Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-08 Thread vishal patel
Thanks for reply.

what you mean by "Shard1 Allocated memory”
>> It means JVM memory of one solr node or instance.

How many Solr JVMs are you running?
>> In one server 2 solr JVMs in which one is shard and other is replica.

What is the heap size for your JVMs?
>> 55GB of one Solr JVM.

Regards,
Vishal Patel

Sent from Outlook

From: Walter Underwood 
Sent: Wednesday, July 8, 2020 8:45 PM
To: solr-user@lucene.apache.org 
Subject: Re: Replica goes into recovery mode in Solr 6.1.0

I don’t understand what you mean by "Shard1 Allocated memory”. I don’t know of
any way to dedicate system RAM to an application object like a replica.

How many Solr JVMs are you running?

What is the heap size for your JVMs?

Setting soft commit max time to 100 ms does not magically make Solr super fast.
It makes Solr do too much work, makes the work queues fill up, and makes it 
fail.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 7, 2020, at 10:55 PM, vishal patel  
> wrote:
>
> Thanks for your reply.
>
> One server has total 320GB ram. In this 2 solr node one is shard1 and second 
> is shard2 replica. Each solr node have 55GB memory allocated. shard1 has 
> 585GB data and shard2 replica has 492GB data. means almost 1TB data in this 
> server. server has also other applications and for that 60GB memory 
> allocated. So total 150GB memory is left.
>
> Proper formatting details:
> https://drive.google.com/file/d/1K9JyvJ50Vele9pPJCiMwm25wV4A6x4eD/view
>
> Are you running multiple huge JVMs?
>>> Not huge but 60GB memory allocated for our 11 application. 150GB memory are 
>>> still free.
>
> The servers will be doing a LOT of disk IO, so look at the read and write 
> iops. I expect that the solr processes are blocked on disk reads almost all 
> the time.
>>> is it chance to go in recovery mode if more IO read and write or blocked?
>
> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
>>> Our requirement is NRT so we keep the less time
>
> Regards,
> Vishal Patel
> 
> From: Walter Underwood 
> Sent: Tuesday, July 7, 2020 8:15 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
>
> This isn’t a support list, so nobody looks at issues. We do try to help.
>
> It looks like you have 1 TB of index on a system with 320 GB of RAM.
> I don’t know what "Shard1 Allocated memory” is, but maybe half of
> that RAM is used by JVMs or some other process, I guess. Are you
> running multiple huge JVMs?
>
> The servers will be doing a LOT of disk IO, so look at the read and
> write iops. I expect that the solr processes are blocked on disk reads
> almost all the time.
>
> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
> That is probably causing your outages.
>
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
>
>> On Jul 7, 2020, at 5:18 AM, vishal patel  
>> wrote:
>>
>> Any one is looking my issue? Please guide me.
>>
>> Regards,
>> Vishal Patel
>>
>>
>> 
>> From: vishal patel 
>> Sent: Monday, July 6, 2020 7:11 PM
>> To: solr-user@lucene.apache.org 
>> Subject: Replica goes into recovery mode in Solr 6.1.0
>>
>> I am using Solr version 6.1.0, Java 8 version and G1GC on production. We 
>> have 2 shards and each shard has 1 replica. We have 3 collection.
>> We do not use any cache and also disable in Solr config.xml. Search and 
>> Update requests are coming frequently in our live platform.
>>
>> *Our commit configuration in solr.config are below
>> 
>> 60
>>  2
>>  false
>> 
>> 
>>  ${solr.autoSoftCommit.maxTime:-1}
>> 
>>
>> *We used Near Real Time Searching So we did below configuration in 
>> solr.in.cmd
>> set SOLR_OPTS=%SOLR_OPTS% -Dsolr.autoSoftCommit.maxTime=100
>>
>> *Our collections details are below:
>>
>> Collection  Shard1  Shard1 Replica  Shard2  Shard2 Replica
>> Number of Documents Size(GB)Number of Documents Size(GB) 
>>Number of Documents Size(GB)Number of Documents Size(GB)
>> collection1 26913364201 26913379202 26913380 
>>198 26913379198
>> collection2 13934360310 13934367310 13934368 
>>219 13934367219
>> collection3 351539689   73.5351540040   73.5351540136
>>75.2351539722   75.2
>>
>> *My server configurations are below:
>>
>>   Server1 Server2
>> CPU Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 Mhz, 10 Core(s), 20 
>> Logical Processor(s)Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 
>> Mhz, 10 Core(s), 20 Logical Processor(s)
>> HardDisk(GB)3845 ( 3.84 TB) 3485 GB (3.48 TB)
>> Total memory(GB)320 320
>> Shard1 Allocated memory(GB) 55
>> Shard2 Replica Allocated memory(GB) 55
>> Shard2 Allocated 

Solr multi word search across multiple fields with mm

2020-07-08 Thread Venu
Search on words spanning across different fields using edismax query parser
with minimum match(mm) and sow=false generates different queries when a
field undergoes different query-time analyses(like multi-word synonyms, stop
words, etc)

Assuming I have 2 documents where brand, description_synonyms and tags have
different data
{id: 1
  brand: amul,
  description_synonyms: slice,
  tags: cheese
}
{id:2,
  brand: amul,
  description_synonyms:cake,
  tags: cheese
}


Below is a parsed query strings for the query "amul cheese slice". In this
case, *mm(~2) is across fields* since none of amul, cheese, slice have
synonyms

"parsedquery_toString": "+brand:amul)^10.0 |
(description_synonyms:amul)^4.0 | tags:amul)~1.0 ((brand:cheese)^10.0 |
(description_synonyms:cheese)^4.0 | tags:cheese)~1.0 ((brand:slice)^10.0 |
(description_synonyms:slice)^4.0 | tags:slice)~1.0)*~2*)"

while below is a parsed string for "amul cheese cake". Since cake has plum
cake etc as synonyms, edismax produced below query with *mm(~2) on per
field* resulting in no match.

"parsedquery_toString": "+(((brand:amul brand:cheese brand:cake)~2)^10.0 |
((description_synonyms:amul description_synonyms:cheese
(description_synonyms:cupcak description_synonyms:pastri
description_synonyms:\"plum cake\" description_synonyms:cake))~2)^4.0 |
((tags:amul tags:cheese tags:cake)~2))~1.0"

I want to match on individual fields rather than clubbing all fields into a
single field. 

Is there a way we can solve this? Any help would be highly appreciated.






--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr docker image works with image option but not with build option in docker-compose

2020-07-08 Thread gnandre
Hi,

I am using Solr docker image 8.5.2-slim from https://hub.docker.com/_/solr.
I use it as a base image and then add some more stuff to it with my custom
Dockerfile. When I build the final docker image, it is built successfully.
After that, when I try to use it in docker-compose.yml (with build option)
to start a Solr service, it complains about no permission for creating
directories under /var/solr path. I have given read/write permission to
solr user for /var/solr path in dockerfile.Also, when I use image instead
of build option in docker-compose.yml file for the same image, it does not
throw any errors like that and Solr starts without any issues. Any clue why
this might be happening?


Re: Suggestion or recommendation for NRT

2020-07-08 Thread ramyogi
Hi Team, Any suggestion or recommendation for the above approach which we are
doing  to have better search performance.



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Tokenizing managed synonyms

2020-07-08 Thread Kayak28
Hello, Solr Community:

Actually, you can set up a tokenizer for the managed synonyms.
But, the configuration is not on the reference guide, and I do not know how
to add a Tokenizer via API-call.
So, you might need to manually edit a JSON file below the config directory.


In the _schema_analysis_synonyms_.json under config
directory, you will see the JSON below.

{
  "responseHeader":{
"status":0,
"QTime":3},
  "synonymMappings":{
"initArgs":{
  "ignoreCase":true,
  "format":"solr"},
"initializedOn":"2014-12-16T22:44:05.33Z",
"managedMap":{
  "GB":
["GiB",
 "Gigabyte"],
  "TV":
["Television"],
  "happy":
["glad",
 "joyful"]}}}


In order to add a tokenizer, under the "initArgs" key, you need to add the
following key-value data.
 "tokenizerFactory":"solr.Factory"

Eventually,  you will get the following JSON.
{ "responseHeader":{
  "status":0, "QTime":3},
  "synonymMappings":{ "
  initArgs":{
  "ignoreCase":true,
  "format":"solr",
  "tokenizerFactory":"solr.Factory"
   },
  "initializedOn":"2014-12-16T22:44:05.33Z",
 "managedMap":{
 "GB": ["GiB", "Gigabyte"],
 "TV": ["Television"],
 "happy": ["glad", "joyful"]}}}


I would like to add this configuration to Solr reference guide, but I have
not created a JIRA issue yet.


-- 

Sincerely,
Kaya
github: https://github.com/28kayak



2020年7月7日(火) 11:55 Koji Sekiguchi :

> I think the question makes sense as SynonymGraphFilterFactory accepts
> tokenizerFactory,
> he asked the managed version of SynonymGraphFilter could accept it as well.
>
>
> https://lucene.apache.org/solr/guide/8_5/filter-descriptions.html#synonym-graph-filter
>
> The answer seems to be NO.
>
> Koji
>
>
> On 2020/07/07 8:18, Erick Erickson wrote:
> > This question doesn’t really make sense. You don’t specify tokenizers on
> > filters, they’re specified at the _field_ level.
> >
> > You can certainly define as many field(type)s as you want, each with a
> different
> > analysis chain and those chains can be made up of whatever you want to
> use, and
> > there are lots of choices.
> >
> > If you are asking to do _additional_ tokenization on the output of a
> synonym
> > filter, no.
> >
> > Perhaps if you defined the problem you’re trying to solve we could make
> some
> > suggestions.
> >
> > Best,
> > Erick
> >
> >> On Jul 6, 2020, at 6:43 PM, Thomas Corthals 
> wrote:
> >>
> >> Hi,
> >>
> >> Is it possible to specify a Tokenizer Factory on a Managed Synonym Graph
> >> Filter? I would like to use a Standard Tokenizer or Keyword Tokenizer on
> >> some fields.
> >>
> >> Best,
> >>
> >> Thomas
> >
> >
>





Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-08 Thread Walter Underwood
I don’t understand what you mean by "Shard1 Allocated memory”. I don’t know of
any way to dedicate system RAM to an application object like a replica.

How many Solr JVMs are you running?

What is the heap size for your JVMs?

Setting soft commit max time to 100 ms does not magically make Solr super fast.
It makes Solr do too much work, makes the work queues fill up, and makes it 
fail.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Jul 7, 2020, at 10:55 PM, vishal patel  
> wrote:
> 
> Thanks for your reply.
> 
> One server has total 320GB ram. In this 2 solr node one is shard1 and second 
> is shard2 replica. Each solr node have 55GB memory allocated. shard1 has 
> 585GB data and shard2 replica has 492GB data. means almost 1TB data in this 
> server. server has also other applications and for that 60GB memory 
> allocated. So total 150GB memory is left.
> 
> Proper formatting details:
> https://drive.google.com/file/d/1K9JyvJ50Vele9pPJCiMwm25wV4A6x4eD/view
> 
> Are you running multiple huge JVMs?
>>> Not huge but 60GB memory allocated for our 11 application. 150GB memory are 
>>> still free.
> 
> The servers will be doing a LOT of disk IO, so look at the read and write 
> iops. I expect that the solr processes are blocked on disk reads almost all 
> the time.
>>> is it chance to go in recovery mode if more IO read and write or blocked?
> 
> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
>>> Our requirement is NRT so we keep the less time
> 
> Regards,
> Vishal Patel
> 
> From: Walter Underwood 
> Sent: Tuesday, July 7, 2020 8:15 PM
> To: solr-user@lucene.apache.org 
> Subject: Re: Replica goes into recovery mode in Solr 6.1.0
> 
> This isn’t a support list, so nobody looks at issues. We do try to help.
> 
> It looks like you have 1 TB of index on a system with 320 GB of RAM.
> I don’t know what "Shard1 Allocated memory” is, but maybe half of
> that RAM is used by JVMs or some other process, I guess. Are you
> running multiple huge JVMs?
> 
> The servers will be doing a LOT of disk IO, so look at the read and
> write iops. I expect that the solr processes are blocked on disk reads
> almost all the time.
> 
> "-Dsolr.autoSoftCommit.maxTime=100” is way too short (100 ms).
> That is probably causing your outages.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Jul 7, 2020, at 5:18 AM, vishal patel  
>> wrote:
>> 
>> Any one is looking my issue? Please guide me.
>> 
>> Regards,
>> Vishal Patel
>> 
>> 
>> 
>> From: vishal patel 
>> Sent: Monday, July 6, 2020 7:11 PM
>> To: solr-user@lucene.apache.org 
>> Subject: Replica goes into recovery mode in Solr 6.1.0
>> 
>> I am using Solr version 6.1.0, Java 8 version and G1GC on production. We 
>> have 2 shards and each shard has 1 replica. We have 3 collection.
>> We do not use any cache and also disable in Solr config.xml. Search and 
>> Update requests are coming frequently in our live platform.
>> 
>> *Our commit configuration in solr.config are below
>> 
>> 60
>>  2
>>  false
>> 
>> 
>>  ${solr.autoSoftCommit.maxTime:-1}
>> 
>> 
>> *We used Near Real Time Searching So we did below configuration in 
>> solr.in.cmd
>> set SOLR_OPTS=%SOLR_OPTS% -Dsolr.autoSoftCommit.maxTime=100
>> 
>> *Our collections details are below:
>> 
>> Collection  Shard1  Shard1 Replica  Shard2  Shard2 Replica
>> Number of Documents Size(GB)Number of Documents Size(GB) 
>>Number of Documents Size(GB)Number of Documents Size(GB)
>> collection1 26913364201 26913379202 26913380 
>>198 26913379198
>> collection2 13934360310 13934367310 13934368 
>>219 13934367219
>> collection3 351539689   73.5351540040   73.5351540136
>>75.2351539722   75.2
>> 
>> *My server configurations are below:
>> 
>>   Server1 Server2
>> CPU Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 Mhz, 10 Core(s), 20 
>> Logical Processor(s)Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 
>> Mhz, 10 Core(s), 20 Logical Processor(s)
>> HardDisk(GB)3845 ( 3.84 TB) 3485 GB (3.48 TB)
>> Total memory(GB)320 320
>> Shard1 Allocated memory(GB) 55
>> Shard2 Replica Allocated memory(GB) 55
>> Shard2 Allocated memory(GB) 55
>> Shard1 Replica Allocated memory(GB) 55
>> Other Applications Allocated Memory(GB) 60  22
>> Other Number Of Applications11  7
>> 
>> 
>> Sometimes, any one replica goes into recovery mode. Why replica goes into 
>> recovery? Due to heavy search OR heavy update/insert OR long GC pause time? 
>> If any one of them then what should we do in configuration?
>> Should we increase the shard for recovery issue?
>> 
>> Regards,
>> Vishal Patel
>> 
> 



Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-08 Thread vishal patel
Actually, I have showed our collection details in Excel format but may be 
formatting is removed here.

For this you can see 
https://drive.google.com/file/d/1K9JyvJ50Vele9pPJCiMwm25wV4A6x4eD/view

Regards,
Vishal Patel

From: Rodrigo Oliveira 
Sent: Wednesday, July 8, 2020 4:23 PM
To: solr-user@lucene.apache.org 
Subject: Re: Replica goes into recovery mode in Solr 6.1.0

Hi,

How do you show this? Command for this resume?


*Our collections details are below:

Collection  Shard1  Shard1 Replica  Shard2  Shard2 Replica
Number of Documents Size(GB)Number of Documents Size(GB)
Number of Documents Size(GB)Number of Documents Size(GB)
collection1 26913364201 26913379202 26913380
198 26913379198
collection2 13934360310 13934367310 13934368
219 13934367219
collection3 351539689   73.5351540040   73.5351540136
 75.2351539722



Em seg, 6 de jul de 2020 10:41, vishal patel 
escreveu:

> I am using Solr version 6.1.0, Java 8 version and G1GC on production. We
> have 2 shards and each shard has 1 replica. We have 3 collection.
> We do not use any cache and also disable in Solr config.xml. Search and
> Update requests are coming frequently in our live platform.
>
> *Our commit configuration in solr.config are below
> 
> 60
>2
>false
> 
> 
>${solr.autoSoftCommit.maxTime:-1}
> 
>
> *We used Near Real Time Searching So we did below configuration in
> solr.in.cmd
> set SOLR_OPTS=%SOLR_OPTS% -Dsolr.autoSoftCommit.maxTime=100
>
> *Our collections details are below:
>
> Collection  Shard1  Shard1 Replica  Shard2  Shard2 Replica
> Number of Documents Size(GB)Number of Documents Size(GB)
>   Number of Documents Size(GB)Number of Documents
>  Size(GB)
> collection1 26913364201 26913379202 26913380
>   198 26913379198
> collection2 13934360310 13934367310 13934368
>   219 13934367219
> collection3 351539689   73.5351540040   73.5351540136
>  75.2351539722   75.2
>
> *My server configurations are below:
>
> Server1 Server2
> CPU Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 Mhz, 10 Core(s),
> 20 Logical Processor(s)Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz,
> 2301 Mhz, 10 Core(s), 20 Logical Processor(s)
> HardDisk(GB)3845 ( 3.84 TB) 3485 GB (3.48 TB)
> Total memory(GB)320 320
> Shard1 Allocated memory(GB) 55
> Shard2 Replica Allocated memory(GB) 55
> Shard2 Allocated memory(GB) 55
> Shard1 Replica Allocated memory(GB) 55
> Other Applications Allocated Memory(GB) 60  22
> Other Number Of Applications11  7
>
>
> Sometimes, any one replica goes into recovery mode. Why replica goes into
> recovery? Due to heavy search OR heavy update/insert OR long GC pause time?
> If any one of them then what should we do in configuration?
> Should we increase the shard for recovery issue?
>
> Regards,
> Vishal Patel
>
>


Replication of Solr Model and feature store

2020-07-08 Thread krishan goyal
Hi,

How do I enable replication of the model and feature store ?

Thanks
Krishan


Re: Replica goes into recovery mode in Solr 6.1.0

2020-07-08 Thread Rodrigo Oliveira
Hi,

How do you show this? Command for this resume?


*Our collections details are below:

Collection  Shard1  Shard1 Replica  Shard2  Shard2 Replica
Number of Documents Size(GB)Number of Documents Size(GB)
Number of Documents Size(GB)Number of Documents Size(GB)
collection1 26913364201 26913379202 26913380
198 26913379198
collection2 13934360310 13934367310 13934368
219 13934367219
collection3 351539689   73.5351540040   73.5351540136
 75.2351539722



Em seg, 6 de jul de 2020 10:41, vishal patel 
escreveu:

> I am using Solr version 6.1.0, Java 8 version and G1GC on production. We
> have 2 shards and each shard has 1 replica. We have 3 collection.
> We do not use any cache and also disable in Solr config.xml. Search and
> Update requests are coming frequently in our live platform.
>
> *Our commit configuration in solr.config are below
> 
> 60
>2
>false
> 
> 
>${solr.autoSoftCommit.maxTime:-1}
> 
>
> *We used Near Real Time Searching So we did below configuration in
> solr.in.cmd
> set SOLR_OPTS=%SOLR_OPTS% -Dsolr.autoSoftCommit.maxTime=100
>
> *Our collections details are below:
>
> Collection  Shard1  Shard1 Replica  Shard2  Shard2 Replica
> Number of Documents Size(GB)Number of Documents Size(GB)
>   Number of Documents Size(GB)Number of Documents
>  Size(GB)
> collection1 26913364201 26913379202 26913380
>   198 26913379198
> collection2 13934360310 13934367310 13934368
>   219 13934367219
> collection3 351539689   73.5351540040   73.5351540136
>  75.2351539722   75.2
>
> *My server configurations are below:
>
> Server1 Server2
> CPU Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz, 2301 Mhz, 10 Core(s),
> 20 Logical Processor(s)Intel(R) Xeon(R) CPU E5-2650 v3 @ 2.30GHz,
> 2301 Mhz, 10 Core(s), 20 Logical Processor(s)
> HardDisk(GB)3845 ( 3.84 TB) 3485 GB (3.48 TB)
> Total memory(GB)320 320
> Shard1 Allocated memory(GB) 55
> Shard2 Replica Allocated memory(GB) 55
> Shard2 Allocated memory(GB) 55
> Shard1 Replica Allocated memory(GB) 55
> Other Applications Allocated Memory(GB) 60  22
> Other Number Of Applications11  7
>
>
> Sometimes, any one replica goes into recovery mode. Why replica goes into
> recovery? Due to heavy search OR heavy update/insert OR long GC pause time?
> If any one of them then what should we do in configuration?
> Should we increase the shard for recovery issue?
>
> Regards,
> Vishal Patel
>
>


QTime lesser for facet.limit=-1 than facet.limit=5000/10000

2020-07-08 Thread ana
Hi Team,
Which is more optimized: facet.limit=-1 OR facet.limit=1/5/4?
For a high Cardinality string field, with no cache enabled, no docValues
enabled, after every RELOAD on Solr admin UI for each query with different
facet.limit, why the QTime for "facet.limit=-1" is lesser as compared to
that of a 'facet.limit=5000/1". What factors apart from those listed
above matters in calculating QTime?

My understanding is that facet.limit=-1 should have higher response Time as
per Solr ref guide as compared to any other higher facet.limit specified.

Experiment : 

field = abc_s 
cardinality:71520
num of docs : count:52055449,
total num of facets:70657
appliedMethod: FC
Test query :
http://localhost:8983/solr//select?facet.field=abc_s=on=*:*=0=true_s.facet.limit=-1

facet.limit -1  100  5000   1 4  5
QTime   983857   34295324  1006   1027

Debug response for facet.limit=1 is attached
facet_Response_1.txt
  



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Performance in solr7.5

2020-07-08 Thread Sankar Panda
Hi All,

Any suggestion?
Thanks
Sankar Panda

On Mon, Jul 6, 2020, 16:58 Sankar Panda  wrote:

> Hi Eric,
> Thanks for your mail. I am seeing that 91`% time cpu is idle.
>
> if you see below some of the stats ,still i am not able to visualize where
> it is wrong
>
> FilenameTypeSizeUsed
>  Priority
> /mnt/resource/swapfile  file2097148 1119016 -2
>
>  totalusedfree  shared  buff/cache   available
> Mem:251  51  16   2 184
>   196
> Swap: 1   1   0
>
>  vmstat
> procs ---memory-- ---swap-- -io -system--
> --cpu-
>  r  b   swpd   free   buff  cache   si   sobibo   in   cs us sy id
> wa st
>  0  0 1119016 16969144 200036 19279326400   177   21700  2
>  2 92  4  0
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>2.490.001.604.140.00   91.77
>
> Can you please help me where i need to see
>
> Thanks
> Sankar Panda
>
>
>
> On Mon, Jul 6, 2020 at 12:19 AM Erick Erickson 
> wrote:
>
>> Look at your I/O stats. My bet is that you’re swapping like crazy and
>> your CPU is relatively idle.
>>
>> 2T of index on two machines is probably simply too much data on too
>> little hardware.
>>
>> Consider stress testing your hardware gradually, see:
>>
>>
>> https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>>
>> Best,
>> Erick
>>
>> > On Jul 5, 2020, at 9:38 AM, Sankar Panda 
>> wrote:
>> >
>> > Hi All,
>> >
>> > I am facing performance issue while searching. It took 10mins to get the
>> > results.i have 2 shard and each shard having 2 replicas.80M documents
>> and
>> > index size in each shard is  2T.
>> >
>> > Any suggestions?
>> >
>> > Thanks
>> > Sankar panda
>>
>>


Solr multi word search across multiple fields with mm

2020-07-08 Thread Venu
Hi 
We observed that the multi-word queries spanned across multiple fields with
mm create a problem. Any help would be appreciated.

Current Problem:
Search on words spanning across different fields with minimum match(mm) and
sow=false generates field centric query with per field mm rather than term
centric query with mm across fields when a field undergoes different
query-time analyses(like multi-word synonyms, stop words, etc)

Below are the sample field and term centric queries:

*term centric query with the query string as "amul cheese slice" (none of
the terms has synonyms):*

"parsedquery_toString": "+description:amul)^6.0 | description_l2:amul |
(description_l1:amul)^4.0 | (brand_name_h:amul)^8.0 |
(manual_tags:amul)^3.0) ((description:cheese)^6.0 | description_l2:cheese |
(description_l1:cheese)^4.0 | (brand_name_h:cheese)^8.0 |
(manual_tags:cheese)^3.0) ((description:slice)^6.0 | description_l2:slice |
(description_l1:slice)^4.0 | (brand_name_h:slice)^8.0 |
(manual_tags:slice)^3.0))~2)",

*field centric query with the query string as "amul cheese cake" (cake has a
synonym of plum cake):*

"parsedquery_toString": "+(((description:amul description:cheese
description:cake)~2)^6.0 | ((description_l2:amul description_l2:cheese
(description_l2:cupcak description_l2:pastri (+description_l2:plum
+description_l2:cake) description_l2:cake))~2) | ((description_l1:amul
description_l1:cheese description_l1:cake)~2)^4.0 | ((brand_name_h:amul
brand_name_h:cheese brand_name_h:cake)~2)^8.0 | ((manual_tags:amul
manual_tags:cheese manual_tags:cake)~2)^3.0)",


Referring to multiple blogs below helped us try different things below
1. autogeneratephrase queries 
2. per field mm q=({!edismax qf=brand_name description v=$qx mm=2}^10 OR
{!edismax qf=description_l1 manual_tags_l1 v=$qx mm=2} OR {!edismax
qf=description_l2 v=$qx mm=2} )=amul cheese cake

But we observed that the above are still being converted to field centric
queries with mm per field resulting in no match if the words span across
multiple fields.





--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html