query keyword but no result (solr 8)

2019-05-13 Thread Derrick Cui
Hi,

I am trying to setup solrcloud, I can index a few documents successfully.
but I cannot get result if I search keyword(without field). if I use
field:keyword, I can get result.

any idea why I get this issue?

Thank you

-- 
Regards,

Derrick Cui
Email: derrick...@gmail.com


Re: Solr node goes into recovery mode

2019-05-13 Thread Shawn Heisey

On 5/13/2019 8:26 AM, Maulin Rathod wrote:

Recently we are observing issue where solr node (any random node) automatically 
goes into recovery mode and stops responding.


Do you KNOW that these Solr instances actually need a 60GB heap?  That's 
a HUGE heap.  When a full GC happens on a heap that large, it's going to 
be a long pause, and there's nothing that can be done about it.



We have enough memory allocated to Solr (60 gb) and system also have enough 
memory (300 gb)...


As just mentioned, unless you are CERTAIN that you need a 60GB heap, 
which most users do not, don't set it that high.  Any advice you read 
that says "set the heap to XX percent of the installed system memory" 
will frequently result in a setting that's incorrect for your specific 
setup.


And if you really DO need a 60GB heap, it would be recommended to either 
add more servers and put less of your index on each one, or to split 
your replicas between two Solr instances each running 31GB or less -- as 
Erick mentioned in his reply.



We have analyzed GC logs and found that there was GC pause time of 29.6583943 
second when problem happened. Can this GC Pause lead to make the node 
unavailable/recovery mode? or there could be some another reason ?



Please note we have set zkClientTimeout to 10 minutes (zkClientTimeout=60) 
so that zookeeper will not consider this node unavailable during high GC pause 
time.


You can't actually set that timeout that high.  I believe that ZooKeeper 
limits the session timeout to 20 times the tickTime, which is typically 
set to 2 seconds.  So 40 seconds is typically the maximum you can have 
for that timeout.  Solr's zkClientTimeout value is used to set 
ZooKeeper's session timeout.


And, as Erick also mentioned, there are other ways that a long GC pause 
can cause problems other than that specific timeout.  SolrCloud is not 
going to work well with a huge heap ... eventually a full GC is going to 
happen, and if it takes more than a few seconds, it's going to cause issues.


Thanks,
Shawn


Re: Solr node goes into recovery mode

2019-05-13 Thread Erick Erickson
There are a number of timeouts that  can trip this, the ZK timeout is only one.

For instance, when a leader sends an update to a follower, if that times out 
the leader may put the follower into “Leader Iniated Recovery” (LIR).

60G heaps are, by and large, not recommended for this very reason. Consider 
creating more JVMs with less memory and each hosting fewer Solr replicas.

Best,
Erick

> On May 13, 2019, at 9:26 AM, Maulin Rathod  wrote:
> 
> Hi,
> 
> We are using solr 6.1 version with 2 shards. Each shard have 1 replica 
> set-up. i.e. We have total 4 server nodes (each node is assigned 60 gb of 
> RAM).
> 
> Recently we are observing issue where solr node (any random node) 
> automatically goes into recovery mode and stops responding.
> 
> We have enough memory allocated to Solr (60 gb) and system also have enough 
> memory (300 gb)...
> 
> We have analyzed GC logs and found that there was GC pause time of 29.6583943 
> second when problem happened. Can this GC Pause lead to make the node 
> unavailable/recovery mode? or there could be some another reason ?
> 
> Please note we have set zkClientTimeout to 10 minutes 
> (zkClientTimeout=60) so that zookeeper will not consider this node 
> unavailable during high GC pause time.
> 
> Solr GC Logs
> ==
> 
> {Heap before GC invocations=10940 (full 14):
> par new generation   total 17476288K, used 14724911K [0x8000, 
> 0x00058000, 0x00058000)
>  eden space 13981056K, 100% used [0x8000, 0x0003d556, 
> 0x0003d556)
>  from space 3495232K,  21% used [0x0003d556, 0x000402bcbdb0, 
> 0x0004aaab)
>  to   space 3495232K,   0% used [0x0004aaab, 0x0004aaab, 
> 0x00058000)
> concurrent mark-sweep generation total 62914560K, used 27668932K 
> [0x00058000, 0x00148000, 0x00148000)
> Metaspace   used 47602K, capacity 48370K, committed 49860K, reserved 
> 51200K
> 2019-05-13T12:23:19.103+0100: 174643.550: [GC (Allocation Failure) 
> 174643.550: [ParNew
> Desired survivor size 3221205808 bytes, new threshold 8 (max 8)
> - age   1:   52251504 bytes,   52251504 total
> - age   2:  208183784 bytes,  260435288 total
> - age   3:  274752960 bytes,  535188248 total
> - age   4:   12176528 bytes,  547364776 total
> - age   5:6135968 bytes,  553500744 total
> - age   6:3903152 bytes,  557403896 total
> - age   7:   15341896 bytes,  572745792 total
> - age   8:5518880 bytes,  578264672 total
> : 14724911K->762845K(17476288K), 24.7822734 secs] 
> 42393844K->28434889K(80390848K), 24.7825687 secs] [Times: user=157.97 
> sys=25.63, real=24.78 secs]
> Heap after GC invocations=10941 (full 14):
> par new generation   total 17476288K, used 762845K [0x8000, 
> 0x00058000, 0x00058000)
>  eden space 13981056K,   0% used [0x8000, 0x8000, 
> 0x0003d556)
>  from space 3495232K,  21% used [0x0004aaab, 0x0004d93a76a8, 
> 0x00058000)
>  to   space 3495232K,   0% used [0x0003d556, 0x0003d556, 
> 0x0004aaab)
> concurrent mark-sweep generation total 62914560K, used 27672043K 
> [0x00058000, 0x00148000, 0x00148000)
> Metaspace   used 47602K, capacity 48370K, committed 49860K, reserved 
> 51200K
> }
> 2019-05-13T12:23:44.456+0100: 174668.901: Total time for which application 
> threads were stopped: 29.6583943 seconds, Stopping threads took: 4.3050775 
> seconds
> 
> 
> ==
> 
> 
> 
> Regards,
> 
> Maulin
> 
> [CC Award Winners!]
> 



Re: SolrCloud limitations?

2019-05-13 Thread Edward Ribeiro
Just an addendum to Erick's answer: you can see also the possibility of
using different replica types like TLOG or PULL. It will depend on your use
case and performance requirements. See
https://lucene.apache.org/solr/guide/7_7/shards-and-indexing-data-in-solrcloud.html

Best,
Edward

On Mon, May 13, 2019 at 9:54 AM Erick Erickson 
wrote:
>
> There’s no a-priori limit. 12 or 15 servers will be fine. As add more and
more replicas, there’s a little overhead at indexing time to get all the
docs distributed from the leader to all replicas.
>
> I’ve seen 100s of replicas for a given shard.
>
> Best,
> Erick
>
> > On May 13, 2019, at 12:18 AM, Juergen Melzer (DPDHL IT Services)
 wrote:
> >
> > Hi all
> >
> > At the moment we have 6 servers for doing the search. We want to go up
to 12 or 15 servers.
> > So my question is:
> > Are there any limitations for the SolrCloud and number of replicates?
> >
> >
> >
> > Regards
> > Juergen
> >


Re: Softer version of grouping and/or filter query

2019-05-13 Thread Edward Ribeiro
Cool! Paraphrasing 'Solr in Action' book: edismax is the query parser to
use when dealing with users' queries. It has a lot of customization options
and is more resilient to ill-formed queries than lucene-parser. Whenever
possible, take some time to dig deeper into those. :)

Regards,
Edward

On Fri, May 10, 2019 at 6:09 PM Doug Reeder  wrote:

> Thanks much!  I dropped price from the fq term, changed to an edismax
> parser, and boosted with
> bq=price:[150+TO+*]^100
>
>
>
> On Thu, May 9, 2019 at 7:21 AM Edward Ribeiro 
> wrote:
>
> > Em qua, 8 de mai de 2019 18:56, Doug Reeder 
> > escreveu:
> >
> > >
> > > Similarly, we have a filter query that only returns products over $150:
> > > fq=price:[150+TO+*]
> > >
> > > Can this be changed to a q or qf parameter where products less than
> $150
> > > have score less than any product priced $150 or more? (A price higher
> > than
> > > $150 should not increase the score.)
> > >
> >
> > If you are using edismax then you could use boost function. Maybe
> something
> > along those: bf=if(lt(price, 150), 0.5, 100)
> >
> > Your fq already filters out documents with prices less than 150. Using a
> > boost (function/query) will retrieve back docs with prices less than 150,
> > but probably with smaller scores.
> >
> > Edward
> >
> > >
> >
>


Solr node goes into recovery mode

2019-05-13 Thread Maulin Rathod
Hi,

We are using solr 6.1 version with 2 shards. Each shard have 1 replica set-up. 
i.e. We have total 4 server nodes (each node is assigned 60 gb of RAM).

Recently we are observing issue where solr node (any random node) automatically 
goes into recovery mode and stops responding.

We have enough memory allocated to Solr (60 gb) and system also have enough 
memory (300 gb)...

We have analyzed GC logs and found that there was GC pause time of 29.6583943 
second when problem happened. Can this GC Pause lead to make the node 
unavailable/recovery mode? or there could be some another reason ?

Please note we have set zkClientTimeout to 10 minutes (zkClientTimeout=60) 
so that zookeeper will not consider this node unavailable during high GC pause 
time.

Solr GC Logs
==

{Heap before GC invocations=10940 (full 14):
par new generation   total 17476288K, used 14724911K [0x8000, 
0x00058000, 0x00058000)
  eden space 13981056K, 100% used [0x8000, 0x0003d556, 
0x0003d556)
  from space 3495232K,  21% used [0x0003d556, 0x000402bcbdb0, 
0x0004aaab)
  to   space 3495232K,   0% used [0x0004aaab, 0x0004aaab, 
0x00058000)
concurrent mark-sweep generation total 62914560K, used 27668932K 
[0x00058000, 0x00148000, 0x00148000)
Metaspace   used 47602K, capacity 48370K, committed 49860K, reserved 51200K
2019-05-13T12:23:19.103+0100: 174643.550: [GC (Allocation Failure) 174643.550: 
[ParNew
Desired survivor size 3221205808 bytes, new threshold 8 (max 8)
- age   1:   52251504 bytes,   52251504 total
- age   2:  208183784 bytes,  260435288 total
- age   3:  274752960 bytes,  535188248 total
- age   4:   12176528 bytes,  547364776 total
- age   5:6135968 bytes,  553500744 total
- age   6:3903152 bytes,  557403896 total
- age   7:   15341896 bytes,  572745792 total
- age   8:5518880 bytes,  578264672 total
: 14724911K->762845K(17476288K), 24.7822734 secs] 
42393844K->28434889K(80390848K), 24.7825687 secs] [Times: user=157.97 
sys=25.63, real=24.78 secs]
Heap after GC invocations=10941 (full 14):
par new generation   total 17476288K, used 762845K [0x8000, 
0x00058000, 0x00058000)
  eden space 13981056K,   0% used [0x8000, 0x8000, 
0x0003d556)
  from space 3495232K,  21% used [0x0004aaab, 0x0004d93a76a8, 
0x00058000)
  to   space 3495232K,   0% used [0x0003d556, 0x0003d556, 
0x0004aaab)
concurrent mark-sweep generation total 62914560K, used 27672043K 
[0x00058000, 0x00148000, 0x00148000)
Metaspace   used 47602K, capacity 48370K, committed 49860K, reserved 51200K
}
2019-05-13T12:23:44.456+0100: 174668.901: Total time for which application 
threads were stopped: 29.6583943 seconds, Stopping threads took: 4.3050775 
seconds


==



Regards,

Maulin

[CC Award Winners!]



Re: Solr query takes a too much time in Solr 6.1.0

2019-05-13 Thread Shawn Heisey

On 5/13/2019 2:51 AM, vishal patel wrote:

Executing an identical query again will likely satisfy the query from Solr's 
caches.  Solr won't need to talk to the actual index, and it will be REALLY 
fast.  Even a massively complex query, if it is cached, will be fast.


All caches are disabled in our schema file because of our indexing and 
searching ratio is high in our live environment.


Solr's caches are defined in solrconfig.xml, not the schema.  I mention 
this so you can be sure you have the config you think you have.


If your caches are in fact disabled, I am betting that the index data 
relevant to that query is getting pushed out of your OS disk cache. 
When you execute the same query twice, all the data required by Lucene 
to execute that query is available in the OS disk cache, so the second 
time is quick because Lucene is pulling the information from RAM, which 
is MUCH faster than disk.


Fixing that usually requires adding more memory to the server.

Thanks,
Shawn


Spellcheck Collations Phrase based instead of AND

2019-05-13 Thread Ashish Bisht
Hi,


For a sample collation during spellcheck.

 "collation",{
"collationQuery":"smart connected factory",
"hits":109,
"misspellingsAndCorrections":[
  "smart","smart",
  "connected","connected",
  "fator","factory"]},
  "collation",{
"collationQuery":"smart connected faster",
"hits":325,
"misspellingsAndCorrections":[
  "smart","smart",
  "connected","connected",
  "fator","faster"]},
  "collation",{
"collationQuery":"sparc connected factory",
"hits":14,
"misspellingsAndCorrections":[
  "smart","sparc",
  "connected","connected",
  "fator","factory"]},

The hits in the collationQuery are based on AND between the keyword .

Is it possible to get the collations sorted based on phrase instead of AND

Regards
Ashish



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: SolrCloud limitations?

2019-05-13 Thread Erick Erickson
There’s no a-priori limit. 12 or 15 servers will be fine. As add more and more 
replicas, there’s a little overhead at indexing time to get all the docs 
distributed from the leader to all replicas. 

I’ve seen 100s of replicas for a given shard.

Best,
Erick

> On May 13, 2019, at 12:18 AM, Juergen Melzer (DPDHL IT Services) 
>  wrote:
> 
> Hi all
> 
> At the moment we have 6 servers for doing the search. We want to go up to 12 
> or 15 servers.
> So my question is:
> Are there any limitations for the SolrCloud and number of replicates?
> 
> 
> 
> Regards
> Juergen
> 



Re: Solr 8.0.0 error: cannot change field from index options=DOCS to inconsistent index options=DOCS_AND_FREQS_AND_POSITIONS

2019-05-13 Thread Erick Erickson
Whenever you change a field’s type, reindexing is usually indicated. There are 
a very few times when it’s not. This really has nothing to do with 8.0, just 
the fact that you want to change the field’s type.

Do be aware that if the index was _ever_ touched by a 6x version of Lucene, you 
must re-index to use 8x. But that’s a separate issue from changing a field’s 
type.

Best,
Erick

> On May 13, 2019, at 2:59 AM, Bjarke Buur Mortensen  
> wrote:
> 
> OK, so the problem seems to come from
> https://issues.apache.org/jira/browse/LUCENE-8134
> Our field used to be type="string", but we have since changed it to a text
> type to be able to use synonyms (see below).
> 
> So we'll still have some documents that were indexed as "string". Am I
> right in assuming that we need to reindex in order to upgrade to 8.0.0?
> 
> Thanks,
> Bjarke
> 
>   positionIncrementGap="100">
> 
>  
>
>  
>
>
>
> synonyms="nuts-synonyms.txt" ignoreCase="false" expand="true"/>
>
>  
> 
> 
> Den fre. 10. maj 2019 kl. 22.38 skrev Erick Erickson <
> erickerick...@gmail.com>:
> 
>> I suspect that perhaps some defaults have changed? So I’d try changing the
>> definition in the schema for that field. These changes should be pointed
>> out in the upgrade notes in Lucene or Solr CHANGES.txt.
>> 
>> Best,
>> Erick
>> 
>>> On May 10, 2019, at 1:17 AM, Bjarke Buur Mortensen <
>> morten...@eluence.com> wrote:
>>> 
>>> Hi list,
>>> 
>>> I'm trying to open a 7.x core in Solr 8.
>>> I'm getting the error:
>>> 
>> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
>>> Error opening new searcher
>>> 
>>> Digging further in the logs, I see the error:
>>> "
>>> ...
>>> Caused by: java.lang.IllegalArgumentException: cannot change field
>>> "delivery_place_code" from index options=DOCS to inconsistent index
>>> options=DOCS_AND_FREQS_AND_POSITIONS
>>> ...
>>> "
>>> 
>>> Is this a known issue when upgrading to 8.0.0?
>>> Can I do anything to avoid it?
>>> 
>>> Thanks,
>>> Bjarke
>> 
>> 



Re: Solr query takes a too much time in Solr 6.1.0

2019-05-13 Thread Erick Erickson
Oh, and you can freely set docValues=true _and_ have indexed=true on the same 
field, Solr will use the right structure for the operations it needs. HOWEVER: 
if you change that definition you _must_ re-index the entire collection.

> On May 13, 2019, at 1:22 AM, Bernd Fehling  
> wrote:
> 
> Your "sort" parameter has "sort=id+desc,id+desc".
> 1. It doesn't make sense to have a sort on "id" in descending order twice.
> 2. Be aware that the id field has the highest cadinality.
> 3. To speedup sorting have a separate field with docValues=true for sorting.
>   E.g.
>  multiValued="false" />
>  docValues="true" useDocValuesAsStored="false" />
> 
> 
> Regards
> Bernd
> 
> 
> Am 10.05.19 um 15:32 schrieb vishal patel:
>> We have 2 shards and 2 replicas in Live environment. we have multiple 
>> collections.
>> Some times some query takes much time(QTime=52552).  There are so many 
>> documents indexing and searching within milliseconds.
>> When we executed the same query again using admin panel, it does not take a 
>> much time and it completes within 20 milliseconds.
>> My Solr Logs :
>> 2019-05-10 09:48:56.744 INFO  (qtp1239731077-128223) [c:actionscomments 
>> s:shard1 r:core_node1 x:actionscomments] o.a.s.c.S.Request [actionscomments] 
>>  webapp=/solr path=/select 
>> params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2B(is_formtype_active:true)+%2B(appType:1)=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
>>  hits=198 status=0 QTime=52552
>> 2019-05-10 09:48:56.744 INFO  (qtp1239731077-127998) [c:actionscomments 
>> s:shard1 r:core_node1 x:actionscomments] o.a.s.c.S.Request [actionscomments] 
>>  webapp=/solr path=/select 
>> params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-09T19:30:00Z+TO+2019-05-09T19:30:00Z%2B1DAY]+%2B(is_formtype_active:true)+%2B(appType:1)=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
>>  hits=0 status=0 QTime=51970
>> 2019-05-10 09:48:56.746 INFO  (qtp1239731077-128224) [c:actionscomments 
>> s:shard1 r:core_node1 x:actionscomments] o.a.s.c.S.Request [actionscomments] 
>>  webapp=/solr path=/select 
>> params={q=%2Bproject_id:(2121600+2115171+2104206)%2Brecipient_id:(2834330)+%2Bentity_type:(2)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-10T00:00:00Z+TO+2019-05-10T00:00:00Z%2B1DAY]=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
>>  hits=98 status=0 QTime=51402
>> My schema fields below :
>> > multiValued="false"/>
>> 
>> 
>> 
>> 
>> 
>> 
>> > />
>> 
>> 
>> What could be a problem here? why the query takes too much time at that time?
>> Sent from Outlook



Re: Solr query takes a too much time in Solr 6.1.0

2019-05-13 Thread Erick Erickson
That indicates you’re hitting the queryResultCache, which is also supported by 
your statement about how fast queries are returned after they’re run once. Look 
at admin UI>>select core>>stats/plugins>>cache>>queryResultCache and you’ll 
probably see a very hit ratio, approaching 1.

You also have (in another e-mail) 7 zookeepers for a small system. This is 
severe overkill and gains you nothing. The only times I’ve seen ZooKeepers need 
more than three is when there are 100s of nodes.

Best,
Erick

> On May 13, 2019, at 7:10 AM, vishal patel  
> wrote:
> 
> there are many searching and indexing within a millisecond



Re: Solr query takes a too much time in Solr 6.1.0

2019-05-13 Thread vishal patel
In our live environment, there are many searching and indexing within a 
millisecond. we used facet and sorting in Query.

> 3. To speedup sorting have a separate field with docValues=true for sorting.

Is it necessary or useful to make a separate field if I used this field in 
sorting or facet?
If I do not do a separate field then any performance issue when the same field 
will search in a query?

Sent from Outlook

From: Bernd Fehling 
Sent: Monday, May 13, 2019 11:52 AM
To: solr-user@lucene.apache.org
Subject: Re: Solr query takes a too much time in Solr 6.1.0

Your "sort" parameter has "sort=id+desc,id+desc".
1. It doesn't make sense to have a sort on "id" in descending order twice.
2. Be aware that the id field has the highest cadinality.
3. To speedup sorting have a separate field with docValues=true for sorting.
E.g.




Regards
Bernd


Am 10.05.19 um 15:32 schrieb vishal patel:
> We have 2 shards and 2 replicas in Live environment. we have multiple 
> collections.
> Some times some query takes much time(QTime=52552).  There are so many 
> documents indexing and searching within milliseconds.
> When we executed the same query again using admin panel, it does not take a 
> much time and it completes within 20 milliseconds.
>
> My Solr Logs :
> 2019-05-10 09:48:56.744 INFO  (qtp1239731077-128223) [c:actionscomments 
> s:shard1 r:core_node1 x:actionscomments] o.a.s.c.S.Request [actionscomments]  
> webapp=/solr path=/select 
> params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2B(is_formtype_active:true)+%2B(appType:1)=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
>  hits=198 status=0 QTime=52552
> 2019-05-10 09:48:56.744 INFO  (qtp1239731077-127998) [c:actionscomments 
> s:shard1 r:core_node1 x:actionscomments] o.a.s.c.S.Request [actionscomments]  
> webapp=/solr path=/select 
> params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-09T19:30:00Z+TO+2019-05-09T19:30:00Z%2B1DAY]+%2B(is_formtype_active:true)+%2B(appType:1)=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
>  hits=0 status=0 QTime=51970
> 2019-05-10 09:48:56.746 INFO  (qtp1239731077-128224) [c:actionscomments 
> s:shard1 r:core_node1 x:actionscomments] o.a.s.c.S.Request [actionscomments]  
> webapp=/solr path=/select 
> params={q=%2Bproject_id:(2121600+2115171+2104206)%2Brecipient_id:(2834330)+%2Bentity_type:(2)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-10T00:00:00Z+TO+2019-05-10T00:00:00Z%2B1DAY]=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
>  hits=98 status=0 QTime=51402
>
>
> My schema fields below :
>
>  multiValued="false"/>
> 
> 
> 
> 
> 
> 
>  />
> 
> 
>
> What could be a problem here? why the query takes too much time at that time?
>
> Sent from Outlook
>


Shard got down in Solr 6.1.0

2019-05-13 Thread vishal patel
vishal patel has shared a OneDrive file with you. To view it, click the link 
below.



[https://r1.res.office365.com/owa/prem/images/dc-txt_20.png]

GC_log.txt



We have 2 shards and 2 replicas with 7 zookeepers in our live environment. 
unexpectedly shard got down. From logs, we can not identify why shard got down. 
I have attached the GC and solr log.

*
My solr.xml data
**


  

${host:localhost}
${jetty.port:8983}
${hostContext:solr}

${genericCoreNodeNames:true}

${zkClientTimeout:60}
${distribUpdateSoTimeout:60}
${distribUpdateConnTimeout:6}
${zkCredentialsProvider:org.apache.solr.common.cloud.DefaultZkCredentialsProvider}
${zkACLProvider:org.apache.solr.common.cloud.DefaultZkACLProvider}

  

  
${socketTimeout:60}
${connTimeout:6}
  


*
My zoo.cfg data
**
tickTime=2000
initLimit=10
syncLimit=5
*



Can you suggest me how can we find out this issue?
2019-05-10 13:00:54.559 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ConnectionManager Watcher 
org.apache.solr.common.cloud.ConnectionManager@3a0b9e39 
name:ZooKeeperConnection 
Watcher:10.200.312.80:1,10.200.312.81:2,10.200.312.82:3,10.200.312.83:4,10.200.312.84:5,10.200.312.85:6,10.200.312.86:7
 got event WatchedEvent state:Expired type:None path:null path:null type:None
2019-05-10 13:00:54.559 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ConnectionManager Our previous ZooKeeper session was expired. 
Attempting to reconnect to recover relationship with ZooKeeper...
2019-05-10 13:00:54.559 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.Overseer Overseer 
(id=246176007594049637-localhost:8983_solr-n_48) closing
2019-05-10 13:00:54.621 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.DefaultConnectionStrategy Connection expired - starting a new one...
2019-05-10 13:00:54.691 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ConnectionManager Waiting for client to connect to ZooKeeper
2019-05-10 13:00:54.691 ERROR (OverseerExitThread) [   ] o.a.s.c.Overseer could 
not read the data
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = 
Session expired for /overseer_elect/leader
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:348)
at org.apache.solr.common.cloud.SolrZkClient$7.execute(SolrZkClient.java:345)
at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)
at org.apache.solr.common.cloud.SolrZkClient.getData(SolrZkClient.java:345)
at 
org.apache.solr.cloud.Overseer$ClusterStateUpdater.checkIfIamStillLeader(Overseer.java:309)
at 
org.apache.solr.cloud.Overseer$ClusterStateUpdater.access$300(Overseer.java:89)
at org.apache.solr.cloud.Overseer$ClusterStateUpdater$2.run(Overseer.java:268)
2019-05-10 13:00:57.662 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr-EventThread) [   ] 
o.a.s.c.c.ConnectionManager Watcher 
org.apache.solr.common.cloud.ConnectionManager@3a0b9e39 
name:ZooKeeperConnection 
Watcher:10.200.312.80:1,10.200.312.81:2,10.200.312.82:3,10.200.312.83:4,10.200.312.84:5,10.200.312.85:6,10.200.312.86:7
 got event WatchedEvent state:SyncConnected type:None path:null path:null 
type:None
2019-05-10 13:00:57.666 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ConnectionManager Client is connected to ZooKeeper
2019-05-10 13:00:57.668 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ConnectionManager Connection with ZooKeeper reestablished.
2019-05-10 13:00:57.668 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.ZkController ZooKeeper session re-connected ... refreshing core states 
after session expiration.
2019-05-10 13:00:57.700 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ZkStateReader Updating cluster state from ZooKeeper... 
2019-05-10 13:01:01.138 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ZkStateReader Loaded empty cluster properties
2019-05-10 13:01:03.382 INFO  
(zkCallback-4-thread-632-processing-n:localhost:8983_solr) [   ] 
o.a.s.c.c.ZkStateReader Updated live nodes from ZooKeeper... (4) -> (3)
2019-05-10 13:01:03.400 INFO  

Re: Solr query takes a too much time in Solr 6.1.0

2019-05-13 Thread vishal patel
Thanks for the reply.

> Executing an identical query again will likely satisfy the query from Solr's 
> caches.  Solr won't need to talk to the actual index, and it will be REALLY 
> fast.  Even a massively complex query, if it is cached, will be fast.

All caches are disabled in our schema file because of our indexing and 
searching ratio is high in our live environment.


Sent from Outlook

From: Shawn Heisey 
Sent: Friday, May 10, 2019 9:32 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr query takes a too much time in Solr 6.1.0

On 5/10/2019 7:32 AM, vishal patel wrote:
> We have 2 shards and 2 replicas in Live environment. we have multiple 
> collections.
> Some times some query takes much time(QTime=52552).  There are so many 
> documents indexing and searching within milliseconds.

There could be any number of causes of slow performance.

A common reason is not having enough spare memory in the machine to
allow the operating system to cache the index data.  This is memory NOT
allocated by programs (including Solr).

Another common reason is that the heap size is too small, which causes
Java to frequently perform full garbage collections, which will REALLY
kill performance.

Since there's very little information here, it's difficult for us to
diagnose the cause.  Here's a wiki page about performance problems:

https://wiki.apache.org/solr/SolrPerformanceProblems

(Disclaimer:  I am the principal author of that page)

> When we executed the same query again using admin panel, it does not take a 
> much time and it completes within 20 milliseconds.

Executing an identical query again will likely satisfy the query from
Solr's caches.  Solr won't need to talk to the actual index, and it will
be REALLY fast.  Even a massively complex query, if it is cached, will
be fast.

Running the information from your logs through a URL decoder, this is
what I found:

q=+project_id:(2102117)+recipient_id:(4642365) +entity_type:(1)
-action_id:(20 32) +action_status:(0) +is_active:(true)
+(is_formtype_active:true) +(appType:1)

If all of those fields are indexed, then I would not expect a properly
sized server to be slow.  If any of those fields are indexed=false and
have docValues, then it could be a schema configuration issue.
Searching docValues does work, but it's really slow.

Your query does have an empty fq ... "fq=" ... I do not know whether
that's problematic.  Try it without that to verify.  I would not expect
it to cause problems, but I can't be sure.

Thanks,
Shawn


Re: Streaming Expression: get the value of the array at the specified position

2019-05-13 Thread Nazerke S
That was really helpful for my use case.

It should definitely be included in the documentation.

On Sat, May 11, 2019 at 8:19 PM Joel Bernstein  wrote:

> There actually is an undocumented function called valueAt. It works both
> for an array and for a matrix.
>
> For an array:
>
> let(echo="b", a=array(1,2,3,4,5), b=valueAt(a, 2))  should return 3.
>
> I have lot's of documentation still to do.
>
>
>
>
>
>
>
>
>
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
>
> On Fri, May 10, 2019 at 11:12 AM David Hastings <
> hastings.recurs...@gmail.com> wrote:
>
> > no.
> >
> > On Fri, May 10, 2019 at 11:09 AM Nazerke S 
> wrote:
> >
> > > Hi,
> > >
> > > I am interested in getting the value of the array at the given index.
> For
> > > example,
> > >
> > > let(echo="b", a=array(1,2,3,4,5), b=getAt(a, 2))  should return 3.
> > >
> > > Is there a way to get access an array's element by indexing?
> > >
> > > Thanks!
> > >
> > > __Nazerke
> > >
> >
>


Invalid Date Math Strings silently fail in q, not in fq

2019-05-13 Thread Ronja Koistinen
Hello,

I encountered an issue where invalid dates throw an exception when they
are in an fq parameter but are silently dropped when in q. For example:

{
  "responseHeader":{
"zkConnected":true,
"status":0,
"QTime":4,
"params":{
  "q":"visible_date:[2019-01-01T00:00:00Z TO 2019-12-31T23:59:99Z]",
  "defType":"edismax",
  "df":"text",
  "fl":"visible_date",
  "wt":"json",
  "debugQuery":"on",
  "stopwords":"true"}},
  "response":{"numFound":0,"start":0,"docs":[]
  },
  "debug":{
"rawquerystring":"visible_date:[2019-01-01T00:00:00Z TO
2019-12-31T23:59:99Z]",
"querystring":"visible_date:[2019-01-01T00:00:00Z TO
2019-12-31T23:59:99Z]",
"parsedquery":"+()",
"parsedquery_toString":"+()",
"explain":{},
"QParser":"ExtendedDismaxQParser",
"altquerystring":null,
"boost_queries":null,
"parsed_boost_queries":[],
"boostfuncs":null,
--snip--

Above the parsedquery is empty because the timestamp
2019-12-31T23:59:99Z is invalid.

However:

{
  "responseHeader":{
"zkConnected":true,
"status":400,
"QTime":2,
"params":{
  "q":"*:*",
  "defType":"edismax",
  "df":"text",
  "fl":"visible_date",
  "fq":"visible_date:[\"2019-01-01T00:00:00Z\" TO
\"2019-12-31T23:59:99Z\"]",
  "debugQuery":"on",
  "stopwords":"true",
  "_":"1557734718206"}},
  "error":{
"metadata":[
  "error-class","org.apache.solr.common.SolrException",
  "root-error-class","java.time.format.DateTimeParseException"],
"msg":"Invalid Date in Date Math String:'2019-12-31T23:59:99Z'",
"code":400}}

Above the date range filter is in fq, and now a DateTimeParseException
is thrown.

Is it intended behaviour that in a q parameter date filters are silently
dropped on parsing errors while in an fq parameter an exception is
thrown? Or is this a bug?

I am running Solr version 7.7.1.

-- 
Ronja Koistinen
IT Specialist
University of Helsinki



signature.asc
Description: OpenPGP digital signature


Re: Solr 8.0.0 error: cannot change field from index options=DOCS to inconsistent index options=DOCS_AND_FREQS_AND_POSITIONS

2019-05-13 Thread Bjarke Buur Mortensen
OK, so the problem seems to come from
https://issues.apache.org/jira/browse/LUCENE-8134
Our field used to be type="string", but we have since changed it to a text
type to be able to use synonyms (see below).

So we'll still have some documents that were indexed as "string". Am I
right in assuming that we need to reindex in order to upgrade to 8.0.0?

Thanks,
Bjarke

  
 
  

  





  


Den fre. 10. maj 2019 kl. 22.38 skrev Erick Erickson <
erickerick...@gmail.com>:

> I suspect that perhaps some defaults have changed? So I’d try changing the
> definition in the schema for that field. These changes should be pointed
> out in the upgrade notes in Lucene or Solr CHANGES.txt.
>
> Best,
> Erick
>
> > On May 10, 2019, at 1:17 AM, Bjarke Buur Mortensen <
> morten...@eluence.com> wrote:
> >
> > Hi list,
> >
> > I'm trying to open a 7.x core in Solr 8.
> > I'm getting the error:
> >
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> > Error opening new searcher
> >
> > Digging further in the logs, I see the error:
> > "
> > ...
> > Caused by: java.lang.IllegalArgumentException: cannot change field
> > "delivery_place_code" from index options=DOCS to inconsistent index
> > options=DOCS_AND_FREQS_AND_POSITIONS
> > ...
> > "
> >
> > Is this a known issue when upgrading to 8.0.0?
> > Can I do anything to avoid it?
> >
> > Thanks,
> > Bjarke
>
>


Re: Solr query takes a too much time in Solr 6.1.0

2019-05-13 Thread Bernd Fehling

Your "sort" parameter has "sort=id+desc,id+desc".
1. It doesn't make sense to have a sort on "id" in descending order twice.
2. Be aware that the id field has the highest cadinality.
3. To speedup sorting have a separate field with docValues=true for sorting.
   E.g.




Regards
Bernd


Am 10.05.19 um 15:32 schrieb vishal patel:

We have 2 shards and 2 replicas in Live environment. we have multiple 
collections.
Some times some query takes much time(QTime=52552).  There are so many 
documents indexing and searching within milliseconds.
When we executed the same query again using admin panel, it does not take a 
much time and it completes within 20 milliseconds.

My Solr Logs :
2019-05-10 09:48:56.744 INFO  (qtp1239731077-128223) [c:actionscomments s:shard1 r:core_node1 
x:actionscomments] o.a.s.c.S.Request [actionscomments]  webapp=/solr path=/select 
params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2B(is_formtype_active:true)+%2B(appType:1)=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
 hits=198 status=0 QTime=52552
2019-05-10 09:48:56.744 INFO  (qtp1239731077-127998) [c:actionscomments s:shard1 r:core_node1 
x:actionscomments] o.a.s.c.S.Request [actionscomments]  webapp=/solr path=/select 
params={q=%2Bproject_id:(2102117)%2Brecipient_id:(4642365)+%2Bentity_type:(1)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-09T19:30:00Z+TO+2019-05-09T19:30:00Z%2B1DAY]+%2B(is_formtype_active:true)+%2B(appType:1)=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
 hits=0 status=0 QTime=51970
2019-05-10 09:48:56.746 INFO  (qtp1239731077-128224) [c:actionscomments s:shard1 r:core_node1 
x:actionscomments] o.a.s.c.S.Request [actionscomments]  webapp=/solr path=/select 
params={q=%2Bproject_id:(2121600+2115171+2104206)%2Brecipient_id:(2834330)+%2Bentity_type:(2)+-action_id:(20+32)+%2Baction_status:(0)+%2Bis_active:(true)+%2Bdue_date:[2019-05-10T00:00:00Z+TO+2019-05-10T00:00:00Z%2B1DAY]=s1.example.com:8983/solr/actionscomments|s1r1.example.com:8983/solr/actionscomments,s2.example.com:8983/solr/actionscomments|s2r1.example.com:8983/solr/actionscomments=off=true=id=0=id+desc,id+desc==1}
 hits=98 status=0 QTime=51402


My schema fields below :












What could be a problem here? why the query takes too much time at that time?

Sent from Outlook