Re: Solr queries slow down over time

2020-09-25 Thread Goutham Tholpadi
Hi Mark, Thanks for confirming Dwane's advice from your own experience. I
will shift to a streaming expressions implementation.

Best
Goutham

On Fri, Sep 25, 2020 at 7:03 PM Mark H. Wood  wrote:

> On Fri, Sep 25, 2020 at 11:49:22AM +0530, Goutham Tholpadi wrote:
> > I have around 30M documents in Solr, and I am doing repeated *:* queries
> > with rows=1, and changing start to 0, 1, 2, and so on, in a
> > loop in my script (using pysolr).
> >
> > At the start of the iteration, the calls to Solr were taking less than 1
> > sec each. After running for a few hours (with start at around 27M) I
> found
> > that each call was taking around 30-60 secs.
> >
> > Any pointers on why the same fetch of 1 records takes much longer
> now?
> > Does Solr need to load all the 27M before getting the last 1 records?
>
> I and many others have run into the same issue.  Yes, each windowed
> query starts fresh, having to find at least enough records to satisfy
> the query, walking the list to discard the first 'start' worth of
> them, and then returning the next 'rows' worth.  So as 'start' increases,
> the work required of Solr increases and the response time lengthens.
>
> > Is there a better way to do this operation using Solr?
>
> Another answer in this thread gives links to resources for addressing
> the problem, and I can't improve on those.
>
> I can say that when I switched from start= windowing to cursormark, I
> got a very nice improvement in overall speed and did not see the
> progressive slowing anymore.  A query loop that ran for *days* now
> completes in under five minutes.  In some way that I haven't quite
> figured out, a cursormark tells Solr where in the overall document
> sequence to start working.
>
> So yes, there *is* a better way.
>
> --
> Mark H. Wood
> Lead Technology Analyst
>
> University Library
> Indiana University - Purdue University Indianapolis
> 755 W. Michigan Street
> Indianapolis, IN 46202
> 317-274-0749
> www.ulib.iupui.edu
>


Re: Solr queries slow down over time

2020-09-25 Thread Goutham Tholpadi
Thanks a ton, Dwane. I went through the article and the documentation link.
This corresponds exactly to my use case.

Best
Goutham

On Fri, Sep 25, 2020 at 2:59 PM Dwane Hall  wrote:

> Goutham I suggest you read Hossman's excellent article on deep paging and
> why returning rows=(some large number) is a bad idea. It provides an
> thorough overview of the concept and will explain it better than I ever
> could (
> https://lucidworks.com/post/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/#update_2013_12_18).
> In short if you want to extract that many documents out of your corpus use
> cursor mark, streaming expressions, or Solr's parallel SQL interface (that
> uses streaming expressions under the hood)
> https://lucene.apache.org/solr/guide/8_6/streaming-expressions.html.
>
> Thanks,
>
> Dwane
> --
> *From:* Goutham Tholpadi 
> *Sent:* Friday, 25 September 2020 4:19 PM
> *To:* solr-user@lucene.apache.org 
> *Subject:* Solr queries slow down over time
>
> Hi,
>
> I have around 30M documents in Solr, and I am doing repeated *:* queries
> with rows=1, and changing start to 0, 1, 2, and so on, in a
> loop in my script (using pysolr).
>
> At the start of the iteration, the calls to Solr were taking less than 1
> sec each. After running for a few hours (with start at around 27M) I found
> that each call was taking around 30-60 secs.
>
> Any pointers on why the same fetch of 1 records takes much longer now?
> Does Solr need to load all the 27M before getting the last 1 records?
> Is there a better way to do this operation using Solr?
>
> Thanks!
> Goutham
>


Re: solr performance with >1 NUMAs

2020-09-25 Thread Shawn Heisey

On 9/23/2020 7:42 PM, Wei wrote:

Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I
noticed that query latency almost doubled compared to deployment on single
NUMA machines. Not sure what's causing the huge difference. Is there any
tuning to boost the performance on multiple NUMA machines? Any pointer is
appreciated.


If you're running with standard options, Solr 8.4.1 will start using the 
G1 garbage collector.


As of Java 14, G1 has gained the ability to use the -XX:+UseNUMA option, 
which makes better decisions about memory allocations and multiple 
NUMAs.  If you're running a new enough Java, it would probably be 
beneficial to add this to the garbage collector options.  Solr itself is 
unaware of things like NUMA -- Java must handle that.


https://openjdk.java.net/jeps/345

Thanks,
Shawn


Re: solr performance with >1 NUMAs

2020-09-25 Thread Wei
Thanks Dominique. I'll start with the -XX:+UseNUMA option.

Best,
Wei

On Fri, Sep 25, 2020 at 7:04 AM Dominique Bejean 
wrote:

> Hi,
>
> This would be a Java VM option, not something Solr itself can know about.
> Take a look at this article in comments. May be it will help.
>
> https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html?showComment=1347033706559#c229885263664926125
>
> Regards
>
> Dominique
>
>
>
> Le jeu. 24 sept. 2020 à 03:42, Wei  a écrit :
>
> > Hi,
> >
> > Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I
> > noticed that query latency almost doubled compared to deployment on
> single
> > NUMA machines. Not sure what's causing the huge difference. Is there any
> > tuning to boost the performance on multiple NUMA machines? Any pointer is
> > appreciated.
> >
> > Best,
> > Wei
> >
>


Re: Solr 8.6.2 UI issue

2020-09-25 Thread Alexandre Rafalovitch
Sounds strange. If you had Solr installed previously, it could be
cached Javascript. Force-reload or try doing it in an anonymous
window.

Also try starting with an example (solr/start -e techproducts).

Finally, if you are up to it, see if there are any serious errors in
the Browser's developer console's log.

If all else fails, try an earlier version of Solr, just to check
whether it could be something about the latest version (unlikely).

Regards,
   Alex

On Fri, 25 Sep 2020 at 18:43, Manisha Rahatadkar
 wrote:
>
> Hello All
>
> I downloaded 8.6.2 and running it on windows 10 machine. Solr starts on 8983 
> port but whenever I click on any menu like Logging, Core Admin, Query it 
> always shows only dashboard screen.
> Has anyone experienced this issue?
>
>
> Regards
> Manisha Rahatadkar
>
> Confidentiality Notice
> 
> This email message, including any attachments, is for the sole use of the 
> intended recipient and may contain confidential and privileged information. 
> Any unauthorized view, use, disclosure or distribution is prohibited. If you 
> are not the intended recipient, please contact the sender by reply email and 
> destroy all copies of the original message. Anju Software, Inc. 4500 S. 
> Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


Solr 8.6.2 UI issue

2020-09-25 Thread Manisha Rahatadkar
Hello All

I downloaded 8.6.2 and running it on windows 10 machine. Solr starts on 8983 
port but whenever I click on any menu like Logging, Core Admin, Query it always 
shows only dashboard screen.
Has anyone experienced this issue?


Regards
Manisha Rahatadkar

Confidentiality Notice

This email message, including any attachments, is for the sole use of the 
intended recipient and may contain confidential and privileged information. Any 
unauthorized view, use, disclosure or distribution is prohibited. If you are 
not the intended recipient, please contact the sender by reply email and 
destroy all copies of the original message. Anju Software, Inc. 4500 S. 
Lakeshore Drive, Suite 620, Tempe, AZ USA 85282.


SOLR Cursor Pagination Issue

2020-09-25 Thread vmakovsky

Good afternoon,
Could you please suggest us a solution: during data updating process in 
solrCloud, requests with cursor mark return incorrect data. I suppose that 
the results do not follow each other during the indexation process, because 
the data doesn't have enough time to be replicated between the nodes.

Kind regards,
Vladislav Makovski
Developer
XB Software Ltd. | Minsk, Belarus
Site: https://xbsoftware.com
Skype: vlad__makovski
Cell:  +37529 6484100


Re: Delete from Solr console fails

2020-09-25 Thread Dominique Bejean
Hi Goutham,

I agree with Rahul, avoid large deletebyquery.
It you can, prefere one query to get all the ids first than use ids with
deletebyid

Regards

Dominique


Le ven. 25 sept. 2020 à 06:50, Goutham Tholpadi  a
écrit :

> I spoke too soon. I am getting the "Connection lost" error again.
>
> I have never faced this problem when there are a small number of docs in
> the index. I was wondering if the size of the index (30M docs) has anything
> to do with this.
>
> Thanks
> Goutham
>
> On Fri, Sep 25, 2020 at 9:55 AM Goutham Tholpadi 
> wrote:
>
> > Thanks for your response Rahul!
> >
> > Yes, all the fields I tried with were indexed=true, but it did not work.
> >
> > Btw, when I try to today, I am no longer getting the "Connection lost"
> > error. The delete command returns with status=success, however the
> document
> > is not actually deleted when I check in the search console again.
> >
> > I tried using Document Type as XML just now and I see the same behaviour
> > as above.
> >
> > Thanks
> > Goutham
> >
> > On Fri, Sep 25, 2020 at 7:17 AM Rahul Goswami 
> > wrote:
> >
> >> Goutham,
> >> Is the field you are trying to delete by indexed=true in the schema ?
> >> If the uniqueKey is indexed=true, does delete by id work for you?
> >> ( uniqueKey:value)
> >> Also, instead of  "Solr Command" if you choose the Document type as
> "XML"
> >> does it make any difference?
> >>
> >> Rahul
> >>
> >> On Thu, Sep 24, 2020 at 1:04 PM Goutham Tholpadi 
> >> wrote:
> >>
> >> > Hi,
> >> >
> >> > Setup:
> >> > We have a stand-alone Solr (v7.2) with around 30 million documents and
> >> with
> >> > 4 cores, 38G of RAM, and a 1TB disk. The documents were not directly
> >> > indexed but came from a restore of a back from another Solr instance.
> >> >
> >> > Problem:
> >> > Search queries seem to be working fine. However, when I try to delete
> >> > documents from the Solr console, I get a "Connection to Solr lost"
> >> error. I
> >> > am trying by navigating to the "Documents" section of the chosen core,
> >> > using "Solr Command" as the "Document Type", and entering something
> >> this in
> >> > the box below:
> >> > 
> >> > 
> >> > field:value
> >> > 
> >> > 
> >> >
> >> > I tried with the field being the unique key, and otherwise. I also
> tried
> >> > with values containing wild cards. I got the error in all cases.
> >> >
> >> > Any pointers on this?
> >> >
> >> > Thanks
> >> > Goutham
> >> >
> >>
> >
>


Re: solr performance with >1 NUMAs

2020-09-25 Thread Dominique Bejean
Hi,

This would be a Java VM option, not something Solr itself can know about.
Take a look at this article in comments. May be it will help.
https://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html?showComment=1347033706559#c229885263664926125

Regards

Dominique



Le jeu. 24 sept. 2020 à 03:42, Wei  a écrit :

> Hi,
>
> Recently we deployed solr 8.4.1 on a batch of new servers with 2 NUMAs. I
> noticed that query latency almost doubled compared to deployment on single
> NUMA machines. Not sure what's causing the huge difference. Is there any
> tuning to boost the performance on multiple NUMA machines? Any pointer is
> appreciated.
>
> Best,
> Wei
>


Re: Solr 8.6.2 text_general

2020-09-25 Thread Erick Erickson
Uhhh, this is really dangerous. If you’ve indexed documents 
since upgrading, some were indexed with multiValued=false. Now
you’ve changed the definition at a fundamental Lucene level and
Things Can Go Wrong. 

You’re OK if (and only if) you have indexed _no_ documents since
you upgraded.

But even in that case, there may be other fields with different
multiValued values. And if you sort, group, or facet on them
when some have been indexed one way and some others, you’ll
get errors.

I strongly urge you to re-index all your data into a new collection
and, perhaps, use collection aliasing to seamlessly switch.

Best,
Erick

> On Sep 25, 2020, at 8:50 AM, Anuj Bhargava  wrote:
> 
> It worked. I just added multiValued="true".
> 
>multiValued="true"/>
>   
>multiValued="true"/>
>   
> 
> Thanks for all your help.
> 
> Regards,
> 
> Anuj
> 
> On Fri, 25 Sep 2020 at 18:08, Alexandre Rafalovitch 
> wrote:
> 
>> Ok, something is definitely not right. In those cases, I suggest
>> checking backwards from hard reality. Just in case the file you are
>> looking at is NOT the one that is actually used when collection is
>> actually setup. Happened to me more times than I can count.
>> 
>> Point your Admin UI to the collection you are having issues and check
>> the schema definitions there (either in Files or even in Schema
>> screen). I still think your multiValued definition changed somewhere.
>> 
>> Regards,
>>  Alex.
>> 
>> On Fri, 25 Sep 2020 at 03:57, Anuj Bhargava  wrote:
>>> 
>>> Schema on both are the same
>>> 
>>>   > stored="true"/>
>>>   
>>>   > stored="true"/>
>>>   
>>> 
>>> Regards,
>>> 
>>> Anuj
>>> 
>>> On Thu, 24 Sep 2020 at 18:58, Alexandre Rafalovitch 
>>> wrote:
>>> 
 These are field definitions for _text_ and text, your original
 question was about the fields named "country"/"currency" and whatever
 type they mapped to.
 
 Your text/_text_ field is not actually returned to the browser,
 because it is "stored=false", so it is most likely a catch-all
 copyField destination. You may be searching against it, but you are
 returning other (original) fields.
 
 Regards,
   Alex.
 
 On Thu, 24 Sep 2020 at 09:23, Anuj Bhargava 
>> wrote:
> 
> In both it is the same
> 
> In Solr 8.0.0
> >>> indexed="true"
> stored="false"/>
> 
> In Solr 8.6.2
>  multiValued="true"/>
> 
> On Thu, 24 Sep 2020 at 18:33, Alexandre Rafalovitch <
>> arafa...@gmail.com>
> wrote:
> 
>> I think that means your field went from multiValued to
>> singleValued.
>> Double check your schema. Remember that multiValued flag can be set
>> both on the field itself and on its fieldType.
>> 
>> Regards,
>>   Alex
>> P.s. However if your field is supposed to be single-valued, maybe
>> you
>> should treat it as a feature not a bug. Multivalued fields have
>> some
>> restrictions that single-valued fields do not have (around sorting
>> for
>> example).
>> 
>> On Thu, 24 Sep 2020 at 03:09, Anuj Bhargava 
 wrote:
>>> 
>>> In solr 8.0.0 when running the query the data
>> (type="text_general")
 was
>>> shown in brackets *[ ]*
>>> "country":*[*"IN"*]*,
>>> "currency":*[*"INR"*]*,
>>> "date_c":"2020-08-23T18:30:00Z",
>>> "est_cost":0,
>>> 
>>> However, in solr 8.6.2 the query the data (type="text_general")
>> is
 not
>>> showing in brackets [ ]
>>> "country":"IN",
>>> "currency":"INR",
>>> "date_c":"2020-08-23T18:30:00Z",
>>> "est_cost":0,
>>> 
>>> 
>>> How to get the query results to show brackets in Solr 8.6.2
>> 
 
>> 



Re: Solr queries slow down over time

2020-09-25 Thread Mark H. Wood
On Fri, Sep 25, 2020 at 11:49:22AM +0530, Goutham Tholpadi wrote:
> I have around 30M documents in Solr, and I am doing repeated *:* queries
> with rows=1, and changing start to 0, 1, 2, and so on, in a
> loop in my script (using pysolr).
> 
> At the start of the iteration, the calls to Solr were taking less than 1
> sec each. After running for a few hours (with start at around 27M) I found
> that each call was taking around 30-60 secs.
> 
> Any pointers on why the same fetch of 1 records takes much longer now?
> Does Solr need to load all the 27M before getting the last 1 records?

I and many others have run into the same issue.  Yes, each windowed
query starts fresh, having to find at least enough records to satisfy
the query, walking the list to discard the first 'start' worth of
them, and then returning the next 'rows' worth.  So as 'start' increases,
the work required of Solr increases and the response time lengthens.

> Is there a better way to do this operation using Solr?

Another answer in this thread gives links to resources for addressing
the problem, and I can't improve on those.

I can say that when I switched from start= windowing to cursormark, I
got a very nice improvement in overall speed and did not see the
progressive slowing anymore.  A query loop that ran for *days* now
completes in under five minutes.  In some way that I haven't quite
figured out, a cursormark tells Solr where in the overall document
sequence to start working.

So yes, there *is* a better way.

-- 
Mark H. Wood
Lead Technology Analyst

University Library
Indiana University - Purdue University Indianapolis
755 W. Michigan Street
Indianapolis, IN 46202
317-274-0749
www.ulib.iupui.edu


signature.asc
Description: PGP signature


Re: Solr 8.6.2 text_general

2020-09-25 Thread Anuj Bhargava
It worked. I just added multiValued="true".

   
   
   
   

Thanks for all your help.

Regards,

Anuj

On Fri, 25 Sep 2020 at 18:08, Alexandre Rafalovitch 
wrote:

> Ok, something is definitely not right. In those cases, I suggest
> checking backwards from hard reality. Just in case the file you are
> looking at is NOT the one that is actually used when collection is
> actually setup. Happened to me more times than I can count.
>
> Point your Admin UI to the collection you are having issues and check
> the schema definitions there (either in Files or even in Schema
> screen). I still think your multiValued definition changed somewhere.
>
> Regards,
>   Alex.
>
> On Fri, 25 Sep 2020 at 03:57, Anuj Bhargava  wrote:
> >
> > Schema on both are the same
> >
> > stored="true"/>
> >
> > stored="true"/>
> >
> >
> > Regards,
> >
> > Anuj
> >
> > On Thu, 24 Sep 2020 at 18:58, Alexandre Rafalovitch 
> > wrote:
> >
> > > These are field definitions for _text_ and text, your original
> > > question was about the fields named "country"/"currency" and whatever
> > > type they mapped to.
> > >
> > > Your text/_text_ field is not actually returned to the browser,
> > > because it is "stored=false", so it is most likely a catch-all
> > > copyField destination. You may be searching against it, but you are
> > > returning other (original) fields.
> > >
> > > Regards,
> > >Alex.
> > >
> > > On Thu, 24 Sep 2020 at 09:23, Anuj Bhargava 
> wrote:
> > > >
> > > > In both it is the same
> > > >
> > > > In Solr 8.0.0
> > > >  > > indexed="true"
> > > > stored="false"/>
> > > >
> > > > In Solr 8.6.2
> > > >  > > > multiValued="true"/>
> > > >
> > > > On Thu, 24 Sep 2020 at 18:33, Alexandre Rafalovitch <
> arafa...@gmail.com>
> > > > wrote:
> > > >
> > > > > I think that means your field went from multiValued to
> singleValued.
> > > > > Double check your schema. Remember that multiValued flag can be set
> > > > > both on the field itself and on its fieldType.
> > > > >
> > > > > Regards,
> > > > >Alex
> > > > > P.s. However if your field is supposed to be single-valued, maybe
> you
> > > > > should treat it as a feature not a bug. Multivalued fields have
> some
> > > > > restrictions that single-valued fields do not have (around sorting
> for
> > > > > example).
> > > > >
> > > > > On Thu, 24 Sep 2020 at 03:09, Anuj Bhargava 
> > > wrote:
> > > > > >
> > > > > > In solr 8.0.0 when running the query the data
> (type="text_general")
> > > was
> > > > > > shown in brackets *[ ]*
> > > > > > "country":*[*"IN"*]*,
> > > > > > "currency":*[*"INR"*]*,
> > > > > > "date_c":"2020-08-23T18:30:00Z",
> > > > > > "est_cost":0,
> > > > > >
> > > > > > However, in solr 8.6.2 the query the data (type="text_general")
> is
> > > not
> > > > > > showing in brackets [ ]
> > > > > > "country":"IN",
> > > > > > "currency":"INR",
> > > > > > "date_c":"2020-08-23T18:30:00Z",
> > > > > > "est_cost":0,
> > > > > >
> > > > > >
> > > > > > How to get the query results to show brackets in Solr 8.6.2
> > > > >
> > >
>


Re: Solr 8.6.2 text_general

2020-09-25 Thread Alexandre Rafalovitch
Ok, something is definitely not right. In those cases, I suggest
checking backwards from hard reality. Just in case the file you are
looking at is NOT the one that is actually used when collection is
actually setup. Happened to me more times than I can count.

Point your Admin UI to the collection you are having issues and check
the schema definitions there (either in Files or even in Schema
screen). I still think your multiValued definition changed somewhere.

Regards,
  Alex.

On Fri, 25 Sep 2020 at 03:57, Anuj Bhargava  wrote:
>
> Schema on both are the same
>
>
>
>
>
>
> Regards,
>
> Anuj
>
> On Thu, 24 Sep 2020 at 18:58, Alexandre Rafalovitch 
> wrote:
>
> > These are field definitions for _text_ and text, your original
> > question was about the fields named "country"/"currency" and whatever
> > type they mapped to.
> >
> > Your text/_text_ field is not actually returned to the browser,
> > because it is "stored=false", so it is most likely a catch-all
> > copyField destination. You may be searching against it, but you are
> > returning other (original) fields.
> >
> > Regards,
> >Alex.
> >
> > On Thu, 24 Sep 2020 at 09:23, Anuj Bhargava  wrote:
> > >
> > > In both it is the same
> > >
> > > In Solr 8.0.0
> > >  > indexed="true"
> > > stored="false"/>
> > >
> > > In Solr 8.6.2
> > >  > > multiValued="true"/>
> > >
> > > On Thu, 24 Sep 2020 at 18:33, Alexandre Rafalovitch 
> > > wrote:
> > >
> > > > I think that means your field went from multiValued to singleValued.
> > > > Double check your schema. Remember that multiValued flag can be set
> > > > both on the field itself and on its fieldType.
> > > >
> > > > Regards,
> > > >Alex
> > > > P.s. However if your field is supposed to be single-valued, maybe you
> > > > should treat it as a feature not a bug. Multivalued fields have some
> > > > restrictions that single-valued fields do not have (around sorting for
> > > > example).
> > > >
> > > > On Thu, 24 Sep 2020 at 03:09, Anuj Bhargava 
> > wrote:
> > > > >
> > > > > In solr 8.0.0 when running the query the data (type="text_general")
> > was
> > > > > shown in brackets *[ ]*
> > > > > "country":*[*"IN"*]*,
> > > > > "currency":*[*"INR"*]*,
> > > > > "date_c":"2020-08-23T18:30:00Z",
> > > > > "est_cost":0,
> > > > >
> > > > > However, in solr 8.6.2 the query the data (type="text_general") is
> > not
> > > > > showing in brackets [ ]
> > > > > "country":"IN",
> > > > > "currency":"INR",
> > > > > "date_c":"2020-08-23T18:30:00Z",
> > > > > "est_cost":0,
> > > > >
> > > > >
> > > > > How to get the query results to show brackets in Solr 8.6.2
> > > >
> >


Re: Any blog or url that explain step by step configure grafana dashboard to monitor solr metrics

2020-09-25 Thread Emir Arnautović
Hi,
In case you decide to go with cloud solution, you can check how you can monitor 
Solr with Sematext: 
https://sematext.com/blog/solr-monitoring-made-easy-with-sematext/ 


Regards,
Emir
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/



> On 24 Sep 2020, at 04:55, yaswanth kumar  wrote:
> 
> Can some one post here any blogs or url where I can get the detailed steps 
> involved in configuring grafana dashboard for monitoring solr metrics??
> 
> Sent from my iPhone



Re: Solr queries slow down over time

2020-09-25 Thread Dwane Hall
Goutham I suggest you read Hossman's excellent article on deep paging and why 
returning rows=(some large number) is a bad idea. It provides an thorough 
overview of the concept and will explain it better than I ever could 
(https://lucidworks.com/post/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/#update_2013_12_18).
 In short if you want to extract that many documents out of your corpus use 
cursor mark, streaming expressions, or Solr's parallel SQL interface (that uses 
streaming expressions under the hood)
https://lucene.apache.org/solr/guide/8_6/streaming-expressions.html.

Thanks,

Dwane

From: Goutham Tholpadi 
Sent: Friday, 25 September 2020 4:19 PM
To: solr-user@lucene.apache.org 
Subject: Solr queries slow down over time

Hi,

I have around 30M documents in Solr, and I am doing repeated *:* queries
with rows=1, and changing start to 0, 1, 2, and so on, in a
loop in my script (using pysolr).

At the start of the iteration, the calls to Solr were taking less than 1
sec each. After running for a few hours (with start at around 27M) I found
that each call was taking around 30-60 secs.

Any pointers on why the same fetch of 1 records takes much longer now?
Does Solr need to load all the 27M before getting the last 1 records?
Is there a better way to do this operation using Solr?

Thanks!
Goutham


Re: Solr 8.6.2 text_general

2020-09-25 Thread Anuj Bhargava
Schema on both are the same

   
   
   
   

Regards,

Anuj

On Thu, 24 Sep 2020 at 18:58, Alexandre Rafalovitch 
wrote:

> These are field definitions for _text_ and text, your original
> question was about the fields named "country"/"currency" and whatever
> type they mapped to.
>
> Your text/_text_ field is not actually returned to the browser,
> because it is "stored=false", so it is most likely a catch-all
> copyField destination. You may be searching against it, but you are
> returning other (original) fields.
>
> Regards,
>Alex.
>
> On Thu, 24 Sep 2020 at 09:23, Anuj Bhargava  wrote:
> >
> > In both it is the same
> >
> > In Solr 8.0.0
> >  indexed="true"
> > stored="false"/>
> >
> > In Solr 8.6.2
> >  > multiValued="true"/>
> >
> > On Thu, 24 Sep 2020 at 18:33, Alexandre Rafalovitch 
> > wrote:
> >
> > > I think that means your field went from multiValued to singleValued.
> > > Double check your schema. Remember that multiValued flag can be set
> > > both on the field itself and on its fieldType.
> > >
> > > Regards,
> > >Alex
> > > P.s. However if your field is supposed to be single-valued, maybe you
> > > should treat it as a feature not a bug. Multivalued fields have some
> > > restrictions that single-valued fields do not have (around sorting for
> > > example).
> > >
> > > On Thu, 24 Sep 2020 at 03:09, Anuj Bhargava 
> wrote:
> > > >
> > > > In solr 8.0.0 when running the query the data (type="text_general")
> was
> > > > shown in brackets *[ ]*
> > > > "country":*[*"IN"*]*,
> > > > "currency":*[*"INR"*]*,
> > > > "date_c":"2020-08-23T18:30:00Z",
> > > > "est_cost":0,
> > > >
> > > > However, in solr 8.6.2 the query the data (type="text_general") is
> not
> > > > showing in brackets [ ]
> > > > "country":"IN",
> > > > "currency":"INR",
> > > > "date_c":"2020-08-23T18:30:00Z",
> > > > "est_cost":0,
> > > >
> > > >
> > > > How to get the query results to show brackets in Solr 8.6.2
> > >
>


Solr queries slow down over time

2020-09-25 Thread Goutham Tholpadi
Hi,

I have around 30M documents in Solr, and I am doing repeated *:* queries
with rows=1, and changing start to 0, 1, 2, and so on, in a
loop in my script (using pysolr).

At the start of the iteration, the calls to Solr were taking less than 1
sec each. After running for a few hours (with start at around 27M) I found
that each call was taking around 30-60 secs.

Any pointers on why the same fetch of 1 records takes much longer now?
Does Solr need to load all the 27M before getting the last 1 records?
Is there a better way to do this operation using Solr?

Thanks!
Goutham