Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-07 Thread sanjay dutt
Best explanation I found so far. Will migrate to LatLonPointSpatialField and 
try to share the benchmark data here. Thanks again David.
Cheers,Sanjay

Sent from Yahoo Mail on Android 
 
  On Sat, Aug 8, 2020 at 3:31 AM, David Smiley wrote:   
Since you have a typical use-case (point data, queries that are
rectangles), I strongly encourage you to migrate to LatLonPointSpatialField:

https://builds.apache.org/job/Solr-reference-guide-master/javadoc/spatial-search.html#latlonpointspatialfield
It's based off an internal "BKD" tree index (doesn't use FSTs) which is
different than the terms based index used by the RPT field that you are
using which employes FSTs.  To be clear, FSTs are awesome but the BKD index
is tailored for numeric data whereas terms/FSTs are not.

If your FSTs are/were taking up so much memory, you are probably not using
Solr 8.4.0 or beyond, which moved to having the FSTs off-heap -- at least
the ones associated with the field indexes.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Aug 6, 2020 at 8:19 PM sanjay dutt
 wrote:

> FieldType defined with class solr.SpatialRecursivePrefixTreeFieldType
>
> In this we are adding points only although collection has few fields with
> points data and then other fieldTypes as well.
> And one of the queries looks like
> (my_field: [45,-94 TO 46,-93]+OR+my_field: [42,-94 TO 43,-93])
>
> Thanks and Regards,Sanjay Dutt
>
>    On Thursday, August 6, 2020, 12:10:04 AM GMT+5:30, David Smiley <
> dsmi...@apache.org> wrote:
>
>  What is the Solr field type definition for this field?  And what sort of
> spatial data do you add here -- just points or what?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt
>  wrote:
>
> > Hello Solr community,
> > On our Production SolrCloud Server, OutOfMemory has been occurring on lot
> > of instances. When I download the HEAP DUMP and analyzed it. I got to
> know
> > that in multiple HEAP DUMPS there are lots of instances
> > of org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the
> > highest retained heap memory and further I have checked the
> > outgoing-reference for those objects,
> > the  org.apache.lucene.util.fst.FST is the one which occupy 90% of the
> heap
> > memory.
> > it's like
> > Production HEAP memory :- 12GBout of
> > which  org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total
> retained
> > heap :- 7-8 GB(vary from instance to
> > instance)and org.apache.lucene.util.fst.FST total retained heap :- 6-7 GB
> > Upon further looking I have calculated the total retained heap for
> > FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the
> same
> > reader which also contains reference to org.apache.lucene.util.fst.FST.
> > Now "my_field" is the field on which we are performing spatial searches.
> > Is spatial searches use FST internally and hence we are seeing lot of
> heap
> > memory used by FST.l only.
> > IS there any way we can optimize the spatial searches so that it take
> less
> > memory.
> > Can someone please give me any pointer that from where Should I start
> > looking to debug the above issue.
> > Thanks and Regards,Sanjay Dutt
> > Sent from Yahoo Mail on Android
>
  


Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-07 Thread David Smiley
Since you have a typical use-case (point data, queries that are
rectangles), I strongly encourage you to migrate to LatLonPointSpatialField:

https://builds.apache.org/job/Solr-reference-guide-master/javadoc/spatial-search.html#latlonpointspatialfield
It's based off an internal "BKD" tree index (doesn't use FSTs) which is
different than the terms based index used by the RPT field that you are
using which employes FSTs.  To be clear, FSTs are awesome but the BKD index
is tailored for numeric data whereas terms/FSTs are not.

If your FSTs are/were taking up so much memory, you are probably not using
Solr 8.4.0 or beyond, which moved to having the FSTs off-heap -- at least
the ones associated with the field indexes.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Thu, Aug 6, 2020 at 8:19 PM sanjay dutt
 wrote:

> FieldType defined with class solr.SpatialRecursivePrefixTreeFieldType
>
> In this we are adding points only although collection has few fields with
> points data and then other fieldTypes as well.
> And one of the queries looks like
> (my_field: [45,-94 TO 46,-93]+OR+my_field: [42,-94 TO 43,-93])
>
> Thanks and Regards,Sanjay Dutt
>
> On Thursday, August 6, 2020, 12:10:04 AM GMT+5:30, David Smiley <
> dsmi...@apache.org> wrote:
>
>  What is the Solr field type definition for this field?  And what sort of
> spatial data do you add here -- just points or what?
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt
>  wrote:
>
> > Hello Solr community,
> > On our Production SolrCloud Server, OutOfMemory has been occurring on lot
> > of instances. When I download the HEAP DUMP and analyzed it. I got to
> know
> > that in multiple HEAP DUMPS there are lots of instances
> > of org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the
> > highest retained heap memory and further I have checked the
> > outgoing-reference for those objects,
> > the  org.apache.lucene.util.fst.FST is the one which occupy 90% of the
> heap
> > memory.
> > it's like
> > Production HEAP memory :- 12GBout of
> > which  org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total
> retained
> > heap :- 7-8 GB(vary from instance to
> > instance)and org.apache.lucene.util.fst.FST total retained heap :- 6-7 GB
> > Upon further looking I have calculated the total retained heap for
> > FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the
> same
> > reader which also contains reference to org.apache.lucene.util.fst.FST.
> > Now "my_field" is the field on which we are performing spatial searches.
> > Is spatial searches use FST internally and hence we are seeing lot of
> heap
> > memory used by FST.l only.
> > IS there any way we can optimize the spatial searches so that it take
> less
> > memory.
> > Can someone please give me any pointer that from where Should I start
> > looking to debug the above issue.
> > Thanks and Regards,Sanjay Dutt
> > Sent from Yahoo Mail on Android
>


Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-06 Thread sanjay dutt
FieldType defined with class solr.SpatialRecursivePrefixTreeFieldType

In this we are adding points only although collection has few fields with 
points data and then other fieldTypes as well.
And one of the queries looks like 
(my_field: [45,-94 TO 46,-93]+OR+my_field: [42,-94 TO 43,-93])

Thanks and Regards,Sanjay Dutt 

On Thursday, August 6, 2020, 12:10:04 AM GMT+5:30, David Smiley 
 wrote:  
 
 What is the Solr field type definition for this field?  And what sort of
spatial data do you add here -- just points or what?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt
 wrote:

> Hello Solr community,
> On our Production SolrCloud Server, OutOfMemory has been occurring on lot
> of instances. When I download the HEAP DUMP and analyzed it. I got to know
> that in multiple HEAP DUMPS there are lots of instances
> of org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the
> highest retained heap memory and further I have checked the
> outgoing-reference for those objects,
> the  org.apache.lucene.util.fst.FST is the one which occupy 90% of the heap
> memory.
> it's like
> Production HEAP memory :- 12GBout of
> which  org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total retained
> heap :- 7-8 GB(vary from instance to
> instance)and org.apache.lucene.util.fst.FST total retained heap :- 6-7 GB
> Upon further looking I have calculated the total retained heap for
> FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the same
> reader which also contains reference to org.apache.lucene.util.fst.FST.
> Now "my_field" is the field on which we are performing spatial searches.
> Is spatial searches use FST internally and hence we are seeing lot of heap
> memory used by FST.l only.
> IS there any way we can optimize the spatial searches so that it take less
> memory.
> Can someone please give me any pointer that from where Should I start
> looking to debug the above issue.
> Thanks and Regards,Sanjay Dutt
> Sent from Yahoo Mail on Android
  

Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-05 Thread David Smiley
What is the Solr field type definition for this field?  And what sort of
spatial data do you add here -- just points or what?

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley


On Mon, Aug 3, 2020 at 10:09 PM sanjay dutt
 wrote:

> Hello Solr community,
> On our Production SolrCloud Server, OutOfMemory has been occurring on lot
> of instances. When I download the HEAP DUMP and analyzed it. I got to know
> that in multiple HEAP DUMPS there are lots of instances
> of org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the
> highest retained heap memory and further I have checked the
> outgoing-reference for those objects,
> the  org.apache.lucene.util.fst.FST is the one which occupy 90% of the heap
> memory.
> it's like
> Production HEAP memory :- 12GBout of
> which  org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total retained
> heap :- 7-8 GB(vary from instance to
> instance)and org.apache.lucene.util.fst.FST total retained heap :- 6-7 GB
> Upon further looking I have calculated the total retained heap for
> FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the same
> reader which also contains reference to org.apache.lucene.util.fst.FST.
> Now "my_field" is the field on which we are performing spatial searches.
> Is spatial searches use FST internally and hence we are seeing lot of heap
> memory used by FST.l only.
> IS there any way we can optimize the spatial searches so that it take less
> memory.
> Can someone please give me any pointer that from where Should I start
> looking to debug the above issue.
> Thanks and Regards,Sanjay Dutt
> Sent from Yahoo Mail on Android


Re: org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-04 Thread Erick Erickson
Possibly https://issues.apache.org/jira/browse/LUCENE-9286? I’m not quite sure 
whether this only affects 8.5 or not.

Also: https://issues.apache.org/jira/browse/LUCENE-8920 and 
https://issues.apache.org/jira/browse/LUCENE-9398 might be of interest.

I have no idea whether these are relevant or whether you just need more heap…

Best,
Erick

> On Aug 3, 2020, at 9:42 PM, sanjay dutt  
> wrote:
> 
> Hello Solr community,
> On our Production SolrCloud Server, OutOfMemory has been occurring on lot of 
> instances. When I download the HEAP DUMP and analyzed it. I got to know that 
> in multiple HEAP DUMPS there are lots of instances of 
> org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the 
> highest retained heap memory and further I have checked the 
> outgoing-reference for those objects, the  org.apache.lucene.util.fst.FST is 
> the one which occupy 90% of the heap memory.
> it's like 
> Production HEAP memory :- 12GBout of which  
> org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total retained heap 
> :- 7-8 GB(vary from instance to instance)and org.apache.lucene.util.fst.FST 
> total retained heap :- 6-7 GB
> Upon further looking I have calculated the total retained heap for 
> FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the same 
> reader which also contains reference to org.apache.lucene.util.fst.FST.
> Now "my_field" is the field on which we are performing spatial searches. Is 
> spatial searches use FST internally and hence we are seeing lot of heap 
> memory used by FST.l only.
> IS there any way we can optimize the spatial searches so that it take less 
> memory.
> Can someone please give me any pointer that from where Should I start looking 
> to debug the above issue. 
> Thanks and Regards,Sanjay Dutt
> Sent from Yahoo Mail on Android



org.apache.lucene.util.fst.FST taking up lot of Java Heap Memory

2020-08-03 Thread sanjay dutt
Hello Solr community,
On our Production SolrCloud Server, OutOfMemory has been occurring on lot of 
instances. When I download the HEAP DUMP and analyzed it. I got to know that in 
multiple HEAP DUMPS there are lots of instances of 
org.apache.lucene.codecs.blocktree.BlockTreeTermsReader  which has the highest 
retained heap memory and further I have checked the outgoing-reference for 
those objects, the  org.apache.lucene.util.fst.FST is the one which occupy 90% 
of the heap memory.
it's like 
Production HEAP memory :- 12GBout of which  
org.apache.lucene.codecs.blocktree.BlockTreeTermsReader total retained heap :- 
7-8 GB(vary from instance to instance)and org.apache.lucene.util.fst.FST total 
retained heap :- 6-7 GB
Upon further looking I have calculated the total retained heap for 
FieldReader.fieldInfo.name="my_field" is around 7GB. Now this is the same 
reader which also contains reference to org.apache.lucene.util.fst.FST.
Now "my_field" is the field on which we are performing spatial searches. Is 
spatial searches use FST internally and hence we are seeing lot of heap memory 
used by FST.l only.
IS there any way we can optimize the spatial searches so that it take less 
memory.
Can someone please give me any pointer that from where Should I start looking 
to debug the above issue. 
Thanks and Regards,Sanjay Dutt
Sent from Yahoo Mail on Android