You need to enable docValues on the field (and completely reindex). The standard inverted index structure is great for answering "for term X in field Y, what docs does it appear in?". It's rotten for the "uninverted" case: "For doc X, what is the value of field Y?". This latter question is the one that needs to be answered for sorting, faceting and grouping. So when you do one of those operations, Solr (well, Lucene actually) "uninverts" the field into the JVM if you have not specified docValues="true" and builds a structure efficient for answering this latter question.
Specifying docValues="true" effectively builds this "uninverted" structure at _index time_ and serializes it out to disk. Then the structure is mapped into MMapDirectory space using the OS memory (much more efficient than the JVM). Helpful background for MMapDirectory: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html See also: https://lucene.apache.org/solr/guide/6_6/docvalues.html Best, Erick On Thu, Aug 24, 2017 at 11:02 PM, Sundeep T <[email protected]> wrote: > Hi, > > In our enterprise application, we occasionally get range facet queries > ordered by the timestamp field. The timestamp field is of date type. > > Below is the query from solr.log - > > 2017-08-25 05:18:51.048 INFO (qtp1321530272-90) [ x:drums] > o.a.s.c.S.Request [drums] webapp=/solr path=/select > params={df=text&distrib=false&_facet_={}&fl=id&fl=score&shards.purpose=1048580&start=0&fsv=true&shard.url= > http://localhost:8983/solr/drums&rows=0&version=2&q=*:*&json.facet={“timestamp”:{“type”:“range”,“field”:“timestamp”,“start”:“2016-05-28T16:19:09.857Z”,“end”:“2017-08-18T10:57:10.365Z”,“gap”:“+5000SECOND”,“limit”:100000,“sort”:{“index”:“desc”},“facet”:{}}}&NOW=1503638261623&isShard=true&timeAllowed=-1&wt=javabin} > hits=68541066 status=0 QTime=69422 > > Whenever such query runs we see that > org.apache.solr.uninverting.FieldCacheImpl is being populated in the > backend jvm heap. When we analyzed using heapdump, all the underlying > objects in the FieldCacheImpl have timestamp as the cache key. It seems to > be taking quite a bit of memory. > > Does any one have an idea what this cache is and why its being populated? > Also, what is the criteria for clearing this cache? > > Really appreciate your response. Thanks!
