Hi Srini,

(and apologies for the delay in replying - only just spotted this message)

There is indeed a level of caching in the design where all of the terms for 
a field are loaded into RAM using FieldData. This lets us lookup the terms 
in individual docs very quickly.
However, the stats required for looking up how frequently terms occur in 
the background (typically your corpus) are hitting the Lucene APIs to read 
frequencies from the Lucene index on disk. Generally the cost of doing this 
will be a multiple of how many unique terms are in your result set.

We are currently looking at ways of improving this and for now one approach 
may be for you to limit the size of the result set being presented to the 
sig_terms agg for analysis. Generally speaking the quality of suggestions 
can still be good on smaller (but not too small) sets of relevant results 
and arguably the quality of suggestions can go down if the agg is analysing 
result sets that include a long-tail of garbage.

Hope this makes sense
Mark
 


On Thursday, May 29, 2014 6:21:01 PM UTC+1, Srinivasan Ramaswamy wrote:
>
> I am trying to use the significant terms aggregation feature, but its 
> making the search very slow. Is there any optimization that i can do to 
> make it faster ? I have an index with 24 shards and 1 replica, where each 
> shard size is 2.5G. With the significant terms feature turned on many 
> searches take ~5s (even when the same search is repeated), with this 
> feature disabled it takes only ~150ms.
>
> I am using it like the following 
>
> SearchRequestBuilder srb = ...;
> SignificantTermsBuilder tags = 
> significantTerms("st_name").field("tags").size(11);
> srb.addAggregation(tags);
>
>
> Does any one have any hints at how to optimize this feature ? Is there 
> some level of caching involved in this feature ? If it does it shouldnt 
> take ~5s when the same query is executed again and again, isnt it ?
>
> Thanks
> Srini
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/62e071c7-93aa-4191-8b43-172d8e68862e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to