Hi Shai, Do ttf and docfreq return global stats in distributed mode? I wasn't aware that there was a mechanism for aggregating values in the field list.
Joel Bernstein http://joelsolr.blogspot.com/ On Wed, Feb 22, 2017 at 7:18 AM, Shai Erera <[email protected]> wrote: > Hi > > I am currently using function queries to obtain these two statistics, as I > didn't see a better or more explicit API and the Terms component only > returns docFreq, but not totalTermFreq. > > The way I use the API is submit requests as follows: > > curl "http://localhost:8983/solr/mycollection/select?q=*:*&rows= > 1&fl=ttf(text,'t1'),docfreq(text,'t1')" > > Today I noticed that it sometimes returns 0 for these stats for existing > terms. After debugging and going through the code, I noticed that it > performs analysis on the value that's given. So if I provide an already > stemmed value, it analyzes the value further and in some cases it results > in a non-existing term (and in other cases I get stats for a term I didn't > ask for). > > I want to get the stats of the indexed version of the terms, and that's > why I send the already stemmed one. In my case I tried to get the stats for > the term 'disguis' which is the stem of 'disguise' and 'disguised', however > it further analyzed the value to 'disgui' (per the analysis chain) and that > term does not exist in the index. > > So first question is -- is this the right API to retrieve such statistics? > I didn't find another one, but could be I missed it. > > If it is, why does it analyze the value? I tried to wrap the value with > single and double quotes, but of course that does not affect the analysis > ... is analysis an intended behavior or a bug? > > Shai >
