For a tag cloud, the anomalous words are what you want. If you choose the most 
common words, then every tag cloud will have the same words. It will look like:

the, be, to, it, of, and, a, in, that, have, I, it, for, not, on, with, ...

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Nov 3, 2020, at 10:04 AM, uyilmaz <uyil...@vivaldi.net.INVALID> wrote:
> 
> 
> I have been trying to find a way to do this in Solr for a while. Perform a 
> query, and for a text_general field in the result set, find each term's # of 
> occurences.
> 
> - I tried the Terms Component, it doesn't have the ability to restrict the 
> result set with a query.
> 
> - Tried faceting on the field, since it's a text_general field it doesn't 
> have docValues, plus cardinality is very high (millions of documents * tens 
> of words in each field), so it works but it's very slow and sometimes times 
> out.
> 
> - Tried significantTerms streaming expression, but it's logically not the 
> same with what I'm looking for. It gives the words occuring frequently in the 
> result set, but not occuring as frequently outside it. So it's better to find 
> out frequency anomalies rather than simply the counts.
> 
> Do you have any suggestions?
> 
> Regards
> 
> -- 
> uyilmaz <uyil...@vivaldi.net>

Reply via email to