I think you can use the term stats that Lucene tracks for each field.
Compare Terms.getSumTotalTermFreq and Terms.getDocCount. If they are
equal it means every document that had this field, had only one token.
Mike McCandless
http://blog.mikemccandless.com
On Fri, Nov 11, 2016 at 5:50 AM, Mik
I suppose it's needless to remind that norm(field) is proportional (but not
precisely by default) to number of tokens in a doc's field (although not
actual text values).
On Fri, Nov 11, 2016 at 5:08 AM, Alexandre Rafalovitch
wrote:
> Hello,
>
> Say I indexed a large dataset against a schemaless
I don't think so. Once things are indexed, they look just like a
regular text field with odd offsets for some of the terms. Of course
if you returned the stored form (assuming it's stored) it'd look
different, but that's messy too.
Best,
Erick
On Thu, Nov 10, 2016 at 6:08 PM, Alexandre Rafalovitc
Hello,
Say I indexed a large dataset against a schemaless configuration. Now
I have a bunch of multivalued fields. Is there any way to say which of
these (text) fields have (for given data) only single values? I know I
am supposed to look at the original data, and all that, but this is
more for de