This means a single document requires more than 32 KB to store all of its ordinals ... so that document must have like at least 6K facets?
Are you sure this isn't a bug in your app? That's an insanely high number of facets for one document ... Mike McCandless http://blog.mikemccandless.com On Fri, Apr 26, 2013 at 11:22 AM, Nicola Buso <nb...@ebi.ac.uk> wrote: > Hi Shai, > > I can't say now how many of these entries I have, I need to trace them, > but I expect their are exceptions, like 10 entries no more. > > Can I enable partitions document by document? Should I activate > partitions if I reach a threshold just for these exceptions? > > > Nicola. > > On Fri, 2013-04-26 at 18:04 +0300, Shai Erera wrote: >> Hi Nicola, >> >> I think this limit denotes the number of bytes you can write in a single DV >> value. So this actually means much less number of facets you index. Do you >> know how many categories are indexed for that one document? >> >> Also, do you expect to index large number of facets for most documents, or >> is this one extreme example? >> >> Basically I think you can achieve that by enabling partitions. Partitions >> let you split the categories space into smaller sets, so that each DV value >> contains less values, and also the RAM consumption during search is lower >> since FacetArrays is allocated the size of the partition and not the >> taxonomy. But you also incur search performance loss because counting a >> certain dimension requires traversing multiple DV fields. >> >> To enable partitions you need to override FacetIndexingParams partition >> size. You can try to play with it. >> >> In am intetested though to understand the general scenario. Perhaps this >> can be solved some other way... >> >> Shai >> On Apr 26, 2013 5:44 PM, "Nicola Buso" <nb...@ebi.ac.uk> wrote: >> >> > Hi all, >> > >> > I'm encountering a problem to index a document with a large number of >> > values for one facet. >> > >> > Caused by: java.lang.IllegalArgumentException: DocValuesField "$facets" >> > is too large, must be <= 32766 >> > at >> > >> > org.apache.lucene.index.BinaryDocValuesWriter.addValue(BinaryDocValuesWriter.java:57) >> > at >> > >> > org.apache.lucene.index.DocValuesProcessor.addBinaryField(DocValuesProcessor.java:111) >> > at >> > >> > org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:57) >> > at >> > >> > org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36) >> > at >> > >> > org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:242) >> > at >> > >> > org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:256) >> > at >> > >> > org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376) >> > at >> > org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1473) >> > >> > >> > It's obviously hard to visualize such a big number of facets to the user >> > and is also hard to evaluate which of these values to skip to permit to >> > store this document into the index. >> > >> > Do you have any suggestion on how to overcome this number? is it >> > possible? >> > >> > >> > >> > Nicola >> > >> > >> > --------------------------------------------------------------------- >> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> > For additional commands, e-mail: java-user-h...@lucene.apache.org >> > >> > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org