Text-based fields indeed do not have that limit for the _entire_ field. They _do_ have that limit for any single token produced. So if your field contains, say, a base-64 encoded image that is not broken up into smaller tokens, you’ll still get this error.
Best, Erick > On Oct 25, 2019, at 4:28 AM, Marko Ćurlin <marko.cur...@reversinglabs.com> > wrote: > > Hi everyone, > > I am getting an > org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException, while > trying to insert a list with 9 elements, of which one is 242905 bytes long, > into Solr. I am aware that StrField has a hard limit of slightly less than > 32k. I am using a TextField that by my understanding hasn't got such a > limit, as tested here > <https://stackoverflow.com/questions/32936361/in-solr-what-is-the-maximum-size-of-a-text-field> > (taking into consideration that the field wasn't multivalued). So I'm > wondering what is the correlation here, and how could it be solved? Below I > have the error and the relevant part of the solr managed_schema. I am still > new to Solr so take into account that there could be something obvious I am > missing. > > ERROR: > > "error":{ > "metadata":[ > "error-class","org.apache.solr.common.SolrException", > > "root-error-class","org.apache.lucene.util.BytesRefHash$MaxBytesLengthExceededException", > > "error-class","org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException", > > "root-error-class","org.apache.solr.update.processor.DistributedUpdateProcessor$DistributedUpdatesAsyncException"], > "msg":"Async exception during distributed update: Error from > server at http://solr-host:8983/solr/search_collection_xx: Bad Request > \n\n request: http://solr-host:8983/solr/search_collection_xx \n\n > Remote error message: Exception writing document id <document_id> to > the index; possible analysis error: Document contains at least one > immense term in field=\"text_field_name\" (whose UTF8 encoding is > longer than the max length 32766), all of which were skipped. Please > correct the analyzer to not produce such terms. The prefix of the > first immense term is: '[115, 97, 115, 109, 101, 45, 100, 97, 109, > 101, 46, 99, 111, 109, 47, 108, 121, 99, 107, 97, 47, 37, 50, 50, 37, > 50, 48, 109, 101, 116]...', original message: bytes can be at most > 32766 in length; got 242905. Perhaps the document has an indexed > string field (solr.StrField) which is too large", > "code":400} > } > > relevant managed_schema: > > <dynamicField name="text_field_*" indexed="true" stored="true" > multiValued="true" type="case_insensitive_text" /> > > <fieldType name="case_insensitive_text" class="solr.TextField" > multiValued="false"> > <analyzer type="index"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > <analyzer type="query"> > <tokenizer class="solr.KeywordTokenizerFactory"/> > <filter class="solr.LowerCaseFilterFactory"/> > </analyzer> > </fieldType> > > > Best regards, > Marko --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org