[ 
https://issues.apache.org/jira/browse/METRON-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Justin Leet updated METRON-1567:
--------------------------------
    Summary: Large error message can't be written in Solr  (was: Large error 
message can't be written)

> Large error message can't be written in Solr
> --------------------------------------------
>
>                 Key: METRON-1567
>                 URL: https://issues.apache.org/jira/browse/METRON-1567
>             Project: Metron
>          Issue Type: Sub-task
>            Reporter: Justin Leet
>            Assignee: Justin Leet
>            Priority: Major
>
> Error message on the feature branch:
> {code:java}
> org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error 
> from server at 
> http://ip-11-0-1-51.us-west-2.compute.internal:8983/solr/error: Exception 
> writing document id cd6db5c1-f41b-4dcf-8f68-583c7fc08575 to the index; 
> possible analysis error: Document contains at least one immense term in 
> field="raw_message_1" (whose UTF8 encoding is longer than the max length 
> 32766), all of which were skipped. Please correct the analyzer to not produce 
> such terms. The prefix of the first immense term is: '[123, 34, 101, 120, 99, 
> 101, 112, 116, 105, 111, 110, 34, 58, 34, 106, 97, 118, 97, 46, 105, 111, 46, 
> 70, 105, 108, 101, 78, 111, 116, 70]...', original message: bytes can be at 
> most 32766 in length; got 165866. Perhaps the document has an indexed string 
> field (solr.StrField) which is too large
> at 
> org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:612)
>  ~[stormjar.jar:?]
> ...{code}
> This is a hard limit of string fields, per 
> https://lucene.apache.org/solr/guide/6_6/field-types-included-with-solr.html
> It also mentions they aren't tokenized or analyzed, so it doesn't seem like 
> we'd be able to turn this limit off.
> Text fields don't list any sort of limit (although they may still have one), 
> so we may want to switch to that, but it would require testing.
> Additionally, it appears that raw_message is dynamic (since it's getting _1, 
> but we don't define it in the schema).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to