Angel,
The 20 byte is an ASCII space character, which is a separator in most contexts. 
Breaking the buffer at spaces, you can see 6 non-space tokens.

Have a look at your analysis chain and see why you are getting this. Cheers -- 
Rick

On July 24, 2017 4:27:00 PM EDT, Angel Todorov <attodo...@gmail.com> wrote:
>Hi guys,
>
>I am trying to setup the FreeTextSuggester/ Lookup Factory in a
>suggester
>definition in SOLR. Unfortunately while the index is building, I am
>encountering the following errors:
>
>*"msg":"tokens must not contain separator byte; got token=[30 20 30 20
>32
>20 72 20 61 6c 6c 65 6e 20 72] but gramCount=6, which is greater than
>expected max ngram size=5","trace":"java.lang.IllegalArgumentException:
>tokens must not contain separator byte; got token=[30 20 30 20 32 20 72
>20
>61 6c 6c 65 6e 20 72] but gramCount=6, which is greater than expected
>max
>ngram size=5\r\n\tat
>org.apache.lucene.search.suggest.analyzing.FreeTextSuggester.build(FreeTextSuggester.java:362)\r\n\tat
>*
>
>I've also opened the following issue, because i don't think it's right
>not
>to handle this exception:
>
>https://issues.apache.org/jira/browse/SOLR-11139
>
>But my question is about the error in general - why is it occurring? I
>only
>have English text, nothing special.
>
>Thanks,
>Angel

-- 
Sorry for being brief. Alternate email is rickleir at yahoo dot com 

Reply via email to