This exception comes from OffsetAttributeImpl (e.g. you dont need to
index anything to reproduce it).

Maybe you have a missing clearAttributes() call (your tokenizer
'returns true' without calling that first)? This could explain it, if
something like a StopFilter is also present in the chain: basically
the offsets overflow.

the test stuff in BaseTokenStreamTestCase should be able to detect
this as well...

On Fri, Jan 3, 2014 at 1:56 PM, Benson Margulies <ben...@basistech.com> wrote:
> Using Solr Cloud with 4.3.1.
>
> We've got a problem with a tokenizer that manifests as calling
> OffsetAtt.setOffsets() with invalid inputs. OK, so, we want to figure out
> what input provokes our code into getting into this pickle.
>
> The problem happens on SolrCloud nodes.
>
> The problem manifests as this sort of thing:
>
> Jan 3, 2014 6:05:33 PM org.apache.solr.common.SolrException log
> SEVERE: java.lang.IllegalArgumentException: startOffset must be
> non-negative, and endOffset must be >= startOffset,
> startOffset=-1811581632,endOffset=-1811581632
>
> How could we get a document ID so that we can tell which document was being
> processed?

Reply via email to