Re: Incomprehensible (to me) tokenizing behavior

2002-12-30 Thread Doug Cutting
Terry Steichen wrote: I tested StandardAnalyzer (which uses StandardTokenizer) by inputing the a set of strings which produced the following results: aa/bb/cc/dd was tokenized into 4 terms: aa, bb, cc, dd aa/bb/cc/d1 was tokenized into 3 terms: aa, bb, cc/d1 aa/bb/c1/dd was tokenized into 2

Re: Incomprehensible (to me) tokenizing behavior

2002-12-30 Thread Terry Steichen
the same regardless of the presence or absence of numeric characters? - Original Message - From: Doug Cutting [EMAIL PROTECTED] To: Lucene Users List [EMAIL PROTECTED] Sent: Monday, December 30, 2002 1:42 PM Subject: Re: Incomprehensible (to me) tokenizing behavior Terry Steichen wrote: I

Incomprehensible (to me) tokenizing behavior

2002-12-26 Thread Terry Steichen
I tested StandardAnalyzer (which uses StandardTokenizer) by inputing the a set of strings which produced the following results: aa/bb/cc/dd was tokenized into 4 terms: aa, bb, cc, dd aa/bb/cc/d1 was tokenized into 3 terms: aa, bb, cc/d1 aa/bb/c1/dd was tokenized into 2 terms: aa, bb/c1/dd