DO NOT REPLY [Bug 7412] New: - GermanStemFilter setting wrong values for startoffset/endoffset of stemmed tokens

bugzilla Sun, 24 Mar 2002 07:48:44 -0800

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=7412>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.


http://nagoya.apache.org/bugzilla/show_bug.cgi?id=7412

GermanStemFilter setting wrong values for startoffset/endoffset of stemmed tokens

           Summary: GermanStemFilter setting wrong values for
                    startoffset/endoffset of stemmed tokens
           Product: Lucene
           Version: CVS Nightly - Specify date in submission
          Platform: PC
        OS/Version: Linux
            Status: NEW
          Severity: Normal
          Priority: Other
         Component: Analysis
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


The GermanStemFilter sets wrong values to the new Token object created when the 
stemmer succeeds in stemming the termText() string. Bug found in 1.2-RC5-dev

-----------------
Example, for the processing of the string "this is a simple test":
token : thi (0,3)
token : is (5,7)
token : a (8,9)
token : simpl (0,5)
token : test (17,21)

(all the stemmed tokens have wrong start/end offsets).

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

DO NOT REPLY [Bug 7412] New: - GermanStemFilter setting wrong values for startoffset/endoffset of stemmed tokens

Reply via email to