[jira] [Created] (LUCENE-4640) JapaneseCharFilter can't be reusable

Wang Han (JIRA) Thu, 20 Dec 2012 01:43:18 -0800

Wang Han created LUCENE-4640:
--------------------------------

             Summary: JapaneseCharFilter can't be reusable
                 Key: LUCENE-4640
                 URL: https://issues.apache.org/jira/browse/LUCENE-4640
             Project: Lucene - Core
          Issue Type: Bug
          Components: modules/analysis
    Affects Versions: 4.0
            Reporter: Wang Han



com.hulu.lucene.analysis.JapaneseCharFilter is used in JPRomajiAnalyzer
but incrementToken() can't be reusable.
-------------------------------------------------
                     if (!isEnd) { //can't be false when 
analyzer.tokenStream(...) is called
                        if (input.incrementToken()) {
                                String reading = readingAttr.getReading();
                                if (reading == null) {
                                        reading = new String(termAttr.buffer());
                                }
                                if(reading != null) {
                                        reading = 
JapaneseStringTools.getRomanization(reading.trim());
                                        reading = 
StringTools.removeAccentCharacter(reading);
                                        for (int i = 0; i < reading.length(); 
i++) {
                                                chars.add(reading.charAt(i));
                                        }
                                }
                                isEnd = false;
                        } else {
                                isEnd = true;
                        }
                }
                if (chars.size() > 0 || isEnd == false) {
                        if (chars.size() > 0) {
                                char ch = chars.poll();
                                termAttr.setEmpty().append(ch);
                        }
                        return true;
                }
                return false;
--------------------------------------------------
So i can't the call analyzer.tokenStream for the same field twice. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Created] (LUCENE-4640) JapaneseCharFilter can't be reusable

Reply via email to