[
https://issues.apache.org/jira/browse/LUCENE-5275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Muir updated LUCENE-5275:
--------------------------------
Attachment: LUCENE-5275.patch
Yeah: i think identityhashcode is way more useful (see updated patch).
This way the toString actually helps you figure out which guy is which if you
are putting sops in your analysis chain:
{noformat}
PorterStemFilter@23bd31c0 term=it,bytes=[69
74],startOffset=0,endOffset=3,positionIncrement=1,type=<ALPHANUM>,keyword=false
PorterStemFilter@23bd31c0 term=2013,bytes=[32 30 31
33],startOffset=4,endOffset=8,positionIncrement=1,type=<NUM>,keyword=false
PorterStemFilter@23bd31c0 term=let,bytes=[6c 65
74],startOffset=10,endOffset=15,positionIncrement=1,type=<ALPHANUM>,keyword=false
PorterStemFilter@23bd31c0 term=fix,bytes=[66 69
78],startOffset=16,endOffset=19,positionIncrement=1,type=<ALPHANUM>,keyword=false
PorterStemFilter@23bd31c0 term=alreadi,bytes=[61 6c 72 65 61 64
69],startOffset=25,endOffset=32,positionIncrement=2,type=<ALPHANUM>,keyword=false
{noformat}
> Fix AttributeSource.toString()
> ------------------------------
>
> Key: LUCENE-5275
> URL: https://issues.apache.org/jira/browse/LUCENE-5275
> Project: Lucene - Core
> Issue Type: Bug
> Reporter: Robert Muir
> Attachments: LUCENE-5275.patch, LUCENE-5275.patch
>
>
> Its currently just Object.toString, e.g.:
> org.apache.lucene.analysis.en.PorterStemFilter@8a32165c
> But I think we should make it more useful, to end users trying to see what
> their chain is doing, and to make SOPs easier when debugging:
> {code}
> EnglishAnalyzer analyzer = new EnglishAnalyzer(TEST_VERSION_CURRENT);
> try (TokenStream ts = analyzer.tokenStream("body", "Its 2013, let's fix this
> already!")) {
> ts.reset();
> while (ts.incrementToken()) {
> System.out.println(ts.toString());
> }
> ts.end();
> }
> {code}
> Proposed output:
> {noformat}
> PorterStemFilter@8a32165c term=it,bytes=[69
> 74],startOffset=0,endOffset=3,positionIncrement=1,type=<ALPHANUM>,keyword=false
> PorterStemFilter@987b9eea term=2013,bytes=[32 30 31
> 33],startOffset=4,endOffset=8,positionIncrement=1,type=<NUM>,keyword=false
> PorterStemFilter@6b5dbd1f term=let,bytes=[6c 65
> 74],startOffset=10,endOffset=15,positionIncrement=1,type=<ALPHANUM>,keyword=false
> PorterStemFilter@45cbde1b term=fix,bytes=[66 69
> 78],startOffset=16,endOffset=19,positionIncrement=1,type=<ALPHANUM>,keyword=false
> PorterStemFilter@bcd8f627 term=alreadi,bytes=[61 6c 72 65 61 64
> 69],startOffset=25,endOffset=32,positionIncrement=2,type=<ALPHANUM>,keyword=false
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.1#6144)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]