[ https://issues.apache.org/jira/browse/SOLR-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Chris Male updated SOLR-1710: ----------------------------- Attachment: SOLR-1710-readable.patch Updated patch with method name changes. doXYZ is now shouldXYZ and writeClear is now writeAndClear > convert worddelimiterfilter to new tokenstream API > -------------------------------------------------- > > Key: SOLR-1710 > URL: https://issues.apache.org/jira/browse/SOLR-1710 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis > Reporter: Robert Muir > Attachments: SOLR-1710-readable.patch, SOLR-1710-readable.patch, > SOLR-1710.patch, SOLR-1710.patch > > > This one was a doozy, attached is a patch to convert it to the new > tokenstream API. > Some of the logic was split into WordDelimiterIterator (exposes a > BreakIterator-like api for iterating subwords) > the filter is much more efficient now, no cloning. > before applying the patch, copy the existing WordDelimiterFilter to > OriginalWordDelimiterFilter > the patch includes a testcase (TestWordDelimiterBWComp) which generates > random strings from various subword combinations. > For each random string, it compares output against the existing > WordDelimiterFilter for all 512 combinations of boolean parameters. > NOTE: due to bugs found (SOLR-1706), this currently only tests 256 of these > combinations. The bugs discovered in SOLR-1706 are fixed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.