[ https://issues.apache.org/jira/browse/SOLR-1710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Robert Muir updated SOLR-1710: ------------------------------ Attachment: SOLR-1710.patch > convert worddelimiterfilter to new tokenstream API > -------------------------------------------------- > > Key: SOLR-1710 > URL: https://issues.apache.org/jira/browse/SOLR-1710 > Project: Solr > Issue Type: Improvement > Components: Schema and Analysis > Reporter: Robert Muir > Attachments: SOLR-1710.patch > > > This one was a doozy, attached is a patch to convert it to the new > tokenstream API. > Some of the logic was split into WordDelimiterIterator (exposes a > BreakIterator-like api for iterating subwords) > the filter is much more efficient now, no cloning. > before applying the patch, rename the existing WordDelimiterFilter to > OriginalWordDelimiterFilter > the patch includes a testcase (TestWordDelimiterBWComp) which generates > random strings from various subword combinations. > For each random string, it compares output against the existing > WordDelimiterFilter for all 512 combinations of boolean parameters. > NOTE: due to bugs found (SOLR-1706), this currently only tests 256 of these > combinations. The bugs discovered in SOLR-1706 are fixed here. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.