more performance improvements for snowball
------------------------------------------
Key: LUCENE-2201
URL: https://issues.apache.org/jira/browse/LUCENE-2201
Project: Lucene - Java
Issue Type: Improvement
Components: contrib/analyzers
Reporter: Robert Muir
Priority: Minor
Attachments: LUCENE-2201.patch
i took a more serious look at snowball after LUCENE-2194.
This gives greatly improved performance, but note it has some minor breaks to
snowball internals:
* Among.s becomes a char[] instead of a string
* SnowballProgram.current becomes a char[] instead of a StringBuilder
* SnowballProgram.eq_s(int, String) becomes eq_s(int, CharSequence), so that
eq_v(StringBuilder) doesnt need to create an extra string.
* same as the above with eq_s_b and eq_v_b
* replace_s(int, int, String) becomes replace_s(int, int, CharSequence), so
that StringBuilder-based slice and insertion methods don't need to create an
extra string.
all of these "breaks" imho are only theoretical, the problem is just that
pretty much everything is public or protected in the snowball internals.
the performance improvement here depends heavily upon the snowball language in
use, but its way more significant than LUCENE-2194.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]