[
https://issues.apache.org/jira/browse/CODEC-107?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12985467#action_12985467
]
Gary Gregory commented on CODEC-107:
------------------------------------
Feel free to provide a patch.
Personally, I do not see the point of providing performance comparisons of
language encoders. So I would not include that part of the docs (my opinion.)
> Enhance documentation for Language Encoders
> -------------------------------------------
>
> Key: CODEC-107
> URL: https://issues.apache.org/jira/browse/CODEC-107
> Project: Commons Codec
> Issue Type: Improvement
> Affects Versions: 1.4
> Reporter: Marc Pompl
> Priority: Minor
> Fix For: 1.5
>
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> The current userguide (http://commons.apache.org/codec/userguide.html) just
> lists four Language Encoders, but there are five at the moment. CODEC-106
> implements a sixth one.
> Would be a good idea, to complete documentation.
> Additionally, I suggest to extent the userguide in order to show a simple
> performance measurement:
> _SNIP_
> org.apache.commons.codec.language.Metaphone encodings per msec: 327
> org.apache.commons.codec.language.DoubleMetaphone encodings per msec: 224
> org.apache.commons.codec.language.Soundex encodings per msec: 904
> org.apache.commons.codec.language.RefinedSoundex encodings per msec: 637
> org.apache.commons.codec.language.Caverphone encodings per msec: 5
> org.apache.commons.codec.language.ColognePhonetic encodings per msec: 289
> So, Soundex is the fastest encoder. Caverphone is much slower than any other
> algorithm. All others show off nearly the same performance.
> Checked with the following code:
> {code:java}
> private static final int REPEATS = 1000000;
> public void checkSpeed() throws Exception {
> checkSpeedEncoding(new Metaphone(), "easgasg", REPEATS);
> checkSpeedEncoding(new DoubleMetaphone(), "easgasg", REPEATS);
> checkSpeedEncoding(new Soundex(), "easgasg", REPEATS);
> checkSpeedEncoding(new RefinedSoundex(), "easgasg", REPEATS);
> checkSpeedEncoding(new Caverphone(), "Carlene", 100000);
> checkSpeedEncoding(new ColognePhonetic(), "Schmitt", REPEATS);
> }
>
> private void checkSpeedEncoding(Encoder encoder, String toBeEncoded, int
> repeats) throws Exception {
> long start = System.currentTimeMillis();
> for ( int i=0; i<repeats; i++) {
> encoder.encode(toBeEncoded);
> }
> long duration = System.currentTimeMillis()-start;
> System.out.println(encoder.getClass().getName() + " encodings per
> msec: "+(repeats/duration));
> }
> {code}
> _SNAP_
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.