[
https://issues.apache.org/jira/browse/CODEC-57?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12560781#action_12560781
]
Henri Yandell commented on CODEC-57:
------------------------------------
Looking at the code; the first step it does is turn the WH into a W.
Then later on, both W and Y are silent if they are not followed by a vowel.
Playing with the link above, it looks like WH is turned into H there. A quick
look at the source code to PHP shows that it is indeed converted to H. Another
quick look, this time at text.rubyforge shows that the Ruby version converts to
W as we do [though it claims to compare with the PHP version for differences].
Looking at DoubleMetaphone, it handles WH differently. If ^WH, then it'll
append an A.
Looking at the original BASIC code [as posted by aspell.sf.net]:
IF TWO = "WH" THEN ENAME = "W":ENAME[3,9999]
So it looks like PHP are the one with the bigger bug - a surname of WHYE should
be YE and not HE. Then it seems that Metaphone itself is weak in that (my
opinion) it should consider 'Y' a vowel when looking after 'W' for a vowel.
I'm not sure what we should do though. The documentation at
http://text.rubyforge.org/svn/lib/text/metaphone.rb also indicates that there
are other bugs in the original BASIC compared to the original discussion
(anyone got that magazine article? :) ). So this might just be a bug in the
BASIC implementation rather than the original algorithm.
> Metaphone.metaphone(String) returns an empty string when passed the word
> "why".
> -------------------------------------------------------------------------------
>
> Key: CODEC-57
> URL: https://issues.apache.org/jira/browse/CODEC-57
> Project: Commons Codec
> Issue Type: Bug
> Affects Versions: 1.3
> Environment: Commons-codec built from source using jdk 1.4.2.
> OS: Windows XP
> Java Build: 1.4.2
> Reporter: Adam Wilmore
> Fix For: 1.4
>
>
> An empty string is returned from the Metaphone.metaphone(String) method when
> passed the value "why". Variations on the value, such as "wwwhy" and "wwhhhy"
> also return empty strings.
> This appears to be an issue since other implementations of the metaphone
> algorithm, namely the PHP version, returns "H" when passed the value "why".
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.