Ben-Waters commented on code in PR #189:
URL: https://github.com/apache/commons-codec/pull/189#discussion_r1244635165
##########
src/test/java/org/apache/commons/codec/language/NysiisTest.java:
##########
@@ -140,7 +140,8 @@ public void testDropBy() throws EncoderException {
new String[] { "JILES", "JAL" },
// violates 6: if the last two characters are AY, remove A
new String[] { "CARRAWAY", "CARY" }, // Original: CARAY
- new String[] { "YAMADA", "YANAD" });
+ new String[] { "YAMADA", "YANAD" },
+ new String[] { "ASH", "A"});
Review Comment:
Based on the [steps from
wikipedia](https://en.wikipedia.org/wiki/New_York_State_Identification_and_Intelligence_System#cite_ref-taft_2-0),
the core of the issue is that in step 4, we are setting the pointer to the
first character instead of the 2nd character. The 9th step on wikipedia is a
bit ambiguous though as to whether that step includes the first character or
not. If it does, then you would get a blank string like commons-codec currently
gives. If not, then you would get 'A'.
Unfortunately I don't have access to the original source book for the
algorithm to try to clarify. It looks like [berkeley
law](https://lawcat.berkeley.edu/record/38122) has it but I don't have an
account for that.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]