Ben-Waters commented on code in PR #189:
URL: https://github.com/apache/commons-codec/pull/189#discussion_r1244635165


##########
src/test/java/org/apache/commons/codec/language/NysiisTest.java:
##########
@@ -140,7 +140,8 @@ public void testDropBy() throws EncoderException {
                 new String[] { "JILES", "JAL" },
                 // violates 6: if the last two characters are AY, remove A
                 new String[] { "CARRAWAY", "CARY" },       // Original: CARAY
-                new String[] { "YAMADA", "YANAD" });
+                new String[] { "YAMADA", "YANAD" },
+                new String[] { "ASH", "A"});

Review Comment:
   Based on the [steps from 
wikipedia](https://en.wikipedia.org/wiki/New_York_State_Identification_and_Intelligence_System#cite_ref-taft_2-0),
 the core of the issue is that in step 4, we are setting the pointer to the 
first character instead of the 2nd character. The 9th step on wikipedia is a 
bit ambiguous though as to whether that step includes the first character or 
not. If it does, then you would get a blank string like commons-codec currently 
gives. If not, then you would get 'A'.
   
   Unfortunately I don't have access to the original source book for the 
algorithm to try to clarify. It looks like [berkeley 
law](https://lawcat.berkeley.edu/record/38122) has it but I don't have an 
account for that.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to