Re: [codec] Base32 decode is not case-insensitive?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 All, Can I get a review (and hopefully a commit) on this issue? https://issues.apache.org/jira/browse/CODEC-234 I'd really like to have these features available directly in the library instead of having to massage input before handing it off to commons-codec. Thanks, - -chris On 5/1/17 3:21 PM, Christopher Schultz wrote: > Paulo, > > On 5/1/17 1:35 PM, Paulo Roberto Massa Cereda wrote: >> Apologies, I quoted the wrong bit! > >> --8< When decoding, upper and lower case >> letters are accepted, and i and l will be treated as 1 and o >> will be treated as 0. When encoding, only upper case letters are >> used. --8< > > So... none of the above are true in commons-codec. Was it the > intent to follow Douglas Crawfords recommendations? If so, it's > quite incompatible with RFC 4648. > > I think I'll file a JIRA issue for this and attach a patch.[1] > > Thanks, -chris > > [1] https://issues.apache.org/jira/browse/CODEC-234 > >> Em 01-05-2017 14:31, Paulo Roberto Massa Cereda escreveu: >>> 'ello! >>> >>> I suspect it has something to do with Douglas Crockford's >>> base32 [1]: >>> >>> -8<--- The encoding scheme is required >>> to >>> >>> * Be human readable and machine readable. * Be compact. Humans >>> have difficulty in manipulating long strings of arbitrary >>> symbols. * Be error resistant. Entering the symbols must not >>> require keyboarding gymnastics. * Be pronounceable. Humans >>> should be able to accurately transmit the symbols to other >>> humans using a telephone. -8<--- >>> >>> [1]: http://www.crockford.com/wrmg/base32.html >>> >>> Cheerio! >>> >>> Paulo >>> >>> Em 01-05-2017 14:00, Christopher Schultz escreveu: >> All, > >> I spent a few hours trying to figure out what I had done wrong >> when trying to base32-decode a 32-character string and was >> getting a 5-byte array back (instead of a 20-byte array, as >> expected). > >> I finally determined that Base32.decode doesn't work properly >> with lower-case input strings. > >> Is there a particular reason that these don't work? Here's an >> example: > >> import org.apache.commons.codec.binary.Base32; import >> org.apache.commons.codec.binary.Hex; > >> public class Test { public static void main(String[] args) >> throws Exception { byte[] bytes = new >> Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4"); > >> System.out.println("bytes.length=" + bytes.length); >> System.out.println("hex=" + Hex.encodeHexString(bytes)); > >> bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4"); > >> System.out.println("bytes.length=" + bytes.length); >> System.out.println("hex=" + Hex.encodeHexString(bytes)); } } > >> Produces this output: > >> bytes.length=20 hex=623a01735836e9a126e12fbf95e013ee6892997c >> bytes.length=5 hex=ef35bd7bfd > >> It looks like Base32.decode is ignoring all of the characters it >> doesn't see as being "in" its alphabet (all the lowercase >> letters) and uses what is left over ("542326754" in the case >> above). If I base32-decode "542326754" I get the 5-byte output >> that the second case generates. > >> While RFC4648 doesn't specifically say that input strings should >> be allowed to be case-insensitive, it does have this to say: > >> " The Base 32 encoding is designed to represent arbitrary >> sequences of octets in a form that needs to be case insensitive >> but that need not be human readable. " [1] > >> The mention of case-insensitivity leads me to believe that case >> should be ignored for decoding. The javadoc for the Base32 class >> makes no mention of the requirement for input strings to be in >> UPPER CASE, either . > >> Would there be any appetite for extending the Base32 decoder to >> be case-insensitive? > >> I have a small patch and test case that look like they'll do the >> trick. > >> Thanks, -chris > >> [1] https://tools.ietf.org/html/rfc4648#section-6 --- - - > - - > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org > >> - > >> > > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org >> For additional commands, e-mail: user-h...@commons.apache.org > > > > - > > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > -BEGIN PGP SIGNATURE- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJZw8nlAAoJEBzwKT+lPKRYW44QAJ3fGGJFGu9gwANrUAzclAmZ 7Krtn6fSBESnzfmn9AsY1j7Mc5gpQxchjbhD95/Mqk1qDhq+KlpdDEPATcgBwOkf
Re: [codec] Base32 decode is not case-insensitive?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA256 Paulo, On 5/1/17 1:35 PM, Paulo Roberto Massa Cereda wrote: > Apologies, I quoted the wrong bit! > > --8< When decoding, upper and lower case > letters are accepted, and i and l will be treated as 1 and o will > be treated as 0. When encoding, only upper case letters are used. > --8< So... none of the above are true in commons-codec. Was it the intent to follow Douglas Crawfords recommendations? If so, it's quite incompatible with RFC 4648. I think I'll file a JIRA issue for this and attach a patch.[1] Thanks, - -chris [1] https://issues.apache.org/jira/browse/CODEC-234 > Em 01-05-2017 14:31, Paulo Roberto Massa Cereda escreveu: >> 'ello! >> >> I suspect it has something to do with Douglas Crockford's base32 >> [1]: >> >> -8<--- The encoding scheme is required >> to >> >> * Be human readable and machine readable. * Be compact. Humans >> have difficulty in manipulating long strings of arbitrary >> symbols. * Be error resistant. Entering the symbols must not >> require keyboarding gymnastics. * Be pronounceable. Humans should >> be able to accurately transmit the symbols to other humans using >> a telephone. -8<--- >> >> [1]: http://www.crockford.com/wrmg/base32.html >> >> Cheerio! >> >> Paulo >> >> Em 01-05-2017 14:00, Christopher Schultz escreveu: > All, > > I spent a few hours trying to figure out what I had done wrong > when trying to base32-decode a 32-character string and was getting > a 5-byte array back (instead of a 20-byte array, as expected). > > I finally determined that Base32.decode doesn't work properly with > lower-case input strings. > > Is there a particular reason that these don't work? Here's an > example: > > import org.apache.commons.codec.binary.Base32; import > org.apache.commons.codec.binary.Hex; > > public class Test { public static void main(String[] args) throws > Exception { byte[] bytes = new > Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4"); > > System.out.println("bytes.length=" + bytes.length); > System.out.println("hex=" + Hex.encodeHexString(bytes)); > > bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4"); > > System.out.println("bytes.length=" + bytes.length); > System.out.println("hex=" + Hex.encodeHexString(bytes)); } } > > Produces this output: > > bytes.length=20 hex=623a01735836e9a126e12fbf95e013ee6892997c > bytes.length=5 hex=ef35bd7bfd > > It looks like Base32.decode is ignoring all of the characters it > doesn't see as being "in" its alphabet (all the lowercase letters) > and uses what is left over ("542326754" in the case above). If I > base32-decode "542326754" I get the 5-byte output that the second > case generates. > > While RFC4648 doesn't specifically say that input strings should > be allowed to be case-insensitive, it does have this to say: > > " The Base 32 encoding is designed to represent arbitrary sequences > of octets in a form that needs to be case insensitive but that need > not be human readable. " [1] > > The mention of case-insensitivity leads me to believe that case > should be ignored for decoding. The javadoc for the Base32 class > makes no mention of the requirement for input strings to be in > UPPER CASE, either . > > Would there be any appetite for extending the Base32 decoder to be > case-insensitive? > > I have a small patch and test case that look like they'll do the > trick. > > Thanks, -chris > > [1] https://tools.ietf.org/html/rfc4648#section-6 >>> >>> - - >>> >>> To unsubscribe, e-mail: user-unsubscr...@commons.apache.org >>> For additional commands, e-mail: user-h...@commons.apache.org >>> > > - > > To unsubscribe, e-mail: user-unsubscr...@commons.apache.org > For additional commands, e-mail: user-h...@commons.apache.org > -BEGIN PGP SIGNATURE- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJZB4qmAAoJEBzwKT+lPKRYKvgP/1x1Sjn9IxurD+1D++QssN+0 Mw93Uw8v5ZCkSUYxjcLs5V89mR0acAH0ixcOvmYJSmu5HGR5aoVmc1L+kFnryk19 TSpSUvw8lIlVgx6Mkvfksjw5kNZ/os304yYAm0dKfkr3d50tVJHzSRSz8RT9y+PR ttzsIydGGQ+ukK+1xqIat0IyINI2WwQVc8wXP83Ct+ASdsf7zjMCMT6KVrGWq+dm 1dzhFIOD4Wj8F9tF78IfjTgnG6KgklFfWOXxWV05CvfTgNw8DLRIxUs1EI9ShvDc +FhpTPoHjPq+LBEAEepK/2WupRwK7ehmsnmTerfyW1UhbeEzvE0vOkqow3OFr7hC vvo9Y17z0DCv/cIk2fPCrAJUOqEY6f+cTCFZuV5HOVN2n5OvZWSTujrECUl/0QAF Ct+FiusdfjTB6Rz2MAndSSe/Fb9OsE+7wIjtYNFldKt51VWSGZpAspwnHFUNF1Ph fQkTSHKm0FaGUOOUq2YzSYk4YzzJYrF8W51jD5FD+/HmH1mbaISrtWTQ4VYxHDmV T0/IDwQL0D1sCdZdjvR/wzRV8T+Ogkd/oGHIFh/WXvtgTrgxO0BxS+WND9aEn4gm omErr45W67oa+Xt3kJjZIMgHppuJSBhajI19K7MA6h2kSS0gho4gpVf4b4MOx1WB P5TPMuMe/h3GjVAW8/x/ =yEbV -END PGP SIGNATURE- - To
Re: [codec] Base32 decode is not case-insensitive?
Apologies, I quoted the wrong bit! --8< When decoding, upper and lower case letters are accepted, and i and l will be treated as 1 and o will be treated as 0. When encoding, only upper case letters are used. --8< Best, Paulo Em 01-05-2017 14:31, Paulo Roberto Massa Cereda escreveu: 'ello! I suspect it has something to do with Douglas Crockford's base32 [1]: -8<--- The encoding scheme is required to * Be human readable and machine readable. * Be compact. Humans have difficulty in manipulating long strings of arbitrary symbols. * Be error resistant. Entering the symbols must not require keyboarding gymnastics. * Be pronounceable. Humans should be able to accurately transmit the symbols to other humans using a telephone. -8<--- [1]: http://www.crockford.com/wrmg/base32.html Cheerio! Paulo Em 01-05-2017 14:00, Christopher Schultz escreveu: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 All, I spent a few hours trying to figure out what I had done wrong when trying to base32-decode a 32-character string and was getting a 5-byte array back (instead of a 20-byte array, as expected). I finally determined that Base32.decode doesn't work properly with lower-case input strings. Is there a particular reason that these don't work? Here's an example: import org.apache.commons.codec.binary.Base32; import org.apache.commons.codec.binary.Hex; public class Test { public static void main(String[] args) throws Exception { byte[] bytes = new Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4"); System.out.println("bytes.length=" + bytes.length); System.out.println("hex=" + Hex.encodeHexString(bytes)); bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4"); System.out.println("bytes.length=" + bytes.length); System.out.println("hex=" + Hex.encodeHexString(bytes)); } } Produces this output: bytes.length=20 hex=623a01735836e9a126e12fbf95e013ee6892997c bytes.length=5 hex=ef35bd7bfd It looks like Base32.decode is ignoring all of the characters it doesn't see as being "in" its alphabet (all the lowercase letters) and uses what is left over ("542326754" in the case above). If I base32-decode "542326754" I get the 5-byte output that the second case generates. While RFC4648 doesn't specifically say that input strings should be allowed to be case-insensitive, it does have this to say: " The Base 32 encoding is designed to represent arbitrary sequences of octets in a form that needs to be case insensitive but that need not be human readable. " [1] The mention of case-insensitivity leads me to believe that case should be ignored for decoding. The javadoc for the Base32 class makes no mention of the requirement for input strings to be in UPPER CASE, either . Would there be any appetite for extending the Base32 decoder to be case-insensitive? I have a small patch and test case that look like they'll do the trick. Thanks, - -chris [1] https://tools.ietf.org/html/rfc4648#section-6 -BEGIN PGP SIGNATURE- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJZB2mSAAoJEBzwKT+lPKRYDcUQAKuumWhCQpNRwblxm37jhQ8N EqYtYneDtslKx5EQwq2Szx1tDvn56JsX28Kuc1EfcaSyiVueu+IKV076EG7dueWb SleXlAz02/XYr97ks83KsCl6gUDEjoEMG/+U7hcrC06yCOGtPQtXEBT9N5+CF7Gg YlIt6mZ5ZbGuFVjvOc/FiORv7u0k9X8h0PmF5luYSsBgrgmoSAIHUy9VmwZIsD13 NwBTtGarGxJhS65YNfmxIpln5Zv95FojeMKbArhvd9P7cmRJs8ydcE/rydYVa373 R9yy5aYtFOO1mOBZp4YpDvn9oUTmcjlK9bsxOhNld2dNLW+ANE44VYtJb6Dr71ym x37Ez8eJX9kJSe9kB5EWnJh5/epRX7m7H6PZeJNx6riGfKxmH6iPwcJ9ncyk076f o09USE43lT1VUFcgZ6w5XeDAHEkYSQMNz7jGpJ7egpPtZ/BHsijxlTnygD2AjT/3 qY4hgYRYz2hlVwrw84/1TZq8zI7Anzfet9XROdMfhJs2popFVbxLPHz5aV75Pjhm hlsggHiSKb2pbTNzjWEogWrKbX2MJ+c/pcKGwSINWlnoAO+7y5kVwaayFrmD2FLt GXKDO/DnsjY1NRDeVcN/MH1CHGwdEr+KhIkzc6rhvK/Eh8UQs2/wrGZRFvqEfQrX iG3y4rRSX+sPaG6HW+9r =JIze -END PGP SIGNATURE- - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org
Re: [codec] Base32 decode is not case-insensitive?
'ello! I suspect it has something to do with Douglas Crockford's base32 [1]: -8<--- The encoding scheme is required to * Be human readable and machine readable. * Be compact. Humans have difficulty in manipulating long strings of arbitrary symbols. * Be error resistant. Entering the symbols must not require keyboarding gymnastics. * Be pronounceable. Humans should be able to accurately transmit the symbols to other humans using a telephone. -8<--- [1]: http://www.crockford.com/wrmg/base32.html Cheerio! Paulo Em 01-05-2017 14:00, Christopher Schultz escreveu: -BEGIN PGP SIGNED MESSAGE- Hash: SHA256 All, I spent a few hours trying to figure out what I had done wrong when trying to base32-decode a 32-character string and was getting a 5-byte array back (instead of a 20-byte array, as expected). I finally determined that Base32.decode doesn't work properly with lower-case input strings. Is there a particular reason that these don't work? Here's an example: import org.apache.commons.codec.binary.Base32; import org.apache.commons.codec.binary.Hex; public class Test { public static void main(String[] args) throws Exception { byte[] bytes = new Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4"); System.out.println("bytes.length=" + bytes.length); System.out.println("hex=" + Hex.encodeHexString(bytes)); bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4"); System.out.println("bytes.length=" + bytes.length); System.out.println("hex=" + Hex.encodeHexString(bytes)); } } Produces this output: bytes.length=20 hex=623a01735836e9a126e12fbf95e013ee6892997c bytes.length=5 hex=ef35bd7bfd It looks like Base32.decode is ignoring all of the characters it doesn't see as being "in" its alphabet (all the lowercase letters) and uses what is left over ("542326754" in the case above). If I base32-decode "542326754" I get the 5-byte output that the second case generates. While RFC4648 doesn't specifically say that input strings should be allowed to be case-insensitive, it does have this to say: " The Base 32 encoding is designed to represent arbitrary sequences of octets in a form that needs to be case insensitive but that need not be human readable. " [1] The mention of case-insensitivity leads me to believe that case should be ignored for decoding. The javadoc for the Base32 class makes no mention of the requirement for input strings to be in UPPER CASE, either . Would there be any appetite for extending the Base32 decoder to be case-insensitive? I have a small patch and test case that look like they'll do the trick. Thanks, - -chris [1] https://tools.ietf.org/html/rfc4648#section-6 -BEGIN PGP SIGNATURE- Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQIcBAEBCAAGBQJZB2mSAAoJEBzwKT+lPKRYDcUQAKuumWhCQpNRwblxm37jhQ8N EqYtYneDtslKx5EQwq2Szx1tDvn56JsX28Kuc1EfcaSyiVueu+IKV076EG7dueWb SleXlAz02/XYr97ks83KsCl6gUDEjoEMG/+U7hcrC06yCOGtPQtXEBT9N5+CF7Gg YlIt6mZ5ZbGuFVjvOc/FiORv7u0k9X8h0PmF5luYSsBgrgmoSAIHUy9VmwZIsD13 NwBTtGarGxJhS65YNfmxIpln5Zv95FojeMKbArhvd9P7cmRJs8ydcE/rydYVa373 R9yy5aYtFOO1mOBZp4YpDvn9oUTmcjlK9bsxOhNld2dNLW+ANE44VYtJb6Dr71ym x37Ez8eJX9kJSe9kB5EWnJh5/epRX7m7H6PZeJNx6riGfKxmH6iPwcJ9ncyk076f o09USE43lT1VUFcgZ6w5XeDAHEkYSQMNz7jGpJ7egpPtZ/BHsijxlTnygD2AjT/3 qY4hgYRYz2hlVwrw84/1TZq8zI7Anzfet9XROdMfhJs2popFVbxLPHz5aV75Pjhm hlsggHiSKb2pbTNzjWEogWrKbX2MJ+c/pcKGwSINWlnoAO+7y5kVwaayFrmD2FLt GXKDO/DnsjY1NRDeVcN/MH1CHGwdEr+KhIkzc6rhvK/Eh8UQs2/wrGZRFvqEfQrX iG3y4rRSX+sPaG6HW+9r =JIze -END PGP SIGNATURE- - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org - To unsubscribe, e-mail: user-unsubscr...@commons.apache.org For additional commands, e-mail: user-h...@commons.apache.org