Re: [codec] Base32 decode is not case-insensitive?

2017-09-21 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

All,

Can I get a review (and hopefully a commit) on this issue?

https://issues.apache.org/jira/browse/CODEC-234

I'd really like to have these features available directly in the
library instead of having to massage input before handing it off to
commons-codec.

Thanks,
- -chris

On 5/1/17 3:21 PM, Christopher Schultz wrote:
> Paulo,
> 
> On 5/1/17 1:35 PM, Paulo Roberto Massa Cereda wrote:
>> Apologies, I quoted the wrong bit!
> 
>> --8< When decoding, upper and lower case 
>> letters are accepted, and i and l will be treated as 1 and o
>> will be treated as 0. When encoding, only upper case letters are
>> used. --8<
> 
> So... none of the above are true in commons-codec. Was it the
> intent to follow Douglas Crawfords recommendations? If so, it's
> quite incompatible with RFC 4648.
> 
> I think I'll file a JIRA issue for this and attach a patch.[1]
> 
> Thanks, -chris
> 
> [1] https://issues.apache.org/jira/browse/CODEC-234
> 
>> Em 01-05-2017 14:31, Paulo Roberto Massa Cereda escreveu:
>>> 'ello!
>>> 
>>> I suspect it has something to do with Douglas Crockford's
>>> base32 [1]:
>>> 
>>> -8<--- The encoding scheme is required 
>>> to
>>> 
>>> * Be human readable and machine readable. * Be compact. Humans 
>>> have difficulty in manipulating long strings of arbitrary 
>>> symbols. * Be error resistant. Entering the symbols must not 
>>> require keyboarding gymnastics. * Be pronounceable. Humans
>>> should be able to accurately transmit the symbols to other
>>> humans using a telephone. -8<---
>>> 
>>> [1]: http://www.crockford.com/wrmg/base32.html
>>> 
>>> Cheerio!
>>> 
>>> Paulo
>>> 
>>> Em 01-05-2017 14:00, Christopher Schultz escreveu:
>> All,
> 
>> I spent a few hours trying to figure out what I had done wrong 
>> when trying to base32-decode a 32-character string and was
>> getting a 5-byte array back (instead of a 20-byte array, as
>> expected).
> 
>> I finally determined that Base32.decode doesn't work properly
>> with lower-case input strings.
> 
>> Is there a particular reason that these don't work? Here's an 
>> example:
> 
>> import org.apache.commons.codec.binary.Base32; import 
>> org.apache.commons.codec.binary.Hex;
> 
>> public class Test { public static void main(String[] args)
>> throws Exception { byte[] bytes = new 
>> Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4");
> 
>> System.out.println("bytes.length=" + bytes.length); 
>> System.out.println("hex=" + Hex.encodeHexString(bytes));
> 
>> bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4");
> 
>> System.out.println("bytes.length=" + bytes.length); 
>> System.out.println("hex=" + Hex.encodeHexString(bytes)); } }
> 
>> Produces this output:
> 
>> bytes.length=20 hex=623a01735836e9a126e12fbf95e013ee6892997c 
>> bytes.length=5 hex=ef35bd7bfd
> 
>> It looks like Base32.decode is ignoring all of the characters it
>>  doesn't see as being "in" its alphabet (all the lowercase
>> letters) and uses what is left over ("542326754" in the case
>> above). If I base32-decode "542326754" I get the 5-byte output
>> that the second case generates.
> 
>> While RFC4648 doesn't specifically say that input strings should 
>> be allowed to be case-insensitive, it does have this to say:
> 
>> " The Base 32 encoding is designed to represent arbitrary
>> sequences of octets in a form that needs to be case insensitive
>> but that need not be human readable. " [1]
> 
>> The mention of case-insensitivity leads me to believe that case 
>> should be ignored for decoding. The javadoc for the Base32 class 
>> makes no mention of the requirement for input strings to be in 
>> UPPER CASE, either .
> 
>> Would there be any appetite for extending the Base32 decoder to
>> be case-insensitive?
> 
>> I have a small patch and test case that look like they'll do the 
>> trick.
> 
>> Thanks, -chris
> 
>> [1] https://tools.ietf.org/html/rfc4648#section-6
 
 ---
- -
>
 
- -
 
 
> To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
 For additional commands, e-mail:
 user-h...@commons.apache.org
 
> 
>> -
>
>> 
> 
> To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
>> For additional commands, e-mail: user-h...@commons.apache.org
> 
> 
> 
> -
>
> 
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
> For additional commands, e-mail: user-h...@commons.apache.org
> 
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJZw8nlAAoJEBzwKT+lPKRYW44QAJ3fGGJFGu9gwANrUAzclAmZ
7Krtn6fSBESnzfmn9AsY1j7Mc5gpQxchjbhD95/Mqk1qDhq+KlpdDEPATcgBwOkf

Re: [codec] Base32 decode is not case-insensitive?

2017-05-01 Thread Christopher Schultz
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

Paulo,

On 5/1/17 1:35 PM, Paulo Roberto Massa Cereda wrote:
> Apologies, I quoted the wrong bit!
> 
> --8< When decoding, upper and lower case
> letters are accepted, and i and l will be treated as 1 and o will
> be treated as 0. When encoding, only upper case letters are used. 
> --8<

So... none of the above are true in commons-codec. Was it the intent to
follow Douglas Crawfords recommendations? If so, it's quite
incompatible with RFC 4648.

I think I'll file a JIRA issue for this and attach a patch.[1]

Thanks,
- -chris

[1] https://issues.apache.org/jira/browse/CODEC-234

> Em 01-05-2017 14:31, Paulo Roberto Massa Cereda escreveu:
>> 'ello!
>> 
>> I suspect it has something to do with Douglas Crockford's base32
>> [1]:
>> 
>> -8<--- The encoding scheme is required
>> to
>> 
>> * Be human readable and machine readable. * Be compact. Humans
>> have difficulty in manipulating long strings of arbitrary
>> symbols. * Be error resistant. Entering the symbols must not
>> require keyboarding gymnastics. * Be pronounceable. Humans should
>> be able to accurately transmit the symbols to other humans using
>> a telephone. -8<---
>> 
>> [1]: http://www.crockford.com/wrmg/base32.html
>> 
>> Cheerio!
>> 
>> Paulo
>> 
>> Em 01-05-2017 14:00, Christopher Schultz escreveu:
> All,
> 
> I spent a few hours trying to figure out what I had done wrong
> when trying to base32-decode a 32-character string and was getting
> a 5-byte array back (instead of a 20-byte array, as expected).
> 
> I finally determined that Base32.decode doesn't work properly with 
> lower-case input strings.
> 
> Is there a particular reason that these don't work? Here's an
> example:
> 
> import org.apache.commons.codec.binary.Base32; import
> org.apache.commons.codec.binary.Hex;
> 
> public class Test { public static void main(String[] args) throws
> Exception { byte[] bytes = new 
> Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4");
> 
> System.out.println("bytes.length=" + bytes.length); 
> System.out.println("hex=" + Hex.encodeHexString(bytes));
> 
> bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4");
> 
> System.out.println("bytes.length=" + bytes.length); 
> System.out.println("hex=" + Hex.encodeHexString(bytes)); } }
> 
> Produces this output:
> 
> bytes.length=20 hex=623a01735836e9a126e12fbf95e013ee6892997c 
> bytes.length=5 hex=ef35bd7bfd
> 
> It looks like Base32.decode is ignoring all of the characters it 
> doesn't see as being "in" its alphabet (all the lowercase letters)
> and uses what is left over ("542326754" in the case above). If I 
> base32-decode "542326754" I get the 5-byte output that the second
> case generates.
> 
> While RFC4648 doesn't specifically say that input strings should
> be allowed to be case-insensitive, it does have this to say:
> 
> " The Base 32 encoding is designed to represent arbitrary sequences
> of octets in a form that needs to be case insensitive but that need
> not be human readable. " [1]
> 
> The mention of case-insensitivity leads me to believe that case
> should be ignored for decoding. The javadoc for the Base32 class
> makes no mention of the requirement for input strings to be in
> UPPER CASE, either .
> 
> Would there be any appetite for extending the Base32 decoder to be 
> case-insensitive?
> 
> I have a small patch and test case that look like they'll do the
> trick.
> 
> Thanks, -chris
> 
> [1] https://tools.ietf.org/html/rfc4648#section-6
>>> 
>>> 
- -
>>>
>>> 
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
>>> For additional commands, e-mail: user-h...@commons.apache.org
>>> 
> 
> -
>
> 
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
> For additional commands, e-mail: user-h...@commons.apache.org
> 

-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJZB4qmAAoJEBzwKT+lPKRYKvgP/1x1Sjn9IxurD+1D++QssN+0
Mw93Uw8v5ZCkSUYxjcLs5V89mR0acAH0ixcOvmYJSmu5HGR5aoVmc1L+kFnryk19
TSpSUvw8lIlVgx6Mkvfksjw5kNZ/os304yYAm0dKfkr3d50tVJHzSRSz8RT9y+PR
ttzsIydGGQ+ukK+1xqIat0IyINI2WwQVc8wXP83Ct+ASdsf7zjMCMT6KVrGWq+dm
1dzhFIOD4Wj8F9tF78IfjTgnG6KgklFfWOXxWV05CvfTgNw8DLRIxUs1EI9ShvDc
+FhpTPoHjPq+LBEAEepK/2WupRwK7ehmsnmTerfyW1UhbeEzvE0vOkqow3OFr7hC
vvo9Y17z0DCv/cIk2fPCrAJUOqEY6f+cTCFZuV5HOVN2n5OvZWSTujrECUl/0QAF
Ct+FiusdfjTB6Rz2MAndSSe/Fb9OsE+7wIjtYNFldKt51VWSGZpAspwnHFUNF1Ph
fQkTSHKm0FaGUOOUq2YzSYk4YzzJYrF8W51jD5FD+/HmH1mbaISrtWTQ4VYxHDmV
T0/IDwQL0D1sCdZdjvR/wzRV8T+Ogkd/oGHIFh/WXvtgTrgxO0BxS+WND9aEn4gm
omErr45W67oa+Xt3kJjZIMgHppuJSBhajI19K7MA6h2kSS0gho4gpVf4b4MOx1WB
P5TPMuMe/h3GjVAW8/x/
=yEbV
-END PGP SIGNATURE-

-
To 

Re: [codec] Base32 decode is not case-insensitive?

2017-05-01 Thread Paulo Roberto Massa Cereda

Apologies, I quoted the wrong bit!

--8<
When decoding, upper and lower case letters are accepted, and i and l 
will be treated as 1 and o will be treated as 0. When encoding, only 
upper case letters are used.

--8<

Best,

Paulo

Em 01-05-2017 14:31, Paulo Roberto Massa Cereda escreveu:

'ello!

I suspect it has something to do with Douglas Crockford's base32 [1]:

-8<---
The encoding scheme is required to

* Be human readable and machine readable.
* Be compact. Humans have difficulty in manipulating long strings of 
arbitrary symbols.
* Be error resistant. Entering the symbols must not require keyboarding 
gymnastics.
* Be pronounceable. Humans should be able to accurately transmit the 
symbols to other humans using a telephone.

-8<---

[1]: http://www.crockford.com/wrmg/base32.html

Cheerio!

Paulo

Em 01-05-2017 14:00, Christopher Schultz escreveu:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

All,

I spent a few hours trying to figure out what I had done wrong when
trying to base32-decode a 32-character string and was getting a 5-byte
array back (instead of a 20-byte array, as expected).

I finally determined that Base32.decode doesn't work properly with
lower-case input strings.

Is there a particular reason that these don't work? Here's an example:

import org.apache.commons.codec.binary.Base32;
import org.apache.commons.codec.binary.Hex;

public class Test {
 public static void main(String[] args) throws Exception {
 byte[] bytes = new
Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4");

 System.out.println("bytes.length=" + bytes.length);
 System.out.println("hex=" + Hex.encodeHexString(bytes));

 bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4");

 System.out.println("bytes.length=" + bytes.length);
 System.out.println("hex=" + Hex.encodeHexString(bytes));
 }
}

Produces this output:

bytes.length=20
hex=623a01735836e9a126e12fbf95e013ee6892997c
bytes.length=5
hex=ef35bd7bfd

It looks like Base32.decode is ignoring all of the characters it
doesn't see as being "in" its alphabet (all the lowercase letters) and
uses what is left over ("542326754" in the case above). If I
base32-decode "542326754" I get the 5-byte output that the second case
generates.

While RFC4648 doesn't specifically say that input strings should be
allowed to be case-insensitive, it does have this to say:

"
The Base 32 encoding is designed to represent arbitrary sequences of
octets in a form that needs to be case insensitive but that need not
be human readable.
" [1]

The mention of case-insensitivity leads me to believe that case should
be ignored for decoding. The javadoc for the Base32 class makes no
mention of the requirement for input strings to be in UPPER CASE, either
.

Would there be any appetite for extending the Base32 decoder to be
case-insensitive?

I have a small patch and test case that look like they'll do the trick.

Thanks,
- -chris

[1] https://tools.ietf.org/html/rfc4648#section-6
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJZB2mSAAoJEBzwKT+lPKRYDcUQAKuumWhCQpNRwblxm37jhQ8N
EqYtYneDtslKx5EQwq2Szx1tDvn56JsX28Kuc1EfcaSyiVueu+IKV076EG7dueWb
SleXlAz02/XYr97ks83KsCl6gUDEjoEMG/+U7hcrC06yCOGtPQtXEBT9N5+CF7Gg
YlIt6mZ5ZbGuFVjvOc/FiORv7u0k9X8h0PmF5luYSsBgrgmoSAIHUy9VmwZIsD13
NwBTtGarGxJhS65YNfmxIpln5Zv95FojeMKbArhvd9P7cmRJs8ydcE/rydYVa373
R9yy5aYtFOO1mOBZp4YpDvn9oUTmcjlK9bsxOhNld2dNLW+ANE44VYtJb6Dr71ym
x37Ez8eJX9kJSe9kB5EWnJh5/epRX7m7H6PZeJNx6riGfKxmH6iPwcJ9ncyk076f
o09USE43lT1VUFcgZ6w5XeDAHEkYSQMNz7jGpJ7egpPtZ/BHsijxlTnygD2AjT/3
qY4hgYRYz2hlVwrw84/1TZq8zI7Anzfet9XROdMfhJs2popFVbxLPHz5aV75Pjhm
hlsggHiSKb2pbTNzjWEogWrKbX2MJ+c/pcKGwSINWlnoAO+7y5kVwaayFrmD2FLt
GXKDO/DnsjY1NRDeVcN/MH1CHGwdEr+KhIkzc6rhvK/Eh8UQs2/wrGZRFvqEfQrX
iG3y4rRSX+sPaG6HW+9r
=JIze
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



Re: [codec] Base32 decode is not case-insensitive?

2017-05-01 Thread Paulo Roberto Massa Cereda

'ello!

I suspect it has something to do with Douglas Crockford's base32 [1]:

-8<---
The encoding scheme is required to

* Be human readable and machine readable.
* Be compact. Humans have difficulty in manipulating long strings of 
arbitrary symbols.
* Be error resistant. Entering the symbols must not require keyboarding 
gymnastics.
* Be pronounceable. Humans should be able to accurately transmit the 
symbols to other humans using a telephone.

-8<---

[1]: http://www.crockford.com/wrmg/base32.html

Cheerio!

Paulo

Em 01-05-2017 14:00, Christopher Schultz escreveu:

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

All,

I spent a few hours trying to figure out what I had done wrong when
trying to base32-decode a 32-character string and was getting a 5-byte
array back (instead of a 20-byte array, as expected).

I finally determined that Base32.decode doesn't work properly with
lower-case input strings.

Is there a particular reason that these don't work? Here's an example:

import org.apache.commons.codec.binary.Base32;
import org.apache.commons.codec.binary.Hex;

public class Test {
 public static void main(String[] args) throws Exception {
 byte[] bytes = new
Base32().decode("MI5AC42YG3U2CJXBF67ZLYAT5ZUJFGL4");

 System.out.println("bytes.length=" + bytes.length);
 System.out.println("hex=" + Hex.encodeHexString(bytes));

 bytes = new Base32().decode("mi5ac42yg3u2cjxbf67zlyat5zujfgl4");

 System.out.println("bytes.length=" + bytes.length);
 System.out.println("hex=" + Hex.encodeHexString(bytes));
 }
}

Produces this output:

bytes.length=20
hex=623a01735836e9a126e12fbf95e013ee6892997c
bytes.length=5
hex=ef35bd7bfd

It looks like Base32.decode is ignoring all of the characters it
doesn't see as being "in" its alphabet (all the lowercase letters) and
uses what is left over ("542326754" in the case above). If I
base32-decode "542326754" I get the 5-byte output that the second case
generates.

While RFC4648 doesn't specifically say that input strings should be
allowed to be case-insensitive, it does have this to say:

"
The Base 32 encoding is designed to represent arbitrary sequences of
octets in a form that needs to be case insensitive but that need not
be human readable.
" [1]

The mention of case-insensitivity leads me to believe that case should
be ignored for decoding. The javadoc for the Base32 class makes no
mention of the requirement for input strings to be in UPPER CASE, either
.

Would there be any appetite for extending the Base32 decoder to be
case-insensitive?

I have a small patch and test case that look like they'll do the trick.

Thanks,
- -chris

[1] https://tools.ietf.org/html/rfc4648#section-6
-BEGIN PGP SIGNATURE-
Comment: GPGTools - http://gpgtools.org
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBCAAGBQJZB2mSAAoJEBzwKT+lPKRYDcUQAKuumWhCQpNRwblxm37jhQ8N
EqYtYneDtslKx5EQwq2Szx1tDvn56JsX28Kuc1EfcaSyiVueu+IKV076EG7dueWb
SleXlAz02/XYr97ks83KsCl6gUDEjoEMG/+U7hcrC06yCOGtPQtXEBT9N5+CF7Gg
YlIt6mZ5ZbGuFVjvOc/FiORv7u0k9X8h0PmF5luYSsBgrgmoSAIHUy9VmwZIsD13
NwBTtGarGxJhS65YNfmxIpln5Zv95FojeMKbArhvd9P7cmRJs8ydcE/rydYVa373
R9yy5aYtFOO1mOBZp4YpDvn9oUTmcjlK9bsxOhNld2dNLW+ANE44VYtJb6Dr71ym
x37Ez8eJX9kJSe9kB5EWnJh5/epRX7m7H6PZeJNx6riGfKxmH6iPwcJ9ncyk076f
o09USE43lT1VUFcgZ6w5XeDAHEkYSQMNz7jGpJ7egpPtZ/BHsijxlTnygD2AjT/3
qY4hgYRYz2hlVwrw84/1TZq8zI7Anzfet9XROdMfhJs2popFVbxLPHz5aV75Pjhm
hlsggHiSKb2pbTNzjWEogWrKbX2MJ+c/pcKGwSINWlnoAO+7y5kVwaayFrmD2FLt
GXKDO/DnsjY1NRDeVcN/MH1CHGwdEr+KhIkzc6rhvK/Eh8UQs2/wrGZRFvqEfQrX
iG3y4rRSX+sPaG6HW+9r
=JIze
-END PGP SIGNATURE-

-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org



-
To unsubscribe, e-mail: user-unsubscr...@commons.apache.org
For additional commands, e-mail: user-h...@commons.apache.org