[ 
https://issues.apache.org/jira/browse/CODEC-341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18089635#comment-18089635
 ] 

Ruiqi Dong commented on CODEC-341:
----------------------------------

Base32 has a similar issue. I create a new ticket 
[https://issues.apache.org/jira/browse/CODEC-342].

> Base16.Builder#setEncodeTable(...) can create an instance that cannot decode 
> its own output
> -------------------------------------------------------------------------------------------
>
>                 Key: CODEC-341
>                 URL: https://issues.apache.org/jira/browse/CODEC-341
>             Project: Commons Codec
>          Issue Type: Bug
>            Reporter: Ruiqi Dong
>            Priority: Major
>
> *Summary*
> Base16.Builder exposes setEncodeTable(...), which suggests callers can 
> provide a custom Base16 alphabet. Encoding does honor the custom table, but 
> the builder only switches the decode table between the built-in upper-case 
> and lower-case variants. As a result, a Base16 instance created with an 
> arbitrary custom alphabet can emit encoded data that the same instance 
> decodes incorrectly. *This issue also happens on Base32.* BTW, is that fine 
> for me to report Base32 in this ticket? Or do I need to create a new ticket?
>  
> *Affected code*
> File: src/main/java/org/apache/commons/codec/binary/Base16.java
> File: src/main/java/org/apache/commons/codec/binary/Base32.java
> {code:java}
> # Base16
> @Override
> public Builder setEncodeTable(final byte... encodeTable) {
>     super.setDecodeTableRaw(Arrays.equals(encodeTable, 
> LOWER_CASE_ENCODE_TABLE) ? LOWER_CASE_DECODE_TABLE : UPPER_CASE_DECODE_TABLE);
>     return super.setEncodeTable(encodeTable);
> }{code}
> {code:java}
> # Base32
> @Override
> public Builder setEncodeTable(final byte... encodeTable) {
>     super.setDecodeTableRaw(Arrays.equals(encodeTable, HEX_ENCODE_TABLE) ? 
> HEX_DECODE_TABLE : DECODE_TABLE);
>     return super.setEncodeTable(encodeTable);
> } {code}
>  
> *Reproducer* 
> Add the following test to 
> src/test/java/org/apache/commons/codec/binary/Base16Test.java:
> {code:java}
> @Test
> void testBuilderCustomEncodeTableAffectsDecodeTable() {
>     final byte[] encodeTable = 
> "0123456789ABCDEF".getBytes(StandardCharsets.US_ASCII);
>     final byte tmp = encodeTable[0];
>     encodeTable[0] = encodeTable[1];
>     encodeTable[1] = tmp;
>     final Base16 base16 = Base16.builder().setEncodeTable(encodeTable).get();
>     final byte[] encoded = base16.encode(new byte[] { 1 });
>     assertEquals("10", new String(encoded, StandardCharsets.US_ASCII),
>             "A custom Base16 alphabet should affect encoding");
>     assertArrayEquals(new byte[] { 1 }, base16.decode(encoded),
>             "A custom Base16 alphabet should decode its own encoded output");
> }{code}
> Run:
> {code:java}
> mvn -q 
> -Dtest=org.apache.commons.codec.binary.Base16Test#testBuilderCustomEncodeTableAffectsDecodeTable
>  test {code}
> The encoding assertion passes, showing that the custom alphabet is used. The 
> encoded output is:
> {code:java}
> 10{code}
> But the decode assertion fails because 10 is interpreted with the default 
> decode table:
> {code:java}
> array contents differ at index [0], expected: <1> but was: <16> {code}
> Expected behavior:
> If setEncodeTable(...) is part of the public builder API, the resulting 
> Base16 instance should use a matching decode table so that it can decode its 
> own output consistently. If arbitrary custom alphabets are not supported, the 
> builder should reject them instead of silently pairing them with an 
> incompatible decode table.
> Add the following test to 
> src/test/java/org/apache/commons/codec/binary/Base32Test.java:
> {code:java}
> @Test
> void testBuilderCustomEncodeTableAffectsDecodeTable() {
>     final byte[] encodeTable = 
> "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567".getBytes(StandardCharsets.US_ASCII);
>     final byte tmp = encodeTable[0];
>     encodeTable[0] = encodeTable[1];
>     encodeTable[1] = tmp;
>     final Base32 base32 = 
> Base32.builder().setEncodeTable(encodeTable).setLineLength(0).get();
>     final byte[] encoded = base32.encode(new byte[] { 0 });
>     assertEquals("BB======", new String(encoded, StandardCharsets.US_ASCII),
>             "A custom Base32 alphabet should affect encoding");
>     assertArrayEquals(new byte[] { 0 }, base32.decode(encoded),
>             "A custom Base32 alphabet should decode its own encoded output");
> } {code}
> Run:
> {code:java}
> mvn -q 
> -Dtest=org.apache.commons.codec.binary.Base32Test#testBuilderCustomEncodeTableAffectsDecodeTable
>  test {code}
> Observed behavior:
> The encoding assertion passes, showing that the custom alphabet is used. The 
> encoded output is:
> {code:java}
> BB====== {code}
> But the decode assertion fails because "BB======" is interpreted with the 
> default decode table:
> {code:java}
> array contents differ at index [0], expected: <0> but was: <8> {code}
> Expected behavior:
> The resulting Base32 instance should use a matching decode table so that it 
> can decode its own output consistently. If arbitrary custom alphabets are not 
> supported, the builder should reject them instead of silently pairing them 
> with an incompatible decode table.
>  
>  
> This is a configuration/state inconsistency in a public API. The builder 
> accepts a custom alphabet and encoding follows that configuration, but 
> decoding silently continues to interpret characters under a different 
> alphabet. That makes the configured Base16 and Base32 instances internally 
> inconsistent.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to