[ 
https://issues.apache.org/jira/browse/CODEC-342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary D. Gregory resolved CODEC-342.
-----------------------------------
    Fix Version/s: 1.22.1
       Resolution: Fixed

> Base32.Builder#setEncodeTable(...) can create an instance that cannot decode 
> its own output
> -------------------------------------------------------------------------------------------
>
>                 Key: CODEC-342
>                 URL: https://issues.apache.org/jira/browse/CODEC-342
>             Project: Commons Codec
>          Issue Type: Bug
>            Reporter: Ruiqi Dong
>            Priority: Major
>             Fix For: 1.22.1
>
>
> *Summary*
> `Base32.Builder` exposes `setEncodeTable(...)`, which suggests callers can 
> provide a custom Base32 alphabet. Encoding does honor the custom table, but 
> the builder only switches the decode table between the built-in standard and 
> hex variants. As a result, a `Base32` instance created with an arbitrary 
> custom alphabet can emit encoded data that the same instance decodes 
> incorrectly.
>  
> *Affected code*
> File: `src/main/java/org/apache/commons/codec/binary/Base32.java`
> {code:java}
> @Override
> public Builder setEncodeTable(final byte... encodeTable) {
>     super.setDecodeTableRaw(Arrays.equals(encodeTable, HEX_ENCODE_TABLE) ? 
> HEX_DECODE_TABLE : DECODE_TABLE);
>     return super.setEncodeTable(encodeTable);
> } {code}
> So any table other than the exact built-in hex alphabet gets paired with the 
> default decode table. Encoding uses the configured `encodeTable`, but 
> decoding uses the mismatched `decodeTable`, so encode/decode no longer agree 
> on the alphabet.
>  
> *Reproducer*
> Add the following test to 
> `src/test/java/org/apache/commons/codec/binary/Base32Test.java`:
> {code:java}
> @Test
> void testBuilderCustomEncodeTableAffectsDecodeTable() {
>     final byte[] encodeTable = 
> "ABCDEFGHIJKLMNOPQRSTUVWXYZ234567".getBytes(StandardCharsets.US_ASCII);
>     final byte tmp = encodeTable[0];
>     encodeTable[0] = encodeTable[1];
>     encodeTable[1] = tmp;
>     final Base32 base32 = 
> Base32.builder().setEncodeTable(encodeTable).setLineLength(0).get();
>     final byte[] encoded = base32.encode(new byte[] { 0 });
>     assertEquals("BB======", new String(encoded, StandardCharsets.US_ASCII),
>             "A custom Base32 alphabet should affect encoding");
>     assertArrayEquals(new byte[] { 0 }, base32.decode(encoded),
>             "A custom Base32 alphabet should decode its own encoded output");
> } {code}
> Run:
> {code:java}
> mvn -q 
> -Dtest=org.apache.commons.codec.binary.Base32Test#testBuilderCustomEncodeTableAffectsDecodeTable
>  test {code}
> Observed behavior
> The encoding assertion passes, showing that the custom alphabet is used. The 
> encoded output is:
> {code:java}
> BB====== {code}
> But the decode assertion fails because `"BB======"` is interpreted with the 
> default decode table:
> {code:java}
> array contents differ at index [0], expected: <0> but was: <8> {code}
> *Expected behavior*
> If `setEncodeTable(...)` is part of the public builder API, the resulting 
> `Base32` instance should use a matching decode table so that it can decode 
> its own output consistently. If arbitrary custom alphabets are not supported, 
> the builder should reject them instead of silently pairing them with an 
> incompatible decode table.
>  
> This is a configuration/state inconsistency in a public API. The builder 
> accepts a custom alphabet and encoding follows that configuration, but 
> decoding silently continues to interpret characters under a different 
> alphabet. That makes the configured `Base32` instance internally inconsistent.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to