[
https://issues.apache.org/jira/browse/CODEC-343?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gary D. Gregory resolved CODEC-343.
-----------------------------------
Fix Version/s: 1.22.1
Resolution: Fixed
> Base32.Builder#setHexDecodeTable(boolean) sets the encode table to the decode
> table, corrupting encoding
> --------------------------------------------------------------------------------------------------------
>
> Key: CODEC-343
> URL: https://issues.apache.org/jira/browse/CODEC-343
> Project: Commons Codec
> Issue Type: Bug
> Reporter: Ruiqi Dong
> Priority: Major
> Fix For: 1.22.1
>
>
> *Summary*
> `Base32.Builder#setHexDecodeTable(boolean)` is implemented as
> `setEncodeTable(decodeTable(useHex))` — it passes the **decode** lookup table
> to `setEncodeTable(...)`. Used on its own, the resulting `Base32` therefore
> encodes with the decode lookup array (whose low entries are the `-1`
> sentinel) instead of the Base32-Hex alphabet, so it emits bytes outside the
> alphabet and cannot decode its own output.
>
> *Affected code*File:
> `src/main/java/org/apache/commons/codec/binary/Base32.java`
> {code:java}
> public Builder setHexDecodeTable(final boolean useHex) {
> return setEncodeTable(decodeTable(useHex)); // passes the DECODE table to
> setEncodeTable
> } {code}
> `decodeTable(useHex)` returns `HEX_DECODE_TABLE` / `DECODE_TABLE` (the lookup
> arrays used for decoding). Passing one of those to `setEncodeTable(...)`
> makes it the encode table, so encoding reads `-1` sentinels and emits invalid
> bytes.
>
> The only test that touches it chains another setter right after:
> {code:java}
> Base32.builder()
> .setHexDecodeTable(false)
> .setHexDecodeTable(true)
> .setHexEncodeTable(false)
> .setHexEncodeTable(true) // "last set wins" overwrites the broken
> encode table
> ... {code}
> The trailing `setHexEncodeTable(true)` restores a correct encode table,
> masking the defect, so the bug never surfaces when `setHexDecodeTable` is
> used in isolation.
>
> *Reproducer*
> Add the following test to
> `src/test/java/org/apache/commons/codec/binary/Base32Test.java`:
> {code:java}
> @Test
> void testBuilderSetHexDecodeTableEncodesWithHexAlphabet() {
> final Base32 base32 =
> Base32.builder().setHexDecodeTable(true).setLineLength(0).get();
> final byte[] data = { 0 };
> final byte[] encoded = base32.encode(data);
> assertEquals("00======", new String(encoded, StandardCharsets.US_ASCII),
> "setHexDecodeTable(true) should encode with the Base32-Hex
> alphabet");
> assertArrayEquals(data, base32.decode(encoded),
> "the instance should decode its own output");
> } {code}
> Run:
> {code:java}
> mvn -q
> -Dtest=org.apache.commons.codec.binary.Base32Test#testBuilderSetHexDecodeTableEncodesWithHexAlphabet
> test {code}
> *Observed behavior*
> Encoding `\{ 0 }` does not produce the Base32-Hex form `"00======"`. The
> encoder emits the `-1` sentinel from the decode table as `0xFF`:
> {code:java}
> encode({0}) -> bytes [-1, -1, 61, 61, 61, 61, 61, 61] // 0xFF 0xFF ======
> decode(...) -> [] // round-trip lost
> {code}
> So the encoding assertion fails and the instance cannot decode its own output.
>
> *Expected behavior*
> `setHexDecodeTable(true)` should configure a `Base32` that encodes with the
> Base32-Hex alphabet and decodes its own output. It must set the encode table:
> {code:java}
> public Builder setHexDecodeTable(final boolean useHex) {
> return setEncodeTable(encodeTable(useHex));
> } {code}
>
> `setHexDecodeTable(...)` is a public builder API (`@since 1.18.0`). When used
> on its own — the natural way to select the Base32-Hex variant — it produces
> an instance that emits non-alphabet bytes and corrupts data, because the
> encode and decode tables are crossed.
>
>
> Same family as the custom-alphabet decode mismatch in `Base16.Builder`
> (CODEC-341 [https://issues.apache.org/jira/browse/CODEC-341]) and
> `Base32.Builder` (CODEC-342
> [https://issues.apache.org/jira/browse/CODEC-342]).
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)