The GitHub Actions job "Java CI" on commons-compress.git/fix/7z-size-validation 
has failed.
Run started by GitHub user ppkarwasz (triggered by ppkarwasz).

Head commit for run:
91e375d7ec238e7ed99bede569120413c7f523dc / Piotr P. Karwasz 
<[email protected]>
7z: unsigned number parsing and improved header validation

The 7z file format specification defines only **unsigned numbers** (`UINT64`, 
`REAL_UINT64`, `UINT32`). However, the current implementation allows parsing 
methods like `readUint64`, `getLong`, and `getInt` to return negative values 
and then handles those inconsistently in downstream logic.

This PR introduces a safer and more specification-compliant number parsing 
model.

### Key changes

* **Strict unsigned number parsing**

  * Parsing methods now *never* return negative numbers.
  * `readUint64`, `readUint64ToIntExact`, `readRealUint64`, and `readUint32` 
follow the terminology from `7zFormat.txt`.
  * Eliminates scattered negative-value checks that previously compensated for 
parsing issues.

* **Improved header integrity validation**

  * Before large allocations, the size is now validated against the **actual 
available data in the header** as well as the memory limit.
  * Prevents unnecessary or unsafe allocations when the archive is corrupted or 
truncated.

* **Correct numeric type usage**

  * Some fields represent 7z numbers as 64-bit values but are constrained 
internally to Java `int` limits.
  * These are now declared as `int` to signal real constraints in our 
implementation.

* **Consistent error handling**
  Parsing now throws only three well-defined exception types:

  | Condition                                                              | 
Exception                                    |
  | ---------------------------------------------------------------------- | 
-------------------------------------------- |
  | Declared structure exceeds `maxMemoryLimitKiB`                         | 
`MemoryLimitException`                       |
  | Missing data inside header (truncated or corrupt)                      | 
`ArchiveException("Corrupted 7z archive")`   |
  | Unsupported numeric values (too large for implementation) | 
`ArchiveException("Unsupported 7z archive")` |

  Note: `EOFException` is no longer used: a header with missing fields is not 
“EOF,” it is **corrupted**.

This PR lays groundwork for safer parsing and easier future maintenance by 
aligning number handling with the actual 7z specification and making header 
parsing behavior *predictable and robust*.

Report URL: https://github.com/apache/commons-compress/actions/runs/18606448904

With regards,
GitHub Actions via GitBox

Reply via email to