asf-tooling opened a new issue, #1020:
URL: https://github.com/apache/tooling-trusted-releases/issues/1020

   **ASVS Level(s):** [L1]
   
   **Description:**
   
   ### Summary
   The `compute_sha3_256()` function in `atr/hashes.py` accepts a bytes object 
and processes it entirely in memory, unlike the five other hash functions that 
use chunked file I/O. If this function is called with user-uploaded file data 
(up to 512MB per MAX_CONTENT_LENGTH), it could consume significant memory. Five 
of six hash functions use chunked processing with _HASH_CHUNK_SIZE (4MB), but 
this function loads the entire data into memory at once. Impact depends on 
actual call sites which were not verified in audit scope.
   
   ### Details
   Affected location: `atr/hashes.py` line 51
   
   The function signature is:
   ```python
   def compute_sha3_256(data: bytes) -> str:
       return hashlib.sha3_256(data).hexdigest()
   ```
   
   This loads the entire `data` bytes object into memory, unlike other hash 
functions that read files in chunks.
   
   ### Recommended Remediation
   Four-step remediation:
   
   **(1) Audit call sites** - Search codebase for all invocations of 
`compute_sha3_256()` to determine if it's called with user-uploaded data.
   
   **(2) Add size guard** - Implement MAX_IN_MEMORY_SIZE check (10MB) in the 
function:
   
   ```python
   MAX_IN_MEMORY_SIZE = 10 * 1024 * 1024  # 10MB
   
   def compute_sha3_256(data: bytes) -> str:
       if len(data) > MAX_IN_MEMORY_SIZE:
           raise ValueError(
               f"Data size {len(data)} exceeds MAX_IN_MEMORY_SIZE 
{MAX_IN_MEMORY_SIZE}. "
               "Use compute_sha3_256_file() for large data."
           )
       return hashlib.sha3_256(data).hexdigest()
   ```
   
   **(3) Provide streaming alternative** - Create `compute_sha3_256_file()` 
that uses chunked reads with _HASH_CHUNK_SIZE (4MB) for memory-safe processing 
of large files.
   
   **(4) Update call sites** - If any call sites use user-uploaded data, 
migrate to the streaming version.
   
   ### Acceptance Criteria
   - [ ] Call sites are audited for user data usage
   - [ ] Size guard prevents unbounded memory consumption
   - [ ] Streaming alternative exists for large files
   - [ ] Call sites using large data are migrated
   - [ ] Test cases verify size guard
   - [ ] Unit test verifying the fix
   
   ### References
   - Source reports: L1:5.2.1.md
   - Related findings: None
   - ASVS sections: 5.2.1
   
   ### Priority
   Medium
   
   ---


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to