asf-tooling opened a new issue, #1063:
URL: https://github.com/apache/tooling-trusted-releases/issues/1063

   **ASVS Level(s):** L2-only
   
   **Description:**
   
   ### Summary
   The thread message fetching functionality retrieves email messages from 
Apache mailing list archives without applying HTTP timeouts or limiting 
concurrent requests. For threads with hundreds of messages, this creates 
hundreds of simultaneous HTTP requests with no semaphore control. Each request 
can hang indefinitely without timeouts, causing connection exhaustion and 
potential rate limiting by the remote server.
   
   ### Details
   The issue exists in `atr/util.py` in the `thread_messages()` and 
`get_urls_as_completed()` functions. Message fetching is unbounded in both 
concurrency and timeout.
   
   ### Recommended Remediation
   Apply timeout, message count limit, and concurrency control:
   
   ```python
   import asyncio
   import aiohttp
   
   _THREAD_TIMEOUT = aiohttp.ClientTimeout(total=30, connect=10)
   _MAX_THREAD_MESSAGES = 200
   _FETCH_CONCURRENCY = 20
   
   async def thread_messages(thread_url: str) -> list[dict]:
       """Fetch thread messages with limits."""
       message_urls = # ... extract message URLs
       
       # Limit message count
       if len(message_urls) > _MAX_THREAD_MESSAGES:
           log.warning(f"Thread has {len(message_urls)} messages, limiting to 
{_MAX_THREAD_MESSAGES}")
           message_urls = message_urls[:_MAX_THREAD_MESSAGES]
       
       # Fetch with concurrency limit and timeout
       messages = await get_urls_as_completed(
           message_urls,
           timeout=_THREAD_TIMEOUT,
           max_concurrent=_FETCH_CONCURRENCY
       )
       
       return messages
   
   async def get_urls_as_completed(
       urls: list[str],
       timeout: aiohttp.ClientTimeout,
       max_concurrent: int
   ) -> list[dict]:
       """Fetch URLs with concurrency control."""
       semaphore = asyncio.Semaphore(max_concurrent)
       
       async def fetch_with_semaphore(url: str):
           async with semaphore:
               async with util.create_secure_session(timeout=timeout) as 
session:
                   return await session.get(url)
       
       tasks = [fetch_with_semaphore(url) for url in urls]
       return await asyncio.gather(*tasks, return_exceptions=True)
   ```
   
   ### Acceptance Criteria
   - [ ] HTTP timeout added to thread message fetching
   - [ ] Message count limit enforced (_MAX_THREAD_MESSAGES)
   - [ ] Concurrency control added with semaphore
   - [ ] Unit tests verify timeout enforcement
   - [ ] Unit tests verify message count limit
   - [ ] Unit tests verify concurrency limit
   - [ ] Integration tests verify thread fetching with limits
   - [ ] Performance tests verify acceptable behavior with large threads
   
   ### References
   - Source reports: L2:15.1.3.md
   - Related findings: FINDING-193, FINDING-052
   - ASVS sections: 15.1.3
   - CWE: CWE-400
   
   ### Priority
   Medium
   
   ---


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to