suvodeep-pyne opened a new pull request, #17239:
URL: https://github.com/apache/pinot/pull/17239

   ## Summary
   
   Introduces `PinotThrottledLogger` utility to provide rate-limited exception 
logging for record transformers, solving the "Noisy Neighbor" problem where 
high-frequency errors consume all log quota and starve low-frequency critical 
errors.
   
   ### Key Changes
   
   - **New `PinotThrottledLogger` utility** 
(`pinot-common/src/main/java/org/apache/pinot/common/utils/PinotThrottledLogger.java`):
     - Implements per-exception-class rate limiting using Guava's RateLimiter
     - Each exception type gets independent rate limiter (prevents noisy 
neighbor problem)
     - Tracks and reports suppression counts when rate limit lifts
     - Thread-safe implementation using ConcurrentHashMap and AtomicLong
     - Falls back to DEBUG logging when rate=0 (backward compatible behavior)
   
   - **New metric**: `ServerMeter.LOGS_DROPPED_BY_THROTTLED_LOGGER` for 
observability
   
   - **Updated 6 transformers** to use throttled logger:
     - `FilterTransformer`
     - `DataTypeTransformer`
     - `ExpressionTransformer`
     - `ComplexTypeTransformer`
     - `SchemaConformingTransformer`
     - `TimeValidationTransformer`
   
   - **New configuration**: 
`IngestionConfig.ingestionExceptionLogRateLimitPerMin` (default: 0)
   
   ### Backward Compatibility
   
   - Default rate limit is 0, which disables throttling and falls back to DEBUG 
logging (original behavior)
   - No behavior change for existing deployments unless explicitly configured
   
   ### Benefits
   
   1. **Solves Noisy Neighbor Problem**: High-frequency `NumberFormatException` 
won't starve low-frequency `ConnectException`
   2. **Production Visibility**: When rate > 0, exceptions log at WARN/ERROR 
level instead of DEBUG
   3. **Controlled Log Volume**: Prevents log flooding during data quality 
issues
   4. **Observability**: Per-table metric tracks dropped logs
   
   ### Example Output
   
   ```
   WARN  [FilterTransformer] Caught exception while executing filter function...
   java.lang.NumberFormatException: For input string: "abc"
   
   [... 4 more similar logs within 1 minute ...]
   
   [After rate limit window passes]
   WARN  [FilterTransformer] ... Suppressed 9995 occurrences of 
NumberFormatException ...
   WARN  [FilterTransformer] Caught exception while executing filter function...
   ```
   
   Meanwhile, different exception types (e.g., `ConnectException`) log 
immediately using independent rate limiters.
   
   ### Testing
   
   - Added comprehensive unit tests covering:
     - Independent rate limiting per exception class
     - Backward compatibility (null config → DEBUG)
     - Explicit zero rate → DEBUG fallback
     - Thread safety under concurrent load
   
   All tests pass: 4/4 in `PinotThrottledLoggerTest`
   
   ### Implementation Details
   
   - API simplified: Transformers pass `IngestionConfig` directly to logger
   - Rate conversion logic (per-min → per-sec) encapsulated in logger
   - Metric emission only when table name is provided
   - Memory-bounded: Exception class count typically 10-50
   
   ## Test plan
   
   - [x] Unit tests pass
   - [x] Compilation successful across all modules
   - [ ] Manual testing with real ingestion workload
   - [ ] Verify metric emission in production-like environment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to