xudong963 opened a new issue, #19846:
URL: https://github.com/apache/datafusion/issues/19846
The warning message in `datafusion/physical-plan/src/spill/mod.rs` (lines
157-162) is being triggered very aggressively in production environments,
causing log noise.
```rust
warn!(
"Record batch memory usage ({actual_size} bytes) exceeds the expected
limit ({max_record_batch_memory} bytes) \n\
by more than the allowed tolerance ({SPILL_BATCH_MEMORY_MARGIN}
bytes).\n\
This likely indicates a bug in memory accounting during spilling.\n\
Please report this issue in
https://github.com/apache/datafusion/issues/17340."
);
```
Since this is a known issue (tracked in #17340) and doesn't affect
functional correctness, we should consider one of the following approaches:
### Option 1: Downgrade to debug level
Change `warn!` to `debug!` to reduce production noise while keeping the
diagnostic information available for development.
### Option 2: Increase tolerance margin
Adjust `SPILL_BATCH_MEMORY_MARGIN` from 4096 bytes to a more realistic value
that accounts for expected Arrow IPC overhead. But the realistic value could be
case by case and needs efforts to investigate.
I lean towards **Option 1**, it's the easiest way to avoid the influence.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]