avantgardnerio commented on issue #22723: URL: https://github.com/apache/datafusion/issues/22723#issuecomment-4615716656
@2010YOUY01 , re: issue 1 - PTAL at the actual [issue for the DataFusion 51 regression](https://github.com/apache/datafusion/issues/22739) that motivated this whole path. I should have lead with that, and I think it answers a few of your questions. Instead of doing that, I just cranked `HEADROOM` down to `5` and filed a bug for the first thing it hit (NLJ). Addressing your points (as I perceive them) not addressed by the above: 1. This conflates integration tests with memory accounting tests. i.e. the operator is at fault, not a random SLT. I would argue that most of our coverage is through SLTs so adding it here buys regression testing "for free". Normally that might be bad, because when an integration test breaks it's hard to debug, but this tracker usually points to the exact call stack of the bug. I think accounting regressions are real regressions and thus belong in the regression suite. 2. "small memory-limited queries" - I think the question is twofold: 1. what is "small"? and 2. how does it scale? 700K in a test might be "small", but if that multiplies out in a prod environment to be 7GB, then it's big. A possible solution: we can define both the existing relative `HEADROOM` and an `ABS_HEADROOM` and it doesn't fire unless it crosses both. I would lightly push back though because that means we specifically need to run all these with large amounts of data, which means we don't get the benefit of the existing suite. I think your other concerns (difficulty of tracking, size required to track, etc) are all addressed in the other issue. I hope I was able to give a proper response after I put in proper effort. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
