adriangb commented on PR #19390: URL: https://github.com/apache/datafusion/pull/19390#issuecomment-4555393957
> @adriangb Sorry for the late reply, I was off and somehow missed this notification. I agree that this change is pretty well tested and benchmarked now, so I'd be pretty surprised if it introduces major issues (we might still have some weird edge case of course, as usual.) What I'm a bit more concerned about is maintainability as it does introduce quite a bit of complex features. In particular, "Commit 7: Implement Direct Probe", while showing significant gains, is pretty tricky to understand well. I would never have come up with something like this without AI "helping" out. WDYT? Are the performance gains worth the additional future headaches when new work in this area is needed? That is my concern as well :) It's hard for me to judge what is complex because I've never seen it before / don't have a CS degree vs. is complex for anyone. I'm prepared to basically trust your judgment on it. How about this test for each commit / optimization: if your team came and said "hey we are hitting an `InList` bug in prod, please go fix it" and you had never seen the code before, would you feel "this is complicated but the X% perf gain makes it worth it" or "my first instinct is to rip this out it's not worth it". Maybe we pick 1-2 lowest "bang for the buck" optimizations in here and stash those for later? One thing we could do is merge as stacked PRs so that commits don't get squashed and it's easier to revert individual pieces. Ultimately if there were issues because this is self contained I think the initial answer would be to revert the problematic optimization. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
