pepijnve commented on PR #16196: URL: https://github.com/apache/datafusion/pull/16196#issuecomment-2944505070
> I feel like we are getting close to a point where we start having not-so-fruitful discussions. I think I have made a good effort to make my arguments and reasoning clear. @ozankabak My apologies. I didn't mean to derail your efforts here and I'll refrain from adding any more noise to the thread (beyond this, sorry). I appreciate the fact that you guys have much much more experience working in this codebase. I'm really trying to make a good faith contribution here where we compare the pros/cons of both approaches via measurements (API impact, performance impact, etc.), but I'll back off. FWIW, I've added some more tests cases in the meantime that you guys can use or ignore however you see fit. I also have some benchmark results from a first run at https://gist.github.com/pepijnve/21fbd480ae3e60f780446ace974d3ef5. It's a very mixed bag. I'm going to run the suite again a couple of times to see if this is consistent or not before I dig deeper. During the runs I'm seeing so much variability in runtime on both branches that I have my doubts how meaningful these results are. Would it be useful to let the benchmark perform more runs and adapt the tool a bit to report on mean and standard deviation rather than just average? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org