2010YOUY01 commented on PR #18488: URL: https://github.com/apache/datafusion/pull/18488#issuecomment-3500400479
> > It's running in a noisy cloud environment, and tpch_mem takes quite short time, so it might not be accurate. > > Very interesting > > > I’ve verified this with tpch_mem10 locally, and it actually slows down several queries. > > Thanks for confirming! > > > I tried to make this count distinct indices faster (and sent a PR [feniljain#2](https://github.com/feniljain/datafusion/pull/2)), > > Very curious about this PR, cause it seems you have written the same logic as mine, but using loops instead, am I missing some detail? > > Is it that the `None` check, which is causing all this overhead? If we can make the loop body really simple, the compiler can figure out how to generate more efficient machine code like using SIMD instructions, and the hardware can execute faster through several mechanisms (e.g. better memory prefetching), this can result in several times of speed up for the equivalent implementations. I'm not entirely sure under which circumstances the compiler might fail to optimize, so I try to keep the loop body as simple as possible — and that usually works well. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
