Rachelint commented on PR #11802: URL: https://github.com/apache/datafusion/pull/11802#issuecomment-2271321844
It is really Interesting, I profile the two branch with such cmd: ``` sudo perf stat -e cycles,instructions,cache-references,cache-misses,bus-cycles ./target/release/dfbench-main-d clickbench --iterations 2 --path "./benchmarks/data/hits_partitioned/" --queries-path "./benchmarks/queries/clickbench/queries.sql" -o "./result" ``` Then, I found `this branch`'s `bus cycles` is higher than `main`, although its total `instructions` is lower. It means that the cpu do more memory accesses in this branch. Then I only revert the commit `b7262c2d56c6254dcb07a227ac89f9181c4cf570` which introducing `Arc<Statistic>`, but keep other commits in this pr, the q22 get as fast as the `main`! Seems the `Arc` here lead to more memory accesses? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
