c21 commented on pull request #34702: URL: https://github.com/apache/spark/pull/34702#issuecomment-985181892
> Since there are quite some TPCDS queries that get plan changes, can we run a TPCDS benchmark to verify performance improvement? @cloud-fan - sure. Today I ran the TPCDS benchmark (sf=1) on one AWS `r3.xlarge` (same as https://github.com/apache/spark/pull/26049). I don't see much performance difference compared enabling and disabling this rule: * [Benchmark result with enabling this rule](https://gist.github.com/c21/b253d2d0ab8091b2953a633152927204) * [Benchmark result with disabling this rule](https://gist.github.com/c21/d16583e5acb35b402af25c47da45248a) Then I tried with sf=5, but the benchmark has task failure with no space left on device, so the benchmark cannot be conducted on single machine. Do you recommend disabling this rule by default? After adding sort aggregate code-gen, we can do more large scale testing to enable it. WDYT? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
