cxzl25 commented on pull request #32583: URL: https://github.com/apache/spark/pull/32583#issuecomment-894783546
> > Hive: Time taken: 2.816 seconds, Fetched: 10 row(s) > > Spark: Time taken: 248 seconds, Fetched: 10 row(s) > > Patch and verify: > > Time taken: 19.241 seconds, Fetched 10 row(s) > > @cxzl25 I'm a little surprised that Hive is still so much faster than Spark even with the patch. Curious if you have any insight on this. Under this patch, Spark `listPartitionsByFilter` and Hive `doEvalClientSide` method are about the same speed, but Spark will call `listPartitionsByFilter` in several places, the speed will be slower. `PruneHiveTablePartitions` `OptimizeMetadataOnlyQuery` `HiveTableScanExec` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
