[GitHub] [spark] sunchao commented on pull request #33382: [SPARK-36137][SQL] HiveShim should fallback to getAllPartitionsOf even if directSQL is enabled in remote HMS

GitBox Thu, 15 Jul 2021 23:37:26 -0700


sunchao commented on pull request #33382:
URL: https://github.com/apache/spark/pull/33382#issuecomment-881214247



   > Could you share us your opinion about the performance regressions for the 
non-failed queries, @sunchao ?
   
   I'm not sure we should call it a regression since it switch the behavior 
from failing the query (with no workaround at the moment) to letting it 
succeed. If there are huge number of partitions, the cost for fetching 
partition metadata from HMS could increase as we are returning all the 
partitions instead of the filtered ones. However _normally_ this also means 
there are more data to be processed so the increased planning time could be 
amortized in the total job execution time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] sunchao commented on pull request #33382: [SPARK-36137][SQL] HiveShim should fallback to getAllPartitionsOf even if directSQL is enabled in remote HMS

Reply via email to