[GitHub] [spark] cloud-fan commented on pull request #30559: [SPARK-33617][SQL] spark.sql.files.minPartitionNum effective for LocalTableScan

GitBox Wed, 02 Dec 2020 03:54:25 -0800


cloud-fan commented on pull request #30559:
URL: https://github.com/apache/spark/pull/30559#issuecomment-737183359



   It's a common issue that sometimes we don't want to use the spark default 
parallelism in Spark SQL, because:
   1. it depends on the cluster resource and can be very large
   2. it's a global config not per session.
   
   To overcome it, Spark SQL adds new configs for certain cases like file scan, 
AQE coalesce shuffle partitions, etc. Maybe we should have a SQL config to 
specify the default parallelism for Spark SQL queries, which by default is 
still the original default parallelism. Then we can use the new config in local 
table scan and other places in the future. cc @viirya @dongjoon-hyun 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] cloud-fan commented on pull request #30559: [SPARK-33617][SQL] spark.sql.files.minPartitionNum effective for LocalTableScan

Reply via email to