Once you get to needing this level of fine-grained control, should you not consider using the programmatic API in part, to let you control individual jobs?
On Tue, Nov 15, 2016 at 1:19 AM leo9r <lezcano....@gmail.com> wrote: > Hi Daniel, > > I completely agree with your request. As the amount of data being processed > with SparkSQL grows, tweaking sql.shuffle.partitions becomes a common need > to prevent OOM and performance degradation. The fact that > sql.shuffle.partitions cannot be set several times in the same job/action, > because of the reason you explain, is a big inconvenient for the > development > of ETL pipelines. > > Have you got any answer or feedback in this regard? > > Thanks, > Leo Lezcano > > > > -- > View this message in context: > http://apache-spark-developers-list.1001551.n3.nabble.com/Spark-SQL-parameters-like-shuffle-partitions-should-be-stored-in-the-lineage-tp13240p19867.html > Sent from the Apache Spark Developers List mailing list archive at > Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >