[jira] [Commented] (SPARK-2046) Support config properties that are changeable across tasks/stages within a job
[ https://issues.apache.org/jira/browse/SPARK-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019457#comment-14019457 ] Zongheng Yang commented on SPARK-2046: -- [~shivaram] Support config properties that are changeable across tasks/stages within a job -- Key: SPARK-2046 URL: https://issues.apache.org/jira/browse/SPARK-2046 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Zongheng Yang Suppose an application consists of multiple stages, where some stages contain computation-intensive tasks, and other stages contain less computation-intensive (or otherwise ordinary) tasks. For such job to run efficiently, it might make sense to provide user a function to set spark.task.cpus to a high number right before the computation-intensive stages/tasks are getting generated in the user code, and set the property to a lower number for other stages/tasks. As a first step, supporting this feature across stages instead of the more fine-grained task-level might suffice. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2046) Support config properties that are changeable across tasks/stages within a job
[ https://issues.apache.org/jira/browse/SPARK-2046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019458#comment-14019458 ] Shivaram Venkataraman commented on SPARK-2046: -- FWIW I have an older implementation that did this using LocalProperties in SparkContext. https://github.com/shivaram/spark-1/commit/256a34c12d4f3c8ed1a09174f331868a7bf30e11 I haven't tested it in a setting with multiple jobs running at the same time though Support config properties that are changeable across tasks/stages within a job -- Key: SPARK-2046 URL: https://issues.apache.org/jira/browse/SPARK-2046 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Zongheng Yang Suppose an application consists of multiple stages, where some stages contain computation-intensive tasks, and other stages contain less computation-intensive (or otherwise ordinary) tasks. For such job to run efficiently, it might make sense to provide user a function to set spark.task.cpus to a high number right before the computation-intensive stages/tasks are getting generated in the user code, and set the property to a lower number for other stages/tasks. As a first step, supporting this feature across stages instead of the more fine-grained task-level might suffice. -- This message was sent by Atlassian JIRA (v6.2#6252)