[jira] [Commented] (SPARK-2972) APPLICATION_COMPLETE not created in Python unless context explicitly stopped
[ https://issues.apache.org/jira/browse/SPARK-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126686#comment-14126686 ] Shay Rojansky commented on SPARK-2972: -- you're right! imho, this means your program is written better than the examples. it would be good to enhance the examples w/ try/finally semantics. however, Then I can submit a pull request for that, no problem. getting the shutdown semantics right is difficult, and may not apply broadly across applications. for instance, your application may want to catch a failure in stop() and retry to make sure that a history record is written. another application may be ok w/ best effort writing history events. still another application may want to exit w/o stop() to avoid having a history event written. I don't think explicit stop() should be removed - of course users may choose to manually manage stop(), catch exceptions and retry, etc. For me it's just a question of what to do with a context that *didn't* get explicitly closed at the end of the application. As to apps that need to exit without a history event - it's a requirement that's hard to imagine (for me). At least with YARN/Mesos you will be leaving traces anyway, and these traces will be partial and difficult to understand, since the corresponding Spark traces haven't been produced. asking the context creator to do context destruction shifts burden to the application writer and maintains flexibility for applications. I guess it's a question of how high-level a tool you want Spark to be. It seems a bit strange for Spark to handle so much of the troublesome low-level details, while forcing the user to boilerplate-wrap all their programs with try/finally. But I do understand the points you're making and it can be argued both ways. As a minimum, I suggest having context implement the language-specific dispose patterns ('using' in Java, 'with' in Python), so at least the code looks better? APPLICATION_COMPLETE not created in Python unless context explicitly stopped Key: SPARK-2972 URL: https://issues.apache.org/jira/browse/SPARK-2972 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.0.2 Environment: Cloudera 5.1, yarn master on ubuntu precise Reporter: Shay Rojansky If you don't explicitly stop a SparkContext at the end of a Python application with sc.stop(), an APPLICATION_COMPLETE file isn't created and the job doesn't get picked up by the history server. This can be easily reproduced with pyspark (but affects scripts as well). The current workaround is to wrap the entire script with a try/finally and stop manually. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3452) Maven build should skip publishing artifacts people shouldn't depend on
[ https://issues.apache.org/jira/browse/SPARK-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3452: --- Priority: Critical (was: Major) Maven build should skip publishing artifacts people shouldn't depend on --- Key: SPARK-3452 URL: https://issues.apache.org/jira/browse/SPARK-3452 Project: Spark Issue Type: Bug Components: Build Reporter: Patrick Wendell Assignee: Prashant Sharma Priority: Critical I think it's easy to do this by just adding a skip configuration somewhere. We shouldn't be publishing repl, yarn, assembly, tools, repl-bin, or examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3452) Maven build should skip publishing artifacts people shouldn't depend on
[ https://issues.apache.org/jira/browse/SPARK-3452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3452: --- Affects Version/s: 1.1.0 1.0.0 Maven build should skip publishing artifacts people shouldn't depend on --- Key: SPARK-3452 URL: https://issues.apache.org/jira/browse/SPARK-3452 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.0, 1.1.0 Reporter: Patrick Wendell Assignee: Prashant Sharma Priority: Critical I think it's easy to do this by just adding a skip configuration somewhere. We shouldn't be publishing repl, yarn, assembly, tools, repl-bin, or examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3452) Maven build should skip publishing artifacts people shouldn't depend on
Patrick Wendell created SPARK-3452: -- Summary: Maven build should skip publishing artifacts people shouldn't depend on Key: SPARK-3452 URL: https://issues.apache.org/jira/browse/SPARK-3452 Project: Spark Issue Type: Bug Components: Build Reporter: Patrick Wendell Assignee: Prashant Sharma I think it's easy to do this by just adding a skip configuration somewhere. We shouldn't be publishing repl, yarn, assembly, tools, repl-bin, or examples. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3404) SparkSubmitSuite fails with spark-submit exits with code 1
[ https://issues.apache.org/jira/browse/SPARK-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126698#comment-14126698 ] Apache Spark commented on SPARK-3404: - User 'srowen' has created a pull request for this issue: https://github.com/apache/spark/pull/2328 SparkSubmitSuite fails with spark-submit exits with code 1 Key: SPARK-3404 URL: https://issues.apache.org/jira/browse/SPARK-3404 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.2, 1.1.0 Reporter: Sean Owen Priority: Critical Maven-based Jenkins builds have been failing for over a month. For example: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-pre-YARN/ It's SparkSubmitSuite that fails. For example: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-pre-YARN/541/hadoop.version=2.0.0-mr1-cdh4.1.2,label=centos/consoleFull {code} SparkSubmitSuite ... - launch simple application with spark-submit *** FAILED *** org.apache.spark.SparkException: Process List(./bin/spark-submit, --class, org.apache.spark.deploy.SimpleApplicationTest, --name, testApp, --master, local, file:/tmp/1409815981504-0/testJar-1409815981505.jar) exited with code 1 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:837) at org.apache.spark.deploy.SparkSubmitSuite.runSparkSubmit(SparkSubmitSuite.scala:311) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply$mcV$sp(SparkSubmitSuite.scala:291) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply(SparkSubmitSuite.scala:284) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply(SparkSubmitSuite.scala:284) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) ... - spark submit includes jars passed in through --jar *** FAILED *** org.apache.spark.SparkException: Process List(./bin/spark-submit, --class, org.apache.spark.deploy.JarCreationTest, --name, testApp, --master, local-cluster[2,1,512], --jars, file:/tmp/1409815984960-0/testJar-1409815985029.jar,file:/tmp/1409815985030-0/testJar-1409815985087.jar, file:/tmp/1409815984959-0/testJar-1409815984959.jar) exited with code 1 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:837) at org.apache.spark.deploy.SparkSubmitSuite.runSparkSubmit(SparkSubmitSuite.scala:311) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$15.apply$mcV$sp(SparkSubmitSuite.scala:305) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$15.apply(SparkSubmitSuite.scala:294) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$15.apply(SparkSubmitSuite.scala:294) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) ... {code} SBT builds don't fail, so it is likely to be due to some difference in how the tests are run rather than a problem with test or core project. This is related to http://issues.apache.org/jira/browse/SPARK-3330 but the cause identified in that JIRA is, at least, not the only cause. (Although, it wouldn't hurt to be doubly-sure this is not an issue by changing the Jenkins config to invoke {{mvn clean mvn ... package}} {{mvn ... clean package}}.) This JIRA tracks investigation into a different cause. Right now I have some further information but not a PR yet. Part of the issue is that there is no clue in the log about why {{spark-submit}} exited with status 1. See https://github.com/apache/spark/pull/2108/files and https://issues.apache.org/jira/browse/SPARK-3193 for a change that would at least print stdout to the log too. The SparkSubmit program exits with 1 when the main class it is supposed to run is not found (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L322) This is for example SimpleApplicationTest (https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala#L339) The test actually submits an empty JAR not containing this class. It relies on {{spark-submit}} finding the class within the compiled test-classes of the Spark project. However it does seem to be compiled and present even with Maven. If modified to print stdout and stderr, and dump the
[jira] [Created] (SPARK-3453) Refactor Netty module to use BlockTransferService
Reynold Xin created SPARK-3453: -- Summary: Refactor Netty module to use BlockTransferService Key: SPARK-3453 URL: https://issues.apache.org/jira/browse/SPARK-3453 Project: Spark Issue Type: Sub-task Reporter: Reynold Xin Assignee: Reynold Xin -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3453) Refactor Netty module to use BlockTransferService
[ https://issues.apache.org/jira/browse/SPARK-3453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126726#comment-14126726 ] Apache Spark commented on SPARK-3453: - User 'rxin' has created a pull request for this issue: https://github.com/apache/spark/pull/2330 Refactor Netty module to use BlockTransferService - Key: SPARK-3453 URL: https://issues.apache.org/jira/browse/SPARK-3453 Project: Spark Issue Type: Sub-task Components: Shuffle, Spark Core Reporter: Reynold Xin Assignee: Reynold Xin -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3272) Calculate prediction for nodes separately from calculating information gain for splits in decision tree
[ https://issues.apache.org/jira/browse/SPARK-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126791#comment-14126791 ] Qiping Li commented on SPARK-3272: -- Hi Joseph, I created a PR [#2332|https://github.com/apache/spark/pull/2332] based on our discussion. Could you please help me review this, thanks for your help. Calculate prediction for nodes separately from calculating information gain for splits in decision tree --- Key: SPARK-3272 URL: https://issues.apache.org/jira/browse/SPARK-3272 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.0.2 Reporter: Qiping Li Fix For: 1.1.0 In current implementation, prediction for a node is calculated along with calculation of information gain stats for each possible splits. The value to predict for a specific node is determined, no matter what the splits are. To save computation, we can first calculate prediction first and then calculate information gain stats for each split. This is also necessary if we want to support minimum instances per node parameters([SPARK-2207|https://issues.apache.org/jira/browse/SPARK-2207]) because when all splits don't satisfy minimum instances requirement , we don't use information gain of any splits. There should be a way to get the prediction value. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3272) Calculate prediction for nodes separately from calculating information gain for splits in decision tree
[ https://issues.apache.org/jira/browse/SPARK-3272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126799#comment-14126799 ] Apache Spark commented on SPARK-3272: - User 'chouqin' has created a pull request for this issue: https://github.com/apache/spark/pull/2332 Calculate prediction for nodes separately from calculating information gain for splits in decision tree --- Key: SPARK-3272 URL: https://issues.apache.org/jira/browse/SPARK-3272 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.0.2 Reporter: Qiping Li Fix For: 1.1.0 In current implementation, prediction for a node is calculated along with calculation of information gain stats for each possible splits. The value to predict for a specific node is determined, no matter what the splits are. To save computation, we can first calculate prediction first and then calculate information gain stats for each split. This is also necessary if we want to support minimum instances per node parameters([SPARK-2207|https://issues.apache.org/jira/browse/SPARK-2207]) because when all splits don't satisfy minimum instances requirement , we don't use information gain of any splits. There should be a way to get the prediction value. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2207) Add minimum information gain and minimum instances per node as training parameters for decision tree.
[ https://issues.apache.org/jira/browse/SPARK-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126798#comment-14126798 ] Apache Spark commented on SPARK-2207: - User 'chouqin' has created a pull request for this issue: https://github.com/apache/spark/pull/2332 Add minimum information gain and minimum instances per node as training parameters for decision tree. - Key: SPARK-2207 URL: https://issues.apache.org/jira/browse/SPARK-2207 Project: Spark Issue Type: New Feature Components: MLlib Affects Versions: 1.0.0 Reporter: Manish Amde Assignee: Qiping Li -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3222) cross join support in HiveQl
[ https://issues.apache.org/jira/browse/SPARK-3222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adrian Wang resolved SPARK-3222. Resolution: Fixed Fix Version/s: 1.1.0 resolved by pr #2124 cross join support in HiveQl Key: SPARK-3222 URL: https://issues.apache.org/jira/browse/SPARK-3222 Project: Spark Issue Type: New Feature Components: SQL Reporter: Adrian Wang Fix For: 1.1.0 Spark SQL hiveQl should support cross join. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3454) Expose JSON representation of data shown in WebUI
[ https://issues.apache.org/jira/browse/SPARK-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-3454: -- Summary: Expose JSON representation of data shown in WebUI (was: Expose JSON expression of data shown in WebUI) Expose JSON representation of data shown in WebUI - Key: SPARK-3454 URL: https://issues.apache.org/jira/browse/SPARK-3454 Project: Spark Issue Type: Improvement Components: Web UI Affects Versions: 1.1.0 Reporter: Kousuke Saruta If WebUI support to JSON format extracting, it's helpful for user who want to analyse stage / task / executor information. Fortunately, WebUI has renderJson method so we can implement the method in each subclass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3454) Expose JSON expression of data shown in WebUI
Kousuke Saruta created SPARK-3454: - Summary: Expose JSON expression of data shown in WebUI Key: SPARK-3454 URL: https://issues.apache.org/jira/browse/SPARK-3454 Project: Spark Issue Type: Improvement Components: Web UI Affects Versions: 1.1.0 Reporter: Kousuke Saruta If WebUI support to JSON format extracting, it's helpful for user who want to analyse stage / task / executor information. Fortunately, WebUI has renderJson method so we can implement the method in each subclass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3454) Expose JSON representation of data shown in WebUI
[ https://issues.apache.org/jira/browse/SPARK-3454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126854#comment-14126854 ] Apache Spark commented on SPARK-3454: - User 'sarutak' has created a pull request for this issue: https://github.com/apache/spark/pull/2333 Expose JSON representation of data shown in WebUI - Key: SPARK-3454 URL: https://issues.apache.org/jira/browse/SPARK-3454 Project: Spark Issue Type: Improvement Components: Web UI Affects Versions: 1.1.0 Reporter: Kousuke Saruta If WebUI support to JSON format extracting, it's helpful for user who want to analyse stage / task / executor information. Fortunately, WebUI has renderJson method so we can implement the method in each subclass. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3455) **HotFix** Unit test failed due to can not resolve the attribute references
Cheng Hao created SPARK-3455: Summary: **HotFix** Unit test failed due to can not resolve the attribute references Key: SPARK-3455 URL: https://issues.apache.org/jira/browse/SPARK-3455 Project: Spark Issue Type: Bug Components: SQL Reporter: Cheng Hao Priority: Blocker The test case SPARK-3349 partitioning after limit failed, the exception as : {panel} 23:10:04.117 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 274.0 failed 1 times; aborting job [info] - SPARK-3349 partitioning after limit *** FAILED *** [info] Exception thrown while executing query: [info] == Parsed Logical Plan == [info] Project [*] [info]Join Inner, Some(('subset1.n = 'lowerCaseData.n)) [info] UnresolvedRelation None, lowerCaseData, None [info] UnresolvedRelation None, subset1, None [info] [info] == Analyzed Logical Plan == [info] Project [n#605,l#606,n#12] [info]Join Inner, Some((n#12 = n#605)) [info] SparkLogicalPlan (ExistingRdd [n#605,l#606], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] Limit 2 [info] Sort [n#12 DESC] [info] Distinct [info]Project [n#12] [info] SparkLogicalPlan (ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] [info] == Optimized Logical Plan == [info] Project [n#605,l#606,n#12] [info]Join Inner, Some((n#12 = n#605)) [info] SparkLogicalPlan (ExistingRdd [n#605,l#606], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] Limit 2 [info] Sort [n#12 DESC] [info] Distinct [info]Project [n#12] [info] SparkLogicalPlan (ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] [info] == Physical Plan == [info] Project [n#605,l#606,n#12] [info]ShuffledHashJoin [n#605], [n#12], BuildRight [info] Exchange (HashPartitioning [n#605], 10) [info] ExistingRdd [n#605,l#606], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] Exchange (HashPartitioning [n#12], 10) [info] TakeOrdered 2, [n#12 DESC] [info] Distinct false [info]Exchange (HashPartitioning [n#12], 10) [info] Distinct true [info] Project [n#12] [info] ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] [info] Code Generation: false [info] == RDD == [info] == Exception == [info] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: [info] Exchange (HashPartitioning [n#12], 10) [info]TakeOrdered 2, [n#12 DESC] [info] Distinct false [info] Exchange (HashPartitioning [n#12], 10) [info] Distinct true [info]Project [n#12] [info] ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] [info] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: [info] Exchange (HashPartitioning [n#12], 10) [info]TakeOrdered 2, [n#12 DESC] [info] Distinct false [info] Exchange (HashPartitioning [n#12], 10) [info] Distinct true [info]Project [n#12] [info] ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] [info] at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:47) [info] at org.apache.spark.sql.execution.Exchange.execute(Exchange.scala:44) [info] at org.apache.spark.sql.execution.ShuffledHashJoin.execute(joins.scala:354) [info] at org.apache.spark.sql.execution.Project.execute(basicOperators.scala:42) [info] at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:85) [info] at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438) [info] at org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:40) [info] at org.apache.spark.sql.SQLQuerySuite$$anonfun$31.apply$mcV$sp(SQLQuerySuite.scala:369) [info] at org.apache.spark.sql.SQLQuerySuite$$anonfun$31.apply(SQLQuerySuite.scala:362) [info] at org.apache.spark.sql.SQLQuerySuite$$anonfun$31.apply(SQLQuerySuite.scala:362) [info] at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) [info] at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) [info] at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) [info] at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) [info] at org.scalatest.Transformer.apply(Transformer.scala:22) [info] at org.scalatest.Transformer.apply(Transformer.scala:20) [info] at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:158) [info] at
[jira] [Commented] (SPARK-3455) **HotFix** Unit test failed due to can not resolve the attribute references
[ https://issues.apache.org/jira/browse/SPARK-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14126909#comment-14126909 ] Apache Spark commented on SPARK-3455: - User 'chenghao-intel' has created a pull request for this issue: https://github.com/apache/spark/pull/2334 **HotFix** Unit test failed due to can not resolve the attribute references --- Key: SPARK-3455 URL: https://issues.apache.org/jira/browse/SPARK-3455 Project: Spark Issue Type: Bug Components: SQL Reporter: Cheng Hao Priority: Blocker The test case SPARK-3349 partitioning after limit failed, the exception as : {panel} 23:10:04.117 ERROR org.apache.spark.scheduler.TaskSetManager: Task 0 in stage 274.0 failed 1 times; aborting job [info] - SPARK-3349 partitioning after limit *** FAILED *** [info] Exception thrown while executing query: [info] == Parsed Logical Plan == [info] Project [*] [info]Join Inner, Some(('subset1.n = 'lowerCaseData.n)) [info] UnresolvedRelation None, lowerCaseData, None [info] UnresolvedRelation None, subset1, None [info] [info] == Analyzed Logical Plan == [info] Project [n#605,l#606,n#12] [info]Join Inner, Some((n#12 = n#605)) [info] SparkLogicalPlan (ExistingRdd [n#605,l#606], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] Limit 2 [info] Sort [n#12 DESC] [info] Distinct [info]Project [n#12] [info] SparkLogicalPlan (ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] [info] == Optimized Logical Plan == [info] Project [n#605,l#606,n#12] [info]Join Inner, Some((n#12 = n#605)) [info] SparkLogicalPlan (ExistingRdd [n#605,l#606], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] Limit 2 [info] Sort [n#12 DESC] [info] Distinct [info]Project [n#12] [info] SparkLogicalPlan (ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219) [info] [info] == Physical Plan == [info] Project [n#605,l#606,n#12] [info]ShuffledHashJoin [n#605], [n#12], BuildRight [info] Exchange (HashPartitioning [n#605], 10) [info] ExistingRdd [n#605,l#606], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] Exchange (HashPartitioning [n#12], 10) [info] TakeOrdered 2, [n#12 DESC] [info] Distinct false [info]Exchange (HashPartitioning [n#12], 10) [info] Distinct true [info] Project [n#12] [info] ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] [info] Code Generation: false [info] == RDD == [info] == Exception == [info] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: [info] Exchange (HashPartitioning [n#12], 10) [info]TakeOrdered 2, [n#12 DESC] [info] Distinct false [info] Exchange (HashPartitioning [n#12], 10) [info] Distinct true [info]Project [n#12] [info] ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] [info] org.apache.spark.sql.catalyst.errors.package$TreeNodeException: execute, tree: [info] Exchange (HashPartitioning [n#12], 10) [info]TakeOrdered 2, [n#12 DESC] [info] Distinct false [info] Exchange (HashPartitioning [n#12], 10) [info] Distinct true [info]Project [n#12] [info] ExistingRdd [n#607,l#608], MapPartitionsRDD[13] at mapPartitions at basicOperators.scala:219 [info] [info]at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:47) [info]at org.apache.spark.sql.execution.Exchange.execute(Exchange.scala:44) [info]at org.apache.spark.sql.execution.ShuffledHashJoin.execute(joins.scala:354) [info]at org.apache.spark.sql.execution.Project.execute(basicOperators.scala:42) [info]at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:85) [info]at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438) [info]at org.apache.spark.sql.QueryTest.checkAnswer(QueryTest.scala:40) [info]at org.apache.spark.sql.SQLQuerySuite$$anonfun$31.apply$mcV$sp(SQLQuerySuite.scala:369) [info]at org.apache.spark.sql.SQLQuerySuite$$anonfun$31.apply(SQLQuerySuite.scala:362) [info]at org.apache.spark.sql.SQLQuerySuite$$anonfun$31.apply(SQLQuerySuite.scala:362) [info]at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) [info]at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22)
[jira] [Commented] (SPARK-2972) APPLICATION_COMPLETE not created in Python unless context explicitly stopped
[ https://issues.apache.org/jira/browse/SPARK-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127016#comment-14127016 ] Matthew Farrellee commented on SPARK-2972: -- I suggest having context implement the language-specific dispose patterns ('using' in Java, 'with' in Python), so at least the code looks better? that's a great idea. i'll spec this out for python, would you care to do it for java / scala? APPLICATION_COMPLETE not created in Python unless context explicitly stopped Key: SPARK-2972 URL: https://issues.apache.org/jira/browse/SPARK-2972 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.0.2 Environment: Cloudera 5.1, yarn master on ubuntu precise Reporter: Shay Rojansky If you don't explicitly stop a SparkContext at the end of a Python application with sc.stop(), an APPLICATION_COMPLETE file isn't created and the job doesn't get picked up by the history server. This can be easily reproduced with pyspark (but affects scripts as well). The current workaround is to wrap the entire script with a try/finally and stop manually. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3456) YarnAllocator can lose container requests to RM
Thomas Graves created SPARK-3456: Summary: YarnAllocator can lose container requests to RM Key: SPARK-3456 URL: https://issues.apache.org/jira/browse/SPARK-3456 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.2.0 Reporter: Thomas Graves Priority: Critical I haven't actually tested this yet, but I believe that spark on yarn can lose container requests to the RM. The reason is that we ask for the total number upfront (say x) but then we don't ask for anymore unless some are missing and if we do then we could erase the original request. For example - ask for 3 containers - 1 is allocated - ask for 0 containers since asked for 3 originally (2 left) - the 1 allocated dies - We now ask for 1 since its missing, this will override whatever is on the yarn side (in this case 2). Then we lose the 2 more we need. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3456) YarnAllocator can lose container requests to RM
[ https://issues.apache.org/jira/browse/SPARK-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127088#comment-14127088 ] Thomas Graves commented on SPARK-3456: -- Note this is only a problem on yarn alpha because in stable we use the AMRMClient interface which actually does an add. YarnAllocator can lose container requests to RM --- Key: SPARK-3456 URL: https://issues.apache.org/jira/browse/SPARK-3456 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.2.0 Reporter: Thomas Graves Priority: Critical I haven't actually tested this yet, but I believe that spark on yarn can lose container requests to the RM. The reason is that we ask for the total number upfront (say x) but then we don't ask for anymore unless some are missing and if we do then we could erase the original request. For example - ask for 3 containers - 1 is allocated - ask for 0 containers since asked for 3 originally (2 left) - the 1 allocated dies - We now ask for 1 since its missing, this will override whatever is on the yarn side (in this case 2). Then we lose the 2 more we need. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3457) ConcurrentModificationException starting up pyspark
Shay Rojansky created SPARK-3457: Summary: ConcurrentModificationException starting up pyspark Key: SPARK-3457 URL: https://issues.apache.org/jira/browse/SPARK-3457 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Hadoop 2.3 (CDH 5.1) on Ubuntu precise Reporter: Shay Rojansky Just downloaded Spark 1.1.0-rc4. Launching pyspark for the very first time in yarn-client mode (no additional params or anything), I got the exception below. Rerunning pyspark 5 times afterwards did not reproduce the issue. 14/09/09 18:07:58 INFO YarnClientSchedulerBackend: Application report from ASM: appMasterRpcPort: 0 appStartTime: 1410275267606 yarnAppState: RUNNING 14/09/09 18:07:58 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, PROXY_HOST=master. grid.eaglerd.local,PROXY_URI_BASE=http://master.grid.eaglerd.local:8088/proxy/application_1410268447887_0011, /proxy/application_1410268447887_0011 Traceback (most recent call last): File /opt/spark/python/pyspark/shell.py, line 44, in module 14/09/09 18:07:58 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter sc = SparkContext(appName=PySparkShell, pyFiles=add_files) File /opt/spark/python/pyspark/context.py, line 107, in __init__ conf) File /opt/spark/python/pyspark/context.py, line 155, in _do_init self._jsc = self._initialize_context(self._conf._jconf) File /opt/spark/python/pyspark/context.py, line 201, in _initialize_context return self._jvm.JavaSparkContext(jconf) File /opt/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py, line 701, in __call__ File /opt/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py, line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$3.next(Wrappers.scala:458) at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$3.next(Wrappers.scala:454) at scala.collection.Iterator$class.toStream(Iterator.scala:1143) at scala.collection.AbstractIterator.toStream(Iterator.scala:1157) at scala.collection.Iterator$$anonfun$toStream$1.apply(Iterator.scala:1143) at scala.collection.Iterator$$anonfun$toStream$1.apply(Iterator.scala:1143) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1085) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1077) at scala.collection.immutable.Stream$$anonfun$filteredTail$1.apply(Stream.scala:1149) at scala.collection.immutable.Stream$$anonfun$filteredTail$1.apply(Stream.scala:1149) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1085) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1077) at scala.collection.immutable.Stream.length(Stream.scala:284) at scala.collection.SeqLike$class.sorted(SeqLike.scala:608) at scala.collection.AbstractSeq.sorted(Seq.scala:40) at org.apache.spark.SparkEnv$.environmentDetails(SparkEnv.scala:324) at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:1297) at org.apache.spark.SparkContext.init(SparkContext.scala:334) at org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:53) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3457) ConcurrentModificationException starting up pyspark
[ https://issues.apache.org/jira/browse/SPARK-3457?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shay Rojansky updated SPARK-3457: - Description: Just downloaded Spark 1.1.0-rc4. Launching pyspark for the very first time in yarn-client mode (no additional params or anything), I got the exception below. Rerunning pyspark 5 times afterwards did not reproduce the issue. {code} 14/09/09 18:07:58 INFO YarnClientSchedulerBackend: Application report from ASM: appMasterRpcPort: 0 appStartTime: 1410275267606 yarnAppState: RUNNING 14/09/09 18:07:58 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, PROXY_HOST=master. grid.eaglerd.local,PROXY_URI_BASE=http://master.grid.eaglerd.local:8088/proxy/application_1410268447887_0011, /proxy/application_1410268447887_0011 Traceback (most recent call last): File /opt/spark/python/pyspark/shell.py, line 44, in module 14/09/09 18:07:58 INFO JettyUtils: Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter sc = SparkContext(appName=PySparkShell, pyFiles=add_files) File /opt/spark/python/pyspark/context.py, line 107, in __init__ conf) File /opt/spark/python/pyspark/context.py, line 155, in _do_init self._jsc = self._initialize_context(self._conf._jconf) File /opt/spark/python/pyspark/context.py, line 201, in _initialize_context return self._jvm.JavaSparkContext(jconf) File /opt/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/java_gateway.py, line 701, in __call__ File /opt/spark/python/lib/py4j-0.8.2.1-src.zip/py4j/protocol.py, line 300, in get_return_value py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : java.util.ConcurrentModificationException at java.util.Hashtable$Enumerator.next(Hashtable.java:1167) at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$3.next(Wrappers.scala:458) at scala.collection.convert.Wrappers$JPropertiesWrapper$$anon$3.next(Wrappers.scala:454) at scala.collection.Iterator$class.toStream(Iterator.scala:1143) at scala.collection.AbstractIterator.toStream(Iterator.scala:1157) at scala.collection.Iterator$$anonfun$toStream$1.apply(Iterator.scala:1143) at scala.collection.Iterator$$anonfun$toStream$1.apply(Iterator.scala:1143) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1085) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1077) at scala.collection.immutable.Stream$$anonfun$filteredTail$1.apply(Stream.scala:1149) at scala.collection.immutable.Stream$$anonfun$filteredTail$1.apply(Stream.scala:1149) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1085) at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1077) at scala.collection.immutable.Stream.length(Stream.scala:284) at scala.collection.SeqLike$class.sorted(SeqLike.scala:608) at scala.collection.AbstractSeq.sorted(Seq.scala:40) at org.apache.spark.SparkEnv$.environmentDetails(SparkEnv.scala:324) at org.apache.spark.SparkContext.postEnvironmentUpdate(SparkContext.scala:1297) at org.apache.spark.SparkContext.init(SparkContext.scala:334) at org.apache.spark.api.java.JavaSparkContext.init(JavaSparkContext.scala:53) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:234) at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:379) at py4j.Gateway.invoke(Gateway.java:214) at py4j.commands.ConstructorCommand.invokeConstructor(ConstructorCommand.java:79) at py4j.commands.ConstructorCommand.execute(ConstructorCommand.java:68) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745) {code} was: Just downloaded Spark 1.1.0-rc4. Launching pyspark for the very first time in yarn-client mode (no additional params or anything), I got the exception below. Rerunning pyspark 5 times afterwards did not reproduce the issue. 14/09/09 18:07:58 INFO YarnClientSchedulerBackend: Application report from ASM: appMasterRpcPort: 0 appStartTime: 1410275267606 yarnAppState: RUNNING 14/09/09 18:07:58 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, PROXY_HOST=master. grid.eaglerd.local,PROXY_URI_BASE=http://master.grid.eaglerd.local:8088/proxy/application_1410268447887_0011,
[jira] [Created] (SPARK-3458) enable use of python's with statements for SparkContext management
Matthew Farrellee created SPARK-3458: Summary: enable use of python's with statements for SparkContext management Key: SPARK-3458 URL: https://issues.apache.org/jira/browse/SPARK-3458 Project: Spark Issue Type: New Feature Components: PySpark Reporter: Matthew Farrellee best practice for managing SparkContexts involves exception handling, e.g. ``` try: sc = SparkContext() app(sc) finally: sc.stop() ``` python provides the with statement to simplify this code, e.g. ``` with SparkContext() as sc: app(sc) ``` the SparkContext should be usable in a with statement -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2972) APPLICATION_COMPLETE not created in Python unless context explicitly stopped
[ https://issues.apache.org/jira/browse/SPARK-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127171#comment-14127171 ] Shay Rojansky commented on SPARK-2972: -- I'd love to help on this, but I know 0 Scala (I could have helped with the Python though :)). A quick search shows that Scala has no Python 'with' or Java Closeable equivalent in Java. There are several third-party implementations out there, but it doesn't seem right to bring in a non-core library for this kind of thing. I think someone with real Scala knowledge should take a look at this. We can close this issue and open a separate one for the Scala closeability if you want. APPLICATION_COMPLETE not created in Python unless context explicitly stopped Key: SPARK-2972 URL: https://issues.apache.org/jira/browse/SPARK-2972 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.0.2 Environment: Cloudera 5.1, yarn master on ubuntu precise Reporter: Shay Rojansky If you don't explicitly stop a SparkContext at the end of a Python application with sc.stop(), an APPLICATION_COMPLETE file isn't created and the job doesn't get picked up by the history server. This can be easily reproduced with pyspark (but affects scripts as well). The current workaround is to wrap the entire script with a try/finally and stop manually. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2972) APPLICATION_COMPLETE not created in Python unless context explicitly stopped
[ https://issues.apache.org/jira/browse/SPARK-2972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127187#comment-14127187 ] Matthew Farrellee commented on SPARK-2972: -- +1 close this and open 2 feature requests, one for java and one for scala that mirror SPARK-3458 APPLICATION_COMPLETE not created in Python unless context explicitly stopped Key: SPARK-2972 URL: https://issues.apache.org/jira/browse/SPARK-2972 Project: Spark Issue Type: Bug Components: PySpark Affects Versions: 1.0.2 Environment: Cloudera 5.1, yarn master on ubuntu precise Reporter: Shay Rojansky If you don't explicitly stop a SparkContext at the end of a Python application with sc.stop(), an APPLICATION_COMPLETE file isn't created and the job doesn't get picked up by the history server. This can be easily reproduced with pyspark (but affects scripts as well). The current workaround is to wrap the entire script with a try/finally and stop manually. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Under YARN, add and remove executors based on load
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127188#comment-14127188 ] Thomas Graves commented on SPARK-3174: -- Since you mention the graceful decommission as large enough to be a feature of its own the only way we would give executors back is if they are not being used and have no data in the cache, correct? Under YARN, add and remove executors based on load -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. I think it would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Request more executors when RDDs can't fit in memory * Discard executors when few tasks are running / pending and there's not much in memory Bonus points: migrate blocks from executors we're about to discard to executors with free space. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3459) MulticlassMetrics is not serializable
Xiangrui Meng created SPARK-3459: Summary: MulticlassMetrics is not serializable Key: SPARK-3459 URL: https://issues.apache.org/jira/browse/SPARK-3459 Project: Spark Issue Type: Bug Components: MLlib Reporter: Xiangrui Meng Some task closures contains member variables and hence have reference to itself, which causes task not serializable exception on a real cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3458) enable use of python's with statements for SparkContext management
[ https://issues.apache.org/jira/browse/SPARK-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127210#comment-14127210 ] Apache Spark commented on SPARK-3458: - User 'mattf' has created a pull request for this issue: https://github.com/apache/spark/pull/2335 enable use of python's with statements for SparkContext management Key: SPARK-3458 URL: https://issues.apache.org/jira/browse/SPARK-3458 Project: Spark Issue Type: New Feature Components: PySpark Reporter: Matthew Farrellee Assignee: Matthew Farrellee Labels: features, python, sparkcontext best practice for managing SparkContexts involves exception handling, e.g. {code} try: sc = SparkContext() app(sc) finally: sc.stop() {code} python provides the with statement to simplify this code, e.g. {code} with SparkContext() as sc: app(sc) {code} the SparkContext should be usable in a with statement -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-3174) Under YARN, add and remove executors based on load
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127188#comment-14127188 ] Thomas Graves edited comment on SPARK-3174 at 9/9/14 4:51 PM: -- Since you mention the graceful decommission as large enough to be a feature of its own the only way we would give executors back is if they are not being used and have no data in the cache, correct? Perhaps this needs umbrella jira if we are splitting those apart. was (Author: tgraves): Since you mention the graceful decommission as large enough to be a feature of its own the only way we would give executors back is if they are not being used and have no data in the cache, correct? Under YARN, add and remove executors based on load -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. I think it would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Request more executors when RDDs can't fit in memory * Discard executors when few tasks are running / pending and there's not much in memory Bonus points: migrate blocks from executors we're about to discard to executors with free space. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3460) Graceful decommission of idle YARN sessions
Patrick Wendell created SPARK-3460: -- Summary: Graceful decommission of idle YARN sessions Key: SPARK-3460 URL: https://issues.apache.org/jira/browse/SPARK-3460 Project: Spark Issue Type: Sub-task Components: YARN Reporter: Patrick Wendell Assignee: Andrew Or This is a simpler case of the more general ideas discussed in SPARK-3174. If we have a YARN session that is no longer submitting tasks and has no in-scope shuffle data or cached blocks, then we should scale down the cluster and give up containers. This general behavior could be enabled/disabled with a config setting. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-3174) Under YARN, add and remove executors based on load
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127232#comment-14127232 ] Patrick Wendell edited comment on SPARK-3174 at 9/9/14 5:08 PM: Yeah so how about we create a sub-task that covers only graceful decommission. IMO that's a much simpler feature to implement. [~tgraves] is this an issue you've run into at Yahoo (people leaving clusters up that are no longer using any resources?). was (Author: pwendell): Yeah so how about we create a sub-task that covers only graceful decommission. IMO that's a much simpler feature to implement. Under YARN, add and remove executors based on load -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. I think it would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Request more executors when RDDs can't fit in memory * Discard executors when few tasks are running / pending and there's not much in memory Bonus points: migrate blocks from executors we're about to discard to executors with free space. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3438) Support for accessing secured HDFS in Standalone Mode
[ https://issues.apache.org/jira/browse/SPARK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3438: --- Component/s: Deploy Support for accessing secured HDFS in Standalone Mode - Key: SPARK-3438 URL: https://issues.apache.org/jira/browse/SPARK-3438 Project: Spark Issue Type: New Feature Components: Deploy, Spark Core Affects Versions: 1.0.2 Reporter: Zhanfeng Huo Reading data from secure HDFS into spark is a usefull feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3438) Support for accessing secured HDFS in Standalone Mode
[ https://issues.apache.org/jira/browse/SPARK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3438: --- Summary: Support for accessing secured HDFS in Standalone Mode (was: Adding support for accessing secured HDFS) Support for accessing secured HDFS in Standalone Mode - Key: SPARK-3438 URL: https://issues.apache.org/jira/browse/SPARK-3438 Project: Spark Issue Type: New Feature Components: Deploy, Spark Core Affects Versions: 1.0.2 Reporter: Zhanfeng Huo Reading data from secure HDFS into spark is a usefull feature. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3438) Support for accessing secured HDFS in Standalone Mode
[ https://issues.apache.org/jira/browse/SPARK-3438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3438: --- Description: Secured HDFS is supported in YARN currently, but not in standalone mode. The tricky bit is how disseminate the delegation tokens securely in standalone mode. (was: Reading data from secure HDFS into spark is a usefull feature. ) Support for accessing secured HDFS in Standalone Mode - Key: SPARK-3438 URL: https://issues.apache.org/jira/browse/SPARK-3438 Project: Spark Issue Type: New Feature Components: Deploy, Spark Core Affects Versions: 1.0.2 Reporter: Zhanfeng Huo Secured HDFS is supported in YARN currently, but not in standalone mode. The tricky bit is how disseminate the delegation tokens securely in standalone mode. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2182) Scalastyle rule blocking unicode operators
[ https://issues.apache.org/jira/browse/SPARK-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-2182: --- Assignee: Prashant Sharma Scalastyle rule blocking unicode operators -- Key: SPARK-2182 URL: https://issues.apache.org/jira/browse/SPARK-2182 Project: Spark Issue Type: Bug Components: Build Reporter: Andrew Ash Assignee: Prashant Sharma Attachments: Screen Shot 2014-06-18 at 3.28.44 PM.png Some IDEs don't support Scala's [unicode operators|http://www.scala-lang.org/old/node/4723] so we should consider adding a scalastyle rule to block them for wider compatibility among contributors. See this PR for a place we reverted a unicode operator: https://github.com/apache/spark/pull/1119 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2182) Scalastyle rule blocking unicode operators
[ https://issues.apache.org/jira/browse/SPARK-2182?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127248#comment-14127248 ] Patrick Wendell commented on SPARK-2182: [~prashant_] as our resident expert on scalastyle... is this possible? Scalastyle rule blocking unicode operators -- Key: SPARK-2182 URL: https://issues.apache.org/jira/browse/SPARK-2182 Project: Spark Issue Type: Bug Components: Build Reporter: Andrew Ash Assignee: Prashant Sharma Attachments: Screen Shot 2014-06-18 at 3.28.44 PM.png Some IDEs don't support Scala's [unicode operators|http://www.scala-lang.org/old/node/4723] so we should consider adding a scalastyle rule to block them for wider compatibility among contributors. See this PR for a place we reverted a unicode operator: https://github.com/apache/spark/pull/1119 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3426) Sort-based shuffle compression behavior is inconsistent
[ https://issues.apache.org/jira/browse/SPARK-3426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3426: --- Priority: Blocker (was: Critical) Sort-based shuffle compression behavior is inconsistent --- Key: SPARK-3426 URL: https://issues.apache.org/jira/browse/SPARK-3426 Project: Spark Issue Type: Bug Affects Versions: 1.1.0 Reporter: Andrew Or Assignee: Andrew Or Priority: Blocker We have the following configs: {code} spark.shuffle.compress spark.shuffle.spill.compress {code} When these two diverge, sort-based shuffle fails with a compression exception under certain workloads. This is because in sort-based shuffle we serve the index file (using spark.shuffle.spill.compress) as a normal shuffle file (using spark.shuffle.compress). It was unfortunate in retrospect that these two configs were exposed so we can't easily remove them. Here is how this can be reproduced. Set the following in your spark-defaults.conf: {code} spark.master local-cluster[1,1,512] spark.shuffle.spill.compress false spark.shuffle.compresstrue spark.shuffle.manager sort spark.shuffle.memoryFraction 0.001 {code} Then run the following in spark-shell: {code} sc.parallelize(0 until 10).map(i = (i/4, i)).groupByKey().collect() {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3461) Support external group-by
Patrick Wendell created SPARK-3461: -- Summary: Support external group-by Key: SPARK-3461 URL: https://issues.apache.org/jira/browse/SPARK-3461 Project: Spark Issue Type: Bug Reporter: Patrick Wendell Assignee: Patrick Wendell Given that we have SPARK-2978, it seems like we could support an external group by operator pretty easily. We'd just have to wrap the existing iterator exposed by SPARK-2978 with a lookahead iterator that detects the group boundaries. Also, we'd have to override the cache() operator to cache the parent RDD so that if this object is cached it doesn't wind through the iterator. I haven't totally followed all the sort-shuffle internals, but just given the stated semantics of SPARK-2978 it seems like this would be possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3461) Support external groupBy
[ https://issues.apache.org/jira/browse/SPARK-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3461: --- Summary: Support external groupBy (was: Support external group-by) Support external groupBy Key: SPARK-3461 URL: https://issues.apache.org/jira/browse/SPARK-3461 Project: Spark Issue Type: Bug Reporter: Patrick Wendell Assignee: Patrick Wendell Given that we have SPARK-2978, it seems like we could support an external group by operator pretty easily. We'd just have to wrap the existing iterator exposed by SPARK-2978 with a lookahead iterator that detects the group boundaries. Also, we'd have to override the cache() operator to cache the parent RDD so that if this object is cached it doesn't wind through the iterator. I haven't totally followed all the sort-shuffle internals, but just given the stated semantics of SPARK-2978 it seems like this would be possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3461) Support external groupBy
[ https://issues.apache.org/jira/browse/SPARK-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3461: --- Description: Given that we have SPARK-2978, it seems like we could support an external group by operator pretty easily. We'd just have to wrap the existing iterator exposed by SPARK-2978 with a lookahead iterator that detects the group boundaries. Also, we'd have to override the cache() operator to cache the parent RDD so that if this object is cached it doesn't wind through the iterator. I haven't totally followed all the sort-shuffle internals, but just given the stated semantics of SPARK-2978 it seems like this would be possible. It would be really nice to externalize this because many beginner users write jobs in terms of groupBy. was: Given that we have SPARK-2978, it seems like we could support an external group by operator pretty easily. We'd just have to wrap the existing iterator exposed by SPARK-2978 with a lookahead iterator that detects the group boundaries. Also, we'd have to override the cache() operator to cache the parent RDD so that if this object is cached it doesn't wind through the iterator. I haven't totally followed all the sort-shuffle internals, but just given the stated semantics of SPARK-2978 it seems like this would be possible. Support external groupBy Key: SPARK-3461 URL: https://issues.apache.org/jira/browse/SPARK-3461 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell Assignee: Patrick Wendell Given that we have SPARK-2978, it seems like we could support an external group by operator pretty easily. We'd just have to wrap the existing iterator exposed by SPARK-2978 with a lookahead iterator that detects the group boundaries. Also, we'd have to override the cache() operator to cache the parent RDD so that if this object is cached it doesn't wind through the iterator. I haven't totally followed all the sort-shuffle internals, but just given the stated semantics of SPARK-2978 it seems like this would be possible. It would be really nice to externalize this because many beginner users write jobs in terms of groupBy. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3461) Support external groupBy
[ https://issues.apache.org/jira/browse/SPARK-3461?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-3461: --- Component/s: Spark Core Support external groupBy Key: SPARK-3461 URL: https://issues.apache.org/jira/browse/SPARK-3461 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell Assignee: Patrick Wendell Given that we have SPARK-2978, it seems like we could support an external group by operator pretty easily. We'd just have to wrap the existing iterator exposed by SPARK-2978 with a lookahead iterator that detects the group boundaries. Also, we'd have to override the cache() operator to cache the parent RDD so that if this object is cached it doesn't wind through the iterator. I haven't totally followed all the sort-shuffle internals, but just given the stated semantics of SPARK-2978 it seems like this would be possible. It would be really nice to externalize this because many beginner users write jobs in terms of groupBy. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-3459) MulticlassMetrics is not serializable
[ https://issues.apache.org/jira/browse/SPARK-3459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng closed SPARK-3459. Resolution: Cannot Reproduce MulticlassMetrics is not serializable - Key: SPARK-3459 URL: https://issues.apache.org/jira/browse/SPARK-3459 Project: Spark Issue Type: Bug Components: MLlib Reporter: Xiangrui Meng Some task closures contains member variables and hence have reference to itself, which causes task not serializable exception on a real cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3445) Deprecate and later remove YARN alpha support
[ https://issues.apache.org/jira/browse/SPARK-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127301#comment-14127301 ] Reynold Xin commented on SPARK-3445: [~tgraves] when is Yahoo moving? (or was that already completed?) Deprecate and later remove YARN alpha support - Key: SPARK-3445 URL: https://issues.apache.org/jira/browse/SPARK-3445 Project: Spark Issue Type: Improvement Components: YARN Reporter: Patrick Wendell This will depend a bit on both user demand and the commitment level of maintainers, but I'd like to propose the following timeline for yarn-alpha support. Spark 1.2: Deprecate YARN-alpha Spark 1.3: Remove YARN-alpha (i.e. require YARN-stable) Since YARN-alpha is clearly identified as an alpha API, it seems reasonable to drop support for it in a minor release. However, it does depend a bit whether anyone uses this outside of Yahoo!, and that I'm not sure of. In the past this API has been used and maintained by Yahoo, but they'll be migrating soon to the stable API's. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3462) parquet pushdown for unionAll
Cody Koeninger created SPARK-3462: - Summary: parquet pushdown for unionAll Key: SPARK-3462 URL: https://issues.apache.org/jira/browse/SPARK-3462 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.1.0 Reporter: Cody Koeninger http://apache-spark-developers-list.1001551.n3.nabble.com/parquet-predicate-projection-pushdown-into-unionAll-td8339.html // single table, pushdown scala p.where('age 40).select('name) res36: org.apache.spark.sql.SchemaRDD = SchemaRDD[97] at RDD at SchemaRDD.scala:103 == Query Plan == == Physical Plan == Project [name#3] ParquetTableScan [name#3,age#4], (ParquetRelation /var/tmp/people, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml), org.apache.spark.sql.SQLContext@6d7e79f6, []), [(age#4 40)] // union of 2 tables, no pushdown scala b.where('age 40).select('name) res37: org.apache.spark.sql.SchemaRDD = SchemaRDD[99] at RDD at SchemaRDD.scala:103 == Query Plan == == Physical Plan == Project [name#3] Filter (age#4 40) Union [ParquetTableScan [name#3,age#4,phones#5], (ParquetRelation /var/tmp/people, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml), org.apache.spark.sql.SQLContext@6d7e79f6, []), [] ,ParquetTableScan [name#0,age#1,phones#2], (ParquetRelation /var/tmp/people2, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml), org.apache.spark.sql.SQLContext@6d7e79f6, []), [] ] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2425) Standalone Master is too aggressive in removing Applications
[ https://issues.apache.org/jira/browse/SPARK-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-2425: - Fix Version/s: (was: 1.1.1) Standalone Master is too aggressive in removing Applications Key: SPARK-2425 URL: https://issues.apache.org/jira/browse/SPARK-2425 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Mark Hamstra Assignee: Mark Hamstra Priority: Critical Fix For: 1.2.0 When standalone Executors trying to run a particular Application fail a cummulative ApplicationState.MAX_NUM_RETRY times, Master will remove the Application. This will be true even if there actually are a number of Executors that are successfully running the Application. This makes long-running standalone-mode Applications in particular unnecessarily vulnerable to limited failures in the cluster -- e.g., a single bad node on which Executors repeatedly fail for any reason can prevent an Application from starting or can result in a running Application being removed even though it could continue to run successfully (just not making use of all potential Workers and Executors.) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3445) Deprecate and later remove YARN alpha support
[ https://issues.apache.org/jira/browse/SPARK-3445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127318#comment-14127318 ] Thomas Graves commented on SPARK-3445: -- We are in progress of moving and the timeline of the proposal fits with the rest of our plans. Deprecate and later remove YARN alpha support - Key: SPARK-3445 URL: https://issues.apache.org/jira/browse/SPARK-3445 Project: Spark Issue Type: Improvement Components: YARN Reporter: Patrick Wendell This will depend a bit on both user demand and the commitment level of maintainers, but I'd like to propose the following timeline for yarn-alpha support. Spark 1.2: Deprecate YARN-alpha Spark 1.3: Remove YARN-alpha (i.e. require YARN-stable) Since YARN-alpha is clearly identified as an alpha API, it seems reasonable to drop support for it in a minor release. However, it does depend a bit whether anyone uses this outside of Yahoo!, and that I'm not sure of. In the past this API has been used and maintained by Yahoo, but they'll be migrating soon to the stable API's. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-2425) Standalone Master is too aggressive in removing Applications
[ https://issues.apache.org/jira/browse/SPARK-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or reopened SPARK-2425: -- Standalone Master is too aggressive in removing Applications Key: SPARK-2425 URL: https://issues.apache.org/jira/browse/SPARK-2425 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Mark Hamstra Assignee: Mark Hamstra Priority: Critical Fix For: 1.2.0 When standalone Executors trying to run a particular Application fail a cummulative ApplicationState.MAX_NUM_RETRY times, Master will remove the Application. This will be true even if there actually are a number of Executors that are successfully running the Application. This makes long-running standalone-mode Applications in particular unnecessarily vulnerable to limited failures in the cluster -- e.g., a single bad node on which Executors repeatedly fail for any reason can prevent an Application from starting or can result in a running Application being removed even though it could continue to run successfully (just not making use of all potential Workers and Executors.) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-2425) Standalone Master is too aggressive in removing Applications
[ https://issues.apache.org/jira/browse/SPARK-2425?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-2425. Resolution: Fixed Standalone Master is too aggressive in removing Applications Key: SPARK-2425 URL: https://issues.apache.org/jira/browse/SPARK-2425 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Reporter: Mark Hamstra Assignee: Mark Hamstra Priority: Critical Fix For: 1.2.0 When standalone Executors trying to run a particular Application fail a cummulative ApplicationState.MAX_NUM_RETRY times, Master will remove the Application. This will be true even if there actually are a number of Executors that are successfully running the Application. This makes long-running standalone-mode Applications in particular unnecessarily vulnerable to limited failures in the cluster -- e.g., a single bad node on which Executors repeatedly fail for any reason can prevent an Application from starting or can result in a running Application being removed even though it could continue to run successfully (just not making use of all potential Workers and Executors.) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3464) Graceful decommission of executors
[ https://issues.apache.org/jira/browse/SPARK-3464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sandy Ryza updated SPARK-3464: -- Description: In most cases, even when an application is utilizing only a small fraction of its available resources, executors will still have tasks running or blocks cached. It would be useful to have a mechanism for waiting for running tasks on an executor to finish and migrating its cached blocks elsewhere before discarding it. Graceful decommission of executors -- Key: SPARK-3464 URL: https://issues.apache.org/jira/browse/SPARK-3464 Project: Spark Issue Type: Sub-task Components: YARN Reporter: Sandy Ryza In most cases, even when an application is utilizing only a small fraction of its available resources, executors will still have tasks running or blocks cached. It would be useful to have a mechanism for waiting for running tasks on an executor to finish and migrating its cached blocks elsewhere before discarding it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3174) Under YARN, add and remove executors based on load
[ https://issues.apache.org/jira/browse/SPARK-3174?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127446#comment-14127446 ] Thomas Graves commented on SPARK-3174: -- {quote} Yeah so how about we create a sub-task that covers only graceful decommission. IMO that's a much simpler feature to implement. Thomas Graves is this an issue you've run into at Yahoo (people leaving clusters up that are no longer using any resources?). {quote} I haven't seen a lot of it at this point. Generally when I do its someone using spark-shell or pyspark and left it up. I haven't analyzed many customer jobs that deeply either though to know that half the time they were wasting the resources. I can definitely sees its usefulness. Under YARN, add and remove executors based on load -- Key: SPARK-3174 URL: https://issues.apache.org/jira/browse/SPARK-3174 Project: Spark Issue Type: Improvement Components: YARN Affects Versions: 1.0.2 Reporter: Sandy Ryza Assignee: Andrew Or Attachments: SPARK-3174design.pdf A common complaint with Spark in a multi-tenant environment is that applications have a fixed allocation that doesn't grow and shrink with their resource needs. We're blocked on YARN-1197 for dynamically changing the resources within executors, but we can still allocate and discard whole executors. I think it would be useful to have some heuristics that * Request more executors when many pending tasks are building up * Request more executors when RDDs can't fit in memory * Discard executors when few tasks are running / pending and there's not much in memory Bonus points: migrate blocks from executors we're about to discard to executors with free space. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1985) SPARK_HOME shouldn't be required when spark.executor.uri is provided
[ https://issues.apache.org/jira/browse/SPARK-1985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127455#comment-14127455 ] Chip Senkbeil commented on SPARK-1985: -- Does anyone know what the status of this is? SPARK_HOME shouldn't be required when spark.executor.uri is provided Key: SPARK-1985 URL: https://issues.apache.org/jira/browse/SPARK-1985 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.0 Environment: MESOS Reporter: Gerard Maas Labels: mesos When trying to run that simple example on a Mesos installation, I get an error that SPARK_HOME is not set. A local spark installation should not be required to run a job on Mesos. All that's needed is the executor package, being the assembly.tar.gz on a reachable location (HDFS/S3/HTTP). I went looking into the code and indeed there's a check on SPARK_HOME [2] regardless of the presence of the assembly but it's actually only used if the assembly is not provided (which is a kind-of best-effort recovery strategy). Current flow: if (!SPARK_HOME) fail(No SPARK_HOME) else if (assembly) { use assembly) } else { try use SPARK_HOME to build spark_executor } Should be: sparkExecutor = if (assembly) {assembly} else if (SPARK_HOME) {try use SPARK_HOME to build spark_executor} else { fail(No executor found. Please provide spark.executor.uri (preferred) or spark.home) [1] http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-with-Spark-Mesos-spark-shell-works-fine-td6165.html [2] https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L89 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3450) Enable specifiying the --jars CLI option multiple times
[ https://issues.apache.org/jira/browse/SPARK-3450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127533#comment-14127533 ] Marcelo Vanzin commented on SPARK-3450: --- My only concern is that adding this would probably break things for those relying on the current behavior for whatever reason. I don't expect many of those to exist, but you never know... Enable specifiying the --jars CLI option multiple times --- Key: SPARK-3450 URL: https://issues.apache.org/jira/browse/SPARK-3450 Project: Spark Issue Type: Improvement Components: Deploy Affects Versions: 1.0.2 Reporter: wolfgang hoschek spark-submit should support specifiying the --jars option multiple time, e.g. --jars foo.jar,bar.jar --jars baz.jar,oops.jar should be equivalent to --jars foo.jar,bar.jar,baz.jar,oops.jar This would allow using wrapper scripts that simplify usage for enterprise customers along the following lines: {code} my-spark-submit.sh: jars= for i in /opt/myapp/*.jar; do if [ $i -gt 0] then jars=$jars, fi jars=$jars$i done spark-submit --jars $jars $@ {code} Example usage: {code} my-spark-submit.sh --jars myUserDefinedFunction.jar {code} The relevant enhancement code might go into SparkSubmitArguments. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3215) Add remote interface for SparkContext
[ https://issues.apache.org/jira/browse/SPARK-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127564#comment-14127564 ] Marcelo Vanzin commented on SPARK-3215: --- For those following, I moved the prototype to this location: https://github.com/vanzin/spark-client This is so the Hive-on-Spark project can start playing with while we work on all the details. Add remote interface for SparkContext - Key: SPARK-3215 URL: https://issues.apache.org/jira/browse/SPARK-3215 Project: Spark Issue Type: New Feature Components: Spark Core Reporter: Marcelo Vanzin Labels: hive Attachments: RemoteSparkContext.pdf A quick description of the issue: as part of running Hive jobs on top of Spark, it's desirable to have a SparkContext that is running in the background and listening for job requests for a particular user session. Running multiple contexts in the same JVM is not a very good solution. Not only SparkContext currently has issues sharing the same JVM among multiple instances, but that turns the JVM running the contexts into a huge bottleneck in the system. So I'm proposing a solution where we have a SparkContext that is running in a separate process, and listening for requests from the client application via some RPC interface (most probably Akka). I'll attach a document shortly with the current proposal. Let's use this bug to discuss the proposal and any other suggestions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3404) SparkSubmitSuite fails with spark-submit exits with code 1
[ https://issues.apache.org/jira/browse/SPARK-3404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3404. -- Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 Tests are now failing due to HiveQL test problems, but you can see they have passed SparkSubmitSuite: https://amplab.cs.berkeley.edu/jenkins/view/Spark/ I think this one's resolved now. SparkSubmitSuite fails with spark-submit exits with code 1 Key: SPARK-3404 URL: https://issues.apache.org/jira/browse/SPARK-3404 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.2, 1.1.0 Reporter: Sean Owen Priority: Critical Fix For: 1.1.1, 1.2.0 Maven-based Jenkins builds have been failing for over a month. For example: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-pre-YARN/ It's SparkSubmitSuite that fails. For example: https://amplab.cs.berkeley.edu/jenkins/job/Spark-Master-Maven-pre-YARN/541/hadoop.version=2.0.0-mr1-cdh4.1.2,label=centos/consoleFull {code} SparkSubmitSuite ... - launch simple application with spark-submit *** FAILED *** org.apache.spark.SparkException: Process List(./bin/spark-submit, --class, org.apache.spark.deploy.SimpleApplicationTest, --name, testApp, --master, local, file:/tmp/1409815981504-0/testJar-1409815981505.jar) exited with code 1 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:837) at org.apache.spark.deploy.SparkSubmitSuite.runSparkSubmit(SparkSubmitSuite.scala:311) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply$mcV$sp(SparkSubmitSuite.scala:291) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply(SparkSubmitSuite.scala:284) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$14.apply(SparkSubmitSuite.scala:284) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) ... - spark submit includes jars passed in through --jar *** FAILED *** org.apache.spark.SparkException: Process List(./bin/spark-submit, --class, org.apache.spark.deploy.JarCreationTest, --name, testApp, --master, local-cluster[2,1,512], --jars, file:/tmp/1409815984960-0/testJar-1409815985029.jar,file:/tmp/1409815985030-0/testJar-1409815985087.jar, file:/tmp/1409815984959-0/testJar-1409815984959.jar) exited with code 1 at org.apache.spark.util.Utils$.executeAndGetOutput(Utils.scala:837) at org.apache.spark.deploy.SparkSubmitSuite.runSparkSubmit(SparkSubmitSuite.scala:311) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$15.apply$mcV$sp(SparkSubmitSuite.scala:305) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$15.apply(SparkSubmitSuite.scala:294) at org.apache.spark.deploy.SparkSubmitSuite$$anonfun$15.apply(SparkSubmitSuite.scala:294) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.Transformer$$anonfun$apply$1.apply(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) ... {code} SBT builds don't fail, so it is likely to be due to some difference in how the tests are run rather than a problem with test or core project. This is related to http://issues.apache.org/jira/browse/SPARK-3330 but the cause identified in that JIRA is, at least, not the only cause. (Although, it wouldn't hurt to be doubly-sure this is not an issue by changing the Jenkins config to invoke {{mvn clean mvn ... package}} {{mvn ... clean package}}.) This JIRA tracks investigation into a different cause. Right now I have some further information but not a PR yet. Part of the issue is that there is no clue in the log about why {{spark-submit}} exited with status 1. See https://github.com/apache/spark/pull/2108/files and https://issues.apache.org/jira/browse/SPARK-3193 for a change that would at least print stdout to the log too. The SparkSubmit program exits with 1 when the main class it is supposed to run is not found (https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L322) This is for example SimpleApplicationTest (https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/deploy/SparkSubmitSuite.scala#L339) The test actually submits an empty JAR not containing this class. It relies on {{spark-submit}} finding the class within the compiled test-classes of the
[jira] [Created] (SPARK-3465) Task metrics are not aggregated correctly in local mode
Davies Liu created SPARK-3465: - Summary: Task metrics are not aggregated correctly in local mode Key: SPARK-3465 URL: https://issues.apache.org/jira/browse/SPARK-3465 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Priority: Blocker In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3465) Task metrics are not aggregated correctly in local mode
[ https://issues.apache.org/jira/browse/SPARK-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-3465: -- Description: In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099. was: In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in #2099. Task metrics are not aggregated correctly in local mode --- Key: SPARK-3465 URL: https://issues.apache.org/jira/browse/SPARK-3465 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3465) Task metrics are not aggregated correctly in local mode
[ https://issues.apache.org/jira/browse/SPARK-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-3465: -- Description: In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in #2099. was:In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. Task metrics are not aggregated correctly in local mode --- Key: SPARK-3465 URL: https://issues.apache.org/jira/browse/SPARK-3465 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in #2099. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3465) Task metrics are not aggregated correctly in local mode
[ https://issues.apache.org/jira/browse/SPARK-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-3465: -- Description: In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099, cc @sandy rayza was: In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099. Task metrics are not aggregated correctly in local mode --- Key: SPARK-3465 URL: https://issues.apache.org/jira/browse/SPARK-3465 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099, cc @sandy rayza -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3465) Task metrics are not aggregated correctly in local mode
[ https://issues.apache.org/jira/browse/SPARK-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davies Liu updated SPARK-3465: -- Description: In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099, cc [~sandyr]] was: In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099, cc @sandy rayza Task metrics are not aggregated correctly in local mode --- Key: SPARK-3465 URL: https://issues.apache.org/jira/browse/SPARK-3465 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099, cc [~sandyr]] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3409) Avoid pulling in Exchange operator itself in Exchange's closures
[ https://issues.apache.org/jira/browse/SPARK-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3409: - Fix Version/s: 1.1.1 Avoid pulling in Exchange operator itself in Exchange's closures Key: SPARK-3409 URL: https://issues.apache.org/jira/browse/SPARK-3409 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.1.0 Reporter: Reynold Xin Assignee: Reynold Xin Fix For: 1.1.1, 1.2.0 {code} val rdd = child.execute().mapPartitions { iter = if (sortBasedShuffleOn) { iter.map(r = (null, r.copy())) } else { val mutablePair = new MutablePair[Null, Row]() iter.map(r = mutablePair.update(null, r)) } } {code} The above snippet from Exchange references sortBasedShuffleOn within a closure, which requires pulling in the entire Exchange object in the closure. This is a tiny teeny optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3409) Avoid pulling in Exchange operator itself in Exchange's closures
[ https://issues.apache.org/jira/browse/SPARK-3409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3409: - Affects Version/s: 1.1.0 Avoid pulling in Exchange operator itself in Exchange's closures Key: SPARK-3409 URL: https://issues.apache.org/jira/browse/SPARK-3409 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.1.0 Reporter: Reynold Xin Assignee: Reynold Xin Fix For: 1.1.1, 1.2.0 {code} val rdd = child.execute().mapPartitions { iter = if (sortBasedShuffleOn) { iter.map(r = (null, r.copy())) } else { val mutablePair = new MutablePair[Null, Row]() iter.map(r = mutablePair.update(null, r)) } } {code} The above snippet from Exchange references sortBasedShuffleOn within a closure, which requires pulling in the entire Exchange object in the closure. This is a tiny teeny optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3345) Do correct parameters for ShuffleFileGroup
[ https://issues.apache.org/jira/browse/SPARK-3345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3345: - Fix Version/s: 1.1.1 Do correct parameters for ShuffleFileGroup -- Key: SPARK-3345 URL: https://issues.apache.org/jira/browse/SPARK-3345 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Liang-Chi Hsieh Assignee: Liang-Chi Hsieh Priority: Minor Fix For: 1.1.1, 1.2.0 In the method newFileGroup of class FileShuffleBlockManager, the parameters for creating new ShuffleFileGroup object is in wrong order. Wrong: new ShuffleFileGroup(fileId, shuffleId, files) Corrent: new ShuffleFileGroup(shuffleId, fileId, files) Because in current codes, the parameters shuffleId and fileId are not used. So it doesn't cause problem now. However it should be corrected for readability and avoid future problem. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3061) Maven build fails in Windows OS
[ https://issues.apache.org/jira/browse/SPARK-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-3061: - Fix Version/s: 1.1.1 Maven build fails in Windows OS --- Key: SPARK-3061 URL: https://issues.apache.org/jira/browse/SPARK-3061 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.2, 1.1.0 Environment: Windows Reporter: Masayoshi TSUZUKI Assignee: Andrew Or Priority: Minor Fix For: 1.1.1, 1.2.0 Maven build fails in Windows OS with this error message. {noformat} [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:exec (default) on project spark-core_2.10: Command execution failed. Cannot run program unzip (in directory C:\path\to\gitofspark\python): CreateProcess error=2, w肳ꂽt@ - [Help 1] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3061) Maven build fails in Windows OS
[ https://issues.apache.org/jira/browse/SPARK-3061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127677#comment-14127677 ] Andrew Or commented on SPARK-3061: -- Ok, backported. Thanks Josh. Maven build fails in Windows OS --- Key: SPARK-3061 URL: https://issues.apache.org/jira/browse/SPARK-3061 Project: Spark Issue Type: Bug Components: Build Affects Versions: 1.0.2, 1.1.0 Environment: Windows Reporter: Masayoshi TSUZUKI Assignee: Josh Rosen Priority: Minor Fix For: 1.1.1, 1.2.0 Maven build fails in Windows OS with this error message. {noformat} [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:exec (default) on project spark-core_2.10: Command execution failed. Cannot run program unzip (in directory C:\path\to\gitofspark\python): CreateProcess error=2, w肳ꂽt@ - [Help 1] {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3463) Show metrics about spilling in Python
[ https://issues.apache.org/jira/browse/SPARK-3463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127918#comment-14127918 ] Apache Spark commented on SPARK-3463: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2336 Show metrics about spilling in Python - Key: SPARK-3463 URL: https://issues.apache.org/jira/browse/SPARK-3463 Project: Spark Issue Type: Improvement Components: PySpark Reporter: Davies Liu Assignee: Davies Liu It should also show the number of bytes spilled into disks while doing aggregation in Python. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3465) Task metrics are not aggregated correctly in local mode
[ https://issues.apache.org/jira/browse/SPARK-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127920#comment-14127920 ] Apache Spark commented on SPARK-3465: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/2338 Task metrics are not aggregated correctly in local mode --- Key: SPARK-3465 URL: https://issues.apache.org/jira/browse/SPARK-3465 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Davies Liu Assignee: Davies Liu Priority: Blocker In local mode, after onExecutorMetricsUpdate(), t.taskMetrics will be the same object with that in TaskContext (because there is no serialization for MetricsUpdate in local mode), then all the upcoming changes in metrics will be lost, because updateAggregateMetrics() only counts the difference in these two. This bug was introduced in https://issues.apache.org/jira/browse/SPARK-2099, cc [~sandyr]] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3446) FutureAction should expose the job ID
[ https://issues.apache.org/jira/browse/SPARK-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127919#comment-14127919 ] Apache Spark commented on SPARK-3446: - User 'vanzin' has created a pull request for this issue: https://github.com/apache/spark/pull/2337 FutureAction should expose the job ID - Key: SPARK-3446 URL: https://issues.apache.org/jira/browse/SPARK-3446 Project: Spark Issue Type: New Feature Components: Spark Core Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin This is a follow up to SPARK-2636. The patch for that bug added a {{jobId}} method to {{SimpleFutureAction}}. The problem is that {{SimpleFutureAction}} is not exposed through any existing API; all the {{AsyncRDDActions}} methods return just {{FutureAction}}. So clients have to restore to casting / isInstanceOf to be able to use that. Exposing the {{jobId}} through {{FutureAction}} has extra complications, though, because {{ComplexFutureAction}} also extends that class. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3458) enable use of python's with statements for SparkContext management
[ https://issues.apache.org/jira/browse/SPARK-3458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-3458. Resolution: Fixed Fix Version/s: 1.2.0 enable use of python's with statements for SparkContext management Key: SPARK-3458 URL: https://issues.apache.org/jira/browse/SPARK-3458 Project: Spark Issue Type: New Feature Components: PySpark Reporter: Matthew Farrellee Assignee: Matthew Farrellee Labels: features, python, sparkcontext Fix For: 1.2.0 best practice for managing SparkContexts involves exception handling, e.g. {code} try: sc = SparkContext() app(sc) finally: sc.stop() {code} python provides the with statement to simplify this code, e.g. {code} with SparkContext() as sc: app(sc) {code} the SparkContext should be usable in a with statement -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3212) Improve the clarity of caching semantics
[ https://issues.apache.org/jira/browse/SPARK-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14127959#comment-14127959 ] Michael Armbrust commented on SPARK-3212: - [~matei] also points out that we should make sure to uncache cached RDDs when the base table is dropped. Improve the clarity of caching semantics Key: SPARK-3212 URL: https://issues.apache.org/jira/browse/SPARK-3212 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Assignee: Michael Armbrust Priority: Blocker Right now there are a bunch of different ways to cache tables in Spark SQL. For example: - tweets.cache() - sql(SELECT * FROM tweets).cache() - table(tweets).cache() - tweets.cache().registerTempTable(tweets) - sql(CACHE TABLE tweets) - cacheTable(tweets) Each of the above commands has subtly different semantics, leading to a very confusing user experience. Ideally, we would stop doing caching based on simple tables names and instead have a phase of optimization that does intelligent matching of query plans with available cached data. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3160) Simplify DecisionTree data structure for training
[ https://issues.apache.org/jira/browse/SPARK-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joseph K. Bradley updated SPARK-3160: - Description: Improvement: code clarity Currently, we maintain a tree structure, a flat array of nodes, and a parentImpurities array. Proposed fix: Maintain everything within a growing tree structure. This would let us eliminate the flat array of nodes, thus saving storage when we do not grow a full tree. It would also potentially make it easier to pass subtrees to compute nodes for local training. Note: * This JIRA used to have this item as well: We could have a “LearningNode extends Node” setup where the LearningNode holds metadata for learning (such as impurities). The test-time model could be extracted from this training-time model, so that extra information (such as impurities) does not have to be kept after training. * However, this is really a separate issue, so I removed it. was: Improvement: code clarity Currently, we maintain a tree structure, a flat array of nodes, and a parentImpurities array. Proposed fix: Maintain everything within a growing tree structure. For this, we could have a “LearningNode extends Node” setup where the LearningNode holds metadata for learning (such as impurities). The test-time model could be extracted from this training-time model, so that extra information (such as impurities) does not have to be kept after training. This would let us eliminate the flat array of nodes, thus saving storage when we do not grow a full tree. It would also potentially make it easier to pass subtrees to compute nodes for local training. Simplify DecisionTree data structure for training - Key: SPARK-3160 URL: https://issues.apache.org/jira/browse/SPARK-3160 Project: Spark Issue Type: Improvement Components: MLlib Reporter: Joseph K. Bradley Assignee: Joseph K. Bradley Priority: Minor Improvement: code clarity Currently, we maintain a tree structure, a flat array of nodes, and a parentImpurities array. Proposed fix: Maintain everything within a growing tree structure. This would let us eliminate the flat array of nodes, thus saving storage when we do not grow a full tree. It would also potentially make it easier to pass subtrees to compute nodes for local training. Note: * This JIRA used to have this item as well: We could have a “LearningNode extends Node” setup where the LearningNode holds metadata for learning (such as impurities). The test-time model could be extracted from this training-time model, so that extra information (such as impurities) does not have to be kept after training. * However, this is really a separate issue, so I removed it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3160) Simplify DecisionTree data structure for training
[ https://issues.apache.org/jira/browse/SPARK-3160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128048#comment-14128048 ] Apache Spark commented on SPARK-3160: - User 'jkbradley' has created a pull request for this issue: https://github.com/apache/spark/pull/2341 Simplify DecisionTree data structure for training - Key: SPARK-3160 URL: https://issues.apache.org/jira/browse/SPARK-3160 Project: Spark Issue Type: Improvement Components: MLlib Reporter: Joseph K. Bradley Assignee: Joseph K. Bradley Priority: Minor Improvement: code clarity Currently, we maintain a tree structure, a flat array of nodes, and a parentImpurities array. Proposed fix: Maintain everything within a growing tree structure. This would let us eliminate the flat array of nodes, thus saving storage when we do not grow a full tree. It would also potentially make it easier to pass subtrees to compute nodes for local training. Note: * This JIRA used to have this item as well: We could have a “LearningNode extends Node” setup where the LearningNode holds metadata for learning (such as impurities). The test-time model could be extracted from this training-time model, so that extra information (such as impurities) does not have to be kept after training. * However, this is really a separate issue, so I removed it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3468) WebUI Timeline-View feature
Kousuke Saruta created SPARK-3468: - Summary: WebUI Timeline-View feature Key: SPARK-3468 URL: https://issues.apache.org/jira/browse/SPARK-3468 Project: Spark Issue Type: New Feature Components: Web UI Reporter: Kousuke Saruta I sometimes trouble-shoot and analyse the cause of long time spending job. At the time, I find the stages which spends long time or fails, then I find the tasks which spends long time or fails, next I analyse the proportion of each phase in a task. Another case, I find executors which spends long time for running a task and analyse the details of a task. In such situation, I think it's helpful to visualize timeline view of stages / tasks / executors and visualize details of proportion of activity for each task. Now I'm developing prototypes like captures I attached. I'll integrate these viewer into WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3468) WebUI Timeline-View feature
[ https://issues.apache.org/jira/browse/SPARK-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-3468: -- Attachment: executors.png WebUI Timeline-View feature --- Key: SPARK-3468 URL: https://issues.apache.org/jira/browse/SPARK-3468 Project: Spark Issue Type: New Feature Components: Web UI Reporter: Kousuke Saruta Attachments: executors.png, stages.png, taskDetails.png, tasks.png I sometimes trouble-shoot and analyse the cause of long time spending job. At the time, I find the stages which spends long time or fails, then I find the tasks which spends long time or fails, next I analyse the proportion of each phase in a task. Another case, I find executors which spends long time for running a task and analyse the details of a task. In such situation, I think it's helpful to visualize timeline view of stages / tasks / executors and visualize details of proportion of activity for each task. Now I'm developing prototypes like captures I attached. I'll integrate these viewer into WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3468) WebUI Timeline-View feature
[ https://issues.apache.org/jira/browse/SPARK-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kousuke Saruta updated SPARK-3468: -- Attachment: taskDetails.png WebUI Timeline-View feature --- Key: SPARK-3468 URL: https://issues.apache.org/jira/browse/SPARK-3468 Project: Spark Issue Type: New Feature Components: Web UI Reporter: Kousuke Saruta Attachments: executors.png, stages.png, taskDetails.png, tasks.png I sometimes trouble-shoot and analyse the cause of long time spending job. At the time, I find the stages which spends long time or fails, then I find the tasks which spends long time or fails, next I analyse the proportion of each phase in a task. Another case, I find executors which spends long time for running a task and analyse the details of a task. In such situation, I think it's helpful to visualize timeline view of stages / tasks / executors and visualize details of proportion of activity for each task. Now I'm developing prototypes like captures I attached. I'll integrate these viewer into WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3468) WebUI Timeline-View feature
[ https://issues.apache.org/jira/browse/SPARK-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14128109#comment-14128109 ] Apache Spark commented on SPARK-3468: - User 'sarutak' has created a pull request for this issue: https://github.com/apache/spark/pull/2342 WebUI Timeline-View feature --- Key: SPARK-3468 URL: https://issues.apache.org/jira/browse/SPARK-3468 Project: Spark Issue Type: New Feature Components: Web UI Reporter: Kousuke Saruta Attachments: executors.png, stages.png, taskDetails.png, tasks.png I sometimes trouble-shoot and analyse the cause of long time spending job. At the time, I find the stages which spends long time or fails, then I find the tasks which spends long time or fails, next I analyse the proportion of each phase in a task. Another case, I find executors which spends long time for running a task and analyse the details of a task. In such situation, I think it's helpful to visualize timeline view of stages / tasks / executors and visualize details of proportion of activity for each task. Now I'm developing prototypes like captures I attached. I'll integrate these viewer into WebUI. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org