[GitHub] spark pull request #16195: [Spark-18765] [CORE] Make values for spark.yarn.{...
Github user daisukebe closed the pull request at: https://github.com/apache/spark/pull/16195 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16195: [Spark-18765] [CORE] Make values for spark.yarn.{am|driv...
Github user daisukebe commented on the issue: https://github.com/apache/spark/pull/16195 Oh, I see. I was not aware of that.. Please close this PR then. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16195: [Spark-18765] [CORE] Make values for spark.yarn.{am|driv...
Github user daisukebe commented on the issue: https://github.com/apache/spark/pull/16195 @vanzin , 2.0 already has this capability per https://issues.apache.org/jira/browse/SPARK-529, thus my patch targets on 1.6. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16195: [Spark-18765] [CORE] Make values for spark.yarn.{...
Github user daisukebe commented on a diff in the pull request: https://github.com/apache/spark/pull/16195#discussion_r91589664 --- Diff: yarn/src/main/scala/org/apache/spark/deploy/yarn/ClientArguments.scala --- @@ -61,11 +61,15 @@ private[spark] class ClientArguments(args: Array[String], sparkConf: SparkConf) // Additional memory to allocate to containers val amMemoryOverheadConf = if (isClusterMode) driverMemOverheadKey else amMemOverheadKey - val amMemoryOverhead = sparkConf.getInt(amMemoryOverheadConf, -math.max((MEMORY_OVERHEAD_FACTOR * amMemory).toInt, MEMORY_OVERHEAD_MIN)) - - val executorMemoryOverhead = sparkConf.getInt("spark.yarn.executor.memoryOverhead", -math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toInt, MEMORY_OVERHEAD_MIN)) + val amMemoryOverheadDefault = math.max((MEMORY_OVERHEAD_FACTOR * executorMemory).toInt, + MEMORY_OVERHEAD_MIN) --- End diff -- I should have followed your example earlier. Is the indentation below correct? ``` val amMemoryOverheadDefault = math.max( (MEMORY_OVERHEAD_FACTOR * executorMemory).toInt, MEMORY_OVERHEAD_MIN) val amMemoryOverhead = sparkConf.getSizeAsMb( amMemoryOverheadConf, amMemoryOverheadDefault.toString).toInt ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16195: [Spark-18765] [CORE] Make values for spark.yarn.{am|driv...
Github user daisukebe commented on the issue: https://github.com/apache/spark/pull/16195 This doesn't look like correlating with my code change. Should I fix this, @vanzin? ``` Traceback (most recent call last): File "./dev/run-tests-jenkins.py", line 228, in main() File "./dev/run-tests-jenkins.py", line 215, in main test_result_code, test_result_note = run_tests(tests_timeout) File "./dev/run-tests-jenkins.py", line 138, in run_tests test_result_note = ' * This patch **fails %s**.' % failure_note_by_errcode[test_result_code] KeyError: -9 Build step 'Execute shell' marked build as failure Archiving artifacts Recording test results Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/69846/ Test FAILed. Finished: FAILURE ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #16195: [Spark-18765] [CORE] Make values for spark.yarn.{am|driv...
Github user daisukebe commented on the issue: https://github.com/apache/spark/pull/16195 Per @vanzin's suggestion, - revised the code style, - dded a new default variable, - and also fixed the warning: " --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #16195: [Spark-18765] [CORE] Make values for spark.yarn.{...
GitHub user daisukebe opened a pull request: https://github.com/apache/spark/pull/16195 [Spark-18765] [CORE] Make values for spark.yarn.{am|driver|executor}.memoryOverhead have configurable units ## What changes were proposed in this pull request? Make values for spark.yarn.{am|driver|executor}.memoryOverhead have configurable units ## How was this patch tested? Manual tests were done by running spark-shell and SparkPi. Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/daisukebe/spark SPARK-18765 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/16195.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #16195 commit 9c0cf22f7681ae05d894ae05f6a91a9467787519 Author: Grzegorz Chilkiewicz Date: 2016-02-02T19:16:24Z [SPARK-12711][ML] ML StopWordsRemover does not protect itself from column name duplication Fixes problem and verifies fix by test suite. Also - adds optional parameter: nullable (Boolean) to: SchemaUtils.appendColumn and deduplicates SchemaUtils.appendColumn functions. Author: Grzegorz Chilkiewicz Closes #10741 from grzegorz-chilkiewicz/master. (cherry picked from commit b1835d727234fdff42aa8cadd17ddcf43b0bed15) Signed-off-by: Joseph K. Bradley commit 3c92333ee78f249dae37070d3b6558b9c92ec7f4 Author: Daoyuan Wang Date: 2016-02-02T19:09:40Z [SPARK-13056][SQL] map column would throw NPE if value is null Jira: https://issues.apache.org/jira/browse/SPARK-13056 Create a map like { "a": "somestring", "b": null} Query like SELECT col["b"] FROM t1; NPE would be thrown. Author: Daoyuan Wang Closes #10964 from adrian-wang/npewriter. (cherry picked from commit 358300c795025735c3b2f96c5447b1b227d4abc1) Signed-off-by: Michael Armbrust Conflicts: sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala commit e81333be05cc5e2a41e5eb1a630c5af59a47dd23 Author: Kevin (Sangwoo) Kim Date: 2016-02-02T21:24:09Z [DOCS] Update StructType.scala The example will throw error like :20: error: not found: value StructType Need to add this line: import org.apache.spark.sql.types._ Author: Kevin (Sangwoo) Kim Closes #10141 from swkimme/patch-1. (cherry picked from commit b377b03531d21b1d02a8f58b3791348962e1f31b) Signed-off-by: Michael Armbrust commit 2f8abb4afc08aa8dc4ed763bcb93ff6b1d6f0d78 Author: Adam Budde Date: 2016-02-03T03:35:33Z [SPARK-13122] Fix race condition in MemoryStore.unrollSafely() https://issues.apache.org/jira/browse/SPARK-13122 A race condition can occur in MemoryStore's unrollSafely() method if two threads that return the same value for currentTaskAttemptId() execute this method concurrently. This change makes the operation of reading the initial amount of unroll memory used, performing the unroll, and updating the associated memory maps atomic in order to avoid this race condition. Initial proposed fix wraps all of unrollSafely() in a memoryManager.synchronized { } block. A cleaner approach might be introduce a mechanism that synchronizes based on task attempt ID. An alternative option might be to track unroll/pending unroll memory based on block ID rather than task attempt ID. Author: Adam Budde Closes #11012 from budde/master. (cherry picked from commit ff71261b651a7b289ea2312abd6075da8b838ed9) Signed-off-by: Andrew Or Conflicts: core/src/main/scala/org/apache/spark/storage/MemoryStore.scala commit 5fe8796c2fa859e30cf5ba293bee8957e23163bc Author: Mario Briggs Date: 2016-02-03T17:50:28Z [SPARK-12739][STREAMING] Details of batch in Streaming tab uses two Duration columns I have clearly prefix the two 'Duration' columns in 'Details of Batch' Streaming tab as 'Output Op Duration' and 'Job Duration' Author: Mario Briggs Author: mariobriggs Closes #11022 from mariobriggs/spark-12739. (cherry picked from commit e9eb248edfa81d75f99c9afc2063e6b3d9ee7392) Signed-off-by: Shixiong Zhu commit cdfb2a1410aa799596c8b751187dbac28b2cc678 Author: Wenchen Fan Date: 2016-02-04T00:13:23Z [SPARK-13101][SQL][BRANCH-1.6] nullability of array type element should not fail analysis of encoder nullability should only be considered as an optimization rather than part of the type system, so instead of failing analysis for mismatch nullability, we should pass analysis and add ru
[GitHub] spark pull request: [SPARK-7704] Updating Programming Guides per S...
Github user daisukebe commented on the pull request: https://github.com/apache/spark/pull/6234#issuecomment-103308933 Thanks guys. Then, does adding the following make sense? > If the Spark version is prior to 1.3.0, user needs to explicitly import org.apache.spark.SparkContext._ to allow the implicit conversions. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: Updating Programming Guides per SPARK-4397
GitHub user daisukebe opened a pull request: https://github.com/apache/spark/pull/6234 Updating Programming Guides per SPARK-4397 The change per SPARK-4397 makes implicit objects in SparkContext to be found by the compiler automatically. So that we don't need to import the o.a.s.SparkContext._ explicitly any more and can remove some statements around the "implicit conversions" from the latest Programming Guides (1.3.0 and higher) You can merge this pull request into a Git repository by running: $ git pull https://github.com/daisukebe/spark patch-1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/6234.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #6234 commit a29be5fca8c80dd5fa60107efda7001a2460e226 Author: Dice Date: 2015-05-18T11:41:12Z Updating Programming Guides per SPARK-4397 The change per SPARK-4397 makes implicit objects in SparkContext to be found by the compiler automatically. So that we don't need to import the o.a.s.SparkContext._ explicitly any more and can remove some statements around the "implicit conversions" from the latest Programming Guides (1.3.0 and higher) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org