[jira] [Commented] (SPARK-4338) Remove yarn-alpha support
[ https://issues.apache.org/jira/browse/SPARK-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206104#comment-14206104 ] Sandy Ryza commented on SPARK-4338: --- Planning to take a stab at this Remove yarn-alpha support - Key: SPARK-4338 URL: https://issues.apache.org/jira/browse/SPARK-4338 Project: Spark Issue Type: Sub-task Components: YARN Reporter: Sandy Ryza -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4338) Remove yarn-alpha support
Sandy Ryza created SPARK-4338: - Summary: Remove yarn-alpha support Key: SPARK-4338 URL: https://issues.apache.org/jira/browse/SPARK-4338 Project: Spark Issue Type: Sub-task Components: YARN Reporter: Sandy Ryza -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3647) Shaded Guava patch causes access issues with package private classes
[ https://issues.apache.org/jira/browse/SPARK-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-3647: -- Fix Version/s: 1.2.0 Shaded Guava patch causes access issues with package private classes Key: SPARK-3647 URL: https://issues.apache.org/jira/browse/SPARK-3647 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.2.0 Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Priority: Critical Fix For: 1.2.0 The patch that introduced shading to Guava (SPARK-2848) tried to maintain backwards compatibility in the Java API by not relocating the Optional class. That causes problems when that class references package private members in the Absent and Present classes, which are now in a different package: {noformat} Exception in thread main java.lang.IllegalAccessError: tried to access class org.spark-project.guava.common.base.Present from class com.google.common.base.Optional at com.google.common.base.Optional.of(Optional.java:86) at org.apache.spark.api.java.JavaUtils$.optionToOptional(JavaUtils.scala:25) at org.apache.spark.api.java.JavaSparkContext.getSparkHome(JavaSparkContext.scala:542) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3647) Shaded Guava patch causes access issues with package private classes
[ https://issues.apache.org/jira/browse/SPARK-3647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206112#comment-14206112 ] Andrew Ash commented on SPARK-3647: --- Based on poking at the git repo below I'm marking with a fix version of 1.2.0 (the next release on branch-1.2) {noformat} aash@aash-mbp ~/git/spark$ git log origin/master | grep SPARK-3647 [SPARK-3647] Add more exceptions to Guava relocation. Closes #2496 from vanzin/SPARK-3647 and squashes the following commits: 84f58d7 [Marcelo Vanzin] [SPARK-3647] Add more exceptions to Guava relocation. aash@aash-mbp ~/git/spark$ git log origin/branch-1.0 | grep SPARK-3647 aash@aash-mbp ~/git/spark$ git log origin/branch-1.1 | grep SPARK-3647 aash@aash-mbp ~/git/spark$ git log origin/branch-1.2 | grep SPARK-3647 [SPARK-3647] Add more exceptions to Guava relocation. Closes #2496 from vanzin/SPARK-3647 and squashes the following commits: 84f58d7 [Marcelo Vanzin] [SPARK-3647] Add more exceptions to Guava relocation. aash@aash-mbp ~/git/spark$ {noformat} Shaded Guava patch causes access issues with package private classes Key: SPARK-3647 URL: https://issues.apache.org/jira/browse/SPARK-3647 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.2.0 Reporter: Marcelo Vanzin Assignee: Marcelo Vanzin Priority: Critical Fix For: 1.2.0 The patch that introduced shading to Guava (SPARK-2848) tried to maintain backwards compatibility in the Java API by not relocating the Optional class. That causes problems when that class references package private members in the Absent and Present classes, which are now in a different package: {noformat} Exception in thread main java.lang.IllegalAccessError: tried to access class org.spark-project.guava.common.base.Present from class com.google.common.base.Optional at com.google.common.base.Optional.of(Optional.java:86) at org.apache.spark.api.java.JavaUtils$.optionToOptional(JavaUtils.scala:25) at org.apache.spark.api.java.JavaSparkContext.getSparkHome(JavaSparkContext.scala:542) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4339) Use configuration instead of constant
DoingDone9 created SPARK-4339: - Summary: Use configuration instead of constant Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-572) Forbid update of static mutable variables
[ https://issues.apache.org/jira/browse/SPARK-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206143#comment-14206143 ] Andrew Ash commented on SPARK-572: -- Static mutable variables are now a standard way of having code run on a per-executor basis. To run per-entry, you can use map(), for per-partition you can use mapPartitions(), but for per-executor you need static variables or initializers. If for example you want to open a connection to another data storage system and write all of an executor's data into that system, a static connection object is the common way to do that. I would propose closing this ticket as Won't Fix. Using this technique is confusing, but prohibiting it is difficult and introduces additional roadblocks to Spark power users. cc [~rxin] Forbid update of static mutable variables - Key: SPARK-572 URL: https://issues.apache.org/jira/browse/SPARK-572 Project: Spark Issue Type: Improvement Reporter: tjhunter Consider the following piece of code: pre object Foo { var xx = -1 def main() { xx = 1 val sc = new SparkContext(...) sc.broadcast(xx) sc.parallelize(0 to 10).map(i={ ... xx ...}) } } /pre Can you guess the value of xx? It is 1 when you use the local scheduler and -1 when you use the mesos scheduler. Given the complications, it should probably just be forbidden for now... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4339) Use configuration instead of constant
[ https://issues.apache.org/jira/browse/SPARK-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4339: -- Description: fixedPoint limits the max number of iterations,it should be Configurable. Use configuration instead of constant - Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor fixedPoint limits the max number of iterations,it should be Configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-572) Forbid update of static mutable variables
[ https://issues.apache.org/jira/browse/SPARK-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-572. - Resolution: Won't Fix Closing this as won't fix since it is very hard to enforce and we do abuse it to run stateful computation. Forbid update of static mutable variables - Key: SPARK-572 URL: https://issues.apache.org/jira/browse/SPARK-572 Project: Spark Issue Type: Improvement Reporter: tjhunter Consider the following piece of code: pre object Foo { var xx = -1 def main() { xx = 1 val sc = new SparkContext(...) sc.broadcast(xx) sc.parallelize(0 to 10).map(i={ ... xx ...}) } } /pre Can you guess the value of xx? It is 1 when you use the local scheduler and -1 when you use the mesos scheduler. Given the complications, it should probably just be forbidden for now... -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4339) Make fixedPoint Configurable in Analyzer.scala
[ https://issues.apache.org/jira/browse/SPARK-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4339: -- Summary: Make fixedPoint Configurable in Analyzer.scala (was: Use configuration instead of constant) Make fixedPoint Configurable in Analyzer.scala -- Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor fixedPoint limits the max number of iterations,it should be Configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4339) Make fixedPoint Configurable in Analyzer.scala
[ https://issues.apache.org/jira/browse/SPARK-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4339: -- Description: fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala was:fixedPoint limits the max number of iterations,it should be Configurable. Make fixedPoint Configurable in Analyzer.scala -- Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4339) Make fixedPoint Configurable in Analyzer
[ https://issues.apache.org/jira/browse/SPARK-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4339: -- Summary: Make fixedPoint Configurable in Analyzer (was: Make fixedPoint Configurable in Analyzer.scala) Make fixedPoint Configurable in Analyzer Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4339) Make fixedPoint Configurable in Analyzer
[ https://issues.apache.org/jira/browse/SPARK-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4339: -- Description: fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). was: fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). Make fixedPoint Configurable in Analyzer Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4339) Make fixedPoint Configurable in Analyzer
[ https://issues.apache.org/jira/browse/SPARK-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4339: -- Description: fixedPoint limits the max number of iterations,it should be Configurable. But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). was: fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). Make fixedPoint Configurable in Analyzer Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor fixedPoint limits the max number of iterations,it should be Configurable. But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4339) Make fixedPoint Configurable in Analyzer
[ https://issues.apache.org/jira/browse/SPARK-4339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4339: -- Description: fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). was: fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala Make fixedPoint Configurable in Analyzer Key: SPARK-4339 URL: https://issues.apache.org/jira/browse/SPARK-4339 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor fixedPoint limits the max number of iterations,it should be Configurable.But it is a contant in analyzer.scala,like that val fixedPoint = FixedPoint(100). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-632) Akka system names need to be normalized (since they are case-sensitive)
[ https://issues.apache.org/jira/browse/SPARK-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206158#comment-14206158 ] Andrew Ash commented on SPARK-632: -- // link moved to http://doc.akka.io/docs/akka/current/additional/faq.html#what-is-the-name-of-a-remote-actor I believe having the hostname change case will still break Spark. But after a search of the dev and user mailing lists over the past year I haven't seen any other users with this issue. A potential fix could be to call .toLower on the hostname in the Akka string across the cluster, but it's a little dirty to make this assumption everywhere. Technically [hostnames ARE case insensitive|http://serverfault.com/questions/261341/is-the-hostname-case-sensitive] so Spark's behavior is wrong, but the issue is in the underlying Akka library. This is the same underlying behavior where Akka requires that hostnames exactly match as well -- you can't use an IP address to refer to a Akka listening on a hostname -- SPARK-625. Until Akka handles differently-cased hostnames I think can only be done with an ugly workaround. Possibly relevant Akka issues: - https://github.com/akka/akka/issues/15990 - https://github.com/akka/akka/issues/15007 My preference would be to close this as Won't Fix until it's raised again as a problem from the community. cc [~rxin] Akka system names need to be normalized (since they are case-sensitive) --- Key: SPARK-632 URL: https://issues.apache.org/jira/browse/SPARK-632 Project: Spark Issue Type: Bug Reporter: Matt Massie The system name of the Akka full path is case-sensitive (see http://akka.io/faq/#what_is_the_name_of_a_remote_actor). Since DNS names are case-insensitive and we're using them in the system name, we need to normalize them (e.g. make them all lowercase). Otherwise, users will find the workers will not be able to connect with the master even though the URI appears to be correct. For example, Berkeley DNS occasionally uses names e.g. foo.Berkley.EDU. If I used foo.berkeley.edu as the master adddress, the workers would write to their logs that they are connecting to foo.berkeley.edu but failed to. They never show up in the master UI. If use the foo.Berkeley.EDU address, everything works as it should. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4340) add java opts argument substitute to avoid gc log overwritten
Haitao Yao created SPARK-4340: - Summary: add java opts argument substitute to avoid gc log overwritten Key: SPARK-4340 URL: https://issues.apache.org/jira/browse/SPARK-4340 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.2.0 Reporter: Haitao Yao Priority: Minor In standalone mode, if more executors are assigned to 1 host, the gc log will be overwritten. so I add {{CORE_ID}}, {{EXECUTOR_ID}}, {{APP_ID}} substitute to support configuration with APP_ID Heres' the push request. https://github.com/apache/spark/pull/3205 Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-689) Task will crash when setting SPARK_WORKER_CORES 128
[ https://issues.apache.org/jira/browse/SPARK-689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash updated SPARK-689: - Description: when I set SPARK_WORKER_CORES 128(for example 200), and run a job in standalone mode that will allocate 200 tasks in one worker node, then task will crash(it seems that worker cores has been hard-code) {noformat} 13/02/07 11:25:02 ERROR StandaloneExecutorBackend: Task spark.executor.Executor$TaskRunner@5367839e rejected from java.util.concurrent.ThreadPoolExecutor@30f224d9[Running, pool size = 128, active threads = 128, queued tasks = 0, completed tasks = 0] java.util.concurrent.RejectedExecutionException: Task spark.executor.Executor$TaskRunner@5367839e rejected from java.util.concurrent.ThreadPoolExecutor@30f224d9[Running, pool size = 128, active threads = 128, queued tasks = 0, completed tasks = 0] at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2013) at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:816) at java.util.concurrent.ThreadPoolExecutor.execute(ThreadPoolExecutor.java:1337) at spark.executor.Executor.launchTask(Executor.scala:59) at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:57) at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:46) at akka.actor.Actor$class.apply(Actor.scala:318) at spark.executor.StandaloneExecutorBackend.apply(StandaloneExecutorBackend.scala:17) at akka.actor.ActorCell.invoke(ActorCell.scala:626) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197) at akka.dispatch.Mailbox.run(Mailbox.scala:179) at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516) at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259) at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975) at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479) at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) 13/02/07 11:25:02 INFO StandaloneExecutorBackend: Connecting to master: akka://spark@10.0.2.19:60882/user/StandaloneScheduler 13/02/07 11:25:02 INFO StandaloneExecutorBackend: Got assigned task 1929 13/02/07 11:25:02 INFO Executor: launch taskId: 1929 13/02/07 11:25:02 ERROR StandaloneExecutorBackend: java.lang.NullPointerException at spark.executor.Executor.launchTask(Executor.scala:59) at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:57) at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:46) at akka.actor.Actor$class.apply(Actor.scala:318) at spark.executor.StandaloneExecutorBackend.apply(StandaloneExecutorBackend.scala:17) at akka.actor.ActorCell.invoke(ActorCell.scala:626) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197) at akka.dispatch.Mailbox.run(Mailbox.scala:179) at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516) at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259) at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975) at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479) at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104) 13/02/07 11:25:02 INFO StandaloneExecutorBackend: Connecting to master: akka://spark@10.0.2.19:60882/user/StandaloneScheduler 13/02/07 11:25:02 INFO StandaloneExecutorBackend: Got assigned task 1930 13/02/07 11:25:02 INFO Executor: launch taskId: 1930 13/02/07 11:25:02 ERROR StandaloneExecutorBackend: java.lang.NullPointerException at spark.executor.Executor.launchTask(Executor.scala:59) at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:57) at spark.executor.StandaloneExecutorBackend$$anonfun$receive$1.apply(StandaloneExecutorBackend.scala:46) at akka.actor.Actor$class.apply(Actor.scala:318) at spark.executor.StandaloneExecutorBackend.apply(StandaloneExecutorBackend.scala:17) at akka.actor.ActorCell.invoke(ActorCell.scala:626) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:197) at akka.dispatch.Mailbox.run(Mailbox.scala:179) at akka.dispatch.ForkJoinExecutorConfigurator$MailboxExecutionTask.exec(AbstractDispatcher.scala:516) at akka.jsr166y.ForkJoinTask.doExec(ForkJoinTask.java:259) at akka.jsr166y.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:975) at akka.jsr166y.ForkJoinPool.runWorker(ForkJoinPool.java:1479) at akka.jsr166y.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:104)
[jira] [Commented] (SPARK-650) Add a setup hook API for running initialization code on each executor
[ https://issues.apache.org/jira/browse/SPARK-650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206182#comment-14206182 ] Andrew Ash commented on SPARK-650: -- As mentioned in SPARK-572 static classes' initialization methods are being abused to perform this functionality. [~matei] do you still feel that a per-executor initialization function is a hook that Spark should expose in its public API? Add a setup hook API for running initialization code on each executor --- Key: SPARK-650 URL: https://issues.apache.org/jira/browse/SPARK-650 Project: Spark Issue Type: New Feature Reporter: Matei Zaharia Priority: Minor Would be useful to configure things like reporting libraries -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-632) Akka system names need to be normalized (since they are case-sensitive)
[ https://issues.apache.org/jira/browse/SPARK-632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206197#comment-14206197 ] Reynold Xin commented on SPARK-632: --- Sounds good. In the future we might roll our own RPC rather than using Actor for RPC. I think the current RPC library built for the shuffle service is already ok with case insensitive hostnames. Akka system names need to be normalized (since they are case-sensitive) --- Key: SPARK-632 URL: https://issues.apache.org/jira/browse/SPARK-632 Project: Spark Issue Type: Bug Reporter: Matt Massie The system name of the Akka full path is case-sensitive (see http://akka.io/faq/#what_is_the_name_of_a_remote_actor). Since DNS names are case-insensitive and we're using them in the system name, we need to normalize them (e.g. make them all lowercase). Otherwise, users will find the workers will not be able to connect with the master even though the URI appears to be correct. For example, Berkeley DNS occasionally uses names e.g. foo.Berkley.EDU. If I used foo.berkeley.edu as the master adddress, the workers would write to their logs that they are connecting to foo.berkeley.edu but failed to. They never show up in the master UI. If use the foo.Berkeley.EDU address, everything works as it should. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-632) Akka system names need to be normalized (since they are case-sensitive)
[ https://issues.apache.org/jira/browse/SPARK-632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-632. - Resolution: Fixed Akka system names need to be normalized (since they are case-sensitive) --- Key: SPARK-632 URL: https://issues.apache.org/jira/browse/SPARK-632 Project: Spark Issue Type: Bug Reporter: Matt Massie The system name of the Akka full path is case-sensitive (see http://akka.io/faq/#what_is_the_name_of_a_remote_actor). Since DNS names are case-insensitive and we're using them in the system name, we need to normalize them (e.g. make them all lowercase). Otherwise, users will find the workers will not be able to connect with the master even though the URI appears to be correct. For example, Berkeley DNS occasionally uses names e.g. foo.Berkley.EDU. If I used foo.berkeley.edu as the master adddress, the workers would write to their logs that they are connecting to foo.berkeley.edu but failed to. They never show up in the master UI. If use the foo.Berkeley.EDU address, everything works as it should. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4341) Spark need to set num-executors automatically
Hong Shen created SPARK-4341: Summary: Spark need to set num-executors automatically Key: SPARK-4341 URL: https://issues.apache.org/jira/browse/SPARK-4341 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.1.0 Reporter: Hong Shen The mapreduce job can set maptask automaticlly, but in spark, we have to set num-executors, executor memory and cores. It's difficult for users to set these args, especially for the users want to use spark sql. So when user havn't set num-executors, spark should set num-executors automatically accroding to the input partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4341) Spark need to set num-executors automatically
[ https://issues.apache.org/jira/browse/SPARK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206216#comment-14206216 ] Sean Owen commented on SPARK-4341: -- I don't agree with this. How would Spark know a priori the number and spec of machines you have? how would it know how to balance its desire to grab it all, vs yours to not commit everything to Spark. MapReduce does *not* set the amount of resource it consumes on the host machine automatically. This is up to the administrator. The number of map tasks in a job is set by MR, but that's different. Spark does the same thing already since it uses the same InputSplits. MR does not set the number of reducers. Spark need to set num-executors automatically - Key: SPARK-4341 URL: https://issues.apache.org/jira/browse/SPARK-4341 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.1.0 Reporter: Hong Shen The mapreduce job can set maptask automaticlly, but in spark, we have to set num-executors, executor memory and cores. It's difficult for users to set these args, especially for the users want to use spark sql. So when user havn't set num-executors, spark should set num-executors automatically accroding to the input partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4314) Exception when textFileStream attempts to read deleted _COPYING_ file
[ https://issues.apache.org/jira/browse/SPARK-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maji2014 updated SPARK-4314: Description: [Reproduce] 1. Run HdfsWordCount interface, such as ssc.textFileStream(args(0)) 2. Upload file to hdfs(reason as followings) 3. Exception as followings. [Exception stack] 14/11/10 01:21:19 DEBUG Client: IPC Client (842425021) connection to master/192.168.84.142:9000 from ocdc sending #13 14/11/10 01:21:19 ERROR JobScheduler: Error generating jobs for time 1415611274000 ms org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://master:9000/user/spark/200.COPYING at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:285) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:125) at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:124) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.streaming.dstream.FileInputDStream.org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD(FileInputDStream.scala:124) at org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:83) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at org.apache.spark.streaming.dstream.TransformedDStream.compute(TransformedDStream.scala:40) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.ShuffledDStream.compute(ShuffledDStream.scala:41) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) at org.apache.spark.streaming.DStreamGraph.generateJobs(DStreamGraph.scala:115) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:221) at org.apache.spark.streaming.scheduler.JobGenerator$$anonfun$2.apply(JobGenerator.scala:221) at scala.util.Try$.apply(Try.scala:161) at org.apache.spark.streaming.scheduler.JobGenerator.generateJobs(JobGenerator.scala:221) at org.apache.spark.streaming.scheduler.JobGenerator.org$apache$spark$streaming$scheduler$JobGenerator$$processEvent(JobGenerator.scala:165) at
[jira] [Issue Comment Deleted] (SPARK-4314) Exception when textFileStream attempts to read deleted _COPYING_ file
[ https://issues.apache.org/jira/browse/SPARK-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] maji2014 updated SPARK-4314: Comment: was deleted (was: Yes, Not all of this intermediate state are caught. i wanna add following code into defaultFilter method under FileInputDStream. Any suggestions?) Exception when textFileStream attempts to read deleted _COPYING_ file - Key: SPARK-4314 URL: https://issues.apache.org/jira/browse/SPARK-4314 Project: Spark Issue Type: Bug Components: Streaming Reporter: maji2014 [Reproduce] 1. Run HdfsWordCount interface, such as ssc.textFileStream(args(0)) 2. Upload file to hdfs(reason as followings) 3. Exception as followings. [Exception stack] 14/11/10 01:21:19 DEBUG Client: IPC Client (842425021) connection to master/192.168.84.142:9000 from ocdc sending #13 14/11/10 01:21:19 ERROR JobScheduler: Error generating jobs for time 1415611274000 ms org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: hdfs://master:9000/user/spark/200.COPYING at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:285) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:340) at org.apache.spark.rdd.NewHadoopRDD.getPartitions(NewHadoopRDD.scala:95) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:125) at org.apache.spark.streaming.dstream.FileInputDStream$$anonfun$org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD$1.apply(FileInputDStream.scala:124) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at org.apache.spark.streaming.dstream.FileInputDStream.org$apache$spark$streaming$dstream$FileInputDStream$$filesToRDD(FileInputDStream.scala:124) at org.apache.spark.streaming.dstream.FileInputDStream.compute(FileInputDStream.scala:83) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:40) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.TraversableLike$class.map(TraversableLike.scala:244) at scala.collection.AbstractTraversable.map(Traversable.scala:105) at org.apache.spark.streaming.dstream.TransformedDStream.compute(TransformedDStream.scala:40) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.ShuffledDStream.compute(ShuffledDStream.scala:41) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.MappedDStream.compute(MappedDStream.scala:35) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:291) at org.apache.spark.streaming.dstream.ForEachDStream.generateJob(ForEachDStream.scala:38) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) at org.apache.spark.streaming.DStreamGraph$$anonfun$1.apply(DStreamGraph.scala:115) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) at
[jira] [Resolved] (SPARK-4295) [External]Exception throws in SparkSinkSuite although all test cases pass
[ https://issues.apache.org/jira/browse/SPARK-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-4295. -- Resolution: Fixed Fix Version/s: 1.2.0 1.1.1 [External]Exception throws in SparkSinkSuite although all test cases pass - Key: SPARK-4295 URL: https://issues.apache.org/jira/browse/SPARK-4295 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.1.0 Reporter: maji2014 Priority: Minor Fix For: 1.1.1, 1.2.0 [reproduce] Run test suite normally, after the first test case, all other test cases throw javax.management.InstanceAlreadyExistsException: org.apache.flume.channel:type=null [exception stack] exception as followings: 14/11/07 00:24:51 ERROR MonitoredCounterGroup: Failed to register monitored counter group for type: CHANNEL, name: null javax.management.InstanceAlreadyExistsException: org.apache.flume.channel:type=null at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:437) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerWithRepository(DefaultMBeanServerInterceptor.java:1898) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:966) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:900) at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:324) at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:522) at org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:108) at org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:88) at org.apache.flume.channel.MemoryChannel.start(MemoryChannel.java:345) at org.apache.spark.streaming.flume.sink.SparkSinkSuite$$anonfun$2.apply$mcV$sp(SparkSinkSuite.scala:63) at org.apache.spark.streaming.flume.sink.SparkSinkSuite$$anonfun$2.apply(SparkSinkSuite.scala:61) at org.apache.spark.streaming.flume.sink.SparkSinkSuite$$anonfun$2.apply(SparkSinkSuite.scala:61) at org.scalatest.Transformer$$anonfun$apply$1.apply$mcV$sp(Transformer.scala:22) at org.scalatest.OutcomeOf$class.outcomeOf(OutcomeOf.scala:85) at org.scalatest.OutcomeOf$.outcomeOf(OutcomeOf.scala:104) at org.scalatest.Transformer.apply(Transformer.scala:22) at org.scalatest.Transformer.apply(Transformer.scala:20) at org.scalatest.FunSuiteLike$$anon$1.apply(FunSuiteLike.scala:166) at org.scalatest.Suite$class.withFixture(Suite.scala:1122) at org.scalatest.FunSuite.withFixture(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$class.invokeWithFixture$1(FunSuiteLike.scala:163) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) at org.scalatest.FunSuiteLike$$anonfun$runTest$1.apply(FunSuiteLike.scala:175) at org.scalatest.SuperEngine.runTestImpl(Engine.scala:306) at org.scalatest.FunSuiteLike$class.runTest(FunSuiteLike.scala:175) at org.scalatest.FunSuite.runTest(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) at org.scalatest.FunSuiteLike$$anonfun$runTests$1.apply(FunSuiteLike.scala:208) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:413) at org.scalatest.SuperEngine$$anonfun$traverseSubNodes$1$1.apply(Engine.scala:401) at scala.collection.immutable.List.foreach(List.scala:318) at org.scalatest.SuperEngine.traverseSubNodes$1(Engine.scala:401) at org.scalatest.SuperEngine.org$scalatest$SuperEngine$$runTestsInBranch(Engine.scala:396) at org.scalatest.SuperEngine.runTestsImpl(Engine.scala:483) at org.scalatest.FunSuiteLike$class.runTests(FunSuiteLike.scala:208) at org.scalatest.FunSuite.runTests(FunSuite.scala:1555) at org.scalatest.Suite$class.run(Suite.scala:1424) at org.scalatest.FunSuite.org$scalatest$FunSuiteLike$$super$run(FunSuite.scala:1555) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) at org.scalatest.FunSuiteLike$$anonfun$run$1.apply(FunSuiteLike.scala:212) at org.scalatest.SuperEngine.runImpl(Engine.scala:545) at org.scalatest.FunSuiteLike$class.run(FunSuiteLike.scala:212) at org.scalatest.FunSuite.run(FunSuite.scala:1555) at org.scalatest.tools.SuiteRunner.run(SuiteRunner.scala:55) at org.scalatest.tools.Runner$$anonfun$doRunRunRunDaDoRunRun$3.apply(Runner.scala:2563) at
[jira] [Resolved] (SPARK-2492) KafkaReceiver minor changes to align with Kafka 0.8
[ https://issues.apache.org/jira/browse/SPARK-2492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das resolved SPARK-2492. -- Resolution: Fixed Fix Version/s: 1.2.0 KafkaReceiver minor changes to align with Kafka 0.8 Key: SPARK-2492 URL: https://issues.apache.org/jira/browse/SPARK-2492 Project: Spark Issue Type: Improvement Components: Streaming Affects Versions: 1.0.0 Reporter: Saisai Shao Assignee: Saisai Shao Priority: Minor Fix For: 1.2.0 Update to delete Zookeeper metadata when Kafka's parameter auto.offset.reset is set to smallest, which is aligned with Kafka 0.8's ConsoleConsumer. Also use Kafka offered API without directly using zkClient. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4341) Spark need to set num-executors automatically
[ https://issues.apache.org/jira/browse/SPARK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206263#comment-14206263 ] Hong Shen commented on SPARK-4341: -- There must be some relation between inputSplits, num-executors and spark parallelism, for example, if inputSplits (determine partitions of input rdd) is less than num-executors, it will lead to a waste of resources, if inputSplits much bigger than num-executor, job will last a long time. It's the same to num-executors and spark parallelism. So if we want spark widely used, it's should set by spark automatically. Spark need to set num-executors automatically - Key: SPARK-4341 URL: https://issues.apache.org/jira/browse/SPARK-4341 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.1.0 Reporter: Hong Shen The mapreduce job can set maptask automaticlly, but in spark, we have to set num-executors, executor memory and cores. It's difficult for users to set these args, especially for the users want to use spark sql. So when user havn't set num-executors, spark should set num-executors automatically accroding to the input partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4326) unidoc is broken on master
[ https://issues.apache.org/jira/browse/SPARK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206265#comment-14206265 ] Sean Owen commented on SPARK-4326: -- Hm. {{hashInt}} isn't in Guava 11, but is in 12. This leads me to believe that unidoc is picking up Guava 11 from Hadoop, and not Guava 14 from Spark since it's shaded. I would like to phone a friend: [~vanzin] unidoc is broken on master -- Key: SPARK-4326 URL: https://issues.apache.org/jira/browse/SPARK-4326 Project: Spark Issue Type: Bug Components: Build, Documentation Affects Versions: 1.3.0 Reporter: Xiangrui Meng On master, `jekyll build` throws the following error: {code} [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/AppendOnlyMap.scala:205: value hashInt is not a member of com.google.common.hash.HashFunction [error] private def rehash(h: Int): Int = Hashing.murmur3_32().hashInt(h).asInt() [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala:426: value limit is not a member of object com.google.common.io.ByteStreams [error] val bufferedStream = new BufferedInputStream(ByteStreams.limit(fileStream, end - start)) [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala:558: value limit is not a member of object com.google.common.io.ByteStreams [error] val bufferedStream = new BufferedInputStream(ByteStreams.limit(fileStream, end - start)) [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala:261: value hashInt is not a member of com.google.common.hash.HashFunction [error] private def hashcode(h: Int): Int = Hashing.murmur3_32().hashInt(h).asInt() [error]^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/Utils.scala:37: type mismatch; [error] found : java.util.Iterator[T] [error] required: Iterable[?] [error] collectionAsScalaIterable(ordering.leastOf(asJavaIterator(input), num)).iterator [error] ^ [error] /Users/meng/src/spark/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableOperations.scala:421: value putAll is not a member of com.google.common.cache.Cache[org.apache.hadoop.fs.FileStatus,parquet.hadoop.Footer] [error] footerCache.putAll(newFooters) [error] ^ [warn] /Users/meng/src/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/parquet/FakeParquetSerDe.scala:34: @deprecated now takes two arguments; see the scaladoc. [warn] @deprecated(No code should depend on FakeParquetHiveSerDe as it is only intended as a + [warn] ^ [info] No documentation generated with unsucessful compiler run [warn] two warnings found [error] 6 errors found [error] (spark/scalaunidoc:doc) Scaladoc generation failed [error] Total time: 48 s, completed Nov 10, 2014 1:31:01 PM {code} It doesn't happen on branch-1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4340) add java opts argument substitute to avoid gc log overwritten
[ https://issues.apache.org/jira/browse/SPARK-4340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206286#comment-14206286 ] Apache Spark commented on SPARK-4340: - User 'haitaoyao' has created a pull request for this issue: https://github.com/apache/spark/pull/3205 add java opts argument substitute to avoid gc log overwritten - Key: SPARK-4340 URL: https://issues.apache.org/jira/browse/SPARK-4340 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.2.0 Reporter: Haitao Yao Priority: Minor In standalone mode, if more executors are assigned to 1 host, the gc log will be overwritten. so I add {{CORE_ID}}, {{EXECUTOR_ID}}, {{APP_ID}} substitute to support configuration with APP_ID Heres' the push request. https://github.com/apache/spark/pull/3205 Thanks. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4341) Spark need to set num-executors automatically
[ https://issues.apache.org/jira/browse/SPARK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206288#comment-14206288 ] Sean Owen commented on SPARK-4341: -- Yes, but, the executors live as long as the app does. The app may invoke lots of operations, large and small, with different numbers of partitions each. It is not like MR, where one MR execute one map and one reduce. Too many splits does not waste resources; it means you incur the overhead of launching more tasks, but that's relatively small. Concretely, how do you propose to set this automatically? Spark need to set num-executors automatically - Key: SPARK-4341 URL: https://issues.apache.org/jira/browse/SPARK-4341 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 1.1.0 Reporter: Hong Shen The mapreduce job can set maptask automaticlly, but in spark, we have to set num-executors, executor memory and cores. It's difficult for users to set these args, especially for the users want to use spark sql. So when user havn't set num-executors, spark should set num-executors automatically accroding to the input partitions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4342) connection ack timeout improvement, replace Timer with ScheudledExecutor...
[ https://issues.apache.org/jira/browse/SPARK-4342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206305#comment-14206305 ] Apache Spark commented on SPARK-4342: - User 'haitaoyao' has created a pull request for this issue: https://github.com/apache/spark/pull/3207 connection ack timeout improvement, replace Timer with ScheudledExecutor... --- Key: SPARK-4342 URL: https://issues.apache.org/jira/browse/SPARK-4342 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Haitao Yao replace java.util.Timer with scheduledExecutorService, use message id directly in the task. for details, see the mailing list. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2055) bin$ ./run-example is bad. must run SPARK_HOME$ bin/run-example. look at the file run-example at line 54.
[ https://issues.apache.org/jira/browse/SPARK-2055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2055. -- Resolution: Invalid I think this is obsolete or at least quite unclear. The script does control the directory from which spark-submit is run explicitly on about line 54. bin$ ./run-example is bad. must run SPARK_HOME$ bin/run-example. look at the file run-example at line 54. -- Key: SPARK-2055 URL: https://issues.apache.org/jira/browse/SPARK-2055 Project: Spark Issue Type: Improvement Components: Examples Affects Versions: 1.0.0 Reporter: Peerless.feng -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1858) Update third-party Hadoop distros doc to list more distros
[ https://issues.apache.org/jira/browse/SPARK-1858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206353#comment-14206353 ] Sean Owen commented on SPARK-1858: -- Same, this is one of two issues left under the stale-sounding https://issues.apache.org/jira/browse/SPARK-1351 . Are any more distros on the radar that aren't on the page? Update third-party Hadoop distros doc to list more distros Key: SPARK-1858 URL: https://issues.apache.org/jira/browse/SPARK-1858 Project: Spark Issue Type: Sub-task Components: Documentation Reporter: Matei Zaharia Fix For: 1.0.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1564) Add JavaScript into Javadoc to turn ::Experimental:: and such into badges
[ https://issues.apache.org/jira/browse/SPARK-1564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206352#comment-14206352 ] Sean Owen commented on SPARK-1564: -- This is one of two issues left under the stale-sounding https://issues.apache.org/jira/browse/SPARK-1351 . Can this be turned loose into a floating issue for 1.3+ or marked WontFix? Add JavaScript into Javadoc to turn ::Experimental:: and such into badges - Key: SPARK-1564 URL: https://issues.apache.org/jira/browse/SPARK-1564 Project: Spark Issue Type: Sub-task Components: Documentation Reporter: Matei Zaharia Assignee: Andrew Or Priority: Minor Fix For: 1.2.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2092) This is a test issue
[ https://issues.apache.org/jira/browse/SPARK-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2092. -- Resolution: Not a Problem Patrick can these test issues be closed? I'm just looking over old issues. This looks like an old test. This is a test issue Key: SPARK-2092 URL: https://issues.apache.org/jira/browse/SPARK-2092 Project: Spark Issue Type: New Feature Reporter: Test Assignee: Test -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1943) Testing use of target version field
[ https://issues.apache.org/jira/browse/SPARK-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1943. -- Resolution: Not a Problem Patrick can these test issues be closed? I'm just looking over old issues. This looks like an old test. Testing use of target version field --- Key: SPARK-1943 URL: https://issues.apache.org/jira/browse/SPARK-1943 Project: Spark Issue Type: Bug Affects Versions: 1.0.0 Reporter: Patrick Wendell -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4196) Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration
[ https://issues.apache.org/jira/browse/SPARK-4196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen updated SPARK-4196: - Summary: Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration (was: Streaming + checkpointing yields NotSerializableException for Hadoop Configuration from saveAsNewAPIHadoopFiles ?) More info. The problem is that {{CheckpointWriter}} serializes the {{DStreamGraph}} when checkpointing is enabled. In the case of, for example, {{saveAsNewAPIHadoopFiles}}, this includes a {{ForEachDStream}} with a reference to a Hadoop {{Configuration}}. This isn't a problem without checkpointing because Spark is not going to need to serialize this {{ForEachDStream}} closure to execute it in general. But it does to checkpoint it. Does that make sense? I'm not sure what to do but this is presenting a significant problem to me as I can't see a sly workaround to make streaming, with saving Hadoop files, with checkpointing, to work. Here's a cobbled-together test that shows the problem: {code} test(recovery with save to HDFS stream) { // Set up the streaming context and input streams val testDir = Utils.createTempDir() val outDir = Utils.createTempDir() var ssc = new StreamingContext(master, framework, Seconds(1)) ssc.checkpoint(checkpointDir) val fileStream = ssc.textFileStream(testDir.toString) for (i - Seq(1, 2, 3)) { Files.write(i + \n, new File(testDir, i.toString), Charset.forName(UTF-8)) // wait to make sure that the file is written such that it gets shown in the file listings } val reducedStream = fileStream.map(x = (x, x)).saveAsNewAPIHadoopFiles( outDir.toURI.toString, saveAsNewAPIHadoopFilesTest, classOf[Text], classOf[Text], classOf[TextOutputFormat[Text,Text]], ssc.sparkContext.hadoopConfiguration) ssc.start() ssc.awaitTermination(5000) ssc.stop() val checkpointDirFile = new File(checkpointDir) assert(outDir.listFiles().length 0) assert(checkpointDirFile.listFiles().length == 1) assert(checkpointDirFile.listFiles()(0).listFiles().length 0) Utils.deleteRecursively(testDir) Utils.deleteRecursively(outDir) } {code} You'll see the {{NotSerializableException}} clearly if you hack {{Checkpoint.write()}}: {code} def write(checkpoint: Checkpoint) { val bos = new ByteArrayOutputStream() val zos = compressionCodec.compressedOutputStream(bos) val oos = new ObjectOutputStream(zos) try { oos.writeObject(checkpoint) } catch { case e: Exception = e.printStackTrace() throw e } ... {code} Streaming + checkpointing + saveAsNewAPIHadoopFiles = NotSerializableException for Hadoop Configuration --- Key: SPARK-4196 URL: https://issues.apache.org/jira/browse/SPARK-4196 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.1.0 Reporter: Sean Owen I am reasonably sure there is some issue here in Streaming and that I'm not missing something basic, but not 100%. I went ahead and posted it as a JIRA to track, since it's come up a few times before without resolution, and right now I can't get checkpointing to work at all. When Spark Streaming checkpointing is enabled, I see a NotSerializableException thrown for a Hadoop Configuration object, and it seems like it is not one from my user code. Before I post my particular instance see http://mail-archives.apache.org/mod_mbox/spark-user/201408.mbox/%3c1408135046777-12202.p...@n3.nabble.com%3E for another occurrence. I was also on customer site last week debugging an identical issue with checkpointing in a Scala-based program and they also could not enable checkpointing without hitting exactly this error. The essence of my code is: {code} final JavaSparkContext sparkContext = new JavaSparkContext(sparkConf); JavaStreamingContextFactory streamingContextFactory = new JavaStreamingContextFactory() { @Override public JavaStreamingContext create() { return new JavaStreamingContext(sparkContext, new Duration(batchDurationMS)); } }; streamingContext = JavaStreamingContext.getOrCreate( checkpointDirString, sparkContext.hadoopConfiguration(), streamingContextFactory, false); streamingContext.checkpoint(checkpointDirString); {code} It yields: {code} 2014-10-31 14:29:00,211 ERROR OneForOneStrategy:66 org.apache.hadoop.conf.Configuration - field (class org.apache.spark.streaming.dstream.PairDStreamFunctions$$anonfun$9, name: conf$2, type: class org.apache.hadoop.conf.Configuration) - object (class
[jira] [Updated] (SPARK-1825) Windows Spark fails to work with Linux YARN
[ https://issues.apache.org/jira/browse/SPARK-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ángel Álvarez updated SPARK-1825: - Attachment: SPARK-1825.patch Is it really necessary to change the file ExecutorRunnableUtil.scala? I'd just changed the file ClientBase.scala and it (apparently) works for Spark 1.1. In order to make it work, you'll have to add the following configuration - Program arguments: --master yarn-cluster - VM arguments: -Dspark.app-submission.cross-platform=true Windows Spark fails to work with Linux YARN --- Key: SPARK-1825 URL: https://issues.apache.org/jira/browse/SPARK-1825 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.0.0 Reporter: Taeyun Kim Fix For: 1.2.0 Attachments: SPARK-1825.patch Windows Spark fails to work with Linux YARN. This is a cross-platform problem. This error occurs when 'yarn-client' mode is used. (yarn-cluster/yarn-standalone mode was not tested.) On YARN side, Hadoop 2.4.0 resolved the issue as follows: https://issues.apache.org/jira/browse/YARN-1824 But Spark YARN module does not incorporate the new YARN API yet, so problem persists for Spark. First, the following source files should be changed: - /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ClientBase.scala - /yarn/common/src/main/scala/org/apache/spark/deploy/yarn/ExecutorRunnableUtil.scala Change is as follows: - Replace .$() to .$$() - Replace File.pathSeparator for Environment.CLASSPATH.name to ApplicationConstants.CLASS_PATH_SEPARATOR (import org.apache.hadoop.yarn.api.ApplicationConstants is required for this) Unless the above are applied, launch_container.sh will contain invalid shell script statements(since they will contain Windows-specific separators), and job will fail. Also, the following symptom should also be fixed (I could not find the relevant source code): - SPARK_HOME environment variable is copied straight to launch_container.sh. It should be changed to the path format for the server OS, or, the better, a separate environment variable or a configuration variable should be created. - '%HADOOP_MAPRED_HOME%' string still exists in launch_container.sh, after the above change is applied. maybe I missed a few lines. I'm not sure whether this is all, since I'm new to both Spark and YARN. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1493) Apache RAT excludes don't work with file path (instead of file name)
[ https://issues.apache.org/jira/browse/SPARK-1493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1493. -- Resolution: Fixed Looks like this was fixed? There are no paths of this form in .rat-excludes now. Apache RAT excludes don't work with file path (instead of file name) Key: SPARK-1493 URL: https://issues.apache.org/jira/browse/SPARK-1493 Project: Spark Issue Type: Bug Components: Project Infra Reporter: Patrick Wendell Labels: starter Right now the way we do RAT checks, it doesn't work if you try to exclude: /path/to/file.ext you have to just exclude file.ext -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4301) StreamingContext should not allow start() to be called after calling stop()
[ https://issues.apache.org/jira/browse/SPARK-4301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206406#comment-14206406 ] Sean Owen commented on SPARK-4301: -- Sorry, a bit late, but I noticed this is pretty related to https://issues.apache.org/jira/browse/SPARK-2645 which discusses calling stop() twice. StreamingContext should not allow start() to be called after calling stop() --- Key: SPARK-4301 URL: https://issues.apache.org/jira/browse/SPARK-4301 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.0.0, 1.0.1, 1.0.2, 1.1.0 Reporter: Josh Rosen Assignee: Josh Rosen Fix For: 1.1.1, 1.2.0, 1.0.3 In Spark 1.0.0+, calling {{stop()}} on a StreamingContext that has not been started is a no-op which has no side-effects. This allows users to call {{stop()}} on a fresh StreamingContext followed by {{start()}}. I believe that this almost always indicates an error and is not behavior that we should support. Since we don't allow {{start() stop() start()}} then I don't think it makes sense to allow {{stop() start()}}. The current behavior can lead to resource leaks when StreamingContext constructs its own SparkContext: if I call {{stop(stopSparkContext=True)}}, then I expect StreamingContext's underlying SparkContext to be stopped irrespective of whether the StreamingContext has been started. This is useful when writing unit test fixtures. Prior discussions: - https://github.com/apache/spark/pull/3053#discussion-diff-19710333R490 - https://github.com/apache/spark/pull/3121#issuecomment-61927353 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2725) Add instructions about how to build with Hive to building-with-maven.md
[ https://issues.apache.org/jira/browse/SPARK-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2725. -- Resolution: Fixed The current {{building-spark.md}} has Hive-related build instructions. This was fixed along the way, it seems. Add instructions about how to build with Hive to building-with-maven.md --- Key: SPARK-2725 URL: https://issues.apache.org/jira/browse/SPARK-2725 Project: Spark Issue Type: Documentation Components: Documentation Reporter: Matei Zaharia -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2720) spark-examples should depend on HBase modules for HBase 0.96+
[ https://issues.apache.org/jira/browse/SPARK-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2720. -- Resolution: Duplicate Basically subsumed by SPARK-1297, which was resolved. spark-examples should depend on HBase modules for HBase 0.96+ - Key: SPARK-2720 URL: https://issues.apache.org/jira/browse/SPARK-2720 Project: Spark Issue Type: Task Reporter: Ted Yu Priority: Minor With this change: {code} diff --git a/pom.xml b/pom.xml index 93ef3b9..092430a 100644 --- a/pom.xml +++ b/pom.xml @@ -122,7 +122,7 @@ hadoop.version1.0.4/hadoop.version protobuf.version2.4.1/protobuf.version yarn.version${hadoop.version}/yarn.version -hbase.version0.94.6/hbase.version +hbase.version0.98.4/hbase.version zookeeper.version3.4.5/zookeeper.version hive.version0.12.0/hive.version parquet.version1.4.3/parquet.version {code} I got: {code} [ERROR] Failed to execute goal on project spark-examples_2.10: Could not resolve dependencies for project org.apache.spark:spark-examples_2.10:jar:1.1.0-SNAPSHOT: Could not find artifact org.apache.hbase:hbase:jar:0.98.4 in maven-repo (http://repo.maven.apache.org/maven2) - [Help 1] {code} To build against HBase 0.96+, spark-examples needs to specify HBase modules (hbase-client, etc) in dependencies - possibly using a new profile. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2732) Update build script to Tachyon 0.5.0
[ https://issues.apache.org/jira/browse/SPARK-2732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2732. -- Resolution: Duplicate Dupe of SPARK-2702, and resolved anyway. Update build script to Tachyon 0.5.0 Key: SPARK-2732 URL: https://issues.apache.org/jira/browse/SPARK-2732 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Henry Saputra Update Maven pom.xml and sbt script to use Tachyon 0.5.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2733) Update make-distribution.sh to download Tachyon 0.5.0
[ https://issues.apache.org/jira/browse/SPARK-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2733. -- Resolution: Duplicate Also roughly a duplicate of SPARK-2702, like its parent, and also resolved. Update make-distribution.sh to download Tachyon 0.5.0 - Key: SPARK-2733 URL: https://issues.apache.org/jira/browse/SPARK-2733 Project: Spark Issue Type: Sub-task Components: Spark Core Reporter: Henry Saputra Need to update make-distribution.sh to download Tachyon 0.5.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2819) Difficult to turn on intercept with linear models
[ https://issues.apache.org/jira/browse/SPARK-2819?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206415#comment-14206415 ] Sean Owen commented on SPARK-2819: -- Is this still in play ? the convenience methods can't cover every possible combination of params or else they merely duplicate the constructors in a complicated way. Difficult to turn on intercept with linear models - Key: SPARK-2819 URL: https://issues.apache.org/jira/browse/SPARK-2819 Project: Spark Issue Type: Improvement Components: MLlib Reporter: Sandy Ryza If I want to train a logistic regression model with default parameters and include an intercept, I can run: val alg = new LogisticRegressionWithSGD() alg.setIntercept(true) alg.run(data) but if I want to set a parameter like numIterations, I need to use LogisticRegressionWithSGD.train(data, 50) and have no opportunity to turn on the intercept. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2982) Glitch of spark streaming
[ https://issues.apache.org/jira/browse/SPARK-2982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2982. -- Resolution: Invalid Target Version/s: (was: 1.0.2) Glitch of spark streaming - Key: SPARK-2982 URL: https://issues.apache.org/jira/browse/SPARK-2982 Project: Spark Issue Type: Improvement Components: Streaming Affects Versions: 1.0.0 Reporter: dai zhiyuan Attachments: cpu.png, io.png, network.png spark streaming task startup time is very focused,It creates a problem which is glitch of (network and cpu) , and cpu and network is in an idle state at lot of time,which is wasteful for system resources. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2121) Not fully cached when there is enough memory in ALS
[ https://issues.apache.org/jira/browse/SPARK-2121?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-2121. -- Resolution: Not a Problem Not fully cached when there is enough memory in ALS --- Key: SPARK-2121 URL: https://issues.apache.org/jira/browse/SPARK-2121 Project: Spark Issue Type: Bug Components: Block Manager, MLlib, Spark Core Affects Versions: 1.0.0 Reporter: Shuo Xiang While factorizing a large matrix using the latest Alternating Least Squares (ALS) in mllib, from sparkUI it looks like that spark fail to cache all the partitions of some RDD while memory is sufficient. Please find [this post](http://apache-spark-user-list.1001560.n3.nabble.com/Not-fully-cached-when-there-is-enough-memory-tt7429.html) for screenshots. This may cause subsequent job failures while executing `userOut.Count()` or `productsOut.count`. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3228) When DStream save RDD to hdfs , don't create directory and empty file if there are no data received from source in the batch duration .
[ https://issues.apache.org/jira/browse/SPARK-3228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3228. -- Resolution: Won't Fix Given PR discussion, sounds like a WontFix When DStream save RDD to hdfs , don't create directory and empty file if there are no data received from source in the batch duration . --- Key: SPARK-3228 URL: https://issues.apache.org/jira/browse/SPARK-3228 Project: Spark Issue Type: Improvement Components: Streaming Reporter: Leo When I use DStream to save files to hdfs, it will create a directory and a empty file named _SUCCESS for each job which made in the batch duration. But if there are no data from source for a long time , and the duration is very short(e.g. 10s), it will create so many directory and empty files in hdfs. I don't think it is necessary. So I want to modify class DStream's method saveAsObjectFiles and saveAsTextFiles , it creates directory and files just when the RDD's partitions size 0 . -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-3624) Failed to find Spark assembly in /usr/share/spark/lib for RELEASED debian packages
[ https://issues.apache.org/jira/browse/SPARK-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-3624. -- Resolution: Won't Fix PR discussion seems to have several votes for WontFix Failed to find Spark assembly in /usr/share/spark/lib for RELEASED debian packages Key: SPARK-3624 URL: https://issues.apache.org/jira/browse/SPARK-3624 Project: Spark Issue Type: Bug Components: Build, Deploy Affects Versions: 1.1.0 Reporter: Christian Tzolov Priority: Minor The compute-classpath.sh requires that for a 'RELASED' package the Spark assembly jar is accessible from a spark home/lib folder. Currently the jdeb packaging (assembly module) bundles the assembly jar into a folder called 'jars'. The result is : /usr/share/spark/bin/spark-submit --num-executors 10--master yarn-cluster --class org.apache.spark.examples.SparkPi /usr/share/spark/jars/spark-examples-1.1.0-hadoop2.2.0-gphd-3.0.1.0.jar 10 ls: cannot access /usr/share/spark/lib: No such file or directory Failed to find Spark assembly in /usr/share/spark/lib You need to build Spark before running this program. Trivial solution is to rename the 'prefix${deb.install.path}/jars/prefix' inside assembly/pom.xml to prefix${deb.install.path}/lib/prefix. Another less impactful (considering backward compatibility) solution is to define a lib-jars symlink in the assembly/pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1317) sbt doesn't work for building Spark programs
[ https://issues.apache.org/jira/browse/SPARK-1317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1317. -- Resolution: Not a Problem sbt doesn't work for building Spark programs Key: SPARK-1317 URL: https://issues.apache.org/jira/browse/SPARK-1317 Project: Spark Issue Type: Bug Components: Build, Documentation Affects Versions: 0.9.0 Reporter: Diana Carroll I don't know if this is a doc bug or a product bug, because I don't know how it is supposed to work. The Spark quick start guide page has a section that walks you through creating a standalone Spark app in Scala. I think the instructions worked in 0.8.1 but I can't get them to work in 0.9.0. The instructions have you create a directory structure in the canonical sbt format, but do not tell you where to locate this directory. However, after setting up the structure, the tutorial then instructs you to use the command {code}sbt/sbt package{code} which implies that the working directory must be SPARK_HOME. I tried it both ways: creating a mysparkapp directory right in SPARK_HOME and creating it in my home directory. Neither worked, with different results: - if I create a mysparkapp directory as instructed in SPARK_HOME, cd to SPARK_HOME and run the command sbt/sbt package as specified, it packages ALL of Spark...but does not build my own app. - if I create a mysparkapp directory elsewhere, cd to that directory, and run the command there, I get an error: {code} $SPARK_HOME/sbt/sbt package awk: cmd. line:1: fatal: cannot open file `./project/build.properties' for reading (No such file or directory) Attempting to fetch sbt /usr/lib/spark/sbt/sbt: line 33: sbt/sbt-launch-.jar: No such file or directory /usr/lib/spark/sbt/sbt: line 33: sbt/sbt-launch-.jar: No such file or directory Our attempt to download sbt locally to sbt/sbt-launch-.jar failed. Please install sbt manually from http://www.scala-sbt.org/ {code} So, either: 1: the Spark distribution of sbt can only be used to build Spark itself, not you own code...in which case the quick start guide is wrong, and should instead say that users should install sbt separately OR 2: the Spark distribution of sbt CAN be used, with property configuration, in which case that configuration should be documented (I wasn't able to figure it out, but I didn't try that hard either) OR 3: the Spark distribution of sbt is *supposed* to be able to build Spark apps, but is configured incorrectly in the product, in which case there's a product bug rather than a doc bug Although this is not a show-stopper, because the obvious workaround is to simply install sbt separately, I think at least updating the docs is pretty high priority, because most people learning Spark start with that Quick Start page, which doesn't work. (If it's doc issue #1, let me know, and I'll fix the docs myself. :-) ) -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1463) cleanup unnecessary dependency jars in the spark assembly jars
[ https://issues.apache.org/jira/browse/SPARK-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-1463. -- Resolution: Fixed Fix Version/s: (was: 1.0.0) Fixed at some point, it seems. No longer in the project. cleanup unnecessary dependency jars in the spark assembly jars -- Key: SPARK-1463 URL: https://issues.apache.org/jira/browse/SPARK-1463 Project: Spark Issue Type: Improvement Components: Build Affects Versions: 0.9.0 Reporter: Jenny MA Priority: Minor Labels: easyfix there are couple GPL/LGPL based dependencies which are included in the final assembly jar, which are not used by spark runtime. identified the following libraries. we can provide a fix in assembly/pom.xml. excludecom.google.code.findbugs:*/exclude excludeorg.acplt:oncrpc:*/exclude excludeglassfish:*/exclude excludecom.cenqua.clover:clover:*/exclude excludeorg.glassfish:*/exclude excludeorg.glassfish.grizzly:*/exclude excludeorg.glassfish.gmbal:*/exclude excludeorg.glassfish.external:*/exclude -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4343) Mima considers protected API methods for exclusion from binary checks.
Prashant Sharma created SPARK-4343: -- Summary: Mima considers protected API methods for exclusion from binary checks. Key: SPARK-4343 URL: https://issues.apache.org/jira/browse/SPARK-4343 Project: Spark Issue Type: Bug Reporter: Prashant Sharma Related SPARK-4335 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4343) Mima considers protected API methods for exclusion from binary checks.
[ https://issues.apache.org/jira/browse/SPARK-4343?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206472#comment-14206472 ] Prashant Sharma commented on SPARK-4343: I am not sure if its a desired behaviour. From my understanding this might be mostly okay. Mima considers protected API methods for exclusion from binary checks. --- Key: SPARK-4343 URL: https://issues.apache.org/jira/browse/SPARK-4343 Project: Spark Issue Type: Bug Reporter: Prashant Sharma Related SPARK-4335 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4034) get java.lang.NoClassDefFoundError: com/google/common/util/concurrent/ThreadFactoryBuilder in idea
[ https://issues.apache.org/jira/browse/SPARK-4034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206505#comment-14206505 ] Sean Owen commented on SPARK-4034: -- I use IDEA and I have never encountered this. From the PR discussion I'm not clear the proposed changed is not going to disrupt the rest of the build. I suggest closing? get java.lang.NoClassDefFoundError: com/google/common/util/concurrent/ThreadFactoryBuilder in idea Key: SPARK-4034 URL: https://issues.apache.org/jira/browse/SPARK-4034 Project: Spark Issue Type: Bug Reporter: baishuo -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1216) Add a OneHotEncoder for handling categorical features
[ https://issues.apache.org/jira/browse/SPARK-1216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206526#comment-14206526 ] Sean Owen commented on SPARK-1216: -- [~sandyr] This is basically https://issues.apache.org/jira/browse/SPARK-4081 and Joseph has a PR for it now? Add a OneHotEncoder for handling categorical features - Key: SPARK-1216 URL: https://issues.apache.org/jira/browse/SPARK-1216 Project: Spark Issue Type: Bug Components: MLlib Affects Versions: 0.9.0 Reporter: Sandy Pérez González Assignee: Sandy Ryza It would be nice to add something to MLLib to make it easy to do one-of-K encoding of categorical features. Something like: http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.OneHotEncoder.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-3624) Failed to find Spark assembly in /usr/share/spark/lib for RELEASED debian packages
[ https://issues.apache.org/jira/browse/SPARK-3624?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen reopened SPARK-3624: -- I jumped the gun; per Mark's comments this still deserves some resolution. Failed to find Spark assembly in /usr/share/spark/lib for RELEASED debian packages Key: SPARK-3624 URL: https://issues.apache.org/jira/browse/SPARK-3624 Project: Spark Issue Type: Bug Components: Build, Deploy Affects Versions: 1.1.0 Reporter: Christian Tzolov Priority: Minor The compute-classpath.sh requires that for a 'RELASED' package the Spark assembly jar is accessible from a spark home/lib folder. Currently the jdeb packaging (assembly module) bundles the assembly jar into a folder called 'jars'. The result is : /usr/share/spark/bin/spark-submit --num-executors 10--master yarn-cluster --class org.apache.spark.examples.SparkPi /usr/share/spark/jars/spark-examples-1.1.0-hadoop2.2.0-gphd-3.0.1.0.jar 10 ls: cannot access /usr/share/spark/lib: No such file or directory Failed to find Spark assembly in /usr/share/spark/lib You need to build Spark before running this program. Trivial solution is to rename the 'prefix${deb.install.path}/jars/prefix' inside assembly/pom.xml to prefix${deb.install.path}/lib/prefix. Another less impactful (considering backward compatibility) solution is to define a lib-jars symlink in the assembly/pom.xml -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-1830) Deploy failover, Make Persistence engine and LeaderAgent Pluggable.
[ https://issues.apache.org/jira/browse/SPARK-1830?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron Davidson resolved SPARK-1830. --- Resolution: Fixed Fix Version/s: (was: 1.2.0) (was: 1.0.1) 1.3.0 Assignee: Prashant Sharma Deploy failover, Make Persistence engine and LeaderAgent Pluggable. --- Key: SPARK-1830 URL: https://issues.apache.org/jira/browse/SPARK-1830 Project: Spark Issue Type: New Feature Components: Deploy Reporter: Prashant Sharma Assignee: Prashant Sharma Fix For: 1.3.0 With current code base it is difficult to plugin an external user specified Persistence Engine or Election Agent. It would be good to expose this as a pluggable API. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4344) spark.yarn.user.classpath.first is undocumented
Arun Ahuja created SPARK-4344: - Summary: spark.yarn.user.classpath.first is undocumented Key: SPARK-4344 URL: https://issues.apache.org/jira/browse/SPARK-4344 Project: Spark Issue Type: Documentation Affects Versions: 1.1.0 Reporter: Arun Ahuja Priority: Trivial spark.yarn.user.classpath.first is not documented while spark.files.userClassPathFirst and does not point the corresponding yarn parameter -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4344) spark.yarn.user.classpath.first is undocumented
[ https://issues.apache.org/jira/browse/SPARK-4344?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206722#comment-14206722 ] Apache Spark commented on SPARK-4344: - User 'arahuja' has created a pull request for this issue: https://github.com/apache/spark/pull/3209 spark.yarn.user.classpath.first is undocumented --- Key: SPARK-4344 URL: https://issues.apache.org/jira/browse/SPARK-4344 Project: Spark Issue Type: Documentation Affects Versions: 1.1.0 Reporter: Arun Ahuja Priority: Trivial spark.yarn.user.classpath.first is not documented while spark.files.userClassPathFirst and does not point the corresponding yarn parameter -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4345) Spark SQL Hive throws exception when drop a none-exist table
Alex Liu created SPARK-4345: --- Summary: Spark SQL Hive throws exception when drop a none-exist table Key: SPARK-4345 URL: https://issues.apache.org/jira/browse/SPARK-4345 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.1.0 Reporter: Alex Liu Priority: Minor When drop a none-exist hive table, an exception is thrown. log {code} scala val t = hql(drop table if exists test_table); warning: there were 1 deprecation warning(s); re-run with -deprecation for details t: org.apache.spark.sql.SchemaRDD = SchemaRDD[13] at RDD at SchemaRDD.scala:103 == Query Plan == == Physical Plan == DropTable test_table, true scala val t = hql(drop table if exists test_table); warning: there were 1 deprecation warning(s); re-run with -deprecation for details 14/11/11 10:21:49 ERROR Hive: NoSuchObjectException(message:default.test_table table not found) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1373) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103) at com.sun.proxy.$Proxy14.get_table(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:854) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy15.getTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:892) at org.apache.hadoop.hive.ql.exec.DDLTask.dropTable(DDLTask.java:3276) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:277) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:298) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:272) at org.apache.spark.sql.hive.execution.DropTable.sideEffectResult$lzycompute(commands.scala:65) at org.apache.spark.sql.hive.execution.DropTable.sideEffectResult(commands.scala:63) at org.apache.spark.sql.hive.execution.DropTable.execute(commands.scala:71) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:360) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:360) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:103) at org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:106) at org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:110) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:69) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:74) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:76) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:78) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:80)
[jira] [Created] (SPARK-4346) YarnClientSchedulerBack.asyncMonitorApplication should be common with Client.monitorApplication
Thomas Graves created SPARK-4346: Summary: YarnClientSchedulerBack.asyncMonitorApplication should be common with Client.monitorApplication Key: SPARK-4346 URL: https://issues.apache.org/jira/browse/SPARK-4346 Project: Spark Issue Type: Improvement Reporter: Thomas Graves The YarnClientSchedulerBackend.asyncMonitorApplication routine should move into ClientBase and be made common with monitorApplication. Make sure stop is handled properly. See discussion on https://github.com/apache/spark/pull/3143 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-4282) Stopping flag in YarnClientSchedulerBackend should be volatile
[ https://issues.apache.org/jira/browse/SPARK-4282?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves resolved SPARK-4282. -- Resolution: Fixed Fix Version/s: 1.3.0 1.2.0 Assignee: Kousuke Saruta Stopping flag in YarnClientSchedulerBackend should be volatile -- Key: SPARK-4282 URL: https://issues.apache.org/jira/browse/SPARK-4282 Project: Spark Issue Type: Bug Components: YARN Affects Versions: 1.2.0 Reporter: Kousuke Saruta Assignee: Kousuke Saruta Fix For: 1.2.0, 1.3.0 In YarnClientSchedulerBackend, a variable stopping is used as a flag and it's accessed by some threads so it should be volatile. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4347) GradientBoostingSuite takes more than 1 minute to finish
Xiangrui Meng created SPARK-4347: Summary: GradientBoostingSuite takes more than 1 minute to finish Key: SPARK-4347 URL: https://issues.apache.org/jira/browse/SPARK-4347 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.0 Reporter: Xiangrui Meng On a MacBook Pro: {code} [info] GradientBoostingSuite: [info] - Regression with continuous features: SquaredError (22 seconds, 875 milliseconds) [info] - Regression with continuous features: Absolute Error (25 seconds, 652 milliseconds) [info] - Binary classification with continuous features: Log Loss (26 seconds, 604 milliseconds) {code} Maybe we can reduce the size of test data and make it faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4169) [Core] Locale dependent code
[ https://issues.apache.org/jira/browse/SPARK-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or updated SPARK-4169: - Assignee: Niklas Wilcke [Core] Locale dependent code Key: SPARK-4169 URL: https://issues.apache.org/jira/browse/SPARK-4169 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Environment: Debian, Locale: de_DE Reporter: Niklas Wilcke Assignee: Niklas Wilcke Labels: patch, test Fix For: 1.1.1, 1.2.0 Original Estimate: 0.25h Remaining Estimate: 0.25h With a non english locale the method isBindCollision in core/src/main/scala/org/apache/spark/util/Utils.scala doesn't work because it checks the exception message, which is locale dependent. The test suite core/src/test/scala/org/apache/spark/util/UtilsSuite.scala also contains a locale dependent test string formatting of time durations which uses a DecimalSeperator which is locale dependent. I created a pull request on github to solve this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3682) Add helpful warnings to the UI
[ https://issues.apache.org/jira/browse/SPARK-3682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14206891#comment-14206891 ] Kay Ousterhout commented on SPARK-3682: --- Some of the metrics you mentioned fall under the additional metrics that are hidden by default; as part of this, it might be nice to automatically show a metric as part of warning a user that the value is problematic. Add helpful warnings to the UI -- Key: SPARK-3682 URL: https://issues.apache.org/jira/browse/SPARK-3682 Project: Spark Issue Type: New Feature Components: Web UI Affects Versions: 1.1.0 Reporter: Sandy Ryza Attachments: SPARK-3682Design.pdf Spark has a zillion configuration options and a zillion different things that can go wrong with a job. Improvements like incremental and better metrics and the proposed spark replay debugger provide more insight into what's going on under the covers. However, it's difficult for non-advanced users to synthesize this information and understand where to direct their attention. It would be helpful to have some sort of central location on the UI users could go to that would provide indications about why an app/job is failing or performing poorly. Some helpful messages that we could provide: * Warn that the tasks in a particular stage are spending a long time in GC. * Warn that spark.shuffle.memoryFraction does not fit inside the young generation. * Warn that tasks in a particular stage are very short, and that the number of partitions should probably be decreased. * Warn that tasks in a particular stage are spilling a lot, and that the number of partitions should probably be increased. * Warn that a cached RDD that gets a lot of use does not fit in memory, and a lot of time is being spent recomputing it. To start, probably two kinds of warnings would be most helpful. * Warnings at the app level that report on misconfigurations, issues with the general health of executors. * Warnings at the job level that indicate why a job might be performing slowly. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4347) GradientBoostingSuite takes more than 1 minute to finish
[ https://issues.apache.org/jira/browse/SPARK-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiangrui Meng updated SPARK-4347: - Assignee: Manish Amde GradientBoostingSuite takes more than 1 minute to finish Key: SPARK-4347 URL: https://issues.apache.org/jira/browse/SPARK-4347 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.0 Reporter: Xiangrui Meng Assignee: Manish Amde On a MacBook Pro: {code} [info] GradientBoostingSuite: [info] - Regression with continuous features: SquaredError (22 seconds, 875 milliseconds) [info] - Regression with continuous features: Absolute Error (25 seconds, 652 milliseconds) [info] - Binary classification with continuous features: Log Loss (26 seconds, 604 milliseconds) {code} Maybe we can reduce the size of test data and make it faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2205) Unnecessary exchange operators in a join on multiple tables with the same join key.
[ https://issues.apache.org/jira/browse/SPARK-2205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207031#comment-14207031 ] Yin Huai commented on SPARK-2205: - Just a note to myself. It will be good to also look at if outputPartitioning in other physical operators are properly set. For example, the outputPartitioning in LeftSemiJoinHash is using the default UnknownPartitioning. Unnecessary exchange operators in a join on multiple tables with the same join key. --- Key: SPARK-2205 URL: https://issues.apache.org/jira/browse/SPARK-2205 Project: Spark Issue Type: Bug Components: SQL Reporter: Yin Huai Assignee: Yin Huai Priority: Minor {code} hql(select * from src x join src y on (x.key=y.key) join src z on (y.key=z.key)) SchemaRDD[1] at RDD at SchemaRDD.scala:100 == Query Plan == Project [key#4:0,value#5:1,key#6:2,value#7:3,key#8:4,value#9:5] HashJoin [key#6], [key#8], BuildRight Exchange (HashPartitioning [key#6], 200) HashJoin [key#4], [key#6], BuildRight Exchange (HashPartitioning [key#4], 200) HiveTableScan [key#4,value#5], (MetastoreRelation default, src, Some(x)), None Exchange (HashPartitioning [key#6], 200) HiveTableScan [key#6,value#7], (MetastoreRelation default, src, Some(y)), None Exchange (HashPartitioning [key#8], 200) HiveTableScan [key#8,value#9], (MetastoreRelation default, src, Some(z)), None {code} However, this is fine... {code} hql(select * from src x join src y on (x.key=y.key) join src z on (x.key=z.key)) res5: org.apache.spark.sql.SchemaRDD = SchemaRDD[5] at RDD at SchemaRDD.scala:100 == Query Plan == Project [key#26:0,value#27:1,key#28:2,value#29:3,key#30:4,value#31:5] HashJoin [key#26], [key#30], BuildRight HashJoin [key#26], [key#28], BuildRight Exchange (HashPartitioning [key#26], 200) HiveTableScan [key#26,value#27], (MetastoreRelation default, src, Some(x)), None Exchange (HashPartitioning [key#28], 200) HiveTableScan [key#28,value#29], (MetastoreRelation default, src, Some(y)), None Exchange (HashPartitioning [key#30], 200) HiveTableScan [key#30,value#31], (MetastoreRelation default, src, Some(z)), None {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4205) Timestamp and Date objects with comparison operators
[ https://issues.apache.org/jira/browse/SPARK-4205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207037#comment-14207037 ] Apache Spark commented on SPARK-4205: - User 'culler' has created a pull request for this issue: https://github.com/apache/spark/pull/3210 Timestamp and Date objects with comparison operators Key: SPARK-4205 URL: https://issues.apache.org/jira/browse/SPARK-4205 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.1.0 Reporter: Marc Culler Fix For: 1.1.0 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-1739) Close PR's after period of inactivity
[ https://issues.apache.org/jira/browse/SPARK-1739?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207112#comment-14207112 ] Josh Rosen commented on SPARK-1739: --- My proposal would be to have SparkQA post a comment in the PR that mentions the component maintainers. This could happen once a PR sits inactive or unreviewed for more than X days. I can do this myself, but I'm kind of overloaded with other work so this is going to be a low priority. I'd welcome pull requests for this, though: https://github.com/databricks/spark-pr-dashboard Close PR's after period of inactivity - Key: SPARK-1739 URL: https://issues.apache.org/jira/browse/SPARK-1739 Project: Spark Issue Type: Task Components: Project Infra Reporter: Patrick Wendell Assignee: Josh Rosen Sometimes PR's get abandoned if people aren't responsive to feedback or it just falls to a lower priority. We should automatically close stale PR's in order to keep the queue from growing infinitely. I think we just want to do this with a friendly message that says This seems inactive, please re-open this if you are interested in contributing the patch.. We should also explicitly ping any reviewers (via @mentioning) them and ask them to provide feedback one way or the other, for instance, if the feature is being rejected. This will help us avoid letting features slip through the cracks by forcing some action when there is no activity after 30 days. Also, it's ASF policy that we should really be tracking our feature backlog and prioritization in JIRA and only be using Github for active reviews. I don't think we should close it if there was _no_ feedback from any reviewer - in that case we should leave it open (we should be providing at least some feedback on all incoming patches). -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4326) unidoc is broken on master
[ https://issues.apache.org/jira/browse/SPARK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207156#comment-14207156 ] Marcelo Vanzin commented on SPARK-4326: --- Funny that it doesn't happen on 1.2 since the dependency mess should be the same in both. I'll try this out when I'm done with some other tests, to see if I can figure it out. unidoc is broken on master -- Key: SPARK-4326 URL: https://issues.apache.org/jira/browse/SPARK-4326 Project: Spark Issue Type: Bug Components: Build, Documentation Affects Versions: 1.3.0 Reporter: Xiangrui Meng On master, `jekyll build` throws the following error: {code} [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/AppendOnlyMap.scala:205: value hashInt is not a member of com.google.common.hash.HashFunction [error] private def rehash(h: Int): Int = Hashing.murmur3_32().hashInt(h).asInt() [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala:426: value limit is not a member of object com.google.common.io.ByteStreams [error] val bufferedStream = new BufferedInputStream(ByteStreams.limit(fileStream, end - start)) [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala:558: value limit is not a member of object com.google.common.io.ByteStreams [error] val bufferedStream = new BufferedInputStream(ByteStreams.limit(fileStream, end - start)) [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala:261: value hashInt is not a member of com.google.common.hash.HashFunction [error] private def hashcode(h: Int): Int = Hashing.murmur3_32().hashInt(h).asInt() [error]^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/Utils.scala:37: type mismatch; [error] found : java.util.Iterator[T] [error] required: Iterable[?] [error] collectionAsScalaIterable(ordering.leastOf(asJavaIterator(input), num)).iterator [error] ^ [error] /Users/meng/src/spark/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableOperations.scala:421: value putAll is not a member of com.google.common.cache.Cache[org.apache.hadoop.fs.FileStatus,parquet.hadoop.Footer] [error] footerCache.putAll(newFooters) [error] ^ [warn] /Users/meng/src/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/parquet/FakeParquetSerDe.scala:34: @deprecated now takes two arguments; see the scaladoc. [warn] @deprecated(No code should depend on FakeParquetHiveSerDe as it is only intended as a + [warn] ^ [info] No documentation generated with unsucessful compiler run [warn] two warnings found [error] 6 errors found [error] (spark/scalaunidoc:doc) Scaladoc generation failed [error] Total time: 48 s, completed Nov 10, 2014 1:31:01 PM {code} It doesn't happen on branch-1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4348) pyspark.mllib.random conflicts with random module
Davies Liu created SPARK-4348: - Summary: pyspark.mllib.random conflicts with random module Key: SPARK-4348 URL: https://issues.apache.org/jira/browse/SPARK-4348 Project: Spark Issue Type: Bug Components: MLlib, PySpark Affects Versions: 1.1.0, 1.2.0 Reporter: Davies Liu Priority: Blocker There are conflict in two cases: 1. random module is used by pyspark.mllib.feature, if the first part of sys.path is not '', then the hack in pyspark/__init__.py will fail to fix the conflict. 2. Run tests in mllib/xxx.py, the '' should be popped out before import anything, or it will fail. The first one is not fully fixed for user, it will introduce problems in some cases, such as: {code} import sys import sys.insert(0, PATH_OF_MODULE) import pyspark # use Word2Vec will fail {code} I'd like to rename mllib/random.py as random/_random.py, then in mllib/__init.py {code} import pyspark.mllib._random as random {code} cc [~mengxr] [~dorx] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4345) Spark SQL Hive throws exception when drop a none-exist table
[ https://issues.apache.org/jira/browse/SPARK-4345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207171#comment-14207171 ] Alex Liu commented on SPARK-4345: - Swallow NoSuchObjectException exception when drop a none-exist hive table. pull @https://github.com/apache/spark/pull/3211 Spark SQL Hive throws exception when drop a none-exist table Key: SPARK-4345 URL: https://issues.apache.org/jira/browse/SPARK-4345 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 1.1.0 Reporter: Alex Liu Priority: Minor When drop a none-exist hive table, an exception is thrown. log {code} scala val t = hql(drop table if exists test_table); warning: there were 1 deprecation warning(s); re-run with -deprecation for details t: org.apache.spark.sql.SchemaRDD = SchemaRDD[13] at RDD at SchemaRDD.scala:103 == Query Plan == == Physical Plan == DropTable test_table, true scala val t = hql(drop table if exists test_table); warning: there were 1 deprecation warning(s); re-run with -deprecation for details 14/11/11 10:21:49 ERROR Hive: NoSuchObjectException(message:default.test_table table not found) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_table(HiveMetaStore.java:1373) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:103) at com.sun.proxy.$Proxy14.get_table(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTable(HiveMetaStoreClient.java:854) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy15.getTable(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:950) at org.apache.hadoop.hive.ql.metadata.Hive.getTable(Hive.java:892) at org.apache.hadoop.hive.ql.exec.DDLTask.dropTable(DDLTask.java:3276) at org.apache.hadoop.hive.ql.exec.DDLTask.execute(DDLTask.java:277) at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:151) at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:65) at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1414) at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:1192) at org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1020) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:888) at org.apache.spark.sql.hive.HiveContext.runHive(HiveContext.scala:298) at org.apache.spark.sql.hive.HiveContext.runSqlHive(HiveContext.scala:272) at org.apache.spark.sql.hive.execution.DropTable.sideEffectResult$lzycompute(commands.scala:65) at org.apache.spark.sql.hive.execution.DropTable.sideEffectResult(commands.scala:63) at org.apache.spark.sql.hive.execution.DropTable.execute(commands.scala:71) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd$lzycompute(HiveContext.scala:360) at org.apache.spark.sql.hive.HiveContext$QueryExecution.toRdd(HiveContext.scala:360) at org.apache.spark.sql.SchemaRDDLike$class.$init$(SchemaRDDLike.scala:58) at org.apache.spark.sql.SchemaRDD.init(SchemaRDD.scala:103) at org.apache.spark.sql.hive.HiveContext.hiveql(HiveContext.scala:106) at org.apache.spark.sql.hive.HiveContext.hql(HiveContext.scala:110) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:69) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:74) at $line28.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC.init(console:76) at
[jira] [Commented] (SPARK-4326) unidoc is broken on master
[ https://issues.apache.org/jira/browse/SPARK-4326?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207177#comment-14207177 ] Nicholas Chammas commented on SPARK-4326: - Side question: Should we be (or are we already) regularly building the docs to catch these problems at PR/review time? unidoc is broken on master -- Key: SPARK-4326 URL: https://issues.apache.org/jira/browse/SPARK-4326 Project: Spark Issue Type: Bug Components: Build, Documentation Affects Versions: 1.3.0 Reporter: Xiangrui Meng On master, `jekyll build` throws the following error: {code} [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/AppendOnlyMap.scala:205: value hashInt is not a member of com.google.common.hash.HashFunction [error] private def rehash(h: Int): Int = Hashing.murmur3_32().hashInt(h).asInt() [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/ExternalAppendOnlyMap.scala:426: value limit is not a member of object com.google.common.io.ByteStreams [error] val bufferedStream = new BufferedInputStream(ByteStreams.limit(fileStream, end - start)) [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/ExternalSorter.scala:558: value limit is not a member of object com.google.common.io.ByteStreams [error] val bufferedStream = new BufferedInputStream(ByteStreams.limit(fileStream, end - start)) [error] ^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/OpenHashSet.scala:261: value hashInt is not a member of com.google.common.hash.HashFunction [error] private def hashcode(h: Int): Int = Hashing.murmur3_32().hashInt(h).asInt() [error]^ [error] /Users/meng/src/spark/core/src/main/scala/org/apache/spark/util/collection/Utils.scala:37: type mismatch; [error] found : java.util.Iterator[T] [error] required: Iterable[?] [error] collectionAsScalaIterable(ordering.leastOf(asJavaIterator(input), num)).iterator [error] ^ [error] /Users/meng/src/spark/sql/core/src/main/scala/org/apache/spark/sql/parquet/ParquetTableOperations.scala:421: value putAll is not a member of com.google.common.cache.Cache[org.apache.hadoop.fs.FileStatus,parquet.hadoop.Footer] [error] footerCache.putAll(newFooters) [error] ^ [warn] /Users/meng/src/spark/sql/hive/src/main/scala/org/apache/spark/sql/hive/parquet/FakeParquetSerDe.scala:34: @deprecated now takes two arguments; see the scaladoc. [warn] @deprecated(No code should depend on FakeParquetHiveSerDe as it is only intended as a + [warn] ^ [info] No documentation generated with unsucessful compiler run [warn] two warnings found [error] 6 errors found [error] (spark/scalaunidoc:doc) Scaladoc generation failed [error] Total time: 48 s, completed Nov 10, 2014 1:31:01 PM {code} It doesn't happen on branch-1.2. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization
Matt Cheah created SPARK-4349: - Summary: Spark driver hangs on sc.parallelize() if exception is thrown during serialization Key: SPARK-4349 URL: https://issues.apache.org/jira/browse/SPARK-4349 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Matt Cheah Fix For: 1.3.0 Executing the following in the Spark Shell will lead to the Spark Shell hanging after a stack trace is printed. The serializer is set to the Kryo serializer. {code} scala import com.esotericsoftware.kryo.io.Input import com.esotericsoftware.kryo.io.Input scala import com.esotericsoftware.kryo.io.Output import com.esotericsoftware.kryo.io.Output scala class MyKryoSerializable extends com.esotericsoftware.kryo.KryoSerializable { def write (kryo: com.esotericsoftware.kryo.Kryo, output: Output) { throw new com.esotericsoftware.kryo.KryoException; } ; def read (kryo: com.esotericsoftware.kryo.Kryo, input: Input) { throw new com.esotericsoftware.kryo.KryoException; } } defined class MyKryoSerializable scala sc.parallelize(Seq(new MyKryoSerializable, new MyKryoSerializable)).collect {code} A stack trace is printed during serialization as expected, but another stack trace is printed afterwards, indicating that the driver can't recover: {code} 14/11/11 14:10:03 ERROR OneForOneStrategy: actor name [ExecutorActor] is not unique! akka.actor.PostRestartException: exception post restart (class java.io.IOException) at akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:249) at akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:247) at akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:302) at akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:297) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:247) at akka.actor.dungeon.FaultHandling$class.faultRecreate(FaultHandling.scala:76) at akka.actor.ActorCell.faultRecreate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:459) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: akka.actor.InvalidActorNameException: actor name [ExecutorActor] is not unique! at akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130) at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77) at akka.actor.ActorCell.reserveChild(ActorCell.scala:369) at akka.actor.dungeon.Children$class.makeChild(Children.scala:202) at akka.actor.dungeon.Children$class.attachChild(Children.scala:42) at akka.actor.ActorCell.attachChild(ActorCell.scala:369) at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552) at org.apache.spark.executor.Executor.init(Executor.scala:97) at org.apache.spark.scheduler.local.LocalActor.init(LocalBackend.scala:53) at org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96) at org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96) at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343) at akka.actor.Props.newActor(Props.scala:252) at akka.actor.ActorCell.newActor(ActorCell.scala:552) at akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:234) ... 11 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization
[ https://issues.apache.org/jira/browse/SPARK-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207185#comment-14207185 ] Matt Cheah commented on SPARK-4349: --- I'm investigating this now. Someone can assign to me. Spark driver hangs on sc.parallelize() if exception is thrown during serialization -- Key: SPARK-4349 URL: https://issues.apache.org/jira/browse/SPARK-4349 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Matt Cheah Fix For: 1.3.0 Executing the following in the Spark Shell will lead to the Spark Shell hanging after a stack trace is printed. The serializer is set to the Kryo serializer. {code} scala import com.esotericsoftware.kryo.io.Input import com.esotericsoftware.kryo.io.Input scala import com.esotericsoftware.kryo.io.Output import com.esotericsoftware.kryo.io.Output scala class MyKryoSerializable extends com.esotericsoftware.kryo.KryoSerializable { def write (kryo: com.esotericsoftware.kryo.Kryo, output: Output) { throw new com.esotericsoftware.kryo.KryoException; } ; def read (kryo: com.esotericsoftware.kryo.Kryo, input: Input) { throw new com.esotericsoftware.kryo.KryoException; } } defined class MyKryoSerializable scala sc.parallelize(Seq(new MyKryoSerializable, new MyKryoSerializable)).collect {code} A stack trace is printed during serialization as expected, but another stack trace is printed afterwards, indicating that the driver can't recover: {code} 14/11/11 14:10:03 ERROR OneForOneStrategy: actor name [ExecutorActor] is not unique! akka.actor.PostRestartException: exception post restart (class java.io.IOException) at akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:249) at akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:247) at akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:302) at akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:297) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:247) at akka.actor.dungeon.FaultHandling$class.faultRecreate(FaultHandling.scala:76) at akka.actor.ActorCell.faultRecreate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:459) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: akka.actor.InvalidActorNameException: actor name [ExecutorActor] is not unique! at akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130) at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77) at akka.actor.ActorCell.reserveChild(ActorCell.scala:369) at akka.actor.dungeon.Children$class.makeChild(Children.scala:202) at akka.actor.dungeon.Children$class.attachChild(Children.scala:42) at akka.actor.ActorCell.attachChild(ActorCell.scala:369) at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552) at org.apache.spark.executor.Executor.init(Executor.scala:97) at org.apache.spark.scheduler.local.LocalActor.init(LocalBackend.scala:53) at org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96) at org.apache.spark.scheduler.local.LocalBackend$$anonfun$start$1.apply(LocalBackend.scala:96) at akka.actor.TypedCreatorFunctionConsumer.produce(Props.scala:343) at akka.actor.Props.newActor(Props.scala:252) at akka.actor.ActorCell.newActor(ActorCell.scala:552) at akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:234) ... 11 more {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (SPARK-4348) pyspark.mllib.random conflicts with random module
[ https://issues.apache.org/jira/browse/SPARK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207199#comment-14207199 ] Doris Xin commented on SPARK-4348: -- I fully support this. It took a lot of hacking just to override the default random module in Python, and it wasn't clear if the override was the ideal solution. pyspark.mllib.random conflicts with random module - Key: SPARK-4348 URL: https://issues.apache.org/jira/browse/SPARK-4348 Project: Spark Issue Type: Bug Components: MLlib, PySpark Affects Versions: 1.1.0, 1.2.0 Reporter: Davies Liu Priority: Blocker There are conflict in two cases: 1. random module is used by pyspark.mllib.feature, if the first part of sys.path is not '', then the hack in pyspark/__init__.py will fail to fix the conflict. 2. Run tests in mllib/xxx.py, the '' should be popped out before import anything, or it will fail. The first one is not fully fixed for user, it will introduce problems in some cases, such as: {code} import sys import sys.insert(0, PATH_OF_MODULE) import pyspark # use Word2Vec will fail {code} I'd like to rename mllib/random.py as random/_random.py, then in mllib/__init.py {code} import pyspark.mllib._random as random {code} cc [~mengxr] [~dorx] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4092) Input metrics don't work for coalesce()'d RDD's
[ https://issues.apache.org/jira/browse/SPARK-4092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207190#comment-14207190 ] Kostas Sakellis commented on SPARK-4092: [~aash], yes this should solve a superset of the same problems that SPARK-2630 aims to fix. I say superset because https://github.com/apache/spark/pull/3120 also includes a similar fix when hadoop 2.5 is used with the bytesReadCallback. Input metrics don't work for coalesce()'d RDD's --- Key: SPARK-4092 URL: https://issues.apache.org/jira/browse/SPARK-4092 Project: Spark Issue Type: Bug Components: Spark Core Reporter: Patrick Wendell Assignee: Kostas Sakellis Priority: Critical In every case where we set input metrics (from both Hadoop and block storage) we currently assume that exactly one input partition is computed within the task. This is not a correct assumption in the general case. The main example in the current API is coalesce(), but user-defined RDD's could also be affected. To deal with the most general case, we would need to support the notion of a single task having multiple input sources. A more surgical and less general fix is to simply go to HadoopRDD and check if there are already inputMetrics defined for the task with the same type. If there are, then merge in the new data rather than blowing away the old one. This wouldn't cover case where, e.g. a single task has input from both on-disk and in-memory blocks. It _would_ cover the case where someone calls coalesce on a HadoopRDD... which is more common. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4350) aggregate doesn't make copies of zeroValue in local mode
Xiangrui Meng created SPARK-4350: Summary: aggregate doesn't make copies of zeroValue in local mode Key: SPARK-4350 URL: https://issues.apache.org/jira/browse/SPARK-4350 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0, 1.0.2, 1.2.0 Reporter: Xiangrui Meng Priority: Critical RDD.aggregate makes a copy of zeroValue to collect the task result. However, it doesn't make copies of zeroValue for each partition. In local mode, this causes race conditions. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-2269) Clean up and add unit tests for resourceOffers in MesosSchedulerBackend
[ https://issues.apache.org/jira/browse/SPARK-2269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Or closed SPARK-2269. Resolution: Fixed Fix Version/s: 1.2.0 Target Version/s: 1.2.0 Clean up and add unit tests for resourceOffers in MesosSchedulerBackend --- Key: SPARK-2269 URL: https://issues.apache.org/jira/browse/SPARK-2269 Project: Spark Issue Type: Bug Components: Mesos Affects Versions: 1.1.0 Reporter: Patrick Wendell Assignee: Tim Chen Fix For: 1.2.0 This function could be simplified a bit. We could re-write it without offerableIndices or creating the mesosTasks array as large as the offer list. There is a lot of logic around making sure you get the correct index into mesosTasks and offers, really we should just build mesosTasks directly from the offers we get back. To associate the tasks we are launching with the offers we can just create a hashMap from the slaveId to the original offer. The basic logic of the function is that you take the mesos offers, convert them to spark offers, then convert the results back. One reason I think it might be designed as it is now is to deal with the case where Mesos gives multiple offers for a single slave. I checked directly with the Mesos team and they said this won't ever happen, you'll get at most one offer per mesos slave within a set of offers. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4228) Save a ScheamRDD in JSON format
[ https://issues.apache.org/jira/browse/SPARK-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207262#comment-14207262 ] Apache Spark commented on SPARK-4228: - User 'dwmclary' has created a pull request for this issue: https://github.com/apache/spark/pull/3213 Save a ScheamRDD in JSON format --- Key: SPARK-4228 URL: https://issues.apache.org/jira/browse/SPARK-4228 Project: Spark Issue Type: New Feature Components: SQL Reporter: Yin Huai Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4228) Save a ScheamRDD in JSON format
[ https://issues.apache.org/jira/browse/SPARK-4228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207261#comment-14207261 ] Dan McClary commented on SPARK-4228: Pull request here: https://github.com/apache/spark/pull/3213 Save a ScheamRDD in JSON format --- Key: SPARK-4228 URL: https://issues.apache.org/jira/browse/SPARK-4228 Project: Spark Issue Type: New Feature Components: SQL Reporter: Yin Huai Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3066) Support recommendAll in matrix factorization model
[ https://issues.apache.org/jira/browse/SPARK-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207298#comment-14207298 ] Debasish Das commented on SPARK-3066: - [~mengxr] I am testing recommendAllUsers and recommendAllProducts API and I will add the code to RankingMetrics PR: https://github.com/apache/spark/pull/3098 I have not used level-3 BLAS yet since we should be able to re-use DistributedMatrix API that's coming online (here all the matrices are dense)...I used ideas 1 and 2 and I also add a skipRatings in the API (using that you can skip the ratings that each user has already provided...for the validation I skip the train set basically) Example API: def recommendAllUsers(num: Int, skipUserRatings: RDD[Rating]) = { val skipUsers = skipUserRatings.map { x = ((x.user, x.product), x.rating) } val productVectors = productFeatures.collect recommend(productVectors, userFeatures, num, skipUsers) } def recommendAllProducts(num: Int, skipProductRatings: RDD[Rating]) = { val skipProducts = skipProductRatings.map { x = ((x.product, x.user), x.rating) } val userVectors = userFeatures.collect recommend(userVectors, productFeatures, num, skipProducts) } Support recommendAll in matrix factorization model -- Key: SPARK-3066 URL: https://issues.apache.org/jira/browse/SPARK-3066 Project: Spark Issue Type: New Feature Components: MLlib Reporter: Xiangrui Meng ALS returns a matrix factorization model, which we can use to predict ratings for individual queries as well as small batches. In practice, users may want to compute top-k recommendations offline for all users. It is very expensive but a common problem. We can do some optimization like 1) collect one side (either user or product) and broadcast it as a matrix 2) use level-3 BLAS to compute inner products 3) use Utils.takeOrdered to find top-k -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-4157) Task input statistics incomplete when a task reads from multiple locations
[ https://issues.apache.org/jira/browse/SPARK-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Charles Reiss resolved SPARK-4157. -- Resolution: Duplicate Task input statistics incomplete when a task reads from multiple locations -- Key: SPARK-4157 URL: https://issues.apache.org/jira/browse/SPARK-4157 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Charles Reiss Priority: Minor SPARK-1683 introduced tracking of filesystem reads for tasks, but the tracking code assumes that each task reads from exactly one file/cache block, and replaces any prior InputMetrics object for a task after each read. But, for example, a task computing a shuffle-less join (input RDDs are prepartitioned by key) may read two or more cached dependency RDD blocks from cache. In this case, the displayed input size will be for whichever dependency was requested last. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4349) Spark driver hangs on sc.parallelize() if exception is thrown during serialization
[ https://issues.apache.org/jira/browse/SPARK-4349?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207392#comment-14207392 ] Matt Cheah commented on SPARK-4349: --- Investigation showed that the DAGScheduler may not catch un-serializable tasks, and the Task set manager assumes that serialization exceptions are caught in the DAGScheduler. What happens is in DAGScheduler.submitMissingTasks, a Seq of tasks is created and the first task in the set is proactively serialized to check for exceptions. However, in the case of parallel collection partitions and the code I provided above, the first task can be serialized since the first task's partition has an empty array for values, while other tasks in the array may have the actual data that cannot be serialized. I'm not sure what the best way to go forward is. Proactively serializing all of the tasks is too expensive. Spark driver hangs on sc.parallelize() if exception is thrown during serialization -- Key: SPARK-4349 URL: https://issues.apache.org/jira/browse/SPARK-4349 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.1.0 Reporter: Matt Cheah Fix For: 1.3.0 Executing the following in the Spark Shell will lead to the Spark Shell hanging after a stack trace is printed. The serializer is set to the Kryo serializer. {code} scala import com.esotericsoftware.kryo.io.Input import com.esotericsoftware.kryo.io.Input scala import com.esotericsoftware.kryo.io.Output import com.esotericsoftware.kryo.io.Output scala class MyKryoSerializable extends com.esotericsoftware.kryo.KryoSerializable { def write (kryo: com.esotericsoftware.kryo.Kryo, output: Output) { throw new com.esotericsoftware.kryo.KryoException; } ; def read (kryo: com.esotericsoftware.kryo.Kryo, input: Input) { throw new com.esotericsoftware.kryo.KryoException; } } defined class MyKryoSerializable scala sc.parallelize(Seq(new MyKryoSerializable, new MyKryoSerializable)).collect {code} A stack trace is printed during serialization as expected, but another stack trace is printed afterwards, indicating that the driver can't recover: {code} 14/11/11 14:10:03 ERROR OneForOneStrategy: actor name [ExecutorActor] is not unique! akka.actor.PostRestartException: exception post restart (class java.io.IOException) at akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:249) at akka.actor.dungeon.FaultHandling$$anonfun$6.apply(FaultHandling.scala:247) at akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:302) at akka.actor.dungeon.FaultHandling$$anonfun$handleNonFatalOrInterruptedException$1.applyOrElse(FaultHandling.scala:297) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply$mcVL$sp(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:33) at scala.runtime.AbstractPartialFunction$mcVL$sp.apply(AbstractPartialFunction.scala:25) at akka.actor.dungeon.FaultHandling$class.finishRecreate(FaultHandling.scala:247) at akka.actor.dungeon.FaultHandling$class.faultRecreate(FaultHandling.scala:76) at akka.actor.ActorCell.faultRecreate(ActorCell.scala:369) at akka.actor.ActorCell.invokeAll$1(ActorCell.scala:459) at akka.actor.ActorCell.systemInvoke(ActorCell.scala:478) at akka.dispatch.Mailbox.processAllSystemMessages(Mailbox.scala:263) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) Caused by: akka.actor.InvalidActorNameException: actor name [ExecutorActor] is not unique! at akka.actor.dungeon.ChildrenContainer$NormalChildrenContainer.reserve(ChildrenContainer.scala:130) at akka.actor.dungeon.Children$class.reserveChild(Children.scala:77) at akka.actor.ActorCell.reserveChild(ActorCell.scala:369) at akka.actor.dungeon.Children$class.makeChild(Children.scala:202) at akka.actor.dungeon.Children$class.attachChild(Children.scala:42) at akka.actor.ActorCell.attachChild(ActorCell.scala:369) at akka.actor.ActorSystemImpl.actorOf(ActorSystem.scala:552) at org.apache.spark.executor.Executor.init(Executor.scala:97) at
[jira] [Updated] (SPARK-4322) Analysis incorrectly rejects accessing grouping fields
[ https://issues.apache.org/jira/browse/SPARK-4322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4322: Assignee: Cheng Lian Analysis incorrectly rejects accessing grouping fields -- Key: SPARK-4322 URL: https://issues.apache.org/jira/browse/SPARK-4322 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Assignee: Cheng Lian Priority: Blocker {code} sqlContext.jsonRDD(sc.parallelize({a: {b: [{c: 1}]}} :: Nil)).registerTempTable(data) sqlContext.sql(SELECT a.b[0].c FROM data GROUP BY a.b[0].c).collect() {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4258) NPE with new Parquet Filters
[ https://issues.apache.org/jira/browse/SPARK-4258?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-4258: Assignee: Cheng Lian NPE with new Parquet Filters Key: SPARK-4258 URL: https://issues.apache.org/jira/browse/SPARK-4258 Project: Spark Issue Type: Bug Components: SQL Reporter: Michael Armbrust Assignee: Cheng Lian Priority: Blocker {code} Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 21.0 failed 4 times, most recent failure: Lost task 0.3 in stage 21.0 (TID 160, ip-10-0-247-144.us-west-2.compute.internal): java.lang.NullPointerException: parquet.io.api.Binary$ByteArrayBackedBinary.compareTo(Binary.java:206) parquet.io.api.Binary$ByteArrayBackedBinary.compareTo(Binary.java:162) parquet.filter2.statisticslevel.StatisticsFilter.visit(StatisticsFilter.java:100) parquet.filter2.statisticslevel.StatisticsFilter.visit(StatisticsFilter.java:47) parquet.filter2.predicate.Operators$Eq.accept(Operators.java:162) parquet.filter2.statisticslevel.StatisticsFilter.visit(StatisticsFilter.java:210) parquet.filter2.statisticslevel.StatisticsFilter.visit(StatisticsFilter.java:47) parquet.filter2.predicate.Operators$Or.accept(Operators.java:302) parquet.filter2.statisticslevel.StatisticsFilter.visit(StatisticsFilter.java:201) parquet.filter2.statisticslevel.StatisticsFilter.visit(StatisticsFilter.java:47) parquet.filter2.predicate.Operators$And.accept(Operators.java:290) parquet.filter2.statisticslevel.StatisticsFilter.canDrop(StatisticsFilter.java:52) parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:46) parquet.filter2.compat.RowGroupFilter.visit(RowGroupFilter.java:22) parquet.filter2.compat.FilterCompat$FilterPredicateCompat.accept(FilterCompat.java:108) parquet.filter2.compat.RowGroupFilter.filterRowGroups(RowGroupFilter.java:28) parquet.hadoop.ParquetRecordReader.initializeInternalReader(ParquetRecordReader.java:158) parquet.hadoop.ParquetRecordReader.initialize(ParquetRecordReader.java:138) {code} This occurs when reading parquet data encoded with the older version of the library for TPC-DS query 34. Will work on coming up with a smaller reproduction -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4347) GradientBoostingSuite takes more than 1 minute to finish
[ https://issues.apache.org/jira/browse/SPARK-4347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207435#comment-14207435 ] Apache Spark commented on SPARK-4347: - User 'manishamde' has created a pull request for this issue: https://github.com/apache/spark/pull/3214 GradientBoostingSuite takes more than 1 minute to finish Key: SPARK-4347 URL: https://issues.apache.org/jira/browse/SPARK-4347 Project: Spark Issue Type: Improvement Components: MLlib Affects Versions: 1.2.0 Reporter: Xiangrui Meng Assignee: Manish Amde On a MacBook Pro: {code} [info] GradientBoostingSuite: [info] - Regression with continuous features: SquaredError (22 seconds, 875 milliseconds) [info] - Regression with continuous features: Absolute Error (25 seconds, 652 milliseconds) [info] - Binary classification with continuous features: Log Loss (26 seconds, 604 milliseconds) {code} Maybe we can reduce the size of test data and make it faster. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4351) Record cacheable RDD reads and display RDD miss rates
Charles Reiss created SPARK-4351: Summary: Record cacheable RDD reads and display RDD miss rates Key: SPARK-4351 URL: https://issues.apache.org/jira/browse/SPARK-4351 Project: Spark Issue Type: Improvement Reporter: Charles Reiss Priority: Minor Currently, when Spark fails to keep an RDD cached, there is little visibility to the user (beyond performance effects), especially if the user is not reading executor logs. We could expose this information to the Web UI and the event log like we do for RDD storage information by reporting RDD reads and their results with task metrics. From this, live computation of RDD miss rates is straightforward, and information in the event log would enable more complicated post-hoc analyses. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4338) Remove yarn-alpha support
[ https://issues.apache.org/jira/browse/SPARK-4338?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207450#comment-14207450 ] Apache Spark commented on SPARK-4338: - User 'sryza' has created a pull request for this issue: https://github.com/apache/spark/pull/3215 Remove yarn-alpha support - Key: SPARK-4338 URL: https://issues.apache.org/jira/browse/SPARK-4338 Project: Spark Issue Type: Sub-task Components: YARN Reporter: Sandy Ryza -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4352) Incorporate locality preferences in dynamic allocation requests
Sandy Ryza created SPARK-4352: - Summary: Incorporate locality preferences in dynamic allocation requests Key: SPARK-4352 URL: https://issues.apache.org/jira/browse/SPARK-4352 Project: Spark Issue Type: Improvement Components: Spark Core, YARN Affects Versions: 1.2.0 Reporter: Sandy Ryza -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2429) Hierarchical Implementation of KMeans
[ https://issues.apache.org/jira/browse/SPARK-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207463#comment-14207463 ] Yu Ishikawa commented on SPARK-2429: [~rnowling] Sorry for commenting again. Could you tell me what you think about the new function to cut a dendrogram? (For example. you think we don't need the new function. Or we should make an advantage against KMeans from another point of view. ) 1. This algorithm doesn't have an advantage about elapsed time of assignment against KMeans. 2. The new function generate another model to cut a dendrogram by height without re-training by another parameters. Thanks, Hierarchical Implementation of KMeans - Key: SPARK-2429 URL: https://issues.apache.org/jira/browse/SPARK-2429 Project: Spark Issue Type: New Feature Components: MLlib Reporter: RJ Nowling Assignee: Yu Ishikawa Priority: Minor Attachments: 2014-10-20_divisive-hierarchical-clustering.pdf, The Result of Benchmarking a Hierarchical Clustering.pdf, benchmark-result.2014-10-29.html, benchmark2.html Hierarchical clustering algorithms are widely used and would make a nice addition to MLlib. Clustering algorithms are useful for determining relationships between clusters as well as offering faster assignment. Discussion on the dev list suggested the following possible approaches: * Top down, recursive application of KMeans * Reuse DecisionTree implementation with different objective function * Hierarchical SVD It was also suggested that support for distance metrics other than Euclidean such as negative dot or cosine are necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4348) pyspark.mllib.random conflicts with random module
[ https://issues.apache.org/jira/browse/SPARK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207511#comment-14207511 ] Apache Spark commented on SPARK-4348: - User 'davies' has created a pull request for this issue: https://github.com/apache/spark/pull/3216 pyspark.mllib.random conflicts with random module - Key: SPARK-4348 URL: https://issues.apache.org/jira/browse/SPARK-4348 Project: Spark Issue Type: Bug Components: MLlib, PySpark Affects Versions: 1.1.0, 1.2.0 Reporter: Davies Liu Priority: Blocker There are conflict in two cases: 1. random module is used by pyspark.mllib.feature, if the first part of sys.path is not '', then the hack in pyspark/__init__.py will fail to fix the conflict. 2. Run tests in mllib/xxx.py, the '' should be popped out before import anything, or it will fail. The first one is not fully fixed for user, it will introduce problems in some cases, such as: {code} import sys import sys.insert(0, PATH_OF_MODULE) import pyspark # use Word2Vec will fail {code} I'd like to rename mllib/random.py as random/_random.py, then in mllib/__init.py {code} import pyspark.mllib._random as random {code} cc [~mengxr] [~dorx] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4348) pyspark.mllib.random conflicts with random module
[ https://issues.apache.org/jira/browse/SPARK-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207529#comment-14207529 ] Davies Liu commented on SPARK-4348: --- After some experiments, I found it's more harder than expected, it still need some hack to make it work (see the PR), but I think this hack is safer than before: 1. the rand.py module will not overwrite default random module, so it's safe to run the mllib/xxx.py without hacking, also we do not need hack to use random in mllib package. 2. the RandomModuleHook only installed when user try to import 'pyspark.mllib', it also only works for 'pyspark.mllib.random'. Note: In order to use default random module, we need 'from __future__ import absolute_import' in the caller module, this also need as more. Without this, 'import random' can be translated as 'from pyspark.mllib import random'. So, there is a bug in master (Word2Vec) pyspark.mllib.random conflicts with random module - Key: SPARK-4348 URL: https://issues.apache.org/jira/browse/SPARK-4348 Project: Spark Issue Type: Bug Components: MLlib, PySpark Affects Versions: 1.1.0, 1.2.0 Reporter: Davies Liu Priority: Blocker There are conflict in two cases: 1. random module is used by pyspark.mllib.feature, if the first part of sys.path is not '', then the hack in pyspark/__init__.py will fail to fix the conflict. 2. Run tests in mllib/xxx.py, the '' should be popped out before import anything, or it will fail. The first one is not fully fixed for user, it will introduce problems in some cases, such as: {code} import sys import sys.insert(0, PATH_OF_MODULE) import pyspark # use Word2Vec will fail {code} I'd like to rename mllib/random.py as random/_random.py, then in mllib/__init.py {code} import pyspark.mllib._random as random {code} cc [~mengxr] [~dorx] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4038) Outlier Detection Algorithm for MLlib
[ https://issues.apache.org/jira/browse/SPARK-4038?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207550#comment-14207550 ] Ashutosh Trivedi commented on SPARK-4038: - similar issue was opened at Mahout https://issues.apache.org/jira/browse/MAHOUT-384 [~sowen] What are your thoughts? I saw you were helping with the patch there. Please see the following for discussion on it http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Contributing-Algorithm-for-Outlier-Detection-td8880.htm Outlier Detection Algorithm for MLlib - Key: SPARK-4038 URL: https://issues.apache.org/jira/browse/SPARK-4038 Project: Spark Issue Type: New Feature Components: MLlib Reporter: Ashutosh Trivedi Priority: Minor The aim of this JIRA is to discuss about which parallel outlier detection algorithms can be included in MLlib. The one which I am familiar with is Attribute Value Frequency (AVF). It scales linearly with the number of data points and attributes, and relies on a single data scan. It is not distance based and well suited for categorical data. In original paper a parallel version is also given, which is not complected to implement. I am working on the implementation and soon submit the initial code for review. Here is the Link for the paper http://ieeexplore.ieee.org/xpl/articleDetails.jsp?arnumber=4410382 As pointed out by Xiangrui in discussion http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-Contributing-Algorithm-for-Outlier-Detection-td8880.html There are other algorithms also. Lets discuss about which will be more general and easily paralleled. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4353) Delete the val that never used
DoingDone9 created SPARK-4353: - Summary: Delete the val that never used Key: SPARK-4353 URL: https://issues.apache.org/jira/browse/SPARK-4353 Project: Spark Issue Type: Wish Reporter: DoingDone9 Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4353) Delete the val that never used
[ https://issues.apache.org/jira/browse/SPARK-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4353: -- Issue Type: Improvement (was: Wish) Delete the val that never used -- Key: SPARK-4353 URL: https://issues.apache.org/jira/browse/SPARK-4353 Project: Spark Issue Type: Improvement Reporter: DoingDone9 Priority: Minor -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4353) Delete the val that never used
[ https://issues.apache.org/jira/browse/SPARK-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4353: -- Component/s: SQL Delete the val that never used -- Key: SPARK-4353 URL: https://issues.apache.org/jira/browse/SPARK-4353 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor dbName in Catalog never used, like that val (dbName, tblName) = processDatabaseAndTableName(databaseName, tableName); tables -= tblName. i think it should be deleted,it should be val tblName = processDatabaseAndTableName(databaseName, tableName)._2 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-4353) Delete the val that never used
[ https://issues.apache.org/jira/browse/SPARK-4353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] DoingDone9 updated SPARK-4353: -- Description: dbName in Catalog never used, like that val (dbName, tblName) = processDatabaseAndTableName(databaseName, tableName); tables -= tblName. i think it should be deleted,it should be val tblName = processDatabaseAndTableName(databaseName, tableName)._2 Delete the val that never used -- Key: SPARK-4353 URL: https://issues.apache.org/jira/browse/SPARK-4353 Project: Spark Issue Type: Improvement Components: SQL Reporter: DoingDone9 Priority: Minor dbName in Catalog never used, like that val (dbName, tblName) = processDatabaseAndTableName(databaseName, tableName); tables -= tblName. i think it should be deleted,it should be val tblName = processDatabaseAndTableName(databaseName, tableName)._2 -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4351) Record cacheable RDD reads and display RDD miss rates
[ https://issues.apache.org/jira/browse/SPARK-4351?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207566#comment-14207566 ] Apache Spark commented on SPARK-4351: - User 'woggle' has created a pull request for this issue: https://github.com/apache/spark/pull/3218 Record cacheable RDD reads and display RDD miss rates - Key: SPARK-4351 URL: https://issues.apache.org/jira/browse/SPARK-4351 Project: Spark Issue Type: Improvement Reporter: Charles Reiss Priority: Minor Currently, when Spark fails to keep an RDD cached, there is little visibility to the user (beyond performance effects), especially if the user is not reading executor logs. We could expose this information to the Web UI and the event log like we do for RDD storage information by reporting RDD reads and their results with task metrics. From this, live computation of RDD miss rates is straightforward, and information in the event log would enable more complicated post-hoc analyses. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3630) Identify cause of Kryo+Snappy PARSING_ERROR
[ https://issues.apache.org/jira/browse/SPARK-3630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14207568#comment-14207568 ] Ryan Williams commented on SPARK-3630: -- I'm seeing many Snappy {{FAILED_TO_UNCOMPRESS(5)}} and {{PARSING_ERROR(2)}} errors. I just built Spark yesterday off of [227488d|https://github.com/apache/spark/commit/227488d], so I expected that to have picked up some of the fixes detailed in this thread. I am running on a Yarn cluster whose 100 nodes have kernel 2.6.32 so in a few of these attempts I to used {{spark.file.transferTo=false}} and still saw these errors. Here are some notes about some of my runs, along with the stdout I got: * 1000 partitions, {{spark.file.transferTo=false}}: [stdout|https://www.dropbox.com/s/141keqpojucfbai/logs.1000?dl=0]. This was my latest run; it took a while to get to my reduceByKeyLocally stage, and immediately upon finishing the preceding stage it emitted ~190K {{FetchFailure}}s over ~200 attempts of the stage in about one minute, followed by some Snappy errors and the job shutting down. * 2000 partitions, {{spark.file.transferTo=false}}: [stdout|https://www.dropbox.com/s/jr1dsldodq4rvbz/logs.2000?dl=0]. This one had ~150 FetchFailures out of the gate, seemingly ran fine for ~8mins, then had a futures timeout, seemingly ran find for another ~17m, then got to my reduceByKeyLocally stage and died from Snappy errors. * 2000 partitions, {{spark.file.transferTo=true}}: [stdout|https://www.dropbox.com/s/9n24ffcdq0j43ue/logs.2000.tt?dl=0]. Before running the above two, I was hoping that {{spark.file.transferTo=false}} was going to fix my problems, so I ran this to see whether 2000 partitions was the determining factor in the Snappy errors happening, as [~joshrosen] suggested in this thread. No such luck! ~15 FetchFailures right away, ran fine for 24mins, got to reduceByKeyLocally phase, Snappy-failed and died. * these and other stdout logs can be found [here|https://www.dropbox.com/sh/pn0bik3tvy73wfi/AAByFlQVJ3QUOqiKYKXt31RGa?dl=0] In all of these I was running on a dataset (~170GB) that should be easily handled by my cluster (5TB RAM total), and in fact I successfully ran this job against this dataset last night using a Spark 1.1 build. That job was dying of FetchFailures when I tried to run against a larger dataset (~300GB), and I thought maybe I needed shuffle sorting or external shuffle service, or other 1.2.0 goodies, so I've been trying to run with 1.2.0 but can't get anything to finish. This job reads a file in from hadoop, coalesces to the number of partitions I've asked for, and does a {{flatMap}}, a {reduceByKey}}, a map, and a {{reduceByKeyLocally}}. I am pretty confident that the {{Map}} I'm materializing onto the driver in the {{reduceByKeyLocally}} is a reasonable size; it's a {{Map[Long, Long]}} with about 40K entries, and I've actually successfully run this job on this data to materialize that exact map at different points this week, as I mentioned before. Something causes this job to die almost immediately upon starting the {{reduceByKeyLocally}} phase, however, usually just with Snappy errors, but with a preponderance of FetchFailures preceding them in my last attempt. Let me know what other information I can provide that might be useful. Thanks! Identify cause of Kryo+Snappy PARSING_ERROR --- Key: SPARK-3630 URL: https://issues.apache.org/jira/browse/SPARK-3630 Project: Spark Issue Type: Task Components: Spark Core Affects Versions: 1.1.0, 1.2.0 Reporter: Andrew Ash Assignee: Josh Rosen A recent GraphX commit caused non-deterministic exceptions in unit tests so it was reverted (see SPARK-3400). Separately, [~aash] observed the same exception stacktrace in an application-specific Kryo registrator: {noformat} com.esotericsoftware.kryo.KryoException: java.io.IOException: failed to uncompress the chunk: PARSING_ERROR(2) com.esotericsoftware.kryo.io.Input.fill(Input.java:142) com.esotericsoftware.kryo.io.Input.require(Input.java:169) com.esotericsoftware.kryo.io.Input.readInt(Input.java:325) com.esotericsoftware.kryo.io.Input.readFloat(Input.java:624) com.esotericsoftware.kryo.serializers.DefaultSerializers$FloatSerializer.read(DefaultSerializers.java:127) com.esotericsoftware.kryo.serializers.DefaultSerializers$FloatSerializer.read(DefaultSerializers.java:117) com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:109) com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:732) ... {noformat} This ticket is to identify the cause of the exception