serialization stakeoverflow error during reduce on nested objects
Hi, I'm working on a RDD of a tuple of objects which represent trees (Node containing a hashmap of nodes). I'm trying to aggregate these trees over the RDD. Let's take an example, 2 graphs : C - D - B - A - D - B - E F - E - B - A - D - B - F I'm spliting each graphs according to the vertex A resulting in : (B(1, D(1, C(1,))) , D(1, B(1, E(1,))) (B(1, E(1, F(1,))) , D(1, B(1, F(1,))) And I want to aggregate both graph getting : (B(2, (D(1, C(1,)), E(1, F(1, , D(2, B(2, (E(1,), F(1,))) Some graph are potentially large (+4000 vertex) but I'm not supposed to have any cyclic references. When I run my program I get this error : java.lang.StackOverflowError at com.esotericsoftware.kryo.util.DefaultClassResolver.readName(DefaultClassResolver.java:127) at com.esotericsoftware.kryo.util.DefaultClassResolver.readClass(DefaultClassResolver.java:115) at com.esotericsoftware.kryo.Kryo.readClass(Kryo.java:610) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:721) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:109) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) at com.esotericsoftware.kryo.serializers.FieldSerializer$ObjectField.read(FieldSerializer.java:605) at com.esotericsoftware.kryo.serializers.FieldSerializer.read(FieldSerializer.java:221) at com.esotericsoftware.kryo.Kryo.readClassAndObject(Kryo.java:729) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:109) at com.esotericsoftware.kryo.serializers.CollectionSerializer.read(CollectionSerializer.java:18) at com.esotericsoftware.kryo.Kryo.readObject(Kryo.java:648) I've tried to increase the size of the stake and use the standard java serializer but no effect. Any hint of the reason of this error and ways to change my code to solve it ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/serialization-stakeoverflow-error-during-reduce-on-nested-objects-tp22040.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi, I do'nt have any history server running. As SK's already pointed in a previous post the history server seems to be required only in mesos or yarn mode, not in standalone mode. https://spark.apache.org/docs/1.1.1/monitoring.html "If Spark is run on Mesos or YARN, it is still possible to reconstruct the UI of a finished application through Spark’s history server, provided that the application’s event logs exist." -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p21379.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
Re: Spark webUI - application details page
Hi, I've a similar problem. I want to see the detailed logs of Completed Applications so I've set in my program : set("spark.eventLog.enabled","true"). set("spark.eventLog.dir","file:/tmp/spark-events") but when I click on the application in the webui, I got a page with the message : Application history not found (app-20150126000651-0331) No event logs found for application xxx$ in file:/tmp/spark-events/xxx-147211500. Did you specify the correct logging directory? despite the fact that the directory exist and contains 3 files : APPLICATION_COMPLETE* EVENT_LOG_1* SPARK_VERSION_1.1.0* I use spark 1.1.0 on a standalone cluster with 3 nodes. Any suggestion to solve the problem ? Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-webUI-application-details-page-tp3490p21358.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org
key already cancelled error
Hi everyone, I'm writing a program that update a cassandra table. I've writen a first shot where I update the table row by row from a rdd trhough a map. Now I want to build a batch of updates using the same kind of syntax as in this thread : https://groups.google.com/forum/#!msg/spark-users/LUb7ZysYp2k/MhymcFddb8cJ But as soon as I use a mappartition I get a " key already cancelled error". The program updates the table properly but it seems that the problem appears when the driver try to shut down the ressources. 15/01/26 00:07:00 INFO SparkContext: Job finished: collect at CustomerIdReconciliation.scala:143, took 1.998601568 s 15/01/26 00:07:00 INFO SparkUI: Stopped Spark web UI at http://cim1-dev:4044 15/01/26 00:07:00 INFO DAGScheduler: Stopping DAGScheduler 15/01/26 00:07:00 INFO SparkDeploySchedulerBackend: Shutting down all executors 15/01/26 00:07:00 INFO SparkDeploySchedulerBackend: Asking each executor to shut down 15/01/26 00:07:00 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(cim1-dev2,52516) 15/01/26 00:07:00 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(cim1-dev2,52516) 15/01/26 00:07:00 ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId(cim1-dev2,52516) not found 15/01/26 00:07:00 INFO ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl@7cedcb23 15/01/26 00:07:00 INFO ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl@7cedcb23 java.nio.channels.CancelledKeyException at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386) at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139) 15/01/26 00:07:00 INFO ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl@38e8c534 15/01/26 00:07:00 INFO ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl@38e8c534 java.nio.channels.CancelledKeyException at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:310) at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139) 15/01/26 00:07:00 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(cim1-dev,44773) 15/01/26 00:07:00 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(cim1-dev3,29293) 15/01/26 00:07:00 INFO ConnectionManager: Removing SendingConnection to ConnectionManagerId(cim1-dev3,29293) 15/01/26 00:07:00 INFO ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl@159adcf5 15/01/26 00:07:00 INFO ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl@159adcf5 java.nio.channels.CancelledKeyException at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:386) at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139) 15/01/26 00:07:00 INFO ConnectionManager: Removing ReceivingConnection to ConnectionManagerId(cim1-dev,44773) 15/01/26 00:07:00 ERROR ConnectionManager: Corresponding SendingConnection to ConnectionManagerId(cim1-dev,44773) not found 15/01/26 00:07:00 INFO ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl@329a6d86 15/01/26 00:07:00 INFO ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl@329a6d86 java.nio.channels.CancelledKeyException at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:310) at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139) 15/01/26 00:07:00 INFO ConnectionManager: Key not valid ? sun.nio.ch.SelectionKeyImpl@3d3e86d5 15/01/26 00:07:00 INFO ConnectionManager: key already cancelled ? sun.nio.ch.SelectionKeyImpl@3d3e86d5 java.nio.channels.CancelledKeyException at org.apache.spark.network.ConnectionManager.run(ConnectionManager.scala:310) at org.apache.spark.network.ConnectionManager$$anon$4.run(ConnectionManager.scala:139) 15/01/26 00:07:01 INFO MapOutputTrackerMasterActor: MapOutputTrackerActor stopped! 15/01/26 00:07:01 INFO ConnectionManager: Selector thread was interrupted! 15/01/26 00:07:01 INFO ConnectionManager: ConnectionManager stopped 15/01/26 00:07:01 INFO MemoryStore: MemoryStore cleared 15/01/26 00:07:01 INFO BlockManager: BlockManager stopped 15/01/26 00:07:01 INFO BlockManagerMaster: BlockManagerMaster stopped 15/01/26 00:07:01 INFO RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 15/01/26 00:07:01 INFO RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 15/01/26 00:07:01 INFO SparkContext: Successfully stopped SparkContext I've tried to set these 2 options but it doesn't change anything : set("spark.core.connection.ack.wait.timeout","600") set("spark.akka.frameSize","50") Thanks for your help. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/key-already-cancelled-error-tp21357.html Sent from the Apache Spark U