[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069300#comment-16069300 ] Helena Edelson commented on SPARK-18057: IMHO kafka-0-11 to be explicit and wait until kafka 0.11.1.0 which per https://issues.apache.org/jira/browse/KAFKA-4879 resolves the last blocker to upgrading? > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000130#comment-16000130 ] Helena Edelson edited comment on SPARK-18057 at 5/8/17 8:23 PM: It's not that simple, the PR I have queued for this required some code changes in the upgrade. It's not just a dependency addition/exclusion. was (Author: helena_e): Did that a while ago, my only point is not modifying artifacts ideally, by adding and excluding in builds. > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000130#comment-16000130 ] Helena Edelson commented on SPARK-18057: Did that a while ago, my only point is not modifying artifacts ideally, by adding and excluding in builds. > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996166#comment-15996166 ] Helena Edelson commented on SPARK-18057: With the current 0.10.0.1 version we have several issues happening, forcing us into ever tighter situations. Much of this is constraints related to new functionality in later Kafka releases around kafka security and SASL_SSL and related behavior not in previous versions of Kafka. Users in our ecosystem can not delete topics on clusters so this is not our relevant use case. It seems only structured streaming kafka does deleteTopic, vs spark-streaming-kafka. I've had to create an internal fork so that we can use Kafka 0.10.2.0 in Spark, which is bad but we are blocked otherwise. [~ijuma] good to know on the timing. A group of us voted for https://issues.apache.org/jira/browse/KAFKA-4879. > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983832#comment-15983832 ] Helena Edelson commented on SPARK-18057: It is the timeout. I think waiting is better, will be watching that ticket in Kafka. > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983796#comment-15983796 ] Helena Edelson commented on SPARK-18057: I have a branch off branch-2.2 with the 0.10.2.0 upgrade and changes done. All the delete-topic-related tests fail (mainly just in streaming kafka sql). I can PR with those few tests commented out but that doesn't sound right. Or wait to PR? > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979788#comment-15979788 ] Helena Edelson commented on SPARK-18057: Confirming that https://issues.apache.org/jira/browse/KAFKA-4879 - KafkaConsumer.position may hang forever when deleting a topic - is the only blocker. I upgraded in my fork with some minor code changes and the delete-related tests in spark-sql-kafka-0-10 hang. I can submit this as a PR as soon as that is resolved. > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-18057: --- Comment: was deleted (was: There’s a RC for 0.10.2.1 that’s been opened for a while. - It introduces backward compatible protocol (new clients can talk to old broker and vice versa). - There are many fixes in this RC: https://issues.apache.org/jira/browse/KAFKA-4198?jql=fixVersion%20%3D%200.10.2.1%20AND%20project%20%3D%20KAFKA) > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979678#comment-15979678 ] Helena Edelson commented on SPARK-18057: There’s a RC for 0.10.2.1 that’s been opened for a while. - It introduces backward compatible protocol (new clients can talk to old broker and vice versa). - There are many fixes in this RC: https://issues.apache.org/jira/browse/KAFKA-4198?jql=fixVersion%20%3D%200.10.2.1%20AND%20project%20%3D%20KAFKA > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977896#comment-15977896 ] Helena Edelson commented on SPARK-18057: I think this fix in 0.10.2.0 was a big part of it https://issues.apache.org/jira/browse/KAFKA-4547. I saw that behavior. Possible Concern - https://issues.apache.org/jira/browse/SPARK-18779 - I've seen this - https://issues.apache.org/jira/browse/KAFKA-4879 - Not seen this, noted by Michael and [~zsxwing] > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977690#comment-15977690 ] Helena Edelson edited comment on SPARK-18057 at 4/20/17 10:15 PM: -- Hi [~marmbrus], 0.10.2.0 is out. When I modify the kafka version, tests pass - as opposed to the failures with 0.10.1.x. was (Author: helena_e): Hi [~marmbrus], 0.10.2.0 is out. When I modify the kafka version, tests pass. > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977690#comment-15977690 ] Helena Edelson commented on SPARK-18057: Hi [~marmbrus], 0.10.2.0 is out. When I modify the kafka version, tests pass. > Update structured streaming kafka from 10.0.1 to 10.2.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.1.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838274#comment-15838274 ] Helena Edelson commented on SPARK-18057: Thanks [~c...@koeninger.org]. Chatting with [~tdas] today. > Update structured streaming kafka from 10.0.1 to 10.1.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.1.0
[ https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837791#comment-15837791 ] Helena Edelson commented on SPARK-18057: I'd tried this upgrade just cursory attempt with version change to see the behavior and definitely ran into offset logic changes in the spark streaming kafka code that will need to happen. Expected offset behavior change, but I didn't investigate enough to clarify where in Kafka that is from. This is a very important issue for us, needing this Kafka upgrade to 10.1.0. > Update structured streaming kafka from 10.0.1 to 10.1.0 > --- > > Key: SPARK-18057 > URL: https://issues.apache.org/jira/browse/SPARK-18057 > Project: Spark > Issue Type: Improvement > Components: Structured Streaming >Reporter: Cody Koeninger > > There are a couple of relevant KIPs here, > https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-19185) ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing
[ https://issues.apache.org/jira/browse/SPARK-19185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827215#comment-15827215 ] Helena Edelson commented on SPARK-19185: I've seen this as well, the exceptions, as expected, are never raised if not using the cache. Spark 2.1. The exception is raised in the seek function. I added an opt-in config for that temporarily but will work on a better solution. Perf hits aren't something I can do :) > ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing > - > > Key: SPARK-19185 > URL: https://issues.apache.org/jira/browse/SPARK-19185 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.0.2 > Environment: Spark 2.0.2 > Spark Streaming Kafka 010 > Mesos 0.28.0 - client mode > spark.executor.cores 1 > spark.mesos.extra.cores 1 >Reporter: Kalvin Chau > Labels: streaming, windowing > > We've been running into ConcurrentModificationExcpetions "KafkaConsumer is > not safe for multi-threaded access" with the CachedKafkaConsumer. I've been > working through debugging this issue and after looking through some of the > spark source code I think this is a bug. > Our set up is: > Spark 2.0.2, running in Mesos 0.28.0-2 in client mode, using > Spark-Streaming-Kafka-010 > spark.executor.cores 1 > spark.mesos.extra.cores 1 > Batch interval: 10s, window interval: 180s, and slide interval: 30s > We would see the exception when in one executor there are two task worker > threads assigned the same Topic+Partition, but a different set of offsets. > They would both get the same CachedKafkaConsumer, and whichever task thread > went first would seek and poll for all the records, and at the same time the > second thread would try to seek to its offset but fail because it is unable > to acquire the lock. > Time0 E0 Task0 - TopicPartition("abc", 0) X to Y > Time0 E0 Task1 - TopicPartition("abc", 0) Y to Z > Time1 E0 Task0 - Seeks and starts to poll > Time1 E0 Task1 - Attempts to seek, but fails > Here are some relevant logs: > {code} > 17/01/06 03:10:01 Executor task launch worker-1 INFO KafkaRDD: Computing > topic test-topic, partition 2 offsets 4394204414 -> 4394238058 > 17/01/06 03:10:01 Executor task launch worker-0 INFO KafkaRDD: Computing > topic test-topic, partition 2 offsets 4394238058 -> 4394257712 > 17/01/06 03:10:01 Executor task launch worker-1 DEBUG CachedKafkaConsumer: > Get spark-executor-consumer test-topic 2 nextOffset 4394204414 requested > 4394204414 > 17/01/06 03:10:01 Executor task launch worker-0 DEBUG CachedKafkaConsumer: > Get spark-executor-consumer test-topic 2 nextOffset 4394204414 requested > 4394238058 > 17/01/06 03:10:01 Executor task launch worker-0 INFO CachedKafkaConsumer: > Initial fetch for spark-executor-consumer test-topic 2 4394238058 > 17/01/06 03:10:01 Executor task launch worker-0 DEBUG CachedKafkaConsumer: > Seeking to test-topic-2 4394238058 > 17/01/06 03:10:01 Executor task launch worker-0 WARN BlockManager: Putting > block rdd_199_2 failed due to an exception > 17/01/06 03:10:01 Executor task launch worker-0 WARN BlockManager: Block > rdd_199_2 could not be removed as it was not found on disk or in memory > 17/01/06 03:10:01 Executor task launch worker-0 ERROR Executor: Exception in > task 49.0 in stage 45.0 (TID 3201) > java.util.ConcurrentModificationException: KafkaConsumer is not safe for > multi-threaded access > at > org.apache.kafka.clients.consumer.KafkaConsumer.acquire(KafkaConsumer.java:1431) > at > org.apache.kafka.clients.consumer.KafkaConsumer.seek(KafkaConsumer.java:1132) > at > org.apache.spark.streaming.kafka010.CachedKafkaConsumer.seek(CachedKafkaConsumer.scala:95) > at > org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:69) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:227) > at > org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:193) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:409) > at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462) > at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) > at > org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:360) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:951) > at > org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:926) > at
[jira] [Closed] (SPARK-6283) Add a CassandraInputDStream to stream from a C* table
[ https://issues.apache.org/jira/browse/SPARK-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson closed SPARK-6283. - Resolution: Done I've written this but sadly DataStax has decided to close source it. Add a CassandraInputDStream to stream from a C* table - Key: SPARK-6283 URL: https://issues.apache.org/jira/browse/SPARK-6283 Project: Spark Issue Type: New Feature Components: Streaming Reporter: Helena Edelson Add support for streaming from Cassandra to Spark Streaming - external. Related ticket: https://datastax-oss.atlassian.net/browse/SPARKC-40 [~helena_e] is doing the work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-6283) Add a CassandraInputDStream to stream from a C* table
Helena Edelson created SPARK-6283: - Summary: Add a CassandraInputDStream to stream from a C* table Key: SPARK-6283 URL: https://issues.apache.org/jira/browse/SPARK-6283 Project: Spark Issue Type: New Feature Components: Streaming Reporter: Helena Edelson Add support for streaming from Cassandra to Spark Streaming - external. Related ticket: https://datastax-oss.atlassian.net/browse/SPARKC-40 [~helena_e] is doing the work. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3203) ClassNotFoundException in spark-shell with Cassandra
[ https://issues.apache.org/jira/browse/SPARK-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303215#comment-14303215 ] Helena Edelson commented on SPARK-3203: --- Are you both using the spark shell and open source cassandra when you get this error? ClassNotFoundException in spark-shell with Cassandra Key: SPARK-3203 URL: https://issues.apache.org/jira/browse/SPARK-3203 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 1.0.2 Environment: Ubuntu 12.04, openjdk 64 bit 7u65 Reporter: Rohit Kumar I am using Spark with as processing engine over cassandra. I have only one master and a worker node. I am executing following code in spark-shell : sc.stop import org.apache.spark.SparkContext import org.apache.spark.SparkConf import com.datastax.spark.connector._ val conf = new SparkConf(true).set(spark.cassandra.connection.host, 127.0.0.1) val sc = new SparkContext(spark://L-BXP44Z1:7077, Cassandra Connector Test, conf) val rdd = sc.cassandraTable(test, kv) println(rdd.map(_.getInt(value)).sum) I am getting following error: 14/08/25 18:47:17 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address 14/08/25 18:49:39 INFO CoarseGrainedExecutorBackend: Got assigned task 0 14/08/25 18:49:39 INFO Executor: Running task ID 0 14/08/25 18:49:39 ERROR Executor: Exception in task ID 0 java.lang.ClassNotFoundException: $line29.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1 at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:270) at org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:60) at java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612) at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at scala.collection.immutable.$colon$colon.readObject(List.scala:362) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370) at org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63) at org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61) at org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141) at java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
[jira] [Closed] (SPARK-3924) Upgrade to Akka version 2.3.7
[ https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson closed SPARK-3924. - Resolution: Fixed Upgrade to Akka version 2.3.7 - Key: SPARK-3924 URL: https://issues.apache.org/jira/browse/SPARK-3924 Project: Spark Issue Type: Dependency upgrade Environment: deploy env Reporter: Helena Edelson I tried every sbt in the book but can't use the latest Akka version in my project with Spark. It would be great if I could. Also I can not use the latest Typesafe Config - 1.2.1, which would also be great. See https://issues.apache.org/jira/browse/SPARK-2593 This is a big change. If I have time I can do a PR. [~helena_e] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3924) Upgrade to Akka version 2.3.7
[ https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282763#comment-14282763 ] Helena Edelson commented on SPARK-3924: --- I wrote this ticket against the previous, not the latest, version. Upgrade to Akka version 2.3.7 - Key: SPARK-3924 URL: https://issues.apache.org/jira/browse/SPARK-3924 Project: Spark Issue Type: Dependency upgrade Environment: deploy env Reporter: Helena Edelson I tried every sbt in the book but can't use the latest Akka version in my project with Spark. It would be great if I could. Also I can not use the latest Typesafe Config - 1.2.1, which would also be great. See https://issues.apache.org/jira/browse/SPARK-2593 This is a big change. If I have time I can do a PR. [~helena_e] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4923) Maven build should keep publishing spark-repl
[ https://issues.apache.org/jira/browse/SPARK-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261639#comment-14261639 ] Helena Edelson commented on SPARK-4923: --- We are updating https://github.com/datastax/spark-cassandra-connector which integrates with the REPL, as does DSE, so this is a blocker for our upgrade to spark 1.2.0 as well. Maven build should keep publishing spark-repl - Key: SPARK-4923 URL: https://issues.apache.org/jira/browse/SPARK-4923 Project: Spark Issue Type: Bug Components: Build, Spark Shell Affects Versions: 1.2.0 Reporter: Peng Cheng Priority: Critical Labels: shell Attachments: SPARK-4923__Maven_build_should_keep_publishing_spark-repl.patch Original Estimate: 1h Remaining Estimate: 1h Spark-repl installation and deployment has been discontinued (see SPARK-3452). But its in the dependency list of a few projects that extends its initialization process. Please remove the 'skip' setting in spark-repl and make it an 'official' API to encourage more platform to integrate with it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-3924) Upgrade to Akka version 2.3.7
[ https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-3924: -- Summary: Upgrade to Akka version 2.3.7 (was: Upgrade to Akka version 2.3.6) Upgrade to Akka version 2.3.7 - Key: SPARK-3924 URL: https://issues.apache.org/jira/browse/SPARK-3924 Project: Spark Issue Type: Dependency upgrade Environment: deploy env Reporter: Helena Edelson I tried every sbt in the book but can't use the latest Akka version in my project with Spark. It would be great if I could. Also I can not use the latest Typesafe Config - 1.2.1, which would also be great. See https://issues.apache.org/jira/browse/SPARK-2593 This is a big change. If I have time I can do a PR. [~helena_e] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-3924) Upgrade to Akka version 2.3.7
[ https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181319#comment-14181319 ] Helena Edelson edited comment on SPARK-3924 at 12/9/14 12:55 PM: - Modified to 2.3.7 was (Author: helena_e): I met with Matei at Strata about this. Hopefully there is a way to not lock a user into the same version of Akka that Spark is on :) Upgrade to Akka version 2.3.7 - Key: SPARK-3924 URL: https://issues.apache.org/jira/browse/SPARK-3924 Project: Spark Issue Type: Dependency upgrade Environment: deploy env Reporter: Helena Edelson I tried every sbt in the book but can't use the latest Akka version in my project with Spark. It would be great if I could. Also I can not use the latest Typesafe Config - 1.2.1, which would also be great. See https://issues.apache.org/jira/browse/SPARK-2593 This is a big change. If I have time I can do a PR. [~helena_e] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237150#comment-14237150 ] Helena Edelson commented on SPARK-2808: --- I've done the migration, am testing the changes against Scala 2.10.4 since the parent spark pom uses scala.version2.10.4/scala.version scala.binary.version2.10/scala.binary.version This will allow usage of the new producer, among other nice additions in 0.8.2. Though I do not know when it will be GA. update kafka to version 0.8.2 - Key: SPARK-2808 URL: https://issues.apache.org/jira/browse/SPARK-2808 Project: Spark Issue Type: Sub-task Components: Build, Spark Core Reporter: Anand Avati First kafka_2.11 0.8.1 has to be released -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2808) update kafka to version 0.8.2
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237150#comment-14237150 ] Helena Edelson edited comment on SPARK-2808 at 12/7/14 3:35 PM: I've done the migration, am testing the changes against Scala 2.10.4 since the parent spark pom uses scala.version2.10.4/scala.version scala.binary.version2.10/scala.binary.version I do not know when it will be GA. was (Author: helena_e): I've done the migration, am testing the changes against Scala 2.10.4 since the parent spark pom uses scala.version2.10.4/scala.version scala.binary.version2.10/scala.binary.version This will allow usage of the new producer, among other nice additions in 0.8.2. Though I do not know when it will be GA. update kafka to version 0.8.2 - Key: SPARK-2808 URL: https://issues.apache.org/jira/browse/SPARK-2808 Project: Spark Issue Type: Sub-task Components: Build, Spark Core Reporter: Anand Avati First kafka_2.11 0.8.1 has to be released -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2808) update kafka to version 0.8.2
[ https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237150#comment-14237150 ] Helena Edelson edited comment on SPARK-2808 at 12/7/14 3:52 PM: I've done the migration, am testing the changes against Scala 2.10.4 since the parent spark pom uses scala.version2.10.4/scala.version scala.binary.version2.10/scala.binary.version I do not know when it will be GA. https://github.com/apache/spark/pull/3631 was (Author: helena_e): I've done the migration, am testing the changes against Scala 2.10.4 since the parent spark pom uses scala.version2.10.4/scala.version scala.binary.version2.10/scala.binary.version I do not know when it will be GA. update kafka to version 0.8.2 - Key: SPARK-2808 URL: https://issues.apache.org/jira/browse/SPARK-2808 Project: Spark Issue Type: Sub-task Components: Build, Spark Core Reporter: Anand Avati First kafka_2.11 0.8.1 has to be released -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4063) Add the ability to send messages to Kafka in the stream
[ https://issues.apache.org/jira/browse/SPARK-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14236899#comment-14236899 ] Helena Edelson commented on SPARK-4063: --- Obviously. But since I created the ticket someone else submitted a PR https://github.com/apache/spark/pull/2994. The idea is of course that it is much cleaner to have a clean integration to read as well as write in a stream. Add the ability to send messages to Kafka in the stream --- Key: SPARK-4063 URL: https://issues.apache.org/jira/browse/SPARK-4063 Project: Spark Issue Type: New Feature Components: Input/Output Reporter: Helena Edelson Currently you can only receive from Kafka in the stream. This would be adding the ability to publish from the stream as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-4063) Add the ability to send messages to Kafka in the stream
[ https://issues.apache.org/jira/browse/SPARK-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson closed SPARK-4063. - Resolution: Unresolved A PR has been submitted since I created the ticket Add the ability to send messages to Kafka in the stream --- Key: SPARK-4063 URL: https://issues.apache.org/jira/browse/SPARK-4063 Project: Spark Issue Type: New Feature Components: Input/Output Reporter: Helena Edelson Currently you can only receive from Kafka in the stream. This would be adding the ability to publish from the stream as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3924) Upgrade to Akka version 2.3.6
[ https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181319#comment-14181319 ] Helena Edelson commented on SPARK-3924: --- I met with Matei at Strata about this. Hopefully there is a way to not lock a user into the same version of Akka that Spark is on :) Upgrade to Akka version 2.3.6 - Key: SPARK-3924 URL: https://issues.apache.org/jira/browse/SPARK-3924 Project: Spark Issue Type: Dependency upgrade Environment: deploy env Reporter: Helena Edelson I tried every sbt in the book but can't use the latest Akka version in my project with Spark. It would be great if I could. Also I can not use the latest Typesafe Config - 1.2.1, which would also be great. See https://issues.apache.org/jira/browse/SPARK-2593 This is a big change. If I have time I can do a PR. [~helena_e] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson resolved SPARK-2593. --- Resolution: Won't Fix As a user, I want to be able to use the latest version of Akka with Spark and not be locked into the version Spark is using :) I can live with 2 ActorSystem instances per node if it means I can use the Akka version I need. Hopefully there is a way in the build to scope Spark's Akka version. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-4063) Add the ability to send messages to Kafka in the stream
Helena Edelson created SPARK-4063: - Summary: Add the ability to send messages to Kafka in the stream Key: SPARK-4063 URL: https://issues.apache.org/jira/browse/SPARK-4063 Project: Spark Issue Type: New Feature Components: Input/Output Reporter: Helena Edelson Currently you can only receive from Kafka in the stream. This would be adding the ability to publish from the stream as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-4063) Add the ability to send messages to Kafka in the stream
[ https://issues.apache.org/jira/browse/SPARK-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181520#comment-14181520 ] Helena Edelson commented on SPARK-4063: --- I have this started in a WIP branch Add the ability to send messages to Kafka in the stream --- Key: SPARK-4063 URL: https://issues.apache.org/jira/browse/SPARK-4063 Project: Spark Issue Type: New Feature Components: Input/Output Reporter: Helena Edelson Currently you can only receive from Kafka in the stream. This would be adding the ability to publish from the stream as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-3924) Upgrade to Akka version 2.3.6
Helena Edelson created SPARK-3924: - Summary: Upgrade to Akka version 2.3.6 Key: SPARK-3924 URL: https://issues.apache.org/jira/browse/SPARK-3924 Project: Spark Issue Type: Dependency upgrade Environment: deploy env Reporter: Helena Edelson I tried every sbt in the book but can't use the latest Akka version in my project with Spark. It would be great if I could. Also I can not use the latest Typesafe Config - 1.2.1, which would also be great. This is a big change. If I have time I can do a PR. [~helena_e] -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169208#comment-14169208 ] Helena Edelson edited comment on SPARK-2593 at 10/13/14 11:55 AM: -- [~matei], [~pwendell] Yes I see the pain point here now. I just created a ticket to upgrade Akka and thus Typesafe Config versions because I am now locked into 2.2.3 and have binary incompatibility with using latest Akka 2.3.6 / config 1.2.1. Makes me very sad. I think I would throw in the towel on this one if you can make it completely separate so that a user with it's own AkkaSystem and Config versions are not affected? Tricky because when deploying, spark needs its version (provided?) and the user app needs the other. was (Author: helena_e): [~matei] [~pwendell] Yes I see the pain point here now. I just created a ticket to upgrade Akka and thus Typesafe Config versions because I am now locked into 2.2.3 and have binary incompatibility with using latest Akka 2.3.6 / config 1.2.1. Makes me very sad. I think I would throw in the towel on this one if you can make it completely separate so that a user with it's own AkkaSystem and Config versions are not affected? Tricky because when deploying, spark needs its version (provided?) and the user app needs the other. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169208#comment-14169208 ] Helena Edelson commented on SPARK-2593: --- [~matei] [~pwendell] Yes I see the pain point here now. I just created a ticket to upgrade Akka and thus Typesafe Config versions because I am now locked into 2.2.3 and have binary incompatibility with using latest Akka 2.3.6 / config 1.2.1. Makes me very sad. I think I would throw in the towel on this one if you can make it completely separate so that a user with it's own AkkaSystem and Config versions are not affected? Tricky because when deploying, spark needs its version (provided?) and the user app needs the other. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138906#comment-14138906 ] Helena Edelson commented on SPARK-2593: --- [~matei] +1 for spark streaming, that is a primary concern here. And I understand your concern over support for akka upgrades. However I am more than happy to help WRT that and am sure I can find a few others that feel the same so that time isn't taken away from your team's new features/enhancements bandwidth. I will get more data on have 2 actor systems on a node. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137338#comment-14137338 ] Helena Edelson commented on SPARK-2593: --- [~pwendell] I forgot to not this on my reply above but I was offering to do the work myself - or at least start it. What would be ideal is to have both of these: - Expose the spark actor system which also requires insuring all spark actors are on specified dispatchers (very important) - Optionally allow spark users to pass in their existing actor system I feel both are incredibly important for users. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137338#comment-14137338 ] Helena Edelson edited comment on SPARK-2593 at 9/17/14 2:42 PM: [~pwendell] I forgot to note this on my reply above but I was offering to do the work myself - or at least start it. What would be ideal is to have both of these: - Expose the spark actor system which also requires insuring all spark actors are on specified dispatchers (very important) - Optionally allow spark users to pass in their existing actor system I feel both are incredibly important for users. was (Author: helena_e): [~pwendell] I forgot to not this on my reply above but I was offering to do the work myself - or at least start it. What would be ideal is to have both of these: - Expose the spark actor system which also requires insuring all spark actors are on specified dispatchers (very important) - Optionally allow spark users to pass in their existing actor system I feel both are incredibly important for users. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137338#comment-14137338 ] Helena Edelson edited comment on SPARK-2593 at 9/17/14 2:44 PM: [~pwendell] I forgot to note this on my reply above but I was offering to do the work myself - or at least start it. What would be ideal is to have both of these: - Expose the spark actor system which also requires insuring all spark actors are on specified dispatchers (very important) - Optionally allow spark users to pass in their existing actor system - Add a logical naming convention for spark streaming actors or a function to get it I feel both are incredibly important for users. was (Author: helena_e): [~pwendell] I forgot to note this on my reply above but I was offering to do the work myself - or at least start it. What would be ideal is to have both of these: - Expose the spark actor system which also requires insuring all spark actors are on specified dispatchers (very important) - Optionally allow spark users to pass in their existing actor system I feel both are incredibly important for users. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14136463#comment-14136463 ] Helena Edelson commented on SPARK-2593: --- [~pwendell] I've spent no time thus far because yet, the Akka ActorSystem is private, that is the point of my ticket ;) once that can be exposed I'm g2g. - Helena Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2593: -- Description: As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. was: As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. I would like to create an Akka Extension that wraps around Spark/Spark Streaming and Cassandra. So the programmatic creation would simply be this for a user val extension = SparkCassandra(system) Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132797#comment-14132797 ] Helena Edelson commented on SPARK-2593: --- Here is a good example of just one of the issues: it is difficult to locate a remote spark actor to publish data to the stream. Here I have to have the streaming actor get created and in the preStart, publish a custom message with `self`which my actors in my ActorSystem can receive in order to get the ActorRef to send to. This is incredibly clunky. I will try to carve out some time to do this PR this week. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121324#comment-14121324 ] Helena Edelson commented on SPARK-2892: --- I see the same with 1.0.2 streaming: ERROR 08:26:21,139 Deregistered receiver for stream 0: Stopped by driver WARN 08:26:21,211 Stopped executor without error WARN 08:26:21,213 All of the receivers have not deregistered, Map(0 - ReceiverInfo(0,ActorReceiver-0,null,false,host,Stopped by driver,)) Socket Receiver does not stop when streaming context is stopped --- Key: SPARK-2892 URL: https://issues.apache.org/jira/browse/SPARK-2892 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.0.2 Reporter: Tathagata Das Assignee: Tathagata Das Priority: Critical Running NetworkWordCount with {quote} ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); Thread.sleep(6) {quote} gives the following error {quote} 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 10047 ms on localhost (1/1) 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at ReceiverTracker.scala:275) finished in 10.056 s 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at ReceiverTracker.scala:275, took 10.179263 s 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been terminated 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not deregistered, Map(0 - ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after time 1407375433000 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121324#comment-14121324 ] Helena Edelson edited comment on SPARK-2892 at 9/4/14 1:12 PM: --- I see the same with 1.0.2 streaming, with or without stopGracefully = true ssc.stop(stopSparkContext = false, stopGracefully = true) ERROR 08:26:21,139 Deregistered receiver for stream 0: Stopped by driver WARN 08:26:21,211 Stopped executor without error WARN 08:26:21,213 All of the receivers have not deregistered, Map(0 - ReceiverInfo(0,ActorReceiver-0,null,false,host,Stopped by driver,)) was (Author: helena_e): I see the same with 1.0.2 streaming: ERROR 08:26:21,139 Deregistered receiver for stream 0: Stopped by driver WARN 08:26:21,211 Stopped executor without error WARN 08:26:21,213 All of the receivers have not deregistered, Map(0 - ReceiverInfo(0,ActorReceiver-0,null,false,host,Stopped by driver,)) Socket Receiver does not stop when streaming context is stopped --- Key: SPARK-2892 URL: https://issues.apache.org/jira/browse/SPARK-2892 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.0.2 Reporter: Tathagata Das Assignee: Tathagata Das Priority: Critical Running NetworkWordCount with {quote} ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); Thread.sleep(6) {quote} gives the following error {quote} 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 10047 ms on localhost (1/1) 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at ReceiverTracker.scala:275) finished in 10.056 s 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at ReceiverTracker.scala:275, took 10.179263 s 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been terminated 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not deregistered, Map(0 - ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after time 1407375433000 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121561#comment-14121561 ] Helena Edelson commented on SPARK-2892: --- I wonder if the ERROR should be a WARN or INFO since it occurs as a result of ReceiverSupervisorImpl receiving a StopReceiver, and Deregistered receiver for stream seems like the expected behavior. Socket Receiver does not stop when streaming context is stopped --- Key: SPARK-2892 URL: https://issues.apache.org/jira/browse/SPARK-2892 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.0.2 Reporter: Tathagata Das Assignee: Tathagata Das Priority: Critical Running NetworkWordCount with {quote} ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); Thread.sleep(6) {quote} gives the following error {quote} 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 10047 ms on localhost (1/1) 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at ReceiverTracker.scala:275) finished in 10.056 s 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at ReceiverTracker.scala:275, took 10.179263 s 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been terminated 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not deregistered, Map(0 - ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after time 1407375433000 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped
[ https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121561#comment-14121561 ] Helena Edelson edited comment on SPARK-2892 at 9/4/14 5:01 PM: --- I wonder if the ERROR should be a WARN or INFO since it occurs as a result of ReceiverSupervisorImpl receiving a StopReceiver, and Deregistered receiver for stream seems like the expected behavior. DEBUG 13:00:22,418 Stopping JobScheduler INFO 13:00:22,441 Received stop signal INFO 13:00:22,441 Sent stop signal to all 1 receivers INFO 13:00:22,442 Stopping receiver with message: Stopped by driver: INFO 13:00:22,442 Called receiver onStop INFO 13:00:22,443 Deregistering receiver 0 ERROR 13:00:22,445 Deregistered receiver for stream 0: Stopped by driver INFO 13:00:22,445 Stopped receiver 0 was (Author: helena_e): I wonder if the ERROR should be a WARN or INFO since it occurs as a result of ReceiverSupervisorImpl receiving a StopReceiver, and Deregistered receiver for stream seems like the expected behavior. Socket Receiver does not stop when streaming context is stopped --- Key: SPARK-2892 URL: https://issues.apache.org/jira/browse/SPARK-2892 Project: Spark Issue Type: Bug Components: Streaming Affects Versions: 1.0.2 Reporter: Tathagata Das Assignee: Tathagata Das Priority: Critical Running NetworkWordCount with {quote} ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); Thread.sleep(6) {quote} gives the following error {quote} 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 10047 ms on localhost (1/1) 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at ReceiverTracker.scala:275) finished in 10.056 s 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at ReceiverTracker.scala:275, took 10.179263 s 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been terminated 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not deregistered, Map(0 - ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,)) 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after time 1407375433000 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost: {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-3178) setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero
[ https://issues.apache.org/jira/browse/SPARK-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110116#comment-14110116 ] Helena Edelson commented on SPARK-3178: --- +1 it doesn't look like the input data is validated to fail fast if mb/g is not noted setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero Key: SPARK-3178 URL: https://issues.apache.org/jira/browse/SPARK-3178 Project: Spark Issue Type: Bug Environment: osx Reporter: Jon Haddad This should either default to m or just completely fail. Starting a worker with zero memory isn't very helpful. -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-2802) Improve the Cassandra sample and Add a new sample for Streaming to Cassandra
Helena Edelson created SPARK-2802: - Summary: Improve the Cassandra sample and Add a new sample for Streaming to Cassandra Key: SPARK-2802 URL: https://issues.apache.org/jira/browse/SPARK-2802 Project: Spark Issue Type: Improvement Reporter: Helena Edelson Priority: Minor -- This message was sent by Atlassian JIRA (v6.2#6252) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2593: -- Issue Type: Improvement (was: Brainstorming) Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. If it makes sense... I would like to create an Akka Extension that wraps around Spark/Spark Streaming and Cassandra. So the creation would simply be this for a user val extension = SparkCassandra(system) and using is as easy as: import extension._ spark. // do work or, streaming. // do work and all config comes from reference.conf and user overrides of that. The conf file would pick up settings from the deployed environment first, then fallback to -D with a final fallback to configured settings. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2593: -- Description: As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. I would like to create an Akka Extension that wraps around Spark/Spark Streaming and Cassandra. So the programmatic creation would simply be this for a user val extension = SparkCassandra(system) was: As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. If it makes sense... I would like to create an Akka Extension that wraps around Spark/Spark Streaming and Cassandra. So the creation would simply be this for a user val extension = SparkCassandra(system) and using is as easy as: import extension._ spark. // do work or, streaming. // do work and all config comes from reference.conf and user overrides of that. The conf file would pick up settings from the deployed environment first, then fallback to -D with a final fallback to configured settings. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Improvement Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node in an Akka application. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. I would like to create an Akka Extension that wraps around Spark/Spark Streaming and Cassandra. So the programmatic creation would simply be this for a user val extension = SparkCassandra(system) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
[ https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067540#comment-14067540 ] Helena Edelson commented on SPARK-2593: --- I should note that I'd be happy to do the changes. I am a committer to Akka Cluster. Add ability to pass an existing Akka ActorSystem into Spark --- Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Brainstorming Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. If it makes sense... I would like to create an Akka Extension that wraps around Spark/Spark Streaming and Cassandra. So the creation would simply be this for a user val extension = SparkCassandra(system) and using is as easy as: import extension._ spark. // do work or, streaming. // do work and all config comes from reference.conf and user overrides of that. The conf file would pick up settings from the deployed environment first, then fallback to -D with a final fallback to configured settings. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark
Helena Edelson created SPARK-2593: - Summary: Add ability to pass an existing Akka ActorSystem into Spark Key: SPARK-2593 URL: https://issues.apache.org/jira/browse/SPARK-2593 Project: Spark Issue Type: Brainstorming Components: Spark Core Reporter: Helena Edelson As a developer I want to pass an existing ActorSystem into StreamingContext in load-time so that I do not have 2 actor systems running on a node. This would mean having spark's actor system on its own named-dispatchers as well as exposing the new private creation of its own actor system. If it makes sense... I would like to create an Akka Extension that wraps around Spark/Spark Streaming and Cassandra. So the creation would simply be this for a user val extension = SparkCassandra(system) and using is as easy as: import extension._ spark. // do work or, streaming. // do work and all config comes from reference.conf and user overrides of that. The conf file would pick up settings from the deployed environment first, then fallback to -D with a final fallback to configured settings. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (SPARK-2198) Partition the scala build file so that it is easier to maintain
Helena Edelson created SPARK-2198: - Summary: Partition the scala build file so that it is easier to maintain Key: SPARK-2198 URL: https://issues.apache.org/jira/browse/SPARK-2198 Project: Spark Issue Type: Task Components: Build Reporter: Helena Edelson Priority: Minor Partition to standard Dependencies, Version, Settings, Publish.scala. keeping the SparkBuild clean to describe the modules and their deps so that changes in versions, for example, need only be made in Version.scala, settings changes such as in scalac in Settings.scala, etc. I'd be happy to do this ([~helena_e] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2198) Partition the scala build file so that it is easier to maintain
[ https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2198: -- Description: Partition to standard Dependencies, Version, Settings, Publish.scala. keeping the SparkBuild clean to describe the modules and their deps so that changes in versions, for example, need only be made in Version.scala, settings changes such as in scalac in Settings.scala, etc. I'd be happy to do this ([~helena_e]) was: Partition to standard Dependencies, Version, Settings, Publish.scala. keeping the SparkBuild clean to describe the modules and their deps so that changes in versions, for example, need only be made in Version.scala, settings changes such as in scalac in Settings.scala, etc. I'd be happy to do this ([~helena_e] Partition the scala build file so that it is easier to maintain --- Key: SPARK-2198 URL: https://issues.apache.org/jira/browse/SPARK-2198 Project: Spark Issue Type: Task Components: Build Reporter: Helena Edelson Priority: Minor Original Estimate: 3h Remaining Estimate: 3h Partition to standard Dependencies, Version, Settings, Publish.scala. keeping the SparkBuild clean to describe the modules and their deps so that changes in versions, for example, need only be made in Version.scala, settings changes such as in scalac in Settings.scala, etc. I'd be happy to do this ([~helena_e]) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2198) Partition the scala build file so that it is easier to maintain
[ https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2198: -- Remaining Estimate: 2h (was: 1m) Original Estimate: 2h (was: 1m) Partition the scala build file so that it is easier to maintain --- Key: SPARK-2198 URL: https://issues.apache.org/jira/browse/SPARK-2198 Project: Spark Issue Type: Task Components: Build Reporter: Helena Edelson Priority: Minor Original Estimate: 2h Remaining Estimate: 2h Partition to standard Dependencies, Version, Settings, Publish.scala. keeping the SparkBuild clean to describe the modules and their deps so that changes in versions, for example, need only be made in Version.scala, settings changes such as in scalac in Settings.scala, etc. I'd be happy to do this ([~helena_e] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (SPARK-2198) Partition the scala build file so that it is easier to maintain
[ https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Helena Edelson updated SPARK-2198: -- Remaining Estimate: 3h (was: 2h) Original Estimate: 3h (was: 2h) Partition the scala build file so that it is easier to maintain --- Key: SPARK-2198 URL: https://issues.apache.org/jira/browse/SPARK-2198 Project: Spark Issue Type: Task Components: Build Reporter: Helena Edelson Priority: Minor Original Estimate: 3h Remaining Estimate: 3h Partition to standard Dependencies, Version, Settings, Publish.scala. keeping the SparkBuild clean to describe the modules and their deps so that changes in versions, for example, need only be made in Version.scala, settings changes such as in scalac in Settings.scala, etc. I'd be happy to do this ([~helena_e] -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2198) Partition the scala build file so that it is easier to maintain
[ https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037467#comment-14037467 ] Helena Edelson commented on SPARK-2198: --- I am sad to hear that the Maven POMs will be primary (vs scala SBT) and staying. It was very odd to see the SBT/Maven redundancies however. Partition the scala build file so that it is easier to maintain --- Key: SPARK-2198 URL: https://issues.apache.org/jira/browse/SPARK-2198 Project: Spark Issue Type: Task Components: Build Reporter: Helena Edelson Priority: Minor Original Estimate: 3h Remaining Estimate: 3h Partition to standard Dependencies, Version, Settings, Publish.scala. keeping the SparkBuild clean to describe the modules and their deps so that changes in versions, for example, need only be made in Version.scala, settings changes such as in scalac in Settings.scala, etc. I'd be happy to do this ([~helena_e]) -- This message was sent by Atlassian JIRA (v6.2#6252)