[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-06-29 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16069300#comment-16069300
 ] 

Helena Edelson commented on SPARK-18057:


IMHO kafka-0-11 to be explicit and wait until kafka 0.11.1.0 which per 
https://issues.apache.org/jira/browse/KAFKA-4879 resolves the last blocker to 
upgrading?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-05-08 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000130#comment-16000130
 ] 

Helena Edelson edited comment on SPARK-18057 at 5/8/17 8:23 PM:


It's not that simple, the PR I have queued for this required some code changes 
in the upgrade. It's not just a dependency addition/exclusion.


was (Author: helena_e):
Did that a while ago, my only point is not modifying artifacts ideally, by 
adding and excluding in builds.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-05-07 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16000130#comment-16000130
 ] 

Helena Edelson commented on SPARK-18057:


Did that a while ago, my only point is not modifying artifacts ideally, by 
adding and excluding in builds.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-05-03 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15996166#comment-15996166
 ] 

Helena Edelson commented on SPARK-18057:


With the current 0.10.0.1 version we have several issues happening, forcing us 
into ever tighter situations. Much of this is constraints related to new 
functionality in later Kafka releases around kafka security and SASL_SSL and 
related behavior not in previous versions of Kafka. 

Users in our ecosystem can not delete topics on clusters so this is not our 
relevant use case. It seems only structured streaming  kafka does deleteTopic, 
vs spark-streaming-kafka. 

I've had to create an internal fork so that we can use Kafka 0.10.2.0 in Spark, 
which is bad but we are blocked otherwise.

[~ijuma] good to know on the timing. A group of us voted for 
https://issues.apache.org/jira/browse/KAFKA-4879. 

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983832#comment-15983832
 ] 

Helena Edelson commented on SPARK-18057:


It is the timeout. I think waiting is better, will be watching that ticket in 
Kafka.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-25 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15983796#comment-15983796
 ] 

Helena Edelson commented on SPARK-18057:


I have a branch off branch-2.2 with the 0.10.2.0 upgrade and changes done. All 
the delete-topic-related tests fail (mainly just in streaming kafka sql).

I can PR with those few tests commented out but that doesn't sound right. Or 
wait to PR?

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-22 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979788#comment-15979788
 ] 

Helena Edelson commented on SPARK-18057:


Confirming that https://issues.apache.org/jira/browse/KAFKA-4879 - 
KafkaConsumer.position may hang forever when deleting a topic - is the only 
blocker. I upgraded in my fork with some minor code changes and the 
delete-related tests in spark-sql-kafka-0-10 hang. I can submit this as a PR as 
soon as that is resolved.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Issue Comment Deleted] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-18057:
---
Comment: was deleted

(was: There’s a RC for 0.10.2.1 that’s been opened for a while.

- It introduces backward compatible protocol (new clients can talk to old 
broker and vice versa).
- There are many fixes in this RC:
https://issues.apache.org/jira/browse/KAFKA-4198?jql=fixVersion%20%3D%200.10.2.1%20AND%20project%20%3D%20KAFKA)

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-21 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15979678#comment-15979678
 ] 

Helena Edelson commented on SPARK-18057:


There’s a RC for 0.10.2.1 that’s been opened for a while.

- It introduces backward compatible protocol (new clients can talk to old 
broker and vice versa).
- There are many fixes in this RC:
https://issues.apache.org/jira/browse/KAFKA-4198?jql=fixVersion%20%3D%200.10.2.1%20AND%20project%20%3D%20KAFKA

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977896#comment-15977896
 ] 

Helena Edelson commented on SPARK-18057:


I think this fix in 0.10.2.0 was a big part of it 
https://issues.apache.org/jira/browse/KAFKA-4547. I saw that behavior.

Possible Concern 
- https://issues.apache.org/jira/browse/SPARK-18779 - I've seen this
- https://issues.apache.org/jira/browse/KAFKA-4879 - Not seen this, noted by 
Michael and [~zsxwing]

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977690#comment-15977690
 ] 

Helena Edelson edited comment on SPARK-18057 at 4/20/17 10:15 PM:
--

Hi [~marmbrus], 0.10.2.0 is out. When I modify the kafka version, tests pass - 
as opposed to the failures with 0.10.1.x.


was (Author: helena_e):
Hi [~marmbrus], 0.10.2.0 is out. When I modify the kafka version, tests pass.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.2.0

2017-04-20 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15977690#comment-15977690
 ] 

Helena Edelson commented on SPARK-18057:


Hi [~marmbrus], 0.10.2.0 is out. When I modify the kafka version, tests pass.

> Update structured streaming kafka from 10.0.1 to 10.2.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.1.0

2017-01-25 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838274#comment-15838274
 ] 

Helena Edelson commented on SPARK-18057:


Thanks [~c...@koeninger.org]. Chatting with [~tdas] today.

> Update structured streaming kafka from 10.0.1 to 10.1.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18057) Update structured streaming kafka from 10.0.1 to 10.1.0

2017-01-25 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18057?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15837791#comment-15837791
 ] 

Helena Edelson commented on SPARK-18057:


I'd tried this upgrade just cursory attempt with version change to see the 
behavior and definitely ran into offset logic changes in the spark streaming 
kafka code that will need to happen. Expected offset behavior change, but I 
didn't investigate enough to clarify where in Kafka that is from. 

This is a very important issue for us, needing this Kafka upgrade to 10.1.0.

> Update structured streaming kafka from 10.0.1 to 10.1.0
> ---
>
> Key: SPARK-18057
> URL: https://issues.apache.org/jira/browse/SPARK-18057
> Project: Spark
>  Issue Type: Improvement
>  Components: Structured Streaming
>Reporter: Cody Koeninger
>
> There are a couple of relevant KIPs here, 
> https://archive.apache.org/dist/kafka/0.10.1.0/RELEASE_NOTES.html



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-19185) ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing

2017-01-17 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-19185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15827215#comment-15827215
 ] 

Helena Edelson commented on SPARK-19185:


I've seen this as well, the exceptions, as expected, are never raised if not 
using the cache. 
Spark 2.1.

The exception is raised in the seek function. I added an opt-in config for that 
temporarily but will work on a better solution. Perf hits aren't something I 
can do :)

> ConcurrentModificationExceptions with CachedKafkaConsumers when Windowing
> -
>
> Key: SPARK-19185
> URL: https://issues.apache.org/jira/browse/SPARK-19185
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.2
> Environment: Spark 2.0.2
> Spark Streaming Kafka 010
> Mesos 0.28.0 - client mode
> spark.executor.cores 1
> spark.mesos.extra.cores 1
>Reporter: Kalvin Chau
>  Labels: streaming, windowing
>
> We've been running into ConcurrentModificationExcpetions "KafkaConsumer is 
> not safe for multi-threaded access" with the CachedKafkaConsumer. I've been 
> working through debugging this issue and after looking through some of the 
> spark source code I think this is a bug.
> Our set up is:
> Spark 2.0.2, running in Mesos 0.28.0-2 in client mode, using 
> Spark-Streaming-Kafka-010
> spark.executor.cores 1
> spark.mesos.extra.cores 1
> Batch interval: 10s, window interval: 180s, and slide interval: 30s
> We would see the exception when in one executor there are two task worker 
> threads assigned the same Topic+Partition, but a different set of offsets.
> They would both get the same CachedKafkaConsumer, and whichever task thread 
> went first would seek and poll for all the records, and at the same time the 
> second thread would try to seek to its offset but fail because it is unable 
> to acquire the lock.
> Time0 E0 Task0 - TopicPartition("abc", 0) X to Y
> Time0 E0 Task1 - TopicPartition("abc", 0) Y to Z
> Time1 E0 Task0 - Seeks and starts to poll
> Time1 E0 Task1 - Attempts to seek, but fails
> Here are some relevant logs:
> {code}
> 17/01/06 03:10:01 Executor task launch worker-1 INFO KafkaRDD: Computing 
> topic test-topic, partition 2 offsets 4394204414 -> 4394238058
> 17/01/06 03:10:01 Executor task launch worker-0 INFO KafkaRDD: Computing 
> topic test-topic, partition 2 offsets 4394238058 -> 4394257712
> 17/01/06 03:10:01 Executor task launch worker-1 DEBUG CachedKafkaConsumer: 
> Get spark-executor-consumer test-topic 2 nextOffset 4394204414 requested 
> 4394204414
> 17/01/06 03:10:01 Executor task launch worker-0 DEBUG CachedKafkaConsumer: 
> Get spark-executor-consumer test-topic 2 nextOffset 4394204414 requested 
> 4394238058
> 17/01/06 03:10:01 Executor task launch worker-0 INFO CachedKafkaConsumer: 
> Initial fetch for spark-executor-consumer test-topic 2 4394238058
> 17/01/06 03:10:01 Executor task launch worker-0 DEBUG CachedKafkaConsumer: 
> Seeking to test-topic-2 4394238058
> 17/01/06 03:10:01 Executor task launch worker-0 WARN BlockManager: Putting 
> block rdd_199_2 failed due to an exception
> 17/01/06 03:10:01 Executor task launch worker-0 WARN BlockManager: Block 
> rdd_199_2 could not be removed as it was not found on disk or in memory
> 17/01/06 03:10:01 Executor task launch worker-0 ERROR Executor: Exception in 
> task 49.0 in stage 45.0 (TID 3201)
> java.util.ConcurrentModificationException: KafkaConsumer is not safe for 
> multi-threaded access
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.acquire(KafkaConsumer.java:1431)
>   at 
> org.apache.kafka.clients.consumer.KafkaConsumer.seek(KafkaConsumer.java:1132)
>   at 
> org.apache.spark.streaming.kafka010.CachedKafkaConsumer.seek(CachedKafkaConsumer.scala:95)
>   at 
> org.apache.spark.streaming.kafka010.CachedKafkaConsumer.get(CachedKafkaConsumer.scala:69)
>   at 
> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:227)
>   at 
> org.apache.spark.streaming.kafka010.KafkaRDD$KafkaRDDIterator.next(KafkaRDD.scala:193)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at scala.collection.Iterator$$anon$11.next(Iterator.scala:409)
>   at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:462)
>   at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
>   at 
> org.apache.spark.storage.memory.MemoryStore.putIteratorAsBytes(MemoryStore.scala:360)
>   at 
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:951)
>   at 
> org.apache.spark.storage.BlockManager$$anonfun$doPutIterator$1.apply(BlockManager.scala:926)
>   at 

[jira] [Closed] (SPARK-6283) Add a CassandraInputDStream to stream from a C* table

2015-08-04 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-6283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson closed SPARK-6283.
-
Resolution: Done

I've written this but sadly DataStax has decided to close source it.

 Add a CassandraInputDStream to stream from a C* table
 -

 Key: SPARK-6283
 URL: https://issues.apache.org/jira/browse/SPARK-6283
 Project: Spark
  Issue Type: New Feature
  Components: Streaming
Reporter: Helena Edelson

 Add support for streaming from Cassandra to Spark Streaming - external.
 Related ticket: https://datastax-oss.atlassian.net/browse/SPARKC-40 
 [~helena_e] is doing the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-6283) Add a CassandraInputDStream to stream from a C* table

2015-03-11 Thread Helena Edelson (JIRA)
Helena Edelson created SPARK-6283:
-

 Summary: Add a CassandraInputDStream to stream from a C* table
 Key: SPARK-6283
 URL: https://issues.apache.org/jira/browse/SPARK-6283
 Project: Spark
  Issue Type: New Feature
  Components: Streaming
Reporter: Helena Edelson


Add support for streaming from Cassandra to Spark Streaming - external.

Related ticket: https://datastax-oss.atlassian.net/browse/SPARKC-40 

[~helena_e] is doing the work.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3203) ClassNotFoundException in spark-shell with Cassandra

2015-02-03 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3203?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14303215#comment-14303215
 ] 

Helena Edelson commented on SPARK-3203:
---

Are you both using the spark shell and open source cassandra when you get this 
error?

 ClassNotFoundException in spark-shell with Cassandra
 

 Key: SPARK-3203
 URL: https://issues.apache.org/jira/browse/SPARK-3203
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.0.2
 Environment: Ubuntu 12.04, openjdk 64 bit 7u65
Reporter: Rohit Kumar

 I am using Spark with as processing engine over cassandra. I have only one 
 master and a worker node. 
 I am executing   following code in spark-shell :
 sc.stop
  import org.apache.spark.SparkContext
  import org.apache.spark.SparkConf
 import com.datastax.spark.connector._
 val conf = new SparkConf(true).set(spark.cassandra.connection.host, 
 127.0.0.1)
 val sc = new SparkContext(spark://L-BXP44Z1:7077, Cassandra Connector 
 Test, conf)
  val rdd = sc.cassandraTable(test, kv)
 println(rdd.map(_.getInt(value)).sum) 
 I am getting following error:
 14/08/25 18:47:17 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to 
 another address
 14/08/25 18:49:39 INFO CoarseGrainedExecutorBackend: Got assigned task 0
 14/08/25 18:49:39 INFO Executor: Running task ID 0
 14/08/25 18:49:39 ERROR Executor: Exception in task ID 0
 java.lang.ClassNotFoundException: 
 $line29.$read$$iwC$$iwC$$iwC$$iwC$$iwC$$iwC$$anonfun$1
   at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
   at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
   at java.security.AccessController.doPrivileged(Native Method)
   at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
   at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
   at java.lang.Class.forName0(Native Method)
   at java.lang.Class.forName(Class.java:270)
   at 
 org.apache.spark.serializer.JavaDeserializationStream$$anon$1.resolveClass(JavaSerializer.scala:60)
   at 
 java.io.ObjectInputStream.readNonProxyDesc(ObjectInputStream.java:1612)
   at java.io.ObjectInputStream.readClassDesc(ObjectInputStream.java:1517)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1771)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at 
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at 
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
   at scala.collection.immutable.$colon$colon.readObject(List.scala:362)
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 java.io.ObjectStreamClass.invokeReadObject(ObjectStreamClass.java:1017)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1893)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at 
 java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1990)
   at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1915)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1798)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   at java.io.ObjectInputStream.readObject(ObjectInputStream.java:370)
   at 
 org.apache.spark.serializer.JavaDeserializationStream.readObject(JavaSerializer.scala:63)
   at 
 org.apache.spark.scheduler.ResultTask$.deserializeInfo(ResultTask.scala:61)
   at 
 org.apache.spark.scheduler.ResultTask.readExternal(ResultTask.scala:141)
   at 
 java.io.ObjectInputStream.readExternalData(ObjectInputStream.java:1837)
   at 
 java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1796)
   at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1350)
   

[jira] [Closed] (SPARK-3924) Upgrade to Akka version 2.3.7

2015-01-19 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson closed SPARK-3924.
-
Resolution: Fixed

 Upgrade to Akka version 2.3.7
 -

 Key: SPARK-3924
 URL: https://issues.apache.org/jira/browse/SPARK-3924
 Project: Spark
  Issue Type: Dependency upgrade
 Environment: deploy env
Reporter: Helena Edelson

 I tried every sbt in the book but can't use the latest Akka version in my 
 project with Spark. It would be great if I could.
 Also I can not use the latest Typesafe Config - 1.2.1, which would also be 
 great.
 See https://issues.apache.org/jira/browse/SPARK-2593
 This is a big change. If I have time I can do a PR.
 [~helena_e] 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3924) Upgrade to Akka version 2.3.7

2015-01-19 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14282763#comment-14282763
 ] 

Helena Edelson commented on SPARK-3924:
---

I wrote this ticket against the previous, not the latest, version.


 Upgrade to Akka version 2.3.7
 -

 Key: SPARK-3924
 URL: https://issues.apache.org/jira/browse/SPARK-3924
 Project: Spark
  Issue Type: Dependency upgrade
 Environment: deploy env
Reporter: Helena Edelson

 I tried every sbt in the book but can't use the latest Akka version in my 
 project with Spark. It would be great if I could.
 Also I can not use the latest Typesafe Config - 1.2.1, which would also be 
 great.
 See https://issues.apache.org/jira/browse/SPARK-2593
 This is a big change. If I have time I can do a PR.
 [~helena_e] 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4923) Maven build should keep publishing spark-repl

2014-12-30 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14261639#comment-14261639
 ] 

Helena Edelson commented on SPARK-4923:
---

We are updating https://github.com/datastax/spark-cassandra-connector which 
integrates with the REPL, as does DSE, so this is a blocker for our upgrade to 
spark 1.2.0 as well.

 Maven build should keep publishing spark-repl
 -

 Key: SPARK-4923
 URL: https://issues.apache.org/jira/browse/SPARK-4923
 Project: Spark
  Issue Type: Bug
  Components: Build, Spark Shell
Affects Versions: 1.2.0
Reporter: Peng Cheng
Priority: Critical
  Labels: shell
 Attachments: 
 SPARK-4923__Maven_build_should_keep_publishing_spark-repl.patch

   Original Estimate: 1h
  Remaining Estimate: 1h

 Spark-repl installation and deployment has been discontinued (see 
 SPARK-3452). But its in the dependency list of a few projects that extends 
 its initialization process.
 Please remove the 'skip' setting in spark-repl and make it an 'official' API 
 to encourage more platform to integrate with it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-3924) Upgrade to Akka version 2.3.7

2014-12-09 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-3924:
--
Summary: Upgrade to Akka version 2.3.7  (was: Upgrade to Akka version 2.3.6)

 Upgrade to Akka version 2.3.7
 -

 Key: SPARK-3924
 URL: https://issues.apache.org/jira/browse/SPARK-3924
 Project: Spark
  Issue Type: Dependency upgrade
 Environment: deploy env
Reporter: Helena Edelson

 I tried every sbt in the book but can't use the latest Akka version in my 
 project with Spark. It would be great if I could.
 Also I can not use the latest Typesafe Config - 1.2.1, which would also be 
 great.
 See https://issues.apache.org/jira/browse/SPARK-2593
 This is a big change. If I have time I can do a PR.
 [~helena_e] 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-3924) Upgrade to Akka version 2.3.7

2014-12-09 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181319#comment-14181319
 ] 

Helena Edelson edited comment on SPARK-3924 at 12/9/14 12:55 PM:
-

Modified to 2.3.7


was (Author: helena_e):
I met with Matei at Strata about this. Hopefully there is a way to not lock a 
user into the same version of Akka that Spark is on :)

 Upgrade to Akka version 2.3.7
 -

 Key: SPARK-3924
 URL: https://issues.apache.org/jira/browse/SPARK-3924
 Project: Spark
  Issue Type: Dependency upgrade
 Environment: deploy env
Reporter: Helena Edelson

 I tried every sbt in the book but can't use the latest Akka version in my 
 project with Spark. It would be great if I could.
 Also I can not use the latest Typesafe Config - 1.2.1, which would also be 
 great.
 See https://issues.apache.org/jira/browse/SPARK-2593
 This is a big change. If I have time I can do a PR.
 [~helena_e] 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2808) update kafka to version 0.8.2

2014-12-07 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237150#comment-14237150
 ] 

Helena Edelson commented on SPARK-2808:
---

I've done the migration, am testing the changes against Scala 2.10.4 since the 
parent spark pom uses 
scala.version2.10.4/scala.version
scala.binary.version2.10/scala.binary.version
This will allow usage of the new producer, among other nice additions in 0.8.2. 
Though I do not know when it will be GA.

 update kafka to version 0.8.2
 -

 Key: SPARK-2808
 URL: https://issues.apache.org/jira/browse/SPARK-2808
 Project: Spark
  Issue Type: Sub-task
  Components: Build, Spark Core
Reporter: Anand Avati

 First kafka_2.11 0.8.1 has to be released



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-2808) update kafka to version 0.8.2

2014-12-07 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237150#comment-14237150
 ] 

Helena Edelson edited comment on SPARK-2808 at 12/7/14 3:35 PM:


I've done the migration, am testing the changes against Scala 2.10.4 since the 
parent spark pom uses 
scala.version2.10.4/scala.version
scala.binary.version2.10/scala.binary.version
I do not know when it will be GA.


was (Author: helena_e):
I've done the migration, am testing the changes against Scala 2.10.4 since the 
parent spark pom uses 
scala.version2.10.4/scala.version
scala.binary.version2.10/scala.binary.version
This will allow usage of the new producer, among other nice additions in 0.8.2. 
Though I do not know when it will be GA.

 update kafka to version 0.8.2
 -

 Key: SPARK-2808
 URL: https://issues.apache.org/jira/browse/SPARK-2808
 Project: Spark
  Issue Type: Sub-task
  Components: Build, Spark Core
Reporter: Anand Avati

 First kafka_2.11 0.8.1 has to be released



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-2808) update kafka to version 0.8.2

2014-12-07 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2808?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14237150#comment-14237150
 ] 

Helena Edelson edited comment on SPARK-2808 at 12/7/14 3:52 PM:


I've done the migration, am testing the changes against Scala 2.10.4 since the 
parent spark pom uses 
scala.version2.10.4/scala.version
scala.binary.version2.10/scala.binary.version
I do not know when it will be GA.

https://github.com/apache/spark/pull/3631


was (Author: helena_e):
I've done the migration, am testing the changes against Scala 2.10.4 since the 
parent spark pom uses 
scala.version2.10.4/scala.version
scala.binary.version2.10/scala.binary.version
I do not know when it will be GA.

 update kafka to version 0.8.2
 -

 Key: SPARK-2808
 URL: https://issues.apache.org/jira/browse/SPARK-2808
 Project: Spark
  Issue Type: Sub-task
  Components: Build, Spark Core
Reporter: Anand Avati

 First kafka_2.11 0.8.1 has to be released



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4063) Add the ability to send messages to Kafka in the stream

2014-12-06 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14236899#comment-14236899
 ] 

Helena Edelson commented on SPARK-4063:
---

Obviously. But since I created the ticket someone else submitted a PR 
https://github.com/apache/spark/pull/2994.
The idea is of course that it is much cleaner to have a clean integration to 
read as well as write in a stream.

 Add the ability to send messages to Kafka in the stream
 ---

 Key: SPARK-4063
 URL: https://issues.apache.org/jira/browse/SPARK-4063
 Project: Spark
  Issue Type: New Feature
  Components: Input/Output
Reporter: Helena Edelson

 Currently you can only receive from Kafka in the stream. This would be adding 
 the ability to publish from the stream as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (SPARK-4063) Add the ability to send messages to Kafka in the stream

2014-12-06 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson closed SPARK-4063.
-
Resolution: Unresolved

A PR has been submitted since I created the ticket

 Add the ability to send messages to Kafka in the stream
 ---

 Key: SPARK-4063
 URL: https://issues.apache.org/jira/browse/SPARK-4063
 Project: Spark
  Issue Type: New Feature
  Components: Input/Output
Reporter: Helena Edelson

 Currently you can only receive from Kafka in the stream. This would be adding 
 the ability to publish from the stream as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3924) Upgrade to Akka version 2.3.6

2014-10-23 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181319#comment-14181319
 ] 

Helena Edelson commented on SPARK-3924:
---

I met with Matei at Strata about this. Hopefully there is a way to not lock a 
user into the same version of Akka that Spark is on :)

 Upgrade to Akka version 2.3.6
 -

 Key: SPARK-3924
 URL: https://issues.apache.org/jira/browse/SPARK-3924
 Project: Spark
  Issue Type: Dependency upgrade
 Environment: deploy env
Reporter: Helena Edelson

 I tried every sbt in the book but can't use the latest Akka version in my 
 project with Spark. It would be great if I could.
 Also I can not use the latest Typesafe Config - 1.2.1, which would also be 
 great.
 See https://issues.apache.org/jira/browse/SPARK-2593
 This is a big change. If I have time I can do a PR.
 [~helena_e] 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-10-23 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson resolved SPARK-2593.
---
Resolution: Won't Fix

As a user, I want to be able to use the latest version of Akka with Spark and 
not be locked into the version Spark is using :) I can live with 2 ActorSystem 
instances per node if it means I can use the Akka version I need. Hopefully 
there is a way in the build to scope Spark's Akka version.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-4063) Add the ability to send messages to Kafka in the stream

2014-10-23 Thread Helena Edelson (JIRA)
Helena Edelson created SPARK-4063:
-

 Summary: Add the ability to send messages to Kafka in the stream
 Key: SPARK-4063
 URL: https://issues.apache.org/jira/browse/SPARK-4063
 Project: Spark
  Issue Type: New Feature
  Components: Input/Output
Reporter: Helena Edelson


Currently you can only receive from Kafka in the stream. This would be adding 
the ability to publish from the stream as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-4063) Add the ability to send messages to Kafka in the stream

2014-10-23 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14181520#comment-14181520
 ] 

Helena Edelson commented on SPARK-4063:
---

I have this started in a WIP branch

 Add the ability to send messages to Kafka in the stream
 ---

 Key: SPARK-4063
 URL: https://issues.apache.org/jira/browse/SPARK-4063
 Project: Spark
  Issue Type: New Feature
  Components: Input/Output
Reporter: Helena Edelson

 Currently you can only receive from Kafka in the stream. This would be adding 
 the ability to publish from the stream as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-3924) Upgrade to Akka version 2.3.6

2014-10-13 Thread Helena Edelson (JIRA)
Helena Edelson created SPARK-3924:
-

 Summary: Upgrade to Akka version 2.3.6
 Key: SPARK-3924
 URL: https://issues.apache.org/jira/browse/SPARK-3924
 Project: Spark
  Issue Type: Dependency upgrade
 Environment: deploy env
Reporter: Helena Edelson


I tried every sbt in the book but can't use the latest Akka version in my 
project with Spark. It would be great if I could.

Also I can not use the latest Typesafe Config - 1.2.1, which would also be 
great.

This is a big change. If I have time I can do a PR.
[~helena_e]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-10-13 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169208#comment-14169208
 ] 

Helena Edelson edited comment on SPARK-2593 at 10/13/14 11:55 AM:
--

[~matei], [~pwendell] Yes I see the pain point here now. I just created a 
ticket to upgrade Akka and thus Typesafe Config versions because I am now 
locked into 2.2.3 and have binary incompatibility with using latest Akka 2.3.6 
/ config 1.2.1. Makes me very sad.

I think I would throw in the towel on this one if you can make it completely 
separate so that a user with it's own AkkaSystem and Config versions are not 
affected? Tricky because when deploying, spark needs its version (provided?) 
and the user app needs the other.


was (Author: helena_e):
[~matei] [~pwendell] Yes I see the pain point here now. I just created a ticket 
to upgrade Akka and thus Typesafe Config versions because I am now locked into 
2.2.3 and have binary incompatibility with using latest Akka 2.3.6 / config 
1.2.1. Makes me very sad.

I think I would throw in the towel on this one if you can make it completely 
separate so that a user with it's own AkkaSystem and Config versions are not 
affected? Tricky because when deploying, spark needs its version (provided?) 
and the user app needs the other.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-10-13 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14169208#comment-14169208
 ] 

Helena Edelson commented on SPARK-2593:
---

[~matei] [~pwendell] Yes I see the pain point here now. I just created a ticket 
to upgrade Akka and thus Typesafe Config versions because I am now locked into 
2.2.3 and have binary incompatibility with using latest Akka 2.3.6 / config 
1.2.1. Makes me very sad.

I think I would throw in the towel on this one if you can make it completely 
separate so that a user with it's own AkkaSystem and Config versions are not 
affected? Tricky because when deploying, spark needs its version (provided?) 
and the user app needs the other.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-18 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14138906#comment-14138906
 ] 

Helena Edelson commented on SPARK-2593:
---

[~matei] +1 for spark streaming, that is a primary concern here. And I 
understand your concern over support for akka upgrades. However I am more than 
happy to help WRT that and am sure I can find a few others that feel the same 
so that time isn't taken away from your team's new features/enhancements 
bandwidth. I will get more data on have 2 actor systems on a node.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-17 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137338#comment-14137338
 ] 

Helena Edelson commented on SPARK-2593:
---

[~pwendell] I forgot to not this on my reply above but I was offering to do the 
work myself - or at least start it.

What would be ideal is to have both of these:
- Expose the spark actor system which also requires insuring all spark actors 
are on specified dispatchers (very important)
- Optionally allow spark users to pass in their existing actor system

I feel both are incredibly important for users.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-17 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137338#comment-14137338
 ] 

Helena Edelson edited comment on SPARK-2593 at 9/17/14 2:42 PM:


[~pwendell] I forgot to note this on my reply above but I was offering to do 
the work myself - or at least start it.

What would be ideal is to have both of these:
- Expose the spark actor system which also requires insuring all spark actors 
are on specified dispatchers (very important)
- Optionally allow spark users to pass in their existing actor system

I feel both are incredibly important for users.


was (Author: helena_e):
[~pwendell] I forgot to not this on my reply above but I was offering to do the 
work myself - or at least start it.

What would be ideal is to have both of these:
- Expose the spark actor system which also requires insuring all spark actors 
are on specified dispatchers (very important)
- Optionally allow spark users to pass in their existing actor system

I feel both are incredibly important for users.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-17 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14137338#comment-14137338
 ] 

Helena Edelson edited comment on SPARK-2593 at 9/17/14 2:44 PM:


[~pwendell] I forgot to note this on my reply above but I was offering to do 
the work myself - or at least start it.

What would be ideal is to have both of these:
- Expose the spark actor system which also requires insuring all spark actors 
are on specified dispatchers (very important)
- Optionally allow spark users to pass in their existing actor system
- Add a logical naming convention for spark streaming actors or a function to 
get it

I feel both are incredibly important for users.


was (Author: helena_e):
[~pwendell] I forgot to note this on my reply above but I was offering to do 
the work myself - or at least start it.

What would be ideal is to have both of these:
- Expose the spark actor system which also requires insuring all spark actors 
are on specified dispatchers (very important)
- Optionally allow spark users to pass in their existing actor system

I feel both are incredibly important for users.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-16 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14136463#comment-14136463
 ] 

Helena Edelson commented on SPARK-2593:
---

[~pwendell] I've spent no time thus far because yet, the Akka ActorSystem is 
private, that is the point of my ticket ;) once that can be exposed I'm g2g.

- Helena

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-13 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-2593:
--
Description: 
As a developer I want to pass an existing ActorSystem into StreamingContext in 
load-time so that I do not have 2 actor systems running on a node in an Akka 
application.

This would mean having spark's actor system on its own named-dispatchers as 
well as exposing the new private creation of its own actor system.
  
 

  was:
As a developer I want to pass an existing ActorSystem into StreamingContext in 
load-time so that I do not have 2 actor systems running on a node in an Akka 
application.

This would mean having spark's actor system on its own named-dispatchers as 
well as exposing the new private creation of its own actor system.
 
I would like to create an Akka Extension that wraps around Spark/Spark 
Streaming and Cassandra. So the programmatic creation would simply be this for 
a user

val extension = SparkCassandra(system)
 


 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-09-13 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14132797#comment-14132797
 ] 

Helena Edelson commented on SPARK-2593:
---

Here is a good example of just one of the issues: it is difficult to locate a 
remote spark actor to publish data to the stream. Here I have to have the 
streaming actor get created and in the preStart, publish a custom message with 
`self`which my actors in my ActorSystem can receive in order to get the 
ActorRef to send to. This is incredibly clunky.

I will try to carve out some time to do this PR this week.
 

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
   
  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped

2014-09-04 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121324#comment-14121324
 ] 

Helena Edelson commented on SPARK-2892:
---

I see the same with 1.0.2 streaming:

ERROR 08:26:21,139 Deregistered receiver for stream 0: Stopped by driver
 WARN 08:26:21,211 Stopped executor without error
 WARN 08:26:21,213 All of the receivers have not deregistered, Map(0 - 
ReceiverInfo(0,ActorReceiver-0,null,false,host,Stopped by driver,))


 Socket Receiver does not stop when streaming context is stopped
 ---

 Key: SPARK-2892
 URL: https://issues.apache.org/jira/browse/SPARK-2892
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.0.2
Reporter: Tathagata Das
Assignee: Tathagata Das
Priority: Critical

 Running NetworkWordCount with
 {quote}  
 ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); 
 Thread.sleep(6)
 {quote}
 gives the following error
 {quote}
 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) 
 in 10047 ms on localhost (1/1)
 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at 
 ReceiverTracker.scala:275) finished in 10.056 s
 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
 have all completed, from pool
 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at 
 ReceiverTracker.scala:275, took 10.179263 s
 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been 
 terminated
 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not 
 deregistered, Map(0 - 
 ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,))
 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped
 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately
 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after 
 time 1407375433000
 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator
 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler
 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully
 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving
 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost:
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped

2014-09-04 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121324#comment-14121324
 ] 

Helena Edelson edited comment on SPARK-2892 at 9/4/14 1:12 PM:
---

I see the same with 1.0.2 streaming, with or without stopGracefully = true

ssc.stop(stopSparkContext = false, stopGracefully = true)

ERROR 08:26:21,139 Deregistered receiver for stream 0: Stopped by driver
 WARN 08:26:21,211 Stopped executor without error
 WARN 08:26:21,213 All of the receivers have not deregistered, Map(0 - 
ReceiverInfo(0,ActorReceiver-0,null,false,host,Stopped by driver,))



was (Author: helena_e):
I see the same with 1.0.2 streaming:

ERROR 08:26:21,139 Deregistered receiver for stream 0: Stopped by driver
 WARN 08:26:21,211 Stopped executor without error
 WARN 08:26:21,213 All of the receivers have not deregistered, Map(0 - 
ReceiverInfo(0,ActorReceiver-0,null,false,host,Stopped by driver,))


 Socket Receiver does not stop when streaming context is stopped
 ---

 Key: SPARK-2892
 URL: https://issues.apache.org/jira/browse/SPARK-2892
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.0.2
Reporter: Tathagata Das
Assignee: Tathagata Das
Priority: Critical

 Running NetworkWordCount with
 {quote}  
 ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); 
 Thread.sleep(6)
 {quote}
 gives the following error
 {quote}
 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) 
 in 10047 ms on localhost (1/1)
 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at 
 ReceiverTracker.scala:275) finished in 10.056 s
 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
 have all completed, from pool
 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at 
 ReceiverTracker.scala:275, took 10.179263 s
 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been 
 terminated
 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not 
 deregistered, Map(0 - 
 ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,))
 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped
 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately
 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after 
 time 1407375433000
 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator
 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler
 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully
 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving
 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost:
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped

2014-09-04 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121561#comment-14121561
 ] 

Helena Edelson commented on SPARK-2892:
---

I wonder if the ERROR should be a WARN or INFO since it occurs as a result of 
ReceiverSupervisorImpl receiving a StopReceiver, and  Deregistered receiver 
for stream seems like the expected behavior.

 Socket Receiver does not stop when streaming context is stopped
 ---

 Key: SPARK-2892
 URL: https://issues.apache.org/jira/browse/SPARK-2892
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.0.2
Reporter: Tathagata Das
Assignee: Tathagata Das
Priority: Critical

 Running NetworkWordCount with
 {quote}  
 ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); 
 Thread.sleep(6)
 {quote}
 gives the following error
 {quote}
 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) 
 in 10047 ms on localhost (1/1)
 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at 
 ReceiverTracker.scala:275) finished in 10.056 s
 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
 have all completed, from pool
 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at 
 ReceiverTracker.scala:275, took 10.179263 s
 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been 
 terminated
 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not 
 deregistered, Map(0 - 
 ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,))
 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped
 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately
 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after 
 time 1407375433000
 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator
 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler
 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully
 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving
 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost:
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-2892) Socket Receiver does not stop when streaming context is stopped

2014-09-04 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14121561#comment-14121561
 ] 

Helena Edelson edited comment on SPARK-2892 at 9/4/14 5:01 PM:
---

I wonder if the ERROR should be a WARN or INFO since it occurs as a result of 
ReceiverSupervisorImpl receiving a StopReceiver, and  Deregistered receiver 
for stream seems like the expected behavior.


DEBUG 13:00:22,418 Stopping JobScheduler
 INFO 13:00:22,441 Received stop signal
 INFO 13:00:22,441 Sent stop signal to all 1 receivers
 INFO 13:00:22,442 Stopping receiver with message: Stopped by driver: 
 INFO 13:00:22,442 Called receiver onStop
 INFO 13:00:22,443 Deregistering receiver 0
ERROR 13:00:22,445 Deregistered receiver for stream 0: Stopped by driver
 INFO 13:00:22,445 Stopped receiver 0


was (Author: helena_e):
I wonder if the ERROR should be a WARN or INFO since it occurs as a result of 
ReceiverSupervisorImpl receiving a StopReceiver, and  Deregistered receiver 
for stream seems like the expected behavior.

 Socket Receiver does not stop when streaming context is stopped
 ---

 Key: SPARK-2892
 URL: https://issues.apache.org/jira/browse/SPARK-2892
 Project: Spark
  Issue Type: Bug
  Components: Streaming
Affects Versions: 1.0.2
Reporter: Tathagata Das
Assignee: Tathagata Das
Priority: Critical

 Running NetworkWordCount with
 {quote}  
 ssc.start(); Thread.sleep(1); ssc.stop(stopSparkContext = false); 
 Thread.sleep(6)
 {quote}
 gives the following error
 {quote}
 14/08/06 18:37:13 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) 
 in 10047 ms on localhost (1/1)
 14/08/06 18:37:13 INFO DAGScheduler: Stage 0 (runJob at 
 ReceiverTracker.scala:275) finished in 10.056 s
 14/08/06 18:37:13 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks 
 have all completed, from pool
 14/08/06 18:37:13 INFO SparkContext: Job finished: runJob at 
 ReceiverTracker.scala:275, took 10.179263 s
 14/08/06 18:37:13 INFO ReceiverTracker: All of the receivers have been 
 terminated
 14/08/06 18:37:13 WARN ReceiverTracker: All of the receivers have not 
 deregistered, Map(0 - 
 ReceiverInfo(0,SocketReceiver-0,null,false,localhost,Stopped by driver,))
 14/08/06 18:37:13 INFO ReceiverTracker: ReceiverTracker stopped
 14/08/06 18:37:13 INFO JobGenerator: Stopping JobGenerator immediately
 14/08/06 18:37:13 INFO RecurringTimer: Stopped timer for JobGenerator after 
 time 1407375433000
 14/08/06 18:37:13 INFO JobGenerator: Stopped JobGenerator
 14/08/06 18:37:13 INFO JobScheduler: Stopped JobScheduler
 14/08/06 18:37:13 INFO StreamingContext: StreamingContext stopped successfully
 14/08/06 18:37:43 INFO SocketReceiver: Stopped receiving
 14/08/06 18:37:43 INFO SocketReceiver: Closed socket to localhost:
 {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-3178) setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the worker memory limit to zero

2014-08-25 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-3178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14110116#comment-14110116
 ] 

Helena Edelson commented on SPARK-3178:
---

+1 it doesn't look like the input data is validated to fail fast if mb/g is not 
noted

 setting SPARK_WORKER_MEMORY to a value without a label (m or g) sets the 
 worker memory limit to zero
 

 Key: SPARK-3178
 URL: https://issues.apache.org/jira/browse/SPARK-3178
 Project: Spark
  Issue Type: Bug
 Environment: osx
Reporter: Jon Haddad

 This should either default to m or just completely fail.  Starting a worker 
 with zero memory isn't very helpful.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-2802) Improve the Cassandra sample and Add a new sample for Streaming to Cassandra

2014-08-02 Thread Helena Edelson (JIRA)
Helena Edelson created SPARK-2802:
-

 Summary: Improve the Cassandra sample and Add a new sample for 
Streaming to Cassandra
 Key: SPARK-2802
 URL: https://issues.apache.org/jira/browse/SPARK-2802
 Project: Spark
  Issue Type: Improvement
Reporter: Helena Edelson
Priority: Minor






--
This message was sent by Atlassian JIRA
(v6.2#6252)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-22 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-2593:
--

Issue Type: Improvement  (was: Brainstorming)

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
 If it makes sense...
 I would like to create an Akka Extension that wraps around Spark/Spark 
 Streaming and Cassandra. So the creation would simply be this for a user
 val extension = SparkCassandra(system)
  and using is as easy as:
 import extension._
 spark. // do work or, 
 streaming. // do work
  
 and all config comes from reference.conf and user overrides of that.
 The conf file would pick up settings from the deployed environment first, 
 then fallback to -D with a final fallback to configured settings.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-22 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-2593:
--

Description: 
As a developer I want to pass an existing ActorSystem into StreamingContext in 
load-time so that I do not have 2 actor systems running on a node in an Akka 
application.

This would mean having spark's actor system on its own named-dispatchers as 
well as exposing the new private creation of its own actor system.
 
I would like to create an Akka Extension that wraps around Spark/Spark 
Streaming and Cassandra. So the programmatic creation would simply be this for 
a user

val extension = SparkCassandra(system)
 

  was:
As a developer I want to pass an existing ActorSystem into StreamingContext in 
load-time so that I do not have 2 actor systems running on a node.

This would mean having spark's actor system on its own named-dispatchers as 
well as exposing the new private creation of its own actor system.

If it makes sense...

I would like to create an Akka Extension that wraps around Spark/Spark 
Streaming and Cassandra. So the creation would simply be this for a user

val extension = SparkCassandra(system)
 and using is as easy as:

import extension._
spark. // do work or, 
streaming. // do work
 
and all config comes from reference.conf and user overrides of that.
The conf file would pick up settings from the deployed environment first, then 
fallback to -D with a final fallback to configured settings.




 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node in an 
 Akka application.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
  
 I would like to create an Akka Extension that wraps around Spark/Spark 
 Streaming and Cassandra. So the programmatic creation would simply be this 
 for a user
 val extension = SparkCassandra(system)
  



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-19 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14067540#comment-14067540
 ] 

Helena Edelson commented on SPARK-2593:
---

I should note that I'd be happy to do the changes. I am a committer to Akka 
Cluster.

 Add ability to pass an existing Akka ActorSystem into Spark
 ---

 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Brainstorming
  Components: Spark Core
Reporter: Helena Edelson

 As a developer I want to pass an existing ActorSystem into StreamingContext 
 in load-time so that I do not have 2 actor systems running on a node.
 This would mean having spark's actor system on its own named-dispatchers as 
 well as exposing the new private creation of its own actor system.
 If it makes sense...
 I would like to create an Akka Extension that wraps around Spark/Spark 
 Streaming and Cassandra. So the creation would simply be this for a user
 val extension = SparkCassandra(system)
  and using is as easy as:
 import extension._
 spark. // do work or, 
 streaming. // do work
  
 and all config comes from reference.conf and user overrides of that.
 The conf file would pick up settings from the deployed environment first, 
 then fallback to -D with a final fallback to configured settings.



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (SPARK-2593) Add ability to pass an existing Akka ActorSystem into Spark

2014-07-19 Thread Helena Edelson (JIRA)
Helena Edelson created SPARK-2593:
-

 Summary: Add ability to pass an existing Akka ActorSystem into 
Spark
 Key: SPARK-2593
 URL: https://issues.apache.org/jira/browse/SPARK-2593
 Project: Spark
  Issue Type: Brainstorming
  Components: Spark Core
Reporter: Helena Edelson


As a developer I want to pass an existing ActorSystem into StreamingContext in 
load-time so that I do not have 2 actor systems running on a node.

This would mean having spark's actor system on its own named-dispatchers as 
well as exposing the new private creation of its own actor system.

If it makes sense...

I would like to create an Akka Extension that wraps around Spark/Spark 
Streaming and Cassandra. So the creation would simply be this for a user

val extension = SparkCassandra(system)
 and using is as easy as:

import extension._
spark. // do work or, 
streaming. // do work
 
and all config comes from reference.conf and user overrides of that.
The conf file would pick up settings from the deployed environment first, then 
fallback to -D with a final fallback to configured settings.





--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Created] (SPARK-2198) Partition the scala build file so that it is easier to maintain

2014-06-19 Thread Helena Edelson (JIRA)
Helena Edelson created SPARK-2198:
-

 Summary: Partition the scala build file so that it is easier to 
maintain
 Key: SPARK-2198
 URL: https://issues.apache.org/jira/browse/SPARK-2198
 Project: Spark
  Issue Type: Task
  Components: Build
Reporter: Helena Edelson
Priority: Minor


Partition to standard Dependencies, Version, Settings, Publish.scala. keeping 
the SparkBuild clean to describe the modules and their deps so that changes in 
versions, for example, need only be made in Version.scala, settings changes 
such as in scalac in Settings.scala, etc.

I'd be happy to do this ([~helena_e]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (SPARK-2198) Partition the scala build file so that it is easier to maintain

2014-06-19 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-2198:
--

Description: 
Partition to standard Dependencies, Version, Settings, Publish.scala. keeping 
the SparkBuild clean to describe the modules and their deps so that changes in 
versions, for example, need only be made in Version.scala, settings changes 
such as in scalac in Settings.scala, etc.

I'd be happy to do this ([~helena_e])

  was:
Partition to standard Dependencies, Version, Settings, Publish.scala. keeping 
the SparkBuild clean to describe the modules and their deps so that changes in 
versions, for example, need only be made in Version.scala, settings changes 
such as in scalac in Settings.scala, etc.

I'd be happy to do this ([~helena_e]


 Partition the scala build file so that it is easier to maintain
 ---

 Key: SPARK-2198
 URL: https://issues.apache.org/jira/browse/SPARK-2198
 Project: Spark
  Issue Type: Task
  Components: Build
Reporter: Helena Edelson
Priority: Minor
   Original Estimate: 3h
  Remaining Estimate: 3h

 Partition to standard Dependencies, Version, Settings, Publish.scala. keeping 
 the SparkBuild clean to describe the modules and their deps so that changes 
 in versions, for example, need only be made in Version.scala, settings 
 changes such as in scalac in Settings.scala, etc.
 I'd be happy to do this ([~helena_e])



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (SPARK-2198) Partition the scala build file so that it is easier to maintain

2014-06-19 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-2198:
--

Remaining Estimate: 2h  (was: 1m)
 Original Estimate: 2h  (was: 1m)

 Partition the scala build file so that it is easier to maintain
 ---

 Key: SPARK-2198
 URL: https://issues.apache.org/jira/browse/SPARK-2198
 Project: Spark
  Issue Type: Task
  Components: Build
Reporter: Helena Edelson
Priority: Minor
   Original Estimate: 2h
  Remaining Estimate: 2h

 Partition to standard Dependencies, Version, Settings, Publish.scala. keeping 
 the SparkBuild clean to describe the modules and their deps so that changes 
 in versions, for example, need only be made in Version.scala, settings 
 changes such as in scalac in Settings.scala, etc.
 I'd be happy to do this ([~helena_e]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (SPARK-2198) Partition the scala build file so that it is easier to maintain

2014-06-19 Thread Helena Edelson (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Helena Edelson updated SPARK-2198:
--

Remaining Estimate: 3h  (was: 2h)
 Original Estimate: 3h  (was: 2h)

 Partition the scala build file so that it is easier to maintain
 ---

 Key: SPARK-2198
 URL: https://issues.apache.org/jira/browse/SPARK-2198
 Project: Spark
  Issue Type: Task
  Components: Build
Reporter: Helena Edelson
Priority: Minor
   Original Estimate: 3h
  Remaining Estimate: 3h

 Partition to standard Dependencies, Version, Settings, Publish.scala. keeping 
 the SparkBuild clean to describe the modules and their deps so that changes 
 in versions, for example, need only be made in Version.scala, settings 
 changes such as in scalac in Settings.scala, etc.
 I'd be happy to do this ([~helena_e]



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Commented] (SPARK-2198) Partition the scala build file so that it is easier to maintain

2014-06-19 Thread Helena Edelson (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14037467#comment-14037467
 ] 

Helena Edelson commented on SPARK-2198:
---

I am sad to hear that the Maven POMs will be primary (vs scala SBT) and 
staying. 
It was very odd to see the SBT/Maven redundancies however.

 Partition the scala build file so that it is easier to maintain
 ---

 Key: SPARK-2198
 URL: https://issues.apache.org/jira/browse/SPARK-2198
 Project: Spark
  Issue Type: Task
  Components: Build
Reporter: Helena Edelson
Priority: Minor
   Original Estimate: 3h
  Remaining Estimate: 3h

 Partition to standard Dependencies, Version, Settings, Publish.scala. keeping 
 the SparkBuild clean to describe the modules and their deps so that changes 
 in versions, for example, need only be made in Version.scala, settings 
 changes such as in scalac in Settings.scala, etc.
 I'd be happy to do this ([~helena_e])



--
This message was sent by Atlassian JIRA
(v6.2#6252)