[
https://issues.apache.org/jira/browse/SPARK-14693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15250860#comment-15250860
]
Evan Oman commented on SPARK-14693:
-----------------------------------
[~srowen] I apologize, I am very new to all of this: How would I go about
gathering this information?
I have looked at the Spark Driver logs and there isn't anything interesting
there (just a "Executing command, time = blah" whenever I run the block
containing my scc.start() command).
Additionally all of the Spark UI tabs don't contain any information about the
ssc.start() command's execution.
> Spark Streaming Context Hangs on Start
> --------------------------------------
>
> Key: SPARK-14693
> URL: https://issues.apache.org/jira/browse/SPARK-14693
> Project: Spark
> Issue Type: Bug
> Components: Streaming
> Affects Versions: 1.6.0, 1.6.1
> Environment: Databricks Cloud
> Reporter: Evan Oman
>
> All,
> I am trying to use Kinesis with Spark Streaming on Spark 1.6.0 via Databricks
> and my `ssc.start()` command is hanging.
> I am using the following function (based on [this
> guide|http://spark.apache.org/docs/latest/streaming-kinesis-integration.html],
> which, as an aside, contains some broken Github links) to make my Spark
> Streaming Context:
> {code:borderStyle=solid}
> def creatingFunc(sc: SparkContext): StreamingContext =
> {
> // Create a StreamingContext
> val ssc = new StreamingContext(sc,
> Seconds(batchIntervalSeconds))
> // Creata a Kinesis stream
> val kinesisStream = KinesisUtils.createStream(ssc,
> kinesisAppName, kinesisStreamName,
> kinesisEndpointUrl,
> RegionUtils.getRegionByEndpoint(kinesisEndpointUrl).getName,
> InitialPositionInStream.LATEST,
> Seconds(kinesisCheckpointIntervalSeconds),
> StorageLevel.MEMORY_AND_DISK_SER_2,
> config.awsAccessKeyId, config.awsSecretKey)
> kinesisStream.print()
> ssc.remember(Minutes(1))
> ssc.checkpoint(checkpointDir)
> ssc
> }
> {code}
> However when I run the following to start the streaming context:
> {code:borderStyle=solid}
> // Stop any existing StreamingContext
> val stopActiveContext = true
> if (stopActiveContext) {
> StreamingContext.getActive.foreach { _.stop(stopSparkContext = false) }
> }
> // Get or create a streaming context.
> val ssc = StreamingContext.getActiveOrCreate(() => main.creatingFunc(sc))
> // This starts the streaming context in the background.
> ssc.start()
> {code}
> The last bit, `ssc.start()`, hangs indefinitely without issuing any log
> messages. I am running this on a freshly spun up cluster with no other
> notebooks attached so there aren't any other streaming contexts running.
> Any thoughts?
> Additionally, here are the libraries I am using (from my build.sbt file):
> {code:borderStyle=solid}
> "org.apache.spark" % "spark-core_2.10" % "1.6.0"
> "org.apache.spark" % "spark-sql_2.10" % "1.6.0"
> "org.apache.spark" % "spark-streaming-kinesis-asl_2.10" % "1.6.0"
> "org.apache.spark" % "spark-streaming_2.10" % "1.6.0"
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]