Yes there is. I do it every day. It's very useful to try things out to see if things are behaving according to plan.
Create the spark streaming context from the sc SparkContext (I'm using @transient for everything (because I run stuff outside of any enclosing scope, like an object) stuff gets "sucked up" into the context and has to be serialized, which causes lots of errors) @transient val ssc = new StreamingContext(sc,windowDuration) Here is an example of stuff you can do (I'm consuming Kinesis streams, you might use something else) @transient val streams = (0 until 1).map { i => KinesisUtils.createStream(ssc, appName, streamName, endpointUrl, regionName, InitialPositionInStream.LATEST, windowDuration, StorageLevel.MEMORY_ONLY) } @transient val events = ssc .union(streams) .map(byteArray => new String(byteArray)) .map(str => Funcs.parseLogEvent(str)) .filter(tryLogEvent => tryLogEvent.isSuccess) .map(tryLogEvent => tryLogEvent.get) .filter(logEvent => logEvent.DataType == "Event" && logEvent.RequestType != "UIInputError" && logEvent.RequestType != "UIEmptyInput") You need an action like print() otherwise nothing happens, you probably know that events.print(10) Now you start the context: ssc.start() Now you need to wait for as long as it takes for a window to "close", and then you stop the streaming context, *but not the underlying SparkContext*, otherwise you need to restart the interpreter: ssc.stop(stopSparkContext=false, stopGracefully=true) Everything should work I guess. Let me know if you run into other problems. *FA* *FA * http://queirozf.com “Every time you stay out late; every time you sleep in; every time you miss a workout; every time you don’t give 100% – You make it that much easier for me to beat you.” - Unknown author On 15 February 2016 at 23:15, Michael Gummelt <mgumm...@mesosphere.io> wrote: > Hi, > > I've seen this JIRA regarding Spark Streaming: > https://issues.apache.org/jira/browse/ZEPPELIN-274, but I have a more > basic question. Is there a way to iterate on a Spark Streaming job in > Zeppelin without restarting the interpreter? Specifically, I'd like to > start a streaming job, see some results, rewrite the job, then start it > again. Spark has some limitations that make this difficult: > http://spark.apache.org/docs/latest/streaming-programming-guide.html#points-to-remember > > But I'm wondering if there's a workaround. Thanks. > > -- > Michael Gummelt > Software Engineer > Mesosphere >