[jira] [Commented] (SPARK-20168) Enable kinesis to start stream from Initial position specified by a timestamp
[ https://issues.apache.org/jira/browse/SPARK-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16602909#comment-16602909 ] Vladimir Pchelko commented on SPARK-20168: -- [~srowen] this bug must be covered by unit tests > Enable kinesis to start stream from Initial position specified by a timestamp > - > > Key: SPARK-20168 > URL: https://issues.apache.org/jira/browse/SPARK-20168 > Project: Spark > Issue Type: Improvement > Components: DStreams >Affects Versions: 2.1.0 >Reporter: Yash Sharma >Assignee: Yash Sharma >Priority: Minor > Labels: kinesis, streaming > Fix For: 2.4.0 > > > Kinesis client can resume from a specified timestamp while creating a stream. > We should have option to pass a timestamp in config to allow kinesis to > resume from the given timestamp. > Have started initial work and will be posting a PR after I test the patch - > https://github.com/yssharma/spark/commit/11269abf8b2a533a1b10ceee80ac2c3a2a80c4e8 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20168) Enable kinesis to start stream from Initial position specified by a timestamp
[ https://issues.apache.org/jira/browse/SPARK-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16509933#comment-16509933 ] Apache Spark commented on SPARK-20168: -- User 'yashs360' has created a pull request for this issue: https://github.com/apache/spark/pull/21541 > Enable kinesis to start stream from Initial position specified by a timestamp > - > > Key: SPARK-20168 > URL: https://issues.apache.org/jira/browse/SPARK-20168 > Project: Spark > Issue Type: Improvement > Components: DStreams >Affects Versions: 2.1.0 >Reporter: Yash Sharma >Assignee: Yash Sharma >Priority: Major > Labels: kinesis, streaming > Fix For: 2.3.0 > > > Kinesis client can resume from a specified timestamp while creating a stream. > We should have option to pass a timestamp in config to allow kinesis to > resume from the given timestamp. > Have started initial work and will be posting a PR after I test the patch - > https://github.com/yssharma/spark/commit/11269abf8b2a533a1b10ceee80ac2c3a2a80c4e8 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20168) Enable kinesis to start stream from Initial position specified by a timestamp
[ https://issues.apache.org/jira/browse/SPARK-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16488477#comment-16488477 ] Sarath Chandra Bandaru commented on SPARK-20168: I think there is a bug in this. This {noformat} .withInitialPositionInStream(initialPosition.getPosition){noformat} is added to builder initially which will throw error when AT_TIMESTAMP is enabled. Even though you add this later {noformat} baseClientLibConfiguration.withTimestampAtInitialPositionInStream(ts.getTimestamp){noformat} Below is the stack trace of streaming with at_timestamp: {noformat} java.lang.IllegalArgumentException: Invalid InitialPosition: AT_TIMESTAMP at com.amazonaws.services.kinesis.clientlibrary.lib.worker.InitialPositionInStreamExtended.newInitialPosition(InitialPositionInStreamExtended.java:68) at com.amazonaws.services.kinesis.clientlibrary.lib.worker.KinesisClientLibConfiguration.withInitialPositionInStream(KinesisClientLibConfiguration.java:748) at org.apache.spark.streaming.kinesis.KinesisReceiver.onStart(KinesisReceiver.scala:163) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:149) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:131) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:600) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:590) at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2178) at org.apache.spark.SparkContext$$anonfun$34.apply(SparkContext.scala:2178) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:109) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748 {noformat} > Enable kinesis to start stream from Initial position specified by a timestamp > - > > Key: SPARK-20168 > URL: https://issues.apache.org/jira/browse/SPARK-20168 > Project: Spark > Issue Type: Improvement > Components: DStreams >Affects Versions: 2.1.0 >Reporter: Yash Sharma >Assignee: Yash Sharma >Priority: Major > Labels: kinesis, streaming > Fix For: 2.3.0 > > > Kinesis client can resume from a specified timestamp while creating a stream. > We should have option to pass a timestamp in config to allow kinesis to > resume from the given timestamp. > Have started initial work and will be posting a PR after I test the patch - > https://github.com/yssharma/spark/commit/11269abf8b2a533a1b10ceee80ac2c3a2a80c4e8 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20168) Enable kinesis to start stream from Initial position specified by a timestamp
[ https://issues.apache.org/jira/browse/SPARK-20168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16015706#comment-16015706 ] Apache Spark commented on SPARK-20168: -- User 'yssharma' has created a pull request for this issue: https://github.com/apache/spark/pull/18029 > Enable kinesis to start stream from Initial position specified by a timestamp > - > > Key: SPARK-20168 > URL: https://issues.apache.org/jira/browse/SPARK-20168 > Project: Spark > Issue Type: Improvement > Components: DStreams >Affects Versions: 2.1.0 >Reporter: Yash Sharma > Labels: kinesis, streaming > > Kinesis client can resume from a specified timestamp while creating a stream. > We should have option to pass a timestamp in config to allow kinesis to > resume from the given timestamp. > Have started initial work and will be posting a PR after I test the patch - > https://github.com/yssharma/spark/commit/11269abf8b2a533a1b10ceee80ac2c3a2a80c4e8 -- This message was sent by Atlassian JIRA (v6.3.15#6346) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org