[jira] [Commented] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2016-12-08 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15731884#comment-15731884
 ] 

Apache Spark commented on SPARK-18020:
--

User 'maropu' has created a pull request for this issue:
https://github.com/apache/spark/pull/16213

> Kinesis receiver does not snapshot when shard completes
> ---
>
> Key: SPARK-18020
> URL: https://issues.apache.org/jira/browse/SPARK-18020
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.0
>Reporter: Yonathan Randolph
>Priority: Minor
>  Labels: kinesis
>
> When a kinesis shard is split or combined and the old shard ends, the Amazon 
> Kinesis Client library [calls 
> IRecordProcessor.shutdown|https://github.com/awslabs/amazon-kinesis-client/blob/v1.7.0/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/ShutdownTask.java#L100]
>  and expects that {{IRecordProcessor.shutdown}} must checkpoint the sequence 
> number {{ExtendedSequenceNumber.SHARD_END}} before returning. Unfortunately, 
> spark’s 
> [KinesisRecordProcessor|https://github.com/apache/spark/blob/v2.0.1/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala]
>  sometimes does not checkpoint SHARD_END. This results in an error message, 
> and spark is then blocked indefinitely from processing any items from the 
> child shards.
> This issue has also been raised on StackOverflow: [resharding while spark 
> running on kinesis 
> stream|http://stackoverflow.com/questions/38898691/resharding-while-spark-running-on-kinesis-stream]
> Exception that is logged:
> {code}
> 16/10/19 19:37:49 ERROR worker.ShutdownTask: Application exception. 
> java.lang.IllegalArgumentException: Application didn't checkpoint at end of 
> shard shardId-0030
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownTask.call(ShutdownTask.java:106)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Command used to split shard:
> {code}
> aws kinesis --region us-west-1 split-shard --stream-name my-stream 
> --shard-to-split shardId-0030 --new-starting-hash-key 
> 5316911983139663491615228241121378303
> {code}
> After the spark-streaming job has hung, examining the DynamoDB table 
> indicates that the parent shard processor has not reached 
> {{ExtendedSequenceNumber.SHARD_END}} and the child shards are still at 
> {{ExtendedSequenceNumber.TRIM_HORIZON}} waiting for the parent to finish:
> {code}
> aws kinesis --region us-west-1 describe-stream --stream-name my-stream
> {
> "StreamDescription": {
> "RetentionPeriodHours": 24, 
> "StreamName": "my-stream", 
> "Shards": [
> {
> "ShardId": "shardId-0030", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "0"
> },
> ...
> }, 
> {
> "ShardId": "shardId-0062", 
> "HashKeyRange": {
> "EndingHashKey": "5316911983139663491615228241121378302", 
> "StartingHashKey": "0"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087883755242230188435465744452396445937434624994"
> }
> }, 
> {
> "ShardId": "shardId-0063", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "5316911983139663491615228241121378303"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087906055987428719058607280170669094298940605426"
> }
> },
> ...
> ],
> "StreamStatus": "ACTIVE"
> }
> }
> aws dynamodb --region us-west-1 scan --table-name my-processor
> {
> "Items": [
> {
>   

[jira] [Commented] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2016-12-04 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15719797#comment-15719797
 ] 

Takeshi Yamamuro commented on SPARK-18020:
--

I tried to make a workaround patch to fix this issue 
(https://github.com/apache/spark/compare/master...maropu:SPARK-18020) and I 
manually checked this issue resolved.
But, I'm not sure this workaround is acceptable, so I continue to look for 
other better approaches.

> Kinesis receiver does not snapshot when shard completes
> ---
>
> Key: SPARK-18020
> URL: https://issues.apache.org/jira/browse/SPARK-18020
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.0
>Reporter: Yonathan Randolph
>Priority: Minor
>  Labels: kinesis
>
> When a kinesis shard is split or combined and the old shard ends, the Amazon 
> Kinesis Client library [calls 
> IRecordProcessor.shutdown|https://github.com/awslabs/amazon-kinesis-client/blob/v1.7.0/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/ShutdownTask.java#L100]
>  and expects that {{IRecordProcessor.shutdown}} must checkpoint the sequence 
> number {{ExtendedSequenceNumber.SHARD_END}} before returning. Unfortunately, 
> spark’s 
> [KinesisRecordProcessor|https://github.com/apache/spark/blob/v2.0.1/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala]
>  sometimes does not checkpoint SHARD_END. This results in an error message, 
> and spark is then blocked indefinitely from processing any items from the 
> child shards.
> This issue has also been raised on StackOverflow: [resharding while spark 
> running on kinesis 
> stream|http://stackoverflow.com/questions/38898691/resharding-while-spark-running-on-kinesis-stream]
> Exception that is logged:
> {code}
> 16/10/19 19:37:49 ERROR worker.ShutdownTask: Application exception. 
> java.lang.IllegalArgumentException: Application didn't checkpoint at end of 
> shard shardId-0030
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownTask.call(ShutdownTask.java:106)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Command used to split shard:
> {code}
> aws kinesis --region us-west-1 split-shard --stream-name my-stream 
> --shard-to-split shardId-0030 --new-starting-hash-key 
> 5316911983139663491615228241121378303
> {code}
> After the spark-streaming job has hung, examining the DynamoDB table 
> indicates that the parent shard processor has not reached 
> {{ExtendedSequenceNumber.SHARD_END}} and the child shards are still at 
> {{ExtendedSequenceNumber.TRIM_HORIZON}} waiting for the parent to finish:
> {code}
> aws kinesis --region us-west-1 describe-stream --stream-name my-stream
> {
> "StreamDescription": {
> "RetentionPeriodHours": 24, 
> "StreamName": "my-stream", 
> "Shards": [
> {
> "ShardId": "shardId-0030", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "0"
> },
> ...
> }, 
> {
> "ShardId": "shardId-0062", 
> "HashKeyRange": {
> "EndingHashKey": "5316911983139663491615228241121378302", 
> "StartingHashKey": "0"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087883755242230188435465744452396445937434624994"
> }
> }, 
> {
> "ShardId": "shardId-0063", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "5316911983139663491615228241121378303"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087906055987428719058607280170669094298940605426"
> }
> },
> 

[jira] [Commented] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2016-12-04 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15719774#comment-15719774
 ] 

Takeshi Yamamuro commented on SPARK-18020:
--

I tried to checkpoint with SHARD_END by 
`IRecordProcessorCheckpointer#checkpoint(ExtendedSequenceNumber.SHARD_END.toString())`,
 but I got the IllegalArgumentException exception with messages like "Sequence 
number must be numeric, but was SHARD_END". Since I'm not sure this is a 
expected behaviour, I ask this to aws guys here 
https://forums.aws.amazon.com/thread.jspa?threadID=244218. 

> Kinesis receiver does not snapshot when shard completes
> ---
>
> Key: SPARK-18020
> URL: https://issues.apache.org/jira/browse/SPARK-18020
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.0
>Reporter: Yonathan Randolph
>Priority: Minor
>  Labels: kinesis
>
> When a kinesis shard is split or combined and the old shard ends, the Amazon 
> Kinesis Client library [calls 
> IRecordProcessor.shutdown|https://github.com/awslabs/amazon-kinesis-client/blob/v1.7.0/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/ShutdownTask.java#L100]
>  and expects that {{IRecordProcessor.shutdown}} must checkpoint the sequence 
> number {{ExtendedSequenceNumber.SHARD_END}} before returning. Unfortunately, 
> spark’s 
> [KinesisRecordProcessor|https://github.com/apache/spark/blob/v2.0.1/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala]
>  sometimes does not checkpoint SHARD_END. This results in an error message, 
> and spark is then blocked indefinitely from processing any items from the 
> child shards.
> This issue has also been raised on StackOverflow: [resharding while spark 
> running on kinesis 
> stream|http://stackoverflow.com/questions/38898691/resharding-while-spark-running-on-kinesis-stream]
> Exception that is logged:
> {code}
> 16/10/19 19:37:49 ERROR worker.ShutdownTask: Application exception. 
> java.lang.IllegalArgumentException: Application didn't checkpoint at end of 
> shard shardId-0030
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownTask.call(ShutdownTask.java:106)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Command used to split shard:
> {code}
> aws kinesis --region us-west-1 split-shard --stream-name my-stream 
> --shard-to-split shardId-0030 --new-starting-hash-key 
> 5316911983139663491615228241121378303
> {code}
> After the spark-streaming job has hung, examining the DynamoDB table 
> indicates that the parent shard processor has not reached 
> {{ExtendedSequenceNumber.SHARD_END}} and the child shards are still at 
> {{ExtendedSequenceNumber.TRIM_HORIZON}} waiting for the parent to finish:
> {code}
> aws kinesis --region us-west-1 describe-stream --stream-name my-stream
> {
> "StreamDescription": {
> "RetentionPeriodHours": 24, 
> "StreamName": "my-stream", 
> "Shards": [
> {
> "ShardId": "shardId-0030", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "0"
> },
> ...
> }, 
> {
> "ShardId": "shardId-0062", 
> "HashKeyRange": {
> "EndingHashKey": "5316911983139663491615228241121378302", 
> "StartingHashKey": "0"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087883755242230188435465744452396445937434624994"
> }
> }, 
> {
> "ShardId": "shardId-0063", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "5316911983139663491615228241121378303"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> 

[jira] [Commented] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2016-12-03 Thread Takeshi Yamamuro (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15717915#comment-15717915
 ] 

Takeshi Yamamuro commented on SPARK-18020:
--

I'm currently looking into this issue.

> Kinesis receiver does not snapshot when shard completes
> ---
>
> Key: SPARK-18020
> URL: https://issues.apache.org/jira/browse/SPARK-18020
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.0
>Reporter: Yonathan Randolph
>Priority: Minor
>  Labels: kinesis
>
> When a kinesis shard is split or combined and the old shard ends, the Amazon 
> Kinesis Client library [calls 
> IRecordProcessor.shutdown|https://github.com/awslabs/amazon-kinesis-client/blob/v1.7.0/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/ShutdownTask.java#L100]
>  and expects that {{IRecordProcessor.shutdown}} must checkpoint the sequence 
> number {{ExtendedSequenceNumber.SHARD_END}} before returning. Unfortunately, 
> spark’s 
> [KinesisRecordProcessor|https://github.com/apache/spark/blob/v2.0.1/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala]
>  sometimes does not checkpoint SHARD_END. This results in an error message, 
> and spark is then blocked indefinitely from processing any items from the 
> child shards.
> This issue has also been raised on StackOverflow: [resharding while spark 
> running on kinesis 
> stream|http://stackoverflow.com/questions/38898691/resharding-while-spark-running-on-kinesis-stream]
> Exception that is logged:
> {code}
> 16/10/19 19:37:49 ERROR worker.ShutdownTask: Application exception. 
> java.lang.IllegalArgumentException: Application didn't checkpoint at end of 
> shard shardId-0030
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownTask.call(ShutdownTask.java:106)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Command used to split shard:
> {code}
> aws kinesis --region us-west-1 split-shard --stream-name my-stream 
> --shard-to-split shardId-0030 --new-starting-hash-key 
> 5316911983139663491615228241121378303
> {code}
> After the spark-streaming job has hung, examining the DynamoDB table 
> indicates that the parent shard processor has not reached 
> {{ExtendedSequenceNumber.SHARD_END}} and the child shards are still at 
> {{ExtendedSequenceNumber.TRIM_HORIZON}} waiting for the parent to finish:
> {code}
> aws kinesis --region us-west-1 describe-stream --stream-name my-stream
> {
> "StreamDescription": {
> "RetentionPeriodHours": 24, 
> "StreamName": "my-stream", 
> "Shards": [
> {
> "ShardId": "shardId-0030", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "0"
> },
> ...
> }, 
> {
> "ShardId": "shardId-0062", 
> "HashKeyRange": {
> "EndingHashKey": "5316911983139663491615228241121378302", 
> "StartingHashKey": "0"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087883755242230188435465744452396445937434624994"
> }
> }, 
> {
> "ShardId": "shardId-0063", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "5316911983139663491615228241121378303"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087906055987428719058607280170669094298940605426"
> }
> },
> ...
> ],
> "StreamStatus": "ACTIVE"
> }
> }
> aws dynamodb --region us-west-1 scan --table-name my-processor
> {
> "Items": [
> {
> "leaseOwner": {
> "S": 

[jira] [Commented] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2016-11-28 Thread Basile Deustua (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15702167#comment-15702167
 ] 

Basile Deustua commented on SPARK-18020:


Same here

> Kinesis receiver does not snapshot when shard completes
> ---
>
> Key: SPARK-18020
> URL: https://issues.apache.org/jira/browse/SPARK-18020
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.0
>Reporter: Yonathan Randolph
>Priority: Minor
>  Labels: kinesis
>
> When a kinesis shard is split or combined and the old shard ends, the Amazon 
> Kinesis Client library [calls 
> IRecordProcessor.shutdown|https://github.com/awslabs/amazon-kinesis-client/blob/v1.7.0/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/ShutdownTask.java#L100]
>  and expects that {{IRecordProcessor.shutdown}} must checkpoint the sequence 
> number {{ExtendedSequenceNumber.SHARD_END}} before returning. Unfortunately, 
> spark’s 
> [KinesisRecordProcessor|https://github.com/apache/spark/blob/v2.0.1/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala]
>  sometimes does not checkpoint SHARD_END. This results in an error message, 
> and spark is then blocked indefinitely from processing any items from the 
> child shards.
> This issue has also been raised on StackOverflow: [resharding while spark 
> running on kinesis 
> stream|http://stackoverflow.com/questions/38898691/resharding-while-spark-running-on-kinesis-stream]
> Exception that is logged:
> {code}
> 16/10/19 19:37:49 ERROR worker.ShutdownTask: Application exception. 
> java.lang.IllegalArgumentException: Application didn't checkpoint at end of 
> shard shardId-0030
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownTask.call(ShutdownTask.java:106)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Command used to split shard:
> {code}
> aws kinesis --region us-west-1 split-shard --stream-name my-stream 
> --shard-to-split shardId-0030 --new-starting-hash-key 
> 5316911983139663491615228241121378303
> {code}
> After the spark-streaming job has hung, examining the DynamoDB table 
> indicates that the parent shard processor has not reached 
> {{ExtendedSequenceNumber.SHARD_END}} and the child shards are still at 
> {{ExtendedSequenceNumber.TRIM_HORIZON}} waiting for the parent to finish:
> {code}
> aws kinesis --region us-west-1 describe-stream --stream-name my-stream
> {
> "StreamDescription": {
> "RetentionPeriodHours": 24, 
> "StreamName": "my-stream", 
> "Shards": [
> {
> "ShardId": "shardId-0030", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "0"
> },
> ...
> }, 
> {
> "ShardId": "shardId-0062", 
> "HashKeyRange": {
> "EndingHashKey": "5316911983139663491615228241121378302", 
> "StartingHashKey": "0"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087883755242230188435465744452396445937434624994"
> }
> }, 
> {
> "ShardId": "shardId-0063", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "5316911983139663491615228241121378303"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087906055987428719058607280170669094298940605426"
> }
> },
> ...
> ],
> "StreamStatus": "ACTIVE"
> }
> }
> aws dynamodb --region us-west-1 scan --table-name my-processor
> {
> "Items": [
> {
> "leaseOwner": {
> "S": 

[jira] [Commented] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2016-11-09 Thread Hiroki Takeda (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650539#comment-15650539
 ] 

Hiroki Takeda commented on SPARK-18020:
---

I'm actually experiencing the exact same problem.

> Kinesis receiver does not snapshot when shard completes
> ---
>
> Key: SPARK-18020
> URL: https://issues.apache.org/jira/browse/SPARK-18020
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.0
>Reporter: Yonathan Randolph
>Priority: Minor
>  Labels: kinesis
>
> When a kinesis shard is split or combined and the old shard ends, the Amazon 
> Kinesis Client library [calls 
> IRecordProcessor.shutdown|https://github.com/awslabs/amazon-kinesis-client/blob/v1.7.0/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/ShutdownTask.java#L100]
>  and expects that {{IRecordProcessor.shutdown}} must checkpoint the sequence 
> number {{ExtendedSequenceNumber.SHARD_END}} before returning. Unfortunately, 
> spark’s 
> [KinesisRecordProcessor|https://github.com/apache/spark/blob/v2.0.1/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala]
>  sometimes does not checkpoint SHARD_END. This results in an error message, 
> and spark is then blocked indefinitely from processing any items from the 
> child shards.
> This issue has also been raised on StackOverflow: [resharding while spark 
> running on kinesis 
> stream|http://stackoverflow.com/questions/38898691/resharding-while-spark-running-on-kinesis-stream]
> Exception that is logged:
> {code}
> 16/10/19 19:37:49 ERROR worker.ShutdownTask: Application exception. 
> java.lang.IllegalArgumentException: Application didn't checkpoint at end of 
> shard shardId-0030
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownTask.call(ShutdownTask.java:106)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Command used to split shard:
> {code}
> aws kinesis --region us-west-1 split-shard --stream-name my-stream 
> --shard-to-split shardId-0030 --new-starting-hash-key 
> 5316911983139663491615228241121378303
> {code}
> After the spark-streaming job has hung, examining the DynamoDB table 
> indicates that the parent shard processor has not reached 
> {{ExtendedSequenceNumber.SHARD_END}} and the child shards are still at 
> {{ExtendedSequenceNumber.TRIM_HORIZON}} waiting for the parent to finish:
> {code}
> aws kinesis --region us-west-1 describe-stream --stream-name my-stream
> {
> "StreamDescription": {
> "RetentionPeriodHours": 24, 
> "StreamName": "my-stream", 
> "Shards": [
> {
> "ShardId": "shardId-0030", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "0"
> },
> ...
> }, 
> {
> "ShardId": "shardId-0062", 
> "HashKeyRange": {
> "EndingHashKey": "5316911983139663491615228241121378302", 
> "StartingHashKey": "0"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087883755242230188435465744452396445937434624994"
> }
> }, 
> {
> "ShardId": "shardId-0063", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "5316911983139663491615228241121378303"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087906055987428719058607280170669094298940605426"
> }
> },
> ...
> ],
> "StreamStatus": "ACTIVE"
> }
> }
> aws dynamodb --region us-west-1 scan --table-name my-processor
> {
> "Items": [
> {
> "leaseOwner": {
> "S": 

[jira] [Commented] (SPARK-18020) Kinesis receiver does not snapshot when shard completes

2016-11-09 Thread Hiroki Takeda (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-18020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15650540#comment-15650540
 ] 

Hiroki Takeda commented on SPARK-18020:
---

I'm actually experiencing the exact same problem.

> Kinesis receiver does not snapshot when shard completes
> ---
>
> Key: SPARK-18020
> URL: https://issues.apache.org/jira/browse/SPARK-18020
> Project: Spark
>  Issue Type: Bug
>  Components: DStreams
>Affects Versions: 2.0.0
>Reporter: Yonathan Randolph
>Priority: Minor
>  Labels: kinesis
>
> When a kinesis shard is split or combined and the old shard ends, the Amazon 
> Kinesis Client library [calls 
> IRecordProcessor.shutdown|https://github.com/awslabs/amazon-kinesis-client/blob/v1.7.0/src/main/java/com/amazonaws/services/kinesis/clientlibrary/lib/worker/ShutdownTask.java#L100]
>  and expects that {{IRecordProcessor.shutdown}} must checkpoint the sequence 
> number {{ExtendedSequenceNumber.SHARD_END}} before returning. Unfortunately, 
> spark’s 
> [KinesisRecordProcessor|https://github.com/apache/spark/blob/v2.0.1/external/kinesis-asl/src/main/scala/org/apache/spark/streaming/kinesis/KinesisRecordProcessor.scala]
>  sometimes does not checkpoint SHARD_END. This results in an error message, 
> and spark is then blocked indefinitely from processing any items from the 
> child shards.
> This issue has also been raised on StackOverflow: [resharding while spark 
> running on kinesis 
> stream|http://stackoverflow.com/questions/38898691/resharding-while-spark-running-on-kinesis-stream]
> Exception that is logged:
> {code}
> 16/10/19 19:37:49 ERROR worker.ShutdownTask: Application exception. 
> java.lang.IllegalArgumentException: Application didn't checkpoint at end of 
> shard shardId-0030
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.ShutdownTask.call(ShutdownTask.java:106)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:49)
> at 
> com.amazonaws.services.kinesis.clientlibrary.lib.worker.MetricsCollectingTaskDecorator.call(MetricsCollectingTaskDecorator.java:24)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> at java.lang.Thread.run(Thread.java:745)
> {code}
> Command used to split shard:
> {code}
> aws kinesis --region us-west-1 split-shard --stream-name my-stream 
> --shard-to-split shardId-0030 --new-starting-hash-key 
> 5316911983139663491615228241121378303
> {code}
> After the spark-streaming job has hung, examining the DynamoDB table 
> indicates that the parent shard processor has not reached 
> {{ExtendedSequenceNumber.SHARD_END}} and the child shards are still at 
> {{ExtendedSequenceNumber.TRIM_HORIZON}} waiting for the parent to finish:
> {code}
> aws kinesis --region us-west-1 describe-stream --stream-name my-stream
> {
> "StreamDescription": {
> "RetentionPeriodHours": 24, 
> "StreamName": "my-stream", 
> "Shards": [
> {
> "ShardId": "shardId-0030", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "0"
> },
> ...
> }, 
> {
> "ShardId": "shardId-0062", 
> "HashKeyRange": {
> "EndingHashKey": "5316911983139663491615228241121378302", 
> "StartingHashKey": "0"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087883755242230188435465744452396445937434624994"
> }
> }, 
> {
> "ShardId": "shardId-0063", 
> "HashKeyRange": {
> "EndingHashKey": 
> "10633823966279326983230456482242756606", 
> "StartingHashKey": "5316911983139663491615228241121378303"
> }, 
> "ParentShardId": "shardId-0030", 
> "SequenceNumberRange": {
> "StartingSequenceNumber": 
> "49566806087906055987428719058607280170669094298940605426"
> }
> },
> ...
> ],
> "StreamStatus": "ACTIVE"
> }
> }
> aws dynamodb --region us-west-1 scan --table-name my-processor
> {
> "Items": [
> {
> "leaseOwner": {
> "S":