GitHub user maropu opened a pull request:

    https://github.com/apache/spark/pull/16213

    [SPARK-18020][Streaming][Kinesis] Checkpoint SHARD_END to finish reading 
closed shards 

    ## What changes were proposed in this pull request?
    This pr is to fix an issue occurred when resharding Kinesis streams; the 
resharding makes the KCL throw an exception because Spark does not checkpoint 
`SHARD_END` when finishing reading closed shards in 
`KinesisRecordProcessor#shutdown`. This bug finally leads to stopping 
subscribing new split (or merged) shards.
    
    ## How was this patch tested?
    Added a test in `KinesisStreamSuite` to check if it works well when 
splitting/merging shards.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/maropu/spark SPARK-18020

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16213.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16213
    
----
commit e556d3ca64b9c0bf74dd7d4466425b4656342460
Author: Takeshi YAMAMURO <[email protected]>
Date:   2016-12-04T11:26:31Z

    Force advancing a position to SHARD_END by using the KCL internal function, 
this is workaround

commit 7c51da8c1d31e2d963c8ca724c83010c278ea65a
Author: Takeshi YAMAMURO <[email protected]>
Date:   2016-12-08T02:02:29Z

    Add tests for splitting and merging shards

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to