[GitHub] spark pull request #16842: SPARK-19304 fix kinesis slow checkpoint recovery

Gauravshah Tue, 07 Feb 2017 12:53:49 -0800

GitHub user Gauravshah opened a pull request:

    https://github.com/apache/spark/pull/16842


    SPARK-19304 fix kinesis slow checkpoint recovery

    ## What changes were proposed in this pull request?
    added a limit to getRecords api call call in KinesisBackedBlockRdd. This 
helps reduce the amount of data returned by kinesis api call making the 
recovery considerable faster
    
    As we are storing the `fromSeqNum` & `toSeqNum` in checkpoint metadata, we 
can also store the number of records. Which can alter be used for api call.
    
    ## How was this patch tested?
    The patch was manually tested
    
    Apologies for any silly mistakes, opening first pull request


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/Gauravshah/spark 
kinesis_checkpoint_recovery_fix_2_1_0

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/16842.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #16842
    
----
commit b5e544a8ec326149b7d03773dd7abf8703ee44a2
Author: Gaurav <[email protected]>
Date:   2017-02-07T19:21:28Z

    added limit to kinesis checkpoint backed rdd to reduce number of record 
loaded on aws getRecords call

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #16842: SPARK-19304 fix kinesis slow checkpoint recovery

Reply via email to