Hi Gordon: Cool. Thanks for the thumb-up!
We will include some test cases around the behavior of re-sharding. If needed we can double check the behavior with AWS, and see if additional changes are needed. Will keep you posted. - Ying On Wed, Jul 4, 2018 at 7:22 PM, Tzu-Li (Gordon) Tai <tzuli...@apache.org> wrote: > Hi Ying, > > Sorry for the late reply here. > > From the looks of the AmazonDynamoDBStreamsClient, yes it seems like this > should simply work. > > Regarding the resharding behaviour I mentioned in the JIRA: > I'm not sure if this is really a difference in behaviour. Internally, if > DynamoDB streams is actually just working on Kinesis Streams, then the > resharding primitives should be similar. > The shard discovery logic of the Flink Kinesis Consumer assumes that > splitting / merging shards will result in new shards of increasing, > consecutive shard ids. As long as this is also the behaviour for DynamoDB > resharding, then we should be fine. > > Feel free to start with the implementation for this, I think design-wise > we're good to go. And thanks for working on this! > > Cheers, > Gordon > > On Wed, Jul 4, 2018 at 1:59 PM Ying Xu <y...@lyft.com> wrote: > > > HI Gordon: > > > > We are starting to implement some of the primitives along this path. > Please > > let us know if you have any suggestions. > > > > Thanks! > > > > On Fri, Jun 29, 2018 at 12:31 AM, Ying Xu <y...@lyft.com> wrote: > > > > > Hi Gordon: > > > > > > Really appreciate the reply. > > > > > > Yes our plan is to build the connector on top of the > > FlinkKinesisConsumer. > > > At the high level, FlinkKinesisConsumer mainly interacts with Kinesis > > > through the AmazonKinesis client, more specifically through the > following > > > three function calls: > > > > > > - describeStream > > > - getRecords > > > - getShardIterator > > > > > > Given that the low-level DynamoDB client (AmazonDynamoDBStreamsClient) > > > has already implemented similar calls, it is possible to use that > client > > to > > > interact with the dynamoDB streams, and adapt the results from the > > dynamoDB > > > streams model to the kinesis model. > > > > > > It appears this is exactly what the AmazonDynamoDBStreamsAdapterClient > > > < > > https://github.com/awslabs/dynamodb-streams-kinesis- > adapter/blob/master/src/main/java/com/amazonaws/services/ > dynamodbv2/streamsadapter/AmazonDynamoDBStreamsAdapterClient.java > > > > > > does. The adaptor client implements the AmazonKinesis client interface, > > > and is officially supported by AWS. Hence it is possible to replace > the > > > internal Kinesis client inside FlinkKinesisConsumer with this adapter > > > client when interacting with dynamoDB streams. The new object can be a > > > subclass of FlinkKinesisConsumer with a new name e.g, > > FlinkDynamoStreamCon > > > sumer. > > > > > > At best this could simply work. But we would like to hear if there are > > > other situations to take care of. In particular, I am wondering what's > > the *"resharding > > > behavior"* mentioned in FLINK-4582. > > > > > > Thanks a lot! > > > > > > - > > > Ying > > > > > > On Wed, Jun 27, 2018 at 10:43 PM, Tzu-Li (Gordon) Tai < > > tzuli...@apache.org > > > > wrote: > > > > > >> Hi! > > >> > > >> I think it would be definitely nice to have this feature. > > >> > > >> No actual previous work has been made on this issue, but AFAIK, we > > should > > >> be able to build this on top of the FlinkKinesisConsumer. > > >> Whether this should live within the Kinesis connector module or an > > >> independent module of its own is still TBD. > > >> If you want, I would be happy to look at any concrete design proposals > > you > > >> have for this before you start the actual development efforts. > > >> > > >> Cheers, > > >> Gordon > > >> > > >> On Thu, Jun 28, 2018 at 2:12 AM Ying Xu <y...@lyft.com> wrote: > > >> > > >> > Thanks Fabian for the suggestion. > > >> > > > >> > *Ying Xu* > > >> > Software Engineer > > >> > 510.368.1252 <+15103681252> > > >> > [image: Lyft] <http://www.lyft.com/> > > >> > > > >> > On Wed, Jun 27, 2018 at 2:01 AM, Fabian Hueske <fhue...@gmail.com> > > >> wrote: > > >> > > > >> > > Hi Ying, > > >> > > > > >> > > I'm not aware of any effort for this issue. > > >> > > You could check with the assigned contributor in Jira if there is > > some > > >> > > previous work. > > >> > > > > >> > > Best, Fabian > > >> > > > > >> > > 2018-06-26 9:46 GMT+02:00 Ying Xu <y...@lyft.com>: > > >> > > > > >> > > > Hello Flink dev: > > >> > > > > > >> > > > We have a number of use cases which involves pulling data from > > >> DynamoDB > > >> > > > streams into Flink. > > >> > > > > > >> > > > Given that this issue is tracked by Flink-4582 > > >> > > > <https://issues.apache.org/jira/browse/FLINK-4582>. we would > like > > >> to > > >> > > check > > >> > > > if any prior work has been completed by the community. We are > > also > > >> > very > > >> > > > interested in contributing to this effort. Currently, we have a > > >> > > high-level > > >> > > > proposal which is based on extending the existing > > >> FlinkKinesisConsumer > > >> > > and > > >> > > > making it work with DynamoDB streams (via integrating with the > > >> > > > AmazonDynamoDBStreams API). > > >> > > > > > >> > > > Any suggestion is welcome. Thank you very much. > > >> > > > > > >> > > > > > >> > > > - > > >> > > > Ying > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > >