+1 For now we can keep this in hudi-utilities itself IMO. As for the connector or Deltastreamer Source to be specific, should we just integrate to Kinesis? If DynamoDB will pump its changes into Kinesis anyway, why should we aware of DynanoDB directly? Also we may need to rethink how we are going to maintain the schema? does kinesis streams have schemas mapped from DynamoDB already or should we be implementing a DynamoDBSchemaProvider as well?
This would be a really great addition. But also can see how challenging it can be (which is fun :)) On Sun, Sep 22, 2019 at 4:09 AM Taher Koitawala <[email protected]> wrote: > I think this will be a good opportunity to plan better in terms of > abstraction too which is needed for the Flink and Beam engines we might > use. > > Regards, > Taher Koitawala > > On Sun, Sep 22, 2019, 3:37 PM leesf <[email protected]> wrote: > > > +1. > > Happy to see DeltaStreamer becomes more and more powerful. Also, we need > to > > pay some attention to the layout and organization of these connectors as > > more and more data sources introduced to HUDI like vinoyang suggested. > > > > Best, > > Leesf > > > > Bhavani Sudha Saktheeswaran <[email protected]> 于2019年9月22日周日 > > 下午12:18写道: > > > > > +1 to adding more connectors to DeltStreamer and making them as much > > > pluggable modules as possible like Vino Yang suggested. > > > > > > > > > On Sat, Sep 21, 2019 at 7:12 PM vino yang <[email protected]> > wrote: > > > > > > > + 1 to introduce these connectors. It's nice to see that Hudi's > > ecosystem > > > > is growing. As Hudi connects to more and more systems, it is > necessary > > to > > > > introduce separate modules to place these connectors. This can lead > to > > > > module relayout or code refactoring. Of course, all this needs to be > > > > discussed in more depth. Best, Vino On 09/21/2019 18:59, Vinay Patil > > > wrote: > > > > Hi Taher, Basically this can be proposal to support Kinesis and > > DynamoDb > > > > stream support can be enabled by reusing this source code. Flink has > > > > provided support for DynamoDb Streams by reusing Kinesis Streams > > classes. > > > > Regards, Vinay Patil On Sat, Sep 21, 2019 at 4:26 PM Taher Koitawala > < > > > > [email protected]> wrote: > That would be a great addition Vinay. > How > > > > about adding Kinesis as well? > > Regards, > Taher Koitawala > > On > > Sat, > > > > Sep 21, 2019, 4:20 PM Vinay Patil <[email protected]> wrote: > > > > > > > > > Hi Team, > > > > The DynamoDb streams contains the CDC data when > > enabled > > > on > > > > a DynamoDb > > table, we can add a source for DeltaStreamer which > will > > > > enable us to read > > this data and write it back either to Hudi > > dataset > > > or > > > > to another sink. > > > > > > Thoughts on adding this support in Hudi > ? > > > > > > > > > > > > > > > Regards, > > Vinay Patil > > > > > > > > >
