Sg, lets capture these discussions in the JIRA (link to the discussion
thread should suffice) and we can revisit one by one..

On Mon, Sep 23, 2019 at 8:31 PM Taher Koitawala <taher...@gmail.com> wrote:

> Sure Vinoth, I think we need to try this out and check how it fits together
> and how deployable it is.
>
> On Sun, Sep 22, 2019, 7:01 PM Vinoth Chandar <vin...@apache.org> wrote:
>
> > See a lot of Spark Streaming receiver based approach code there, which
> > makes me a bit worried about scalability.
> >
> > Nonetheless. API wise cant we just so dstream.rdd.forEach? And issue
> these
> > writes using the WriteClient api?
> >
> > On Sat, Sep 21, 2019 at 4:16 AM Taher Koitawala <taher...@gmail.com>
> > wrote:
> >
> > > Hi Vinoth,
> > >                 Nifi has the capability to pass data to a custom spark
> > job.
> > > However that is done through a StreamingContext, not sure if we can
> build
> > > something on this. I'm trying to wrap my head around how to fit the
> > > StreamingContext in our existing code.
> > >
> > > Here is an example:
> > > https://blogs.apache.org/nifi/entry/stream_processing_nifi_and_spark
> > >
> > > Regards,
> > > Taher Koitawala
> > >
> > > On Wed, Sep 18, 2019, 8:27 PM Vinoth Chandar <vin...@apache.org>
> wrote:
> > >
> > > > Not too familiar wth Nifi myself. Would this still target an use-case
> > > like
> > > > what pratyaksh mentioned?
> > > > For delta streamer specifically, we are moving more and more towards
> > > > continuous mode, where
> > > > Hudi writing and compaction are amanged by a single long running
> spark
> > > > application.
> > > >
> > > > Would Nifi also help us manage compactions when working with Hudi
> > > > datasource or just writing plain spark Hudi pipelines?
> > > >
> > > > On 2019/09/18 08:18:44, Taher Koitawala <taher...@gmail.com> wrote:
> > > > > That's another way of doing things. I want to know if someone wrote
> > > > > something like PutParquet. Which directly can write data to Hudi.
> > > AFAIK I
> > > > > don't think anyone has.
> > > > >
> > > > > That will really be powerful.
> > > > >
> > > > > On Wed, Sep 18, 2019, 1:37 PM Pratyaksh Sharma <
> > pratyaks...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi Taher,
> > > > > >
> > > > > > In the initial phase of our CDC pipeline, we were using Hudi with
> > > Nifi.
> > > > > > Nifi was being used to read Binlog file of mysql and to push that
> > > data
> > > > to
> > > > > > some Kafka topic. This topic was then getting consumed by
> > > > DeltaStreamer. So
> > > > > > Nifi was indirectly involved in that flow.
> > > > > >
> > > > > > On Wed, Sep 18, 2019 at 10:29 AM Taher Koitawala <
> > taher...@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >           Just wanted to know has anyone tried to write data to
> > > Hudi
> > > > > > with a
> > > > > > > Nifi flow?
> > > > > > >
> > > > > > > Perhaps may be just a csv file on local to Hudi dataset? If not
> > > then
> > > > lets
> > > > > > > try that!
> > > > > > >
> > > > > > > Regards,
> > > > > > > Taher Koitawala
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to