Sg, lets capture these discussions in the JIRA (link to the discussion
thread should suffice) and we can revisit one by one..
On Mon, Sep 23, 2019 at 8:31 PM Taher Koitawala wrote:
> Sure Vinoth, I think we need to try this out and check how it fits together
> and how deployable it is.
>
> On
Sure Vinoth, I think we need to try this out and check how it fits together
and how deployable it is.
On Sun, Sep 22, 2019, 7:01 PM Vinoth Chandar wrote:
> See a lot of Spark Streaming receiver based approach code there, which
> makes me a bit worried about scalability.
>
> Nonetheless. API
Hi Vinoth,
Nifi has the capability to pass data to a custom spark job.
However that is done through a StreamingContext, not sure if we can build
something on this. I'm trying to wrap my head around how to fit the
StreamingContext in our existing code.
Here is an example:
I think we will have to make a Nifi Processor. The Nifi processor should
host all what do with Spark to write data. We will have to scope out the
work on this and compactions.
Regards,
Taher Koitawala
On Wed, Sep 18, 2019, 8:30 PM Suneel Marthi wrote:
> Adding Nifi dev@ to this thread.
>
>
>
Adding Nifi dev@ to this thread.
On Wed, Sep 18, 2019 at 10:57 AM Vinoth Chandar wrote:
> Not too familiar wth Nifi myself. Would this still target an use-case like
> what pratyaksh mentioned?
> For delta streamer specifically, we are moving more and more towards
> continuous mode, where
>
Not too familiar wth Nifi myself. Would this still target an use-case like what
pratyaksh mentioned?
For delta streamer specifically, we are moving more and more towards continuous
mode, where
Hudi writing and compaction are amanged by a single long running spark
application.
Would Nifi
That's another way of doing things. I want to know if someone wrote
something like PutParquet. Which directly can write data to Hudi. AFAIK I
don't think anyone has.
That will really be powerful.
On Wed, Sep 18, 2019, 1:37 PM Pratyaksh Sharma
wrote:
> Hi Taher,
>
> In the initial phase of our
Hi Taher,
In the initial phase of our CDC pipeline, we were using Hudi with Nifi.
Nifi was being used to read Binlog file of mysql and to push that data to
some Kafka topic. This topic was then getting consumed by DeltaStreamer. So
Nifi was indirectly involved in that flow.
On Wed, Sep 18, 2019
Hi All,
Just wanted to know has anyone tried to write data to Hudi with a
Nifi flow?
Perhaps may be just a csv file on local to Hudi dataset? If not then lets
try that!
Regards,
Taher Koitawala