thanks for reply!
could you help to explain my 2 questions  above?

Trevor <[email protected]> 于2020年12月1日周二 下午5:17写道:

> Hi,songj ,
>
> DeltaStreamer can be understood as a packaged Spark DataSource. You only
> need to set the required parameters, which makes it more convenient for
> data ingest.
>
> Best,
>
> Trevor
>
>
> [email protected]
>
> From: songj songj
> Date: 2020-12-01 16:48
> To: dev
> Subject: Re: why not use spark datasource in DeltaStreamer
> spark structured streaming consume kafka using kafka data source, and
> foreachbatch to do insert/upsert/... to hudi,
> is it similar with DeltaStreamer?
>
> songj songj <[email protected]> 于2020年12月1日周二 下午4:28写道:
>
> > hi, I have some questions:
> >
> > 1. DeltaStreamer  has its own Source<JavaRDD<String>> to consume source
> > data,
> > such as Kafka, why not use spark datasource directly ?
> >
> > 2. Hudi has lots of logical which use RDD, why not use Spark DataFrame?
> >
> > I just want to know the background of the above implementation, thanks!
> >
>

Reply via email to