Hi,songj ,

DeltaStreamer can be understood as a packaged Spark DataSource. You only need 
to set the required parameters, which makes it more convenient for data ingest.

Best,

Trevor


[email protected]
 
From: songj songj
Date: 2020-12-01 16:48
To: dev
Subject: Re: why not use spark datasource in DeltaStreamer
spark structured streaming consume kafka using kafka data source, and
foreachbatch to do insert/upsert/... to hudi,
is it similar with DeltaStreamer?
 
songj songj <[email protected]> 于2020年12月1日周二 下午4:28写道:
 
> hi, I have some questions:
>
> 1. DeltaStreamer  has its own Source<JavaRDD<String>> to consume source
> data,
> such as Kafka, why not use spark datasource directly ?
>
> 2. Hudi has lots of logical which use RDD, why not use Spark DataFrame?
>
> I just want to know the background of the above implementation, thanks!
>

Reply via email to