spark structured streaming consume kafka using kafka data source, and foreachbatch to do insert/upsert/... to hudi, is it similar with DeltaStreamer?
songj songj <[email protected]> 于2020年12月1日周二 下午4:28写道: > hi, I have some questions: > > 1. DeltaStreamer has its own Source<JavaRDD<String>> to consume source > data, > such as Kafka, why not use spark datasource directly ? > > 2. Hudi has lots of logical which use RDD, why not use Spark DataFrame? > > I just want to know the background of the above implementation, thanks! >
