Thanks.
IMO, we should focus more on SparkDatasource level, not the compatibility with the HoodieClient level. Thanks, Hmatu ------------------ Original ------------------ From: "Vinoth Chandar"<[email protected]>; Date: Mon, Jan 27, 2020 02:45 AM To: "dev"<[email protected]>; Subject: Re: [DISCUSS] Remove HoodieWriteClient The datasource and deltastreamer are all built on top of the HoodieWriteClient.. So, we cannot remove it. Plus, the RDD level API is actually more efficient for ingesting data from say Kafka. We can go from avro to parquet or avro to avro directly (as opposed to avro -> row -> parquet, or avro -> row -> avro). This is one of the reasons for Hudi's design even.. RFC-13 will change a bunch of things here.. But we do need the RDD api IMO On Sun, Jan 26, 2020 at 8:13 AM hmatu <[email protected]> wrote: > Hi guys, > > > As we know, hudi project contains HoodieWriteClient and HoodieSparkSource > level framework. But may 99% user just use HoodieSparkSource except for > uber. So I suggest remove HoodieWriteClient. WDYT? > > > Thanks > Hmatu
