Re: Write Streaming data using Datasource Writer is not working

2019-10-30 Thread nishith agarwal
I looked at the DataStreamWriter in Spark ( https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/streaming/DataStreamWriter.html) and the implementation seems to be different from DataSource. I haven't looked into what other classes need to be extended to support hudi form

Re: Write Streaming data using Datasource Writer is not working

2019-10-28 Thread Qian Wang
Hi Nishith, Thanks for reply. I did use the Datasource Writer to write instead of using DataStreamWriter. I think Datasource Writer also can support write streaming data, correct? Best, Qian On Oct 28, 2019, 9:31 PM -0700, nishith agarwal , wrote: > Qian, > > It seems like you are using the > h

Re: Write Streaming data using Datasource Writer is not working

2019-10-28 Thread nishith agarwal
Qian, It seems like you are using the https://spark.apache.org/docs/latest/api/java/index.html?org/apache/spark/sql/streaming/DataStreamWriter.html and not the spark DataSource. To use the spark datasource, look at an example here https://hudi.apache.org/writing_data.html#datasource-writer. DataS

Write Streaming data using Datasource Writer is not working

2019-10-28 Thread Qian Wang
Hi All, I tried to use Datasource Writer to read streaming data from Kafka topic and write to Hudi dataset on HDFS.  I used following codes: val output = data .writeStream .trigger(Trigger.ProcessingTime("300 seconds")) .format("org.apache.hudi") .option("hoodie.table.name", "hudi_ro