Thanks Jungtaek. I could not remove the watermark but setting 0 works for me.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Anyone know the answers or pointers? thanks.
--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/
-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org
We have a scenario to group raw records by correlation id every 3 minutes and
append groupped result to some HDFS store, below is an example of our query
val df= records.readStream.format("SomeDataSource")
.selectExpr("current_timestamp() as CurrentTime", "*")
.withWatermark("Curr
Hi,
I started building a custom data source using data source v2 api to stream
data incrementally from azure data lake (HDFS) in Spark 2.4.0. I used kafka
as the reference to implement the
DataSourceV2/MicroBatchReader/InputPartitionReader/InputPartition/OffsetV2
classes and goteverything working