Hi,
Lately, I encountered a problem, when I was writing as structured streaming
job to write things into opentsdb.
The write-stream part looks something like
outputDs
.coalesce(14)
.writeStream
.outputMode("append")
.trigger(Trigger.ProcessingTime(s"$triggerSeconds seconds"))
.option("checkpointLocation",s"$checkpointDir/$appName/tsdb")
.foreach {
TsdbWriter(
tsdbUrl,
MongoProp(mongoUrl, mongoPort, mongoUser, mongoPassword,
mongoDatabase, mongoCollection,mongoAuthenticationDatabase)
)(createMetricBuilder(tsdbMetricPrefix))
}
.start()
And when I check the checkpoint dir, I discover that the
"/checkpoint/state" dir is empty. I looked into the executor's log and found
that the HDFSBackedStateStoreProvider didn't write anything on the checkpoint
dir.
Strange thing is, when I replace the "coalesce" function into "repartition"
function, the problem solved. Is there a difference between these two functions
when using structured streaming?
Looking forward to you help, thanks.