It seems the following structured streaming code keeps on consuming
usercache until all disk space are occupied.
val monitoring_stream =
monitoring_df.writeStream
.trigger(Trigger.ProcessingTime("120 seconds"))
.foreachBatch {
(batchDF: DataFrame, batchId: Long) =>
if(!batchDF.isEmpty) batchDF.show()
}
I even did not call batchDF.persist(). Do I need to really save/write
batchDF to somewhere to release the usercache?
I also tried to call spark.catalog.clearCache() explicitly in a loop, which
does not help solve this problem either.
Below figure also shows the capacity of the cluster is decreasing with the
running of these codes.
---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org