It seems the following structured streaming code keeps on consuming usercache until all disk space are occupied.
val monitoring_stream = monitoring_df.writeStream .trigger(Trigger.ProcessingTime("120 seconds")) .foreachBatch { (batchDF: DataFrame, batchId: Long) => if(!batchDF.isEmpty) batchDF.show() } I even did not call batchDF.persist(). Do I need to really save/write batchDF to somewhere to release the usercache? I also tried to call spark.catalog.clearCache() explicitly in a loop, which does not help solve this problem either. Below figure also shows the capacity of the cluster is decreasing with the running of these codes.
--------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org