[ https://issues.apache.org/jira/browse/SPARK-28712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon resolved SPARK-28712. ---------------------------------- Resolution: Invalid > spark structured stream with kafka don't really delete temp files in spark > standalone cluster > --------------------------------------------------------------------------------------------- > > Key: SPARK-28712 > URL: https://issues.apache.org/jira/browse/SPARK-28712 > Project: Spark > Issue Type: Bug > Components: Structured Streaming > Affects Versions: 2.4.3 > Environment: redhat 7 > jdk 1.8 > scala 2.11.12 > spark standalone cluster 2.4.3 > kafka 0.10.2.1 > > Reporter: 凭落 > Priority: Major > > the folder in Driver > {noformat} > /tmp/temporary-xxxxxxxx{noformat} > takes up all the space in /tmp after runing spark structured stream job a > long time. > it is mainly under the offsets and commits folders.but when I watch it by us > command > {noformat} > du -sh offsets du -sh commits{noformat} > it got more than 600M,but when We use command > {noformat} > ll -h offsets ll -h commits{noformat} > it got 400K. > I think it is because when the file is deleted,it is still used in job. > It wasn't released only if the job is stopped. > How can I solve it? > We use > {code} > df.writeStream.trigger(ProcessingTime("1 seconds")) > {code} > not > {code} > df.writeStream.trigger(Continuous("1 seconds")) > {code} > Is there something wrong here? -- This message was sent by Atlassian JIRA (v7.6.14#76016) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org