[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown
[ https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16733815#comment-16733815 ] Fengyu Cao edited comment on SPARK-26389 at 1/4/19 4:03 AM: {quote}Temp checkpoint can be used in one-node scenario and deleted only if the query didn't fail. {quote} Yes, and there're no logs or error msgs says that we *must* set a non-temp checkpoint if we run a framework non-local And if we do this(run non-local with temp checkpoint), the checkpoint dir on executor consume lots of space and not be deleted if the query fails, and this checkpoint can't be used to recover as I mentioned above. I just think that spark either should prohibits users from using temp checkpoints when their frameworks are non-local, or should be responsible for cleaning up this useless checkpoint directory even if the query fails. was (Author: camper42): {quote}Temp checkpoint can be used in one-node scenario and deleted only if the query didn't fail. {quote} Yes, and there're no logs or error msgs says that we *must* set a non-temp checkpoint if we run a framework non-local And if we do this(run non-local with temp checkpoint), the checkpoint dir on executor consume lots of space and not be deleted if the query if fail, and this checkpoint can't be used to recover as I mentioned above. I just think that spark either should prohibits users from using temp checkpoints when their frameworks are non-local, or should be responsible for cleaning up this useless checkpoint directory even if the query fails. > temp checkpoint folder at executor should be deleted on graceful shutdown > - > > Key: SPARK-26389 > URL: https://issues.apache.org/jira/browse/SPARK-26389 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Fengyu Cao >Priority: Major > > {{spark-submit --master mesos:// -conf > spark.streaming.stopGracefullyOnShutdown=true framework>}} > CTRL-C, framework shutdown > {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = > f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = > 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error > org.apache.spark.SparkException: Writing job aborted.}} > {{/tmp/temporary- on executor not deleted due to > org.apache.spark.SparkException: Writing job aborted., and this temp > checkpoint can't used to recovery.}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown
[ https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725528#comment-16725528 ] Fengyu Cao edited comment on SPARK-26389 at 12/20/18 2:45 AM: -- thanks for reply Two scenarios: # {{temp checkpoint dir /tmp/temporary- on worker node}} # framework restart # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- can't used to recovery and should be deleted)}} # {{temp checkpoint dir /tmp/temporary- on worker node}} # executor stop in some reason # executor start on another worker nodes (/tmp/temporary- can't used to recovery either) Maybe temp checkpoint dir should be deleted on JVM stop? {quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not Structured Streaming one. {quote} sorry, I didn't notice this. was (Author: camper42): thanks for reply Two scenarios: # {{temp checkpoint dir /tmp/temporary-}} # framework restart # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- can't used to recovery and should be deleted)}} # {{temp checkpoint dir /tmp/temporary-}} # executor stop in some reason # executor start on another worker nodes (/tmp/temporary- can't used to recovery either) Maybe temp checkpoint dir should be deleted on JVM stop? {quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not Structured Streaming one. {quote} sorry, I didn't notice this. > temp checkpoint folder at executor should be deleted on graceful shutdown > - > > Key: SPARK-26389 > URL: https://issues.apache.org/jira/browse/SPARK-26389 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Fengyu Cao >Priority: Major > > {{spark-submit --master mesos:// -conf > spark.streaming.stopGracefullyOnShutdown=true framework>}} > CTRL-C, framework shutdown > {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = > f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = > 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error > org.apache.spark.SparkException: Writing job aborted.}} > {{/tmp/temporary- on executor not deleted due to > org.apache.spark.SparkException: Writing job aborted., and this temp > checkpoint can't used to recovery.}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-26389) temp checkpoint folder at executor should be deleted on graceful shutdown
[ https://issues.apache.org/jira/browse/SPARK-26389?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16725528#comment-16725528 ] Fengyu Cao edited comment on SPARK-26389 at 12/20/18 2:44 AM: -- thanks for reply Two scenarios: # {{temp checkpoint dir /tmp/temporary-}} # framework restart # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- can't used to recovery and should be deleted)}} # {{temp checkpoint dir /tmp/temporary-}} # executor stop in some reason # executor start on another worker nodes (/tmp/temporary- can't used to recovery either) Maybe temp checkpoint dir should be deleted on JVM stop? {quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not Structured Streaming one. {quote} sorry, I didn't notice this. was (Author: camper42): thanks for reply Two scenarios: # {{temp checkpoint dir /tmp/temporary-}} # framework restart # {{temp checkpoint dir now /tmp/temporary- (/tmp/temporary- can't used to recovery and should be deleted)}} # {{temp checkpoint dir /tmp/temporary-}} # executor stop in some reason # executor start on another worker nodes (/tmp/temporary- can't used to recovery either) May be temp checkpoint dir should be deleted on JVM stop? {quote}spark.streaming.stopGracefullyOnShutdown is a DStreams parameter and not Structured Streaming one. {quote} sorry, I didn't notice this. > temp checkpoint folder at executor should be deleted on graceful shutdown > - > > Key: SPARK-26389 > URL: https://issues.apache.org/jira/browse/SPARK-26389 > Project: Spark > Issue Type: Bug > Components: Structured Streaming >Affects Versions: 2.4.0 >Reporter: Fengyu Cao >Priority: Major > > {{spark-submit --master mesos:// -conf > spark.streaming.stopGracefullyOnShutdown=true framework>}} > CTRL-C, framework shutdown > {{18/12/18 10:27:36 ERROR MicroBatchExecution: Query [id = > f512e17a-df88-4414-a5cd-a23550cf1e7f, runId = > 24d99723-8d61-48c0-beab-af432f7a19d3] terminated with error > org.apache.spark.SparkException: Writing job aborted.}} > {{/tmp/temporary- on executor not deleted due to > org.apache.spark.SparkException: Writing job aborted., and this temp > checkpoint can't used to recovery.}} > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org