anishshri-db opened a new pull request, #40292:
URL: https://github.com/apache/spark/pull/40292

   ### What changes were proposed in this pull request?
   Write temp checkpoints for streaming queries to local filesystem even if 
default FS is set differently
   
   ### Why are the changes needed?
   We have seen cases where the default FS could be a remote file system and 
since the path for streaming checkpoints is not specified explcitily, this 
could cause pileup under 2 cases:
   
   - query exits with exception and the flag to force checkpoint removal is not 
set
   - driver/cluster terminates without query being terminated gracefully
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Verified that the checkpoint is resolved and written to the local FS
   
   ```
   23/03/04 01:42:49 INFO ResolveWriteToStream: Checkpoint root 
file:/local_disk0/tmp/temporary-c97ab8bd-6b03-4c28-93ea-751d30a2d3f9 resolved 
to file:/local_disk0/tmp/temporary-c97ab8bd-6b03-4c28-93ea-751d30a2d3f9.
   ...
   23/03/04 01:46:37 INFO MicroBatchExecution: [queryId = 66c4c] Deleting 
checkpoint file:/local_disk0/tmp/temporary-c97ab8bd-6b03-4c28-93ea-751d30a2d3f9.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to