Adam Binford created SPARK-44639:
------------------------------------
Summary: Add option to use Java tmp dir for RocksDB state store
Key: SPARK-44639
URL: https://issues.apache.org/jira/browse/SPARK-44639
Project: Spark
Issue Type: Improvement
Components: SQL, Structured Streaming
Affects Versions: 3.4.1
Reporter: Adam Binford
Currently local RocksDB state is stored in a local directory given by
Utils.getLocalDir. On yarn this is a directory created inside the root
application folder such as
{{/tmp/nm-local-dir/usercache/<user>/appcache/<app_id>/}}
The problem with this is that if an executor crashes for some reason (like OOM)
and the shutdown hooks don't get run, this directory will stay around forever
until the application finishes, which can cause jobs to slowly accumulate more
and more temporary space until finally the node manager goes unhealthy.
Because this data will only ever be accessed by the executor that created this
directory, it would make sense to store the data inside the container folder,
which will always get cleaned up by the node manager when that yarn container
gets cleaned up. Yarn sets the `java.io.tmpdir` to a path inside this
directory, such as
{{/tmp/nm-local-dir/usercache/<user>/appcache/<app_id>/<container_id>/tmp/}}
I'm not sure the behavior for other resource managers, so this could be an
opt-in config that can be specified.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]