Github user NicoK commented on a diff in the pull request:
https://github.com/apache/flink/pull/5176#discussion_r160353839
--- Diff:
flink-runtime/src/main/java/org/apache/flink/runtime/blob/BlobUtils.java ---
@@ -127,21 +132,28 @@ private static BlobStoreService
createFileSystemBlobStore(Configuration configur
}
/**
- * Creates a local storage directory for a blob service under the given
parent directory.
+ * Creates a local storage directory for a blob service under the
configuration parameter given
+ * by {@link BlobServerOptions#STORAGE_DIRECTORY}. If this is
<tt>null</tt> or empty, we will
+ * fall back to Flink's temp directories (given by
+ * {@link org.apache.flink.configuration.CoreOptions#TMP_DIRS}) and
choose one among them at
+ * random.
*
- * @param basePath
- * base path, i.e. parent directory, of the storage
directory to use (if <tt>null</tt> or
- * empty, the path in <tt>java.io.tmpdir</tt> will be used)
+ * @param config
+ * Flink configuration
*
* @return a new local storage directory
*
* @throws IOException
* thrown if the local file storage cannot be created or
is not usable
*/
- static File initLocalStorageDirectory(String basePath) throws
IOException {
+ static File initLocalStorageDirectory(Configuration config) throws
IOException {
+
+ String basePath =
config.getString(BlobServerOptions.STORAGE_DIRECTORY);
+
File baseDir;
if (StringUtils.isNullOrWhitespaceOnly(basePath)) {
- baseDir = new
File(System.getProperty("java.io.tmpdir"));
+ final String[] tmpDirPaths =
TaskManagerServicesConfiguration.parseTempDirectories(config);
+ baseDir = new
File(tmpDirPaths[rnd.nextInt(tmpDirPaths.length)]);
--- End diff --
Unfortunately, the BLOB caches cannot handle multiple directories that
well. When looking for cached files, they check for their existence and
otherwise download missing ones. Looking through all of them may be cumbersome
as well (and should not be solved with this PR, imho).
I'd also expect most use cases to use a single directory but tbh, I lack
overview of use cases here.
---