[ https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15708761#comment-15708761 ]
ASF GitHub Bot commented on FLINK-5178: --------------------------------------- GitHub user NicoK opened a pull request: https://github.com/apache/flink/pull/2911 [FLINK-5178] allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system Previously, this was restricted to a local file system path but now we can allow it to be distributed, too, which is therefore not restricted to HA mode anymore. Example for hdfs: `blob.storage.directory=hdfs:///flink/data/` Unfortunately, we cannot detect the case when a locally-mounted distributed file system is used. In this case, we require the user to give us a hint, e.g.: `blob.storage.directory=dfs:///flink/data/` for a file system mounted to `/flink/data/`. If this hint is missing, each job manager and task manager will create its own unique storage directory under this path and files will be requested from the blob server at the job manager as usual, i.e. the task manager requests a blob from the blob server which reads the file from the file system and sends it back where it is stored in the task manager's individual storage path of the same file system. BEWARE: If HA mode is configured and a local file system without the `dfs://` hint is given as HA_STORAGE_PATH, an IllegalConfigurationException will be thrown! You can merge this pull request into a Git repository by running: $ git pull https://github.com/NicoK/flink FLINK-5178 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/flink/pull/2911.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2911 ---- commit b65e74dd92bdf74b2816a0d8a26a5ebaa25ca586 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-22T11:49:03Z [hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath This was actually the same implementation as FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the two could have been removed but the implementation makes most sense at the concrete file system abstraction layer, i.e. in FileSystemBlobStore. commit 09bdd49e6282268fd9c1b2672f0ea6222e097ca2 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-23T15:11:35Z [hotfix] do not create intermediate strings inside String.format in BlobUtils commit 93938ff97fef9e39c17ac795e1e89ca9de25e028 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-24T16:11:19Z [hotfix] properly shut down the BlobServer in BlobServerRangeTest commit c0c9d2239a767154d6071171d4c33e762e01aa62 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-24T17:50:43Z [FLINK-5129] BlobServer: include the cluster id in the HA storage path for blobs Also use JUnit's TemporaryFolder in BlobRecoveryITCase, too. This makes cleaning up simpler. commit 8b9c7d9fd6e1ab3c7f2175a31d0e29b41b01cc61 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-23T18:50:52Z [FLINK-5129] make the BlobCache use the HA filesystem back-end properly Previously, the BlobServer holds a local copy and in case high availability (HA) is set, it also copies jar files to a distributed file system. Upon restore, these files are copied to local store from which they are used. This commit abstracts the BlobServer's backing file system and makes it use the distributed file system directly in HA mode, i.e. without the local file system copy. Other than that the behaviour does not change. commit 249b2ea48f19c54498faa56ad45d299efaad4521 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-25T16:42:05Z [FLINK-5129] make the BlobCache also use a distributed file system in HA mode * re-factor the file system abstraction in FileSystemBlobStore so that it can be used by the task managers, too, which should not be able to delete files in a distributed file system shared among different nodes * only download blobs from the blob server if not in HA mode or the distributed file system is not accessible by the BlobCache, e.g. at the task managers commit dd69f65a47205eb55ac8cc2c8f3aa9f7232dc8ba Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-28T10:42:13Z [FLINK-5129] restore non-HA mode unique directory setup in the blob server and cache If not in high availability mode, local (and now also distributed) file systems again try to set up a unique directory structure so that other instances with the same configuration file or storage path do not interfere. This was lost in 8b9c7d9fd6. commit 76ccc9ffaaa63d6e0bd55ba7f6c08f8c1cff98cb Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-28T15:19:20Z [hotfix] add a missing "'" to FileSystemBlobStore commit 53702add38d1087062e84a7e804b08920dfc0c23 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-28T15:41:11Z [FLINK-5129] move path-related methods from BlobUtils to FileSystemBlobStore and cleanup unused methods commit d45e4615f422ff3cf1b66e6388a0929e366df128 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-29T15:50:57Z [FLINK-5129] BlobService#delete does not throw an IOException anymore Instead, the return value indicates whether a delete operation was successful. This is a result of the FileSystem abstraction layer in FileSystemBlobStore and follows the idiom that a failing delete operation is not that grave and the program can still continue. commit fe4c1c331d803675f2241f8179e452ea8153cb38 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-29T17:43:41Z [FLINK-5129] fix wrongly set isGlobal flag in BlobCache This was set in 249b2ea48f. commit 022817cc47c437f2476f5dad322b3936d1754ac2 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-30T13:30:29Z [FLINK-5129] add a unit test for the fix of fe4c1c331d commit 660eba522b88ee036964d23a5dfbab5182c7cd40 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-30T13:52:12Z [FLINK-5129] add some more documentation commit 4a2445bc502d9f20b145c9c094214591fb57165a Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-28T15:24:28Z [FLINK-5178] allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system Previously, this was restricted to a local file system path but now we can allow it to be distributed, too, which is therefore not restricted to HA mode anymore. Example for hdfs: `blob.storage.directory=hdfs:///flink/data/` Unfortunately, we cannot detect the case when a locally-mounted distributed file system is used. In this case, we require the user to give us a hint, e.g.: `blob.storage.directory=dfs:///flink/data/` for a file system mounted to `/flink/data/`. If this hint is missing, each job manager and task manager will create its own unique storage directory under this path and files will be requested from the blob server at the job manager as usual, i.e. the task manager requests a blob from the blob server which reads the file from the file system and sends it back where it is stored in the task manager's individual storage path of the same file system. BEWARE: If HA mode is configured and a local file system without the `dfs://` hint is given as HA_STORAGE_PATH, an IllegalConfigurationException will be thrown! commit 6e0ee197c57c46c393b77a65007e8a7c9f5d2ec4 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-29T17:11:06Z [hotfix] re-use some code in BlobServerDeleteTest commit 7cc8c9086428074decb1ccf37fa85925f2705add Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-29T14:20:07Z [FLINK-5178] test cases for using a shared blob storage directory commit 454a659e38a241ef5f33016925879c1545413689 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-30T13:35:38Z [hotfix] improve some failure messages in the BlobService's HA unit tests commit e714bc91bbb407beb6f5f6dd0d560dfb17629a82 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-30T13:55:34Z [FLINK-5178] adapt unit tests to the latest changes in the FLINK-5129 branch commit 02bbdf2268a10a505bc6461b5e12cd5b25ba25f6 Author: Nico Kruber <n...@data-artisans.com> Date: 2016-11-30T14:39:54Z [FLINK-5178] add more documentation for the updated config options ---- > allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system > ---------------------------------------------------------------------- > > Key: FLINK-5178 > URL: https://issues.apache.org/jira/browse/FLINK-5178 > Project: Flink > Issue Type: Improvement > Components: Network > Reporter: Nico Kruber > Assignee: Nico Kruber > > After FLINK-5129, high availability (HA) mode adds the ability for the > BlobCache instances at the task managers to download blobs directly from the > distributed file system. It would be nice if this also worked in non-HA mode > and BLOB_STORAGE_DIRECTORY_KEY may point to a distributed file system. -- This message was sent by Atlassian JIRA (v6.3.4#6332)