GitHub user NicoK opened a pull request:
https://github.com/apache/flink/pull/3076
[FLINK-5129] make the BlobServer use a distributed file system
Make the BlobCache use the BlobServer's distributed file system in HA mode:
previously even in HA mode and if the cache has access to the file system, it
would download BLOBs from one central BlobServer. By using the distributed file
system beneath we may leverage its scalability and remove a single point of
(performance) failure. If the distributed file system is not accessible at the
blob
caches, the old behaviour is used.
@uce can you have a look?
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/NicoK/flink FLINK-5129a
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/3076.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #3076
commit 464f2c834688507c67acb3ad584827132ebe444e
Author: Nico Kruber
Date: 2016-11-22T11:49:03Z
[hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath
This was actually the same implementation as
FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the
two
could have been removed but the implementation makes most sense at the
concrete file system abstraction layer, i.e. in FileSystemBlobStore.
commit 2ebffd4c2d499b61f164b4d54dc86c9d44b9c0ea
Author: Nico Kruber
Date: 2016-11-23T15:11:35Z
[hotfix] do not create intermediate strings inside String.format in
BlobUtils
commit 36ab6121e336f63138e442ea48a751ede7fb04c3
Author: Nico Kruber
Date: 2016-11-24T16:11:19Z
[hotfix] properly shut down the BlobServer in BlobServerRangeTest
commit c8c12c67ae875ca5c96db78375bef880cf2a3c59
Author: Nico Kruber
Date: 2017-01-05T17:06:01Z
[hotfix] use JUnit's TemporaryFolder in BlobRecoveryITCase, too
This makes cleaning up simpler.
commit a078cb0c26071fe70e3668d23d0c8bef8550892f
Author: Nico Kruber
Date: 2017-01-05T17:27:00Z
[hotfix] add a missing "'" to the BlobStore class
commit a643f0b989c640a81b112ad14ae27a2a2b1ab257
Author: Nico Kruber
Date: 2017-01-05T17:07:13Z
[FLINK-5129] BlobServer: include the cluster id in the HA storage path for
blobs
This applies to the ZookeeperHaServices implementation.
commit 7d832919040059961940fc96d0cdb285bc9f77d3
Author: Nico Kruber
Date: 2017-01-05T17:18:10Z
[FLINK-5129] unify duplicate code between the BlobServer and
ZookeeperHaServices
(this was introduced by c64860677f)
commit 19879a01b99c4772a09627eb5f380f794f6c1e27
Author: Nico Kruber
Date: 2016-11-30T13:52:12Z
[hotfix] add some more documentation in BlobStore-related classes
commit 80c17ef83104d1186c06d8f5d4cde11e4b05f2b8
Author: Nico Kruber
Date: 2017-01-06T10:55:23Z
[hotfix] minor code beautifications when checking parameters
+ also check the blobService parameter in BlobLibraryCacheManager
commit ff920e48bd69acef280bdef2a12e5f5f9cca3a88
Author: Nico Kruber
Date: 2017-01-06T13:21:42Z
[FLINK-5129] let BlobUtils#initStorageDirectory() throw a proper IOException
commit c8e2815787338f52e5ad369bcaedb1798284dd29
Author: Nico Kruber
Date: 2017-01-06T13:59:51Z
[hotfix] simplify code in BlobCache#deleteGlobal()
Also, re-order the code so that a local delete is always tried before
creating
a connection to the BlobServer. If that fails, the local file is deleted at
least.
commit 38626a705fd0725a8e54f2ee1c3d0ec410184b8a
Author: Nico Kruber
Date: 2017-01-06T14:06:30Z
[FLINK-5129] make the BlobCache also use a distributed file system in HA
mode
If available (in HA mode), download the jar files from the distributed file
system directly instead of querying the BlobServer. This way the load is
more
distributed among the nodes of the file system (depending on its
implementation
of course) compared to putting all the burden on a single BlobServer.
commit 1e86c5c92f9ac35c26c1e707d2d840c4edbeefb1
Author: Nico Kruber
Date: 2016-11-29T17:11:06Z
[hotfix] re-use some code in BlobServerDeleteTest
commit 68d2959b60f6b583cb48de8ed5aee3e18b163082
Author: Nico Kruber
Date: 2016-11-30T13:35:38Z
[hotfix] improve some failure messages in the BlobService's HA unit tests
commit 7cfbeb7707329cad57604a58f44254d4f8b6c9b3
Author: Nico Kruber
Date: 2017-01-06T16:21:05Z
[FLINK-5129] add unit tests for the BlobCache accessing the