[jira] [Commented] (FLINK-5178) allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system

2017-01-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15801974#comment-15801974
 ] 

ASF GitHub Bot commented on FLINK-5178:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/2911
  
I need to adapt a few things and choose a different approach - I'll re-open 
later


> allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system
> --
>
> Key: FLINK-5178
> URL: https://issues.apache.org/jira/browse/FLINK-5178
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> After FLINK-5129, high availability (HA) mode adds the ability for the 
> BlobCache instances at the task managers to download blobs directly from the 
> distributed file system. It would be nice if this also worked in non-HA mode 
> and BLOB_STORAGE_DIRECTORY_KEY may point to a distributed file system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5178) allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system

2017-01-05 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15801975#comment-15801975
 ] 

ASF GitHub Bot commented on FLINK-5178:
---

Github user NicoK closed the pull request at:

https://github.com/apache/flink/pull/2911


> allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system
> --
>
> Key: FLINK-5178
> URL: https://issues.apache.org/jira/browse/FLINK-5178
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> After FLINK-5129, high availability (HA) mode adds the ability for the 
> BlobCache instances at the task managers to download blobs directly from the 
> distributed file system. It would be nice if this also worked in non-HA mode 
> and BLOB_STORAGE_DIRECTORY_KEY may point to a distributed file system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5178) allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system

2016-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708766#comment-15708766
 ] 

ASF GitHub Bot commented on FLINK-5178:
---

Github user NicoK commented on the issue:

https://github.com/apache/flink/pull/2911
  
@uce can you have a look after processing #2891 (FLINK-5129)?


> allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system
> --
>
> Key: FLINK-5178
> URL: https://issues.apache.org/jira/browse/FLINK-5178
> Project: Flink
>  Issue Type: Improvement
>  Components: Network
>Reporter: Nico Kruber
>Assignee: Nico Kruber
>
> After FLINK-5129, high availability (HA) mode adds the ability for the 
> BlobCache instances at the task managers to download blobs directly from the 
> distributed file system. It would be nice if this also worked in non-HA mode 
> and BLOB_STORAGE_DIRECTORY_KEY may point to a distributed file system.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (FLINK-5178) allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed file system

2016-11-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/FLINK-5178?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15708761#comment-15708761
 ] 

ASF GitHub Bot commented on FLINK-5178:
---

GitHub user NicoK opened a pull request:

https://github.com/apache/flink/pull/2911

[FLINK-5178] allow BLOB_STORAGE_DIRECTORY_KEY to point to a distributed 
file system

Previously, this was restricted to a local file system path but now we can
allow it to be distributed, too, which is therefore not restricted to HA
mode anymore.

Example for hdfs: `blob.storage.directory=hdfs:///flink/data/`

Unfortunately, we cannot detect the case when a locally-mounted distributed
file system is used. In this case, we require the user to give us a hint, 
e.g.:
`blob.storage.directory=dfs:///flink/data/`
for a file system mounted to `/flink/data/`. If this hint is missing, each
job manager and task manager will create its own unique storage directory 
under
this path and files will be requested from the blob server at the job 
manager
as usual, i.e. the task manager requests a blob from the blob server which
reads the file from the file system and sends it back where it is stored in
the task manager's individual storage path of the same file system.

BEWARE: If HA mode is configured and a local file system without the 
`dfs://`
hint is given as HA_STORAGE_PATH, an IllegalConfigurationException will be
thrown!

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/NicoK/flink FLINK-5178

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/flink/pull/2911.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2911


commit b65e74dd92bdf74b2816a0d8a26a5ebaa25ca586
Author: Nico Kruber 
Date:   2016-11-22T11:49:03Z

[hotfix] remove unused package-private BlobUtils#copyFromRecoveryPath

This was actually the same implementation as
FileSystemBlobStore#get(java.lang.String, java.io.File) and either of the 
two
could have been removed but the implementation makes most sense at the
concrete file system abstraction layer, i.e. in FileSystemBlobStore.

commit 09bdd49e6282268fd9c1b2672f0ea6222e097ca2
Author: Nico Kruber 
Date:   2016-11-23T15:11:35Z

[hotfix] do not create intermediate strings inside String.format in 
BlobUtils

commit 93938ff97fef9e39c17ac795e1e89ca9de25e028
Author: Nico Kruber 
Date:   2016-11-24T16:11:19Z

[hotfix] properly shut down the BlobServer in BlobServerRangeTest

commit c0c9d2239a767154d6071171d4c33e762e01aa62
Author: Nico Kruber 
Date:   2016-11-24T17:50:43Z

[FLINK-5129] BlobServer: include the cluster id in the HA storage path for 
blobs

Also use JUnit's TemporaryFolder in BlobRecoveryITCase, too. This makes
cleaning up simpler.

commit 8b9c7d9fd6e1ab3c7f2175a31d0e29b41b01cc61
Author: Nico Kruber 
Date:   2016-11-23T18:50:52Z

[FLINK-5129] make the BlobCache use the HA filesystem back-end properly

Previously, the BlobServer holds a local copy and in case high availability 
(HA)
is set, it also copies jar files to a distributed file system. Upon restore,
these files are copied to local store from which they are used.

This commit abstracts the BlobServer's backing file system and makes it use 
the
distributed file system directly in HA mode, i.e. without the local file 
system
copy. Other than that the behaviour does not change.

commit 249b2ea48f19c54498faa56ad45d299efaad4521
Author: Nico Kruber 
Date:   2016-11-25T16:42:05Z

[FLINK-5129] make the BlobCache also use a distributed file system in HA 
mode

* re-factor the file system abstraction in FileSystemBlobStore so that it 
can
  be used by the task managers, too, which should not be able to delete 
files
  in a distributed file system shared among different nodes
* only download blobs from the blob server if not in HA mode or the 
distributed
  file system is not accessible by the BlobCache, e.g. at the task managers

commit dd69f65a47205eb55ac8cc2c8f3aa9f7232dc8ba
Author: Nico Kruber 
Date:   2016-11-28T10:42:13Z

[FLINK-5129] restore non-HA mode unique directory setup in the blob server 
and cache

If not in high availability mode, local (and now also distributed) file 
systems
again try to set up a unique directory structure so that other instances 
with
the same configuration file or storage path do not interfere.

This was lost in 8b9c7d9fd6.

commit 76ccc9ffaaa63d6e0bd55ba7f6c08f8c1cff98cb
Author: Nico Kruber