[ 
https://issues.apache.org/jira/browse/BEAM-2211?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16001736#comment-16001736
 ] 

ASF GitHub Bot commented on BEAM-2211:
--------------------------------------

GitHub user dhalperi opened a pull request:

    https://github.com/apache/beam/pull/2972

    [BEAM-2211] Move PathValidator into GCP-Core

    For now, this is not a Beam concept.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dhalperi/beam b2211-path-validator

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/2972.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2972
    
----
commit cd6dc1c4cfc940095ee1dd4c7c9d6a080d425d25
Author: Dan Halperin <[email protected]>
Date:   2017-05-08T22:37:29Z

    [BEAM-2211] DataflowRunner: remove validation of file read/write paths
    
    Now that users can implement and register custom FileSystems,
    we can no longer really effectively validate filesystems they
    can read or write files from. They can even register file://
    to point to some HDFS path, e.g.,

commit b75596b2b75404ed695ec49bb9793b4b1048129e
Author: Dan Halperin <[email protected]>
Date:   2017-05-08T23:01:40Z

    [BEAM-2211] Move PathValidator into GCP-Core
    
    For now, this does not need to be a Beam concept

----


> DataflowRunner (Java) rejects all but GCS paths for FileBasedSource/Sink
> ------------------------------------------------------------------------
>
>                 Key: BEAM-2211
>                 URL: https://issues.apache.org/jira/browse/BEAM-2211
>             Project: Beam
>          Issue Type: Bug
>          Components: runner-dataflow
>            Reporter: Daniel Halperin
>            Assignee: Daniel Halperin
>             Fix For: 2.0.0
>
>
> {{FileBasedSource}} and {{Sink}} have switched in Beam to the {{FileSystems}} 
> API from the the {{IOChannelUtils}} API, which means they now support HDFS 
> and GCS and others.
> However, the {{DataflowRunner}} still uses {{GcsPathValidator}}, which means 
> it will likely currently disallow HDFS and other new {{FileSystem}} 
> implementations.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to