[ 
https://issues.apache.org/jira/browse/BEAM-2005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15989683#comment-15989683
 ] 

ASF GitHub Bot commented on BEAM-2005:
--------------------------------------

GitHub user lukecwik opened a pull request:

    https://github.com/apache/beam/pull/2776

    [BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base 
implementation of HadoopFileSystem.

    TODO:
    * Add multiplexing FileSystem that is able to route requests based upon the 
base URI when configured for multiple file systems.
    * Take a look at copy/rename again to see if we can do better than moving 
all the bytes through the local machine.
    
    Be sure to do all of the following to help us incorporate your contribution
    quickly and easily:
    
     - [x] Make sure the PR title is formatted like:
       `[BEAM-<Jira issue #>] Description of pull request`
     - [x] Make sure tests pass via `mvn clean verify`. (Even better, enable
           Travis-CI on your fork and ensure the whole test matrix passes).
     - [x] Replace `<Jira issue #>` in the title with the actual Jira issue
           number, if there is one.
     - [x] If this contribution is large, please file an Apache
           [Individual Contributor License 
Agreement](https://www.apache.org/licenses/icla.pdf).
    
    ---


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/lukecwik/incubator-beam hdfs2

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/beam/pull/2776.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2776
    
----
commit ecfb534c79cb38bdf378dffe25d6a7ee3e20e5c6
Author: Luke Cwik <lc...@google.com>
Date:   2017-04-29T01:37:03Z

    [BEAM-2005, BEAM-2030, BEAM-2031, BEAM-2032, BEAM-2033, BEAM-2070] Base 
implementation of HDFS.
    
    TODO:
    * Add multiplexing FileSystem that is able to route requests based upon the 
base URI when configured for multiple file systems.
    * Take a look at copy/rename again to see if we can do better than moving 
all the bytes through the local machine.

----


> Add a Hadoop FileSystem implementation of Beam's FileSystem
> -----------------------------------------------------------
>
>                 Key: BEAM-2005
>                 URL: https://issues.apache.org/jira/browse/BEAM-2005
>             Project: Beam
>          Issue Type: New Feature
>          Components: sdk-java-extensions
>            Reporter: Stephen Sisk
>            Assignee: Luke Cwik
>             Fix For: First stable release
>
>
> Beam's FileSystem creates an abstraction for reading from files in many 
> different places. 
> We should add a Hadoop FileSystem implementation 
> (https://hadoop.apache.org/docs/r2.8.0/api/org/apache/hadoop/fs/FileSystem.html)
>  - that would enable us to read from any file system that implements 
> FileSystem (including HDFS, azure, s3, etc..)
> I'm investigating this now.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to