[ 
https://issues.apache.org/jira/browse/BEAM-7854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomer Zeltzer updated BEAM-7854:
--------------------------------
    Description: 
Folder structure:   
{code:java}
A
    B
        a=100
            data1
                file1.zst
                file2.zst 
        a=999 
            data2
                file6.zst
        a=397
            data3
                file7.zst{code}
 

Glob:

 
{code:java}
/A/B/a=[0-9][0-9][0-9]/*/*{code}
Code:  

 
{code:java}
input.apply(Create.of(patterns))
     .apply("Matching patterns", FileIO.matchAll())
.apply(FileIO.readMatches());
{code}
 

input is of type PBegin.

The above code matches 0 files even though, from the glob, its clear it should 
match all files. I suspect its because of line 227, where only the first parent 
folder is checked while is could be an asterix in a glob. I believe the right 
behaviour should be to check all parent folder and use the first one that 
exists.

  was:Folder structure:   \{code:java} A     B         a=100             data1  
               file1.zst                 file2.zst         a=999             
data2                 file6.zst         a=397             data3                 
file7.zst \{code}   Glob:   \{code:java} /A/B/a=[0-9][0-9][0-9]/*/*\{code}   
Code:   \{code:java}   input.apply(Create.of(patterns)) .apply("Matching 
patterns", FileIO.matchAll()) .apply(FileIO.readMatches()); \{code}  input is 
of type PBegin. The above code matches 0 files even though, from the glob, its 
clear it should match all files. I suspect its because of line 227, where only 
the first parent folder is checked while is could be an asterix in a glob. I 
believe the right behaviour should be to check all parent folder and use the 
first one that exists.


> Reading files from local file system does not fully support glob
> ----------------------------------------------------------------
>
>                 Key: BEAM-7854
>                 URL: https://issues.apache.org/jira/browse/BEAM-7854
>             Project: Beam
>          Issue Type: Bug
>          Components: sdk-java-core
>            Reporter: Tomer Zeltzer
>            Priority: Major
>
> Folder structure:   
> {code:java}
> A
>     B
>         a=100
>             data1
>                 file1.zst
>                 file2.zst 
>         a=999 
>             data2
>                 file6.zst
>         a=397
>             data3
>                 file7.zst{code}
>  
> Glob:
>  
> {code:java}
> /A/B/a=[0-9][0-9][0-9]/*/*{code}
> Code:  
>  
> {code:java}
> input.apply(Create.of(patterns))
>      .apply("Matching patterns", FileIO.matchAll())
> .apply(FileIO.readMatches());
> {code}
>  
> input is of type PBegin.
> The above code matches 0 files even though, from the glob, its clear it 
> should match all files. I suspect its because of line 227, where only the 
> first parent folder is checked while is could be an asterix in a glob. I 
> believe the right behaviour should be to check all parent folder and use the 
> first one that exists.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to