[
https://issues.apache.org/jira/browse/DRILL-1805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daniel Barclay (Drill) updated DRILL-1805:
------------------------------------------
Description:
Scanning the file system for view files fails (resulting in "Table 'vv' not
found" errors) if the directory being scanned for view files contains a file
whose simple name (last pathname segment) contains a colon.
For example, the unit test method {{testDRILL_811View}} in Drill's
{{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}}
fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
The cause is that Hadoop filesystem glob-pattern-matching code
({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling
{{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file
pathname strings and relative URI-style {{Path}} strings.
The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} to
get the raw final segment of the pathname and then passes that as the second
argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} syntax)
without encoding the raw segment into a relative URI/{{Path}} string by
prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, String,
String)}} does internally).
It seems that {{glob()}} should first use Path(String, String, String) to
handle that encoding and then call {{Path.Path(Path, Path)}}.
Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.
was:
Scanning the file system for view files fails (resulting in "Table 'vv' not
found" errors) if the directory being scanned for view files contains a file
whose simple name (last pathname segment) contains a colon.
For example, the unit test method {{testDRILL_811View}} in Drill's
{{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}}
fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
The cause is that Hadoop filesystem glob-pattern-matching code
({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling
{{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file
pathname strings and relative URI-style {{Path}} strings. [Note: Changes to
isolate tests from each other might su
The problem seems to be where {{glob()}} calls {{child.getPath().getName()}} to
get the raw final segment of the pathname and passes that as the second
argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}} syntax)
without encoding the raw segment into a relative URI/{{Path}} string by
prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String, String,
String)}} does internally).
Action items:
1) Report Hadoop bug to Hadoop.
2) Review Drill's handling and propagation of the error.
> view not found if view file directory contains child with colon in simple name
> ------------------------------------------------------------------------------
>
> Key: DRILL-1805
> URL: https://issues.apache.org/jira/browse/DRILL-1805
> Project: Apache Drill
> Issue Type: Bug
> Reporter: Daniel Barclay (Drill)
> Priority: Minor
> Fix For: Future
>
>
> Scanning the file system for view files fails (resulting in "Table 'vv' not
> found" errors) if the directory being scanned for view files contains a file
> whose simple name (last pathname segment) contains a colon.
> For example, the unit test method {{testDRILL_811View}} in Drill's
> {{./exec/java-exec/src/test/java/org/apache/drill/TestExampleQueries.java}}
> fails if {{/tmp}} contains a file named like "{{aptitude-root.1528:JIsVaZ}}".
> The cause is that Hadoop filesystem glob-pattern-matching code
> ({{org.apache.hadoop.fs.Globber}}'s {{glob()}}, calling
> {{org.apache.hadoop.fs.Path}}'s {{Path(Path,String)}}) mixes up relative file
> pathname strings and relative URI-style {{Path}} strings.
> The problem seems to be where {{glob()}} calls {{child.getPath().getName()}}
> to get the raw final segment of the pathname and then passes that as the
> second argument to {{Path.Path(Path, String)}} (which takes URI/{{Path}}
> syntax) without encoding the raw segment into a relative URI/{{Path}} string
> by prepending "{{./}}" because of the colon (e.g., as {{Path.Path(String,
> String, String)}} does internally).
> It seems that {{glob()}} should first use Path(String, String, String) to
> handle that encoding and then call {{Path.Path(Path, Path)}}.
> Action items:
> 1) Report Hadoop bug to Hadoop.
> 2) Review Drill's handling and propagation of the error.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)