Todd Lipcon has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/13091 )

Change subject: [acid] Predicate to test if dir must be included.
......................................................................


Patch Set 2:

(13 comments)

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java
File fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java:

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@73
PS2, Line 73: pathPredicate
Can you clarify in the javadocs whether this predicate is applied to all paths 
or just to directories (to limit recursion)? Also, why use String here instead 
of Path? If using String we should also clarify what the string is (full URL? 
just reltaive path name to the part dir?)


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/FileMetadataLoader.java@121
PS2, Line 121:         fileStatuses = FileSystemUtil.listFiles(fs, partDir_, 
recursive_);
I'd think we'd need the predicate here too.


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
File fe/src/main/java/org/apache/impala/catalog/HdfsTable.java:

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@485
PS2, Line 485:   public void initializePartitionMetadata(
was this necessary>?


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@607
PS2, Line 607:    * TODO:Sudhanshu: Move this to interface and then use this 
implementation in BaseTableRef
yea FeFsTable.Utils already has some similar stuff (it's in that Utils class 
because it predcates java 8 default interfaces)


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@616
PS2, Line 616: "true"
perhaps we should be using Boolean.parseValue here? Maybe we need a tolower as 
well? is 'TRUE' supported?


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1562
PS2, Line 1562:     Preconditions.checkNotNull(hdfsBaseDir_, "HdfsTable base 
dir is null");
how did we get to this state?


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java@1726
PS2, Line 1726:   public FileSystemUtil.FsType getFsType() {
seems to have been pulled into your patch by mistake?


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/catalog/local/DirectMetaProvider.java@293
PS2, Line 293:         //TODO:Sudhanshu: Understand how DirectMetadataProvider 
is used.
sadly it's not used right now. Basically dead code from a half-finished attempt 
to excise catalogd. Perhaps we should remove it for the time being.


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java
File fe/src/main/java/org/apache/impala/common/FileSystemUtil.java:

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@525
PS2, Line 525:           return listFiles(fs, p, recursive);
We need to postprocess this result to evaluate the predicate, right?


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/main/java/org/apache/impala/common/FileSystemUtil.java@603
PS2, Line 603:       boolean includeDir = 
predicate_.test(fileStatus.getPath().toString());
We should clarify throughout that the predicate only applies to limit recursion 
and does not actually limit the returned _files_.


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java
File fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java:

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/test/java/org/apache/impala/catalog/FileMetadataLoaderTest.java@45
PS2, Line 45:         /* oldFds = */Collections.emptyList(), hostIndex, 
x->true);
maybe add a new case to this test that passes a predicate to exclude one of the 
partitions or something?


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/test/java/org/apache/impala/common/RecursingIteratorTest.java
File fe/src/test/java/org/apache/impala/common/RecursingIteratorTest.java:

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/test/java/org/apache/impala/common/RecursingIteratorTest.java@46
PS2, Line 46:   @Mock
impressive mocking, but I wonder whether it isn't simpler to just change one of 
the other tests that actually runs against the real filesystem. It's still 
relatively fast to run, and we always assume that unit tests run in an 
envronment with FileSystem set up


http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/test/java/org/apache/impala/planner/ExplainTest.java
File fe/src/test/java/org/apache/impala/planner/ExplainTest.java:

http://gerrit.cloudera.org:8080/#/c/13091/2/fe/src/test/java/org/apache/impala/planner/ExplainTest.java@59
PS2, Line 59: public class ExplainTest extends FrontendTestBase {
somehow this got pulled into your patch?



--
To view, visit http://gerrit.cloudera.org:8080/13091
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0e88281d277127c9499d37b95fbba55dcc7761c
Gerrit-Change-Number: 13091
Gerrit-PatchSet: 2
Gerrit-Owner: Sudhanshu Arora <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Vihang Karajgaonkar <[email protected]>
Gerrit-Reviewer: Yongzhi Chen <[email protected]>
Gerrit-Comment-Date: Fri, 26 Apr 2019 06:04:18 +0000
Gerrit-HasComments: Yes

Reply via email to