Gabor Kaszab has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/17847 )

Change subject: IMPALA-10838: Error when struct returned from WITH()
......................................................................


Patch Set 9:

(20 comments)

http://gerrit.cloudera.org:8080/#/c/17847/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17847/9//COMMIT_MSG@7
PS9, Line 7: IMPALA-10838: Error when struct returned from WITH()
I wouldn't say I understand everything that happens in ExprSubstitutionMap but 
left some comments.
Nice work to fill this feature gap, anyway!


http://gerrit.cloudera.org:8080/#/c/17847/9//COMMIT_MSG@29
PS9, Line 29: in expression substitutions (in
            :     org.apache.impala.analysis.Expr.substituteImpl), if there is 
no
            :     substitution mapping for a field of a struct but there is a 
mapping
            :     for an enclosing struct, find the mapping for that enclosing 
struct
            :     and from its subexpressions use the one corresponding to the
            :     original expression for substitution
If I'm not mistaken with this approach every time you unsuccessfully try to 
substitute a field of a struct you then find the enclosing field and then try 
the substitution from that angle. Wouldn't it be more performant to populate 
the substitute map in a way to hold the fields of a struct from the beginning?


http://gerrit.cloudera.org:8080/#/c/17847/9//COMMIT_MSG@44
PS9, Line 44: Tests
Would it make sense to somehow assert on the 'tuple-size' in the scan node? (in 
a planner test maybe?) This way we could see if querying a struct member 
actually reads the whole struct or not.


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/Expr.java@1148
PS9, Line 1148: SlotRef overrides this method
              :     // so we need no check here.
This part in my opinion is only relevant when the reader knew that here was an 
'if' that was removed. Otherwise this sentence just brings confusion.

In case ResetAnalysisState() is overridden in SlotRef we should comment its 
behaviour here because in case it changes later on we won't know that this 
comment has to be rewritten as well.


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/Expr.java@1154
PS9, Line 1154:   protected final void 
callSubstituteImplOnChildren(ExprSubstitutionMap smap,
This doesn't add much just extracts the for loop into a different function. I 
don't feel that this brings more readability and seems an overkill at first 
sight.


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
File fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java:

http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@106
PS9, Line 106:     SlotRef enclosingStruct = findEnclosingStruct(structField, 
remainingPath);
If you read just this 4 lines of code you would have no idea what 
'remainingPath' is when you perform a findEnclosingStruct() function. This 
hurts readability in my opinion.
I think the mai reason of this is that findEnclosingStruct() is responsible not 
just for finding something but for returning something that is not really 
relevant taking into account the function name.


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@106
PS9, Line 106: structField
Do we know that 'structField' is in fact a field of a struct? Or do we perform 
all these substitution magic every time if there is a slot that is not found in 
the substitution map?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@120
PS9, Line 120:     // if (getDesc().getParent().getParentSlotDesc() == null) 
return null;
This stinks for me. In case structField is indeed a field of a struct then it's 
parent should be the itemTupleDesc inside a struct slot descriptor.


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@124
PS9, Line 124:       final Expr e = getLhs().get(i);
This 'e' is expected to be a struct, right? I'd do that check here and do a 
continue of it isn't.
And below you can make a removeStructPathPrefixFromStructField() or something 
similar.


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@130
PS9, Line 130:         // TODO: Is it always true?
In general, could you please put some effort to eliminate the new TODOs you 
introduced? Or are those questions towards the reviewer?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@141
PS9, Line 141: checkConditionsAndRemovePrefix
Is there a way to be more specific with the name. "checkConditions" could mean 
anything.
Additionally, from the name it's not obvious what prefix you remove from where.


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@148
PS9, Line 148:     if (refPath == null) return null;
if 'expr'/'ref' is analyzed is there a case when it doesn't have a resolvedPath?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@153
PS9, Line 153: SlotRef root
could you emphasise more that 'root' is a struct?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java@154
PS9, Line 154: root
Shouldn't you check if 'root' has already been analysed before calling 
getResolvedPath()? Or is this something we can assume from the above code?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/Path.java
File fe/src/main/java/org/apache/impala/analysis/Path.java:

http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/Path.java@469
PS9, Line 469: path
             :    * after removing the prefix
The return type is a list of strings so I guess this function returns a 'raw 
path'. Could you emphasise this face in the comment?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
File fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java:

http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@182
PS9, Line 182:   public void getEnclosingSlotAndTupleDescs(List<SlotDescriptor> 
slotDescs,
Is this just a helper function used within SlotDescriptor? I haven't found any 
other usages. Could it be private then?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/SlotRef.java
File fe/src/main/java/org/apache/impala/analysis/SlotRef.java:

http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@285
PS9, Line 285:   public boolean hasDesc() {
nit: should fit into a single line


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@409
PS9, Line 409:       // TODO: Is it possible that we don't YET have 
itemTupleDesc?
Could you dig deeper a bit to eliminate this TODO?


http://gerrit.cloudera.org:8080/#/c/17847/9/fe/src/main/java/org/apache/impala/analysis/SlotRef.java@410
PS9, Line 410:       // Preconditions.checkState(itemTupleDesc != null);
nit: commented code


http://gerrit.cloudera.org:8080/#/c/17847/9/testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test
File 
testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test:

http://gerrit.cloudera.org:8080/#/c/17847/9/testdata/workloads/functional-query/queries/QueryTest/nested-struct-in-select-list.test@260
PS9, Line 260: ====
I wonder if providing an inline view in a different way would also trigger this 
code. E.g.:
SELECT id, struct_name.field_name FROM (SELECT id, struct_name FROM tbl) v
WHERE struct_name.other_field_name;



--
To view, visit http://gerrit.cloudera.org:8080/17847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iadb9233677355b85d424cc3f22b00b5a3bf61c57
Gerrit-Change-Number: 17847
Gerrit-PatchSet: 9
Gerrit-Owner: Daniel Becker <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Gabor Kaszab <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Wed, 19 Jan 2022 10:19:32 +0000
Gerrit-HasComments: Yes

Reply via email to