Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5286: Kudu column name case cleanup
......................................................................


Patch Set 2:

(6 comments)

Is all of the special casing for "kudu case" and "impala case" in code that 
doesn't call directly into Kudu w/ col/table names really necessary?

Why can't we just lower case everything and have a fn which maps 
case-insensitive names to case sensitive names. Clearly that wouldn't work when 
there are col names that match in a case-insensitive comparison (e.g. COL and 
Col), but we should refuse to load those tables anyway. I know that is 
technically a separate JIRA but addressing that here may simplify this problem. 
Maybe I'm still missing something?

http://gerrit.cloudera.org:8080/#/c/6902/2//COMMIT_MSG
Commit Message:

PS2, Line 16: general code
unclear what this is


PS2, Line 17: which fixes a problem where the Analyzer would create
            :   two SlotDescriptors that point to the same column because
            :   registerSlotRef() was being called with inconsistent casing
            :   when ordering on the column.
Can you start this bullet with the problem, and perhaps a bit more detail about 
how this might happen, e.g. occurs when a col created externally with 
upper-cased characters is referenced by an order by expr. Then discuss the fix 
for this problem.


PS2, Line 20: It also exposes a getKuduName()
            :   that returns the name in Kudu casing, for use by Kudu specific
            :   code.
Nice. How can we make it more clear in the future that this needs to be used 
when calling into Kudu w/ the col name? Maybe I'll have some more ideas after 
reading the rest of the patch.


PS2, Line 23:  K
what's the issue? Helpful to state that first otherwise it's not clear what 
we're fixing.


Line 31: - Manually edited functional_kudu to change column names to have
nice. Re my question earlier about how to catch issues with casing in the 
future, maybe this is enough.


http://gerrit.cloudera.org:8080/#/c/6902/2/common/thrift/CatalogObjects.thrift
File common/thrift/CatalogObjects.thrift:

Line 193:   // The column name. For Kudu columns, this will be in Kudu case.
It's a shame we have to expose a new concept in non-Kudu-specific structs. 
Let's see if there's some way we can hide this problem. I'll revisit this after 
looking through the rest of the code.


-- 
To view, visit http://gerrit.cloudera.org:8080/6902
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I14aba88510012174716691b9946e1c7d54d01b44
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Thomas Tauber-Marshall <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Matthew Jacobs <[email protected]>
Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]>
Gerrit-HasComments: Yes

Reply via email to