Matthew Jacobs has posted comments on this change. Change subject: IMPALA-5286: Kudu column name case cleanup ......................................................................
Patch Set 2: (6 comments) Is all of the special casing for "kudu case" and "impala case" in code that doesn't call directly into Kudu w/ col/table names really necessary? Why can't we just lower case everything and have a fn which maps case-insensitive names to case sensitive names. Clearly that wouldn't work when there are col names that match in a case-insensitive comparison (e.g. COL and Col), but we should refuse to load those tables anyway. I know that is technically a separate JIRA but addressing that here may simplify this problem. Maybe I'm still missing something? http://gerrit.cloudera.org:8080/#/c/6902/2//COMMIT_MSG Commit Message: PS2, Line 16: general code unclear what this is PS2, Line 17: which fixes a problem where the Analyzer would create : two SlotDescriptors that point to the same column because : registerSlotRef() was being called with inconsistent casing : when ordering on the column. Can you start this bullet with the problem, and perhaps a bit more detail about how this might happen, e.g. occurs when a col created externally with upper-cased characters is referenced by an order by expr. Then discuss the fix for this problem. PS2, Line 20: It also exposes a getKuduName() : that returns the name in Kudu casing, for use by Kudu specific : code. Nice. How can we make it more clear in the future that this needs to be used when calling into Kudu w/ the col name? Maybe I'll have some more ideas after reading the rest of the patch. PS2, Line 23: K what's the issue? Helpful to state that first otherwise it's not clear what we're fixing. Line 31: - Manually edited functional_kudu to change column names to have nice. Re my question earlier about how to catch issues with casing in the future, maybe this is enough. http://gerrit.cloudera.org:8080/#/c/6902/2/common/thrift/CatalogObjects.thrift File common/thrift/CatalogObjects.thrift: Line 193: // The column name. For Kudu columns, this will be in Kudu case. It's a shame we have to expose a new concept in non-Kudu-specific structs. Let's see if there's some way we can hide this problem. I'll revisit this after looking through the rest of the code. -- To view, visit http://gerrit.cloudera.org:8080/6902 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I14aba88510012174716691b9946e1c7d54d01b44 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Thomas Tauber-Marshall <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Matthew Jacobs <[email protected]> Gerrit-Reviewer: Thomas Tauber-Marshall <[email protected]> Gerrit-HasComments: Yes
