Alex Behm has posted comments on this change. Change subject: IMPALA-3084: Cache the sequence of table ref and materialized tuple ids during analysis. ......................................................................
Patch Set 1: (8 comments) http://gerrit.cloudera.org:8080/#/c/2415/1/fe/src/main/java/com/cloudera/impala/analysis/TableRef.java File fe/src/main/java/com/cloudera/impala/analysis/TableRef.java: Line 111: // Lists of table ref ids and materialized tuple ids of the full sequence of table > we talked about separating the tblref abstraction into a) a logical concept The purpose of caching the table refs ids here is to determine which tables originally appeared to the left of a certain table during plan generation. I'm thinking that with the table expr concept this caching will become unnecessary because the information would be captured in the table expr tree. I think the plan generation from table exprs would basically make this bug impossible, but maybe I'm missing something. Line 365: public List<TupleId> getAllTupleIds() { > if you want to make the distinction between tuple and tblref ids you should Done Line 444: * table ref ids and materialized tuple ids. > reference member names directly. Done Line 449: Preconditions.checkState(desc_ != null); > this depends on analyzeJoin() having been called for leftTblRef; checkState Done Line 459: if (leftTblRef_ != null) { > interleave so you only need one if-stmt. Done http://gerrit.cloudera.org:8080/#/c/2415/1/fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java File fe/src/main/java/com/cloudera/impala/planner/SingleNodePlanner.java: Line 1389: List<SlotId> lhsSlotIds = analyzer.getEquivSlots(slotDesc.getId(), lhsTblRefIds); > (rhsSid, ... Done Line 1393: // eliminate rows that would have been non-matches of an outer or anti join. > this seems to make assumptions about how the returned predicates are being The comment is explaining why a uni-directional value transfer is not sufficient. Note that an equivalence classes contains slots that have 'any' value transfer, not necessarily a mutual one, so we need to check if here explicitly. Added a comment to lhsSlotIds. http://gerrit.cloudera.org:8080/#/c/2415/1/testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test File testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test: Line 1540: partitions=1/1 files=4 size=554.13MB > do you know what this is? I reverted these changes. We ignore the file size anyway when doing the actual/expected comparison. We sometimes see slightly different files after upgrading Hive, etc. -- To view, visit http://gerrit.cloudera.org:8080/2415 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I298b8695c9f26644a395ca9f0e86040e3f5f3846 Gerrit-PatchSet: 1 Gerrit-Project: Impala Gerrit-Branch: cdh5-trunk Gerrit-Owner: Alex Behm <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-HasComments: Yes
