Abhishek Rawat has posted comments on this change. ( http://gerrit.cloudera.org:8080/13193 )
Change subject: IMPALA-7293: Show tuple layout in explain plan. ...................................................................... Patch Set 3: (10 comments) I have addressed the review comments. Patch #4 makes the necessary code changes for proper explain layout. I have however only updated the "tuple-layout.test" for now. Changing all the tests is rather extensive and I want to first get input on the layout in patch #4. Once we agree on a proper layout I can make the changes in all test units. http://gerrit.cloudera.org:8080/#/c/13193/3/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java File fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java: http://gerrit.cloudera.org:8080/#/c/13193/3/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@76 PS3, Line 76: public static final OffsetComparator OffsetComparator_ = new OffsetComparator(); > you can probably do: Done http://gerrit.cloudera.org:8080/#/c/13193/3/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@331 PS3, Line 331: expBuilder.append("Slot " + getId() + " "); > Consider using Objects.toStringHelper, eg: Decided to use a new format where we have a tuple header {Tuple=<id>, byteSize=<bytes>, path=<tuple_path>, Slot format=<offset, nullable, type, pth>} The slot field names which we were earlier printing for every slot are now only printed once per tuple in the tuple header. http://gerrit.cloudera.org:8080/#/c/13193/3/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@333 PS3, Line 333: getType > Would be better to call toSql() or whatever other method explicitly instead I have added a getExplainString() for the Type class. http://gerrit.cloudera.org:8080/#/c/13193/3/fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java@334 PS3, Line 334: expBuilder.append("path=" + getPath() + " "); > Would be better to call getExplainString() or whatever other method explici Added Path::getExplainString(). http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test File testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test: http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test@78 PS3, Line 78: Tuple 0: > Might be worth including the tuple byte size in the header? It's otherwise Done http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test@79 PS3, Line 79: Slot 0 offset=0 type=ARRAY<STRUCT<o_orderkey:BIGINT,o_orderstatus:STRING,o_totalprice:DECIMAL(12,2),o_orderdate:STRING,o_orderpriority:STRING,o_clerk:STRING,o_shippriority:INT,o_comment:STRING,o_lineitems:ARRAY<STRUCT<l_partkey:BIGINT,l_suppkey:BIGINT,l_linenumber:INT,l_quantity:DECIMAL(12,2),l_extendedprice:DECIMAL(12,2),l_discount:DECIMAL(12,2),l_tax:DECIMAL(12,2),l_returnflag:STRING,l_linestatus:STRING,l_shipdate:STRING,l_commitdate:STRING,l_receiptdate:STRING,l_shipinstruct:STRING,l_shipmode:STRING,l_comment:STRING>>>> path=c.c_orders nullable=true > As discussed offline, I think we want to avoid including the type of the ne Nested/Complex Structs are now being represented as STRUCT<...> http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test@80 PS3, Line 80: Slot 2 offset=12 type=BIGINT path=c.c_custkey nullable=true > Maybe we can drop the "Slot <x>" prefix to make it a little denser? I don't I ended up dropping Slot <x> and other filed names. The tuple header is also changed as follows: Tuple=<id> byteSize=<sz> path=<tuple_path> Slot Format=<offset nullable type relative-path> The "path.relative-path" returns the proper path for a slot. This was done because for most slots the path had a common prefix (TupleDescriptor's path) http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test File testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test: http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test@53 PS3, Line 53: Slot 10 offset=44 type=INT path=tpcds.store_sales.ss_quantity nullable=true : Slot 11 offset=48 type=DECIMAL(7,2) path=tpcds.store_sales.ss_wholesale_cost nullable=true > These tables could be nicer to look at if the columns would be aligned with Good point. I added padding for better readability and moved the repeated filed names to Tuple's header. Also moved path to the end. Please look at the new changes in "tuple-layout.test" http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test@55 PS3, Line 55: true > Maybe replace true/false with 1/0 just to make it shorter? Done http://gerrit.cloudera.org:8080/#/c/13193/3/testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test@56 PS3, Line 56: type= > "type=" seems redundant to me - the SQL types with capital letters are reco Done. Also the other field names were also redundant and so moved them to the Tuple's header. -- To view, visit http://gerrit.cloudera.org:8080/13193 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iac38290217a219e75400e4206835992a7ea31dba Gerrit-Change-Number: 13193 Gerrit-PatchSet: 3 Gerrit-Owner: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Todd Lipcon <t...@apache.org> Gerrit-Comment-Date: Wed, 08 May 2019 00:27:24 +0000 Gerrit-HasComments: Yes