Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21927 )
Change subject: IMPALA-13445: Ignore num partition for unpartitioned writes ...................................................................... Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/21927/4/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java File fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java: http://gerrit.cloudera.org:8080/#/c/21927/4/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java@446 PS4, Line 446: totalNumPartitions > 0 > Static partition INSERTs are probably not uncommon, so I think it's worth s Done. Tested with planner test creating table customer_address_1_huge_part. http://gerrit.cloudera.org:8080/#/c/21927/4/fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java@450 PS4, Line 450: Math. > nit: this cast is unnecessary Done http://gerrit.cloudera.org:8080/#/c/21927/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test File testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test: http://gerrit.cloudera.org:8080/#/c/21927/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test@122 PS4, Line 122: ---- PARALLELPLANS : Max Per-Host Resource Reservation: Memory=168.00MB Threads=24 : Per-Host Resource Estimates: Memory=78.24GB : F01:PLAN FRAGMENT [HASH(tpcds_partitioned_parquet_snap.store_sales.ss_sold_date_sk)] hosts=10 instances=120 : | Per-Instance Resources: mem-estimate=6.46GB mem-reservation=6.00MB thread-reservation=1 : | max-parallelism=120 segment-costs=[44372981821, 97047706322] : WRITE TO HDFS [tpcds_partitioned_parquet_snap.store_sales_duplicate, OVERWRITE=false, PARTITION-KEYS=(ss_sold_date_sk)] : | output exprs: ss_sold_time_sk, ss_item_sk, ss_customer_sk, ss_cdemo_sk, ss_hdemo_sk, ss_addr_sk, ss_store_sk, ss_promo_sk, ss_ticket_number, ss_quantity, ss_wholesale_cost, ss_list_price, ss_sales_price, ss_ext_discount_amt, ss_ext_sales_price, ss_ext_wholesale_cost, ss_ext_list_price, ss_ext_tax, ss_coupon_amt, ss_net_paid, ss_net_paid_inc_tax, ss_net_profit, ss_sold_date_sk : | mem-estimate=100.00KB mem-reservation=0B thread-reservation=0 cost=97047706322 : | : 02:SORT : | order by: ss_sold_date_sk ASC NULLS LAST : | mem-estimate=6.44GB mem-reservation=6.00MB spill-buffer=2.00MB thread-reservation=0 : | tuple-ids=2 row-size=96B cardinality=8.64G cost=39597689640 : | in pipelines: 02(GETNEXT), 00(OPEN) : | : 01:EXCHANGE [HASH(tpcds_partitioned_parquet_snap.store_sales.ss_sold_date_sk)] : | mem-estimate=21.72MB mem-reservation=0B thread-reservation=0 : | tuple-ids=0 row-size=96B cardinality=8.64G cost=4775292181 : | in pipelines: 00(GETNEXT) : | : F00:PLAN FRAGMENT [RANDOM] hosts=10 instances=120 : Per-Instance Re > I think Planner insist that I list column names when I declare "partitioned Done http://gerrit.cloudera.org:8080/#/c/21927/4/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-ddl-iceberg.test@311 PS4, Line 311: | | tuple-ids=1 row-size=0B cardinality=15.00M cost=19934990 : | | in pipelines: 01(GETNEXT) > It went that high in my dev environment and I found it strange because I do max_fragment_instances_per_node option should be the safeguard here. public int getMaxParallelismPerNode() { if (getQueryOptions().isCompute_processing_cost()) { return Math.max(getMinParallelismPerNode(), Math.min(getQueryOptions().getMax_fragment_instances_per_node(), getAvailableCoresPerNode())); } else if (getQueryOptions().getMt_dop() > 0) { return getQueryOptions().getMt_dop(); } else { return 1; } } Downstream, we fix it as default query option. I can look into bounding this further with information from executor group set config, but that will involve wider changes that worth its own patch. -- To view, visit http://gerrit.cloudera.org:8080/21927 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I51ab8fc35a5489351a88d372b28642b35449acfc Gerrit-Change-Number: 21927 Gerrit-PatchSet: 5 Gerrit-Owner: Riza Suminto <[email protected]> Gerrit-Reviewer: Abhishek Rawat <[email protected]> Gerrit-Reviewer: David Rorke <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Reviewer: Wenzhe Zhou <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 18 Oct 2024 17:14:47 +0000 Gerrit-HasComments: Yes
