[
https://issues.apache.org/jira/browse/TEZ-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rajesh Balamohan updated TEZ-2496:
----------------------------------
Attachment: (was: TEZ-2496.9.branch-0.7.patch)
> Consider scheduling tasks in ShuffleVertexManager based on the partition
> sizes from the source
> ----------------------------------------------------------------------------------------------
>
> Key: TEZ-2496
> URL: https://issues.apache.org/jira/browse/TEZ-2496
> Project: Apache Tez
> Issue Type: Improvement
> Reporter: Rajesh Balamohan
> Assignee: Rajesh Balamohan
> Fix For: 0.8.0-alpha
>
> Attachments: TEZ-2496.1.patch, TEZ-2496.2.patch, TEZ-2496.3.patch,
> TEZ-2496.4.patch, TEZ-2496.5.patch, TEZ-2496.6.patch, TEZ-2496.7.patch,
> TEZ-2496.8.patch, TEZ-2496.8.patch, TEZ-2496.9.addendum.patch,
> TEZ-2496.9.branch-0.7.patch, TEZ-2496.9.patch
>
>
> Consider scheduling tasks in ShuffleVertexManager based on the partition
> sizes from the source. This would be helpful in scenarios, where there is
> limited resources (or concurrent jobs running or multiple waves) with
> dataskew and the task which gets large amount of data gets sceheduled much
> later.
> e.g Consider the following hive query running in a queue with limited
> capacity (42 slots in total) @ 200 GB scale
> {noformat}
> CREATE TEMPORARY TABLE sampleData AS
> SELECT CASE
> WHEN ss_sold_time_sk IS NULL THEN 70429
> ELSE ss_sold_time_sk
> END AS ss_sold_time_sk,
> ss_item_sk,
> ss_customer_sk,
> ss_cdemo_sk,
> ss_hdemo_sk,
> ss_addr_sk,
> ss_store_sk,
> ss_promo_sk,
> ss_ticket_number,
> ss_quantity,
> ss_wholesale_cost,
> ss_list_price,
> ss_sales_price,
> ss_ext_discount_amt,
> ss_ext_sales_price,
> ss_ext_wholesale_cost,
> ss_ext_list_price,
> ss_ext_tax,
> ss_coupon_amt,
> ss_net_paid,
> ss_net_paid_inc_tax,
> ss_net_profit,
> ss_sold_date_sk
> FROM store_sales distribute by ss_sold_time_sk;
> {noformat}
> This generated 39 maps and 134 reduce slots (3 reduce waves). When lots of
> nulls are there for ss_sold_time_sk, it would tend to have data skew towards
> 70429. If the reducer which gets this data gets scheduled much earlier (i.e
> in first wave itself), entire job would finish fast.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)