[jira] [Commented] (SPARK-30186) support Dynamic Partition Pruning in Adaptive Execution

2021-05-10 Thread weixiuli (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17342266#comment-17342266
 ] 

weixiuli commented on SPARK-30186:
--

https://github.com/apache/spark/pull/31941

> support Dynamic Partition Pruning in Adaptive Execution
> ---
>
> Key: SPARK-30186
> URL: https://issues.apache.org/jira/browse/SPARK-30186
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Affects Versions: 3.1.0
>Reporter: Xiaoju Wu
>Priority: Major
>
> Currently Adaptive Execution cannot work if Dynamic Partition Pruning is 
> applied.
> private def supportAdaptive(plan: SparkPlan): Boolean = {
>  // TODO migrate dynamic-partition-pruning onto adaptive execution.
>  sanityCheck(plan) &&
>  !plan.logicalLink.exists(_.isStreaming) &&
>  
> *!plan.expressions.exists(_.find(_.isInstanceOf[DynamicPruningSubquery]).isDefined)*
>  &&
>  plan.children.forall(supportAdaptive)
> }
> It means we cannot benefit the performance from both AE and DPP.
> This ticket is target to make DPP + AE works together.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30186) support Dynamic Partition Pruning in Adaptive Execution

2020-10-01 Thread Yuming Wang (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17205606#comment-17205606
 ] 

Yuming Wang commented on SPARK-30186:
-

For our internal tpcds q77, enable AQE and enable DPP cannot work properly:
{code:sql}
WITH ss AS
(
 SELECT   s_store_sk,
  Sum(ss_ext_sales_price) AS sales,
  Sum(ss_net_profit)  AS profit
 FROM store_sales,
  date_dim,
  store
 WHEREss_sold_date_sk = d_date_sk
 AND  d_date BETWEEN Cast('2000-08-23' AS DATE) AND  (
   Cast('2000-08-23' AS DATE) + interval '30' day)
 AND  ss_store_sk = s_store_sk
 GROUP BY s_store_sk), sr AS
(
 SELECT   s_store_sk,
  sum(sr_return_amt) AS returns,
  sum(sr_net_loss)   AS profit_loss
 FROM store_returns,
  date_dim,
  store
 WHEREsr_returned_date_sk = d_date_sk
 AND  d_date BETWEEN cast('2000-08-23' AS date) AND  (
   cast('2000-08-23' AS date) + interval '30' day)
 AND  sr_store_sk = s_store_sk
 GROUP BY s_store_sk), cs AS
(
 SELECT   cs_call_center_sk,
  sum(cs_ext_sales_price) AS sales,
  sum(cs_net_profit)  AS profit
 FROM catalog_sales,
  date_dim
 WHEREcs_sold_date_sk = d_date_sk
 AND  d_date BETWEEN cast('2000-08-23' AS date) AND  (
   cast('2000-08-23' AS date) + interval '30' day)
 GROUP BY cs_call_center_sk), cr AS
(
 SELECT   cr_call_center_sk,
  sum(cr_return_amount) AS returns,
  sum(cr_net_loss)  AS profit_loss
 FROM catalog_returns,
  date_dim
 WHEREcr_returned_date_sk = d_date_sk
 AND  d_date BETWEEN cast('2000-08-23' AS date) AND  (
   cast('2000-08-23' AS date) + interval '30' day)
 GROUP BY cr_call_center_sk), ws AS
(
 SELECT   wp_web_page_sk,
  sum(ws_ext_sales_price) AS sales,
  sum(ws_net_profit)  AS profit
 FROM web_sales,
  date_dim,
  web_page
 WHEREws_sold_date_sk = d_date_sk
 AND  d_date BETWEEN cast('2000-08-23' AS date) AND  (
   cast('2000-08-23' AS date) + interval '30' day)
 AND  ws_web_page_sk = wp_web_page_sk
 GROUP BY wp_web_page_sk), wr AS
(
 SELECT   wp_web_page_sk,
  sum(wr_return_amt) AS returns,
  sum(wr_net_loss)   AS profit_loss
 FROM web_returns,
  date_dim,
  web_page
 WHEREwr_returned_date_sk = d_date_sk
 AND  d_date BETWEEN cast('2000-08-23' AS date) AND  (
   cast('2000-08-23' AS date) + interval '30' day)
 AND  wr_web_page_sk = wp_web_page_sk
 GROUP BY wp_web_page_sk)
SELECT   channel,
 id,
 sum(sales)   AS sales,
 sum(returns) AS returns,
 sum(profit)  AS profit
FROM (
   SELECT'store channel' AS channel,
 ss.s_store_sk   AS id,
 sales,
 COALESCE(returns, 0)   AS returns,
 (profit - COALESCE(profit_loss,0)) AS profit
   FROM  ss
   LEFT JOIN sr
   ONss.s_store_sk = sr.s_store_sk
   UNION ALL
   SELECT 'catalog channel' AS channel,
  cs_call_center_sk AS id,
  sales,
  returns,
  (profit - profit_loss) AS profit
   FROM   cs
   CROSS JOIN cr
   UNION ALL
   SELECT'web channel' AS channel,
 ws.wp_web_page_sk AS id,
 sales,
 COALESCE(returns, 0)  returns,
 (profit - COALESCE(profit_loss,0)) AS profit
   FROM  ws
   LEFT JOIN wr
   ONws.wp_web_page_sk = wr.wp_web_page_sk ) x
GROUP BY rollup(channel, id)
ORDER BY channel,
 id limit 100
{code}


> support Dynamic Partition Pruning in Adaptive Execution
> ---
>
> Key: SPARK-30186
> URL: https://issues.apache.org/jira/browse/SPARK-30186
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL