[ 
https://issues.apache.org/jira/browse/SPARK-14946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15266274#comment-15266274
 ] 

Raymond Honderdors commented on SPARK-14946:
--------------------------------------------

version 2.0 query plan:
= Parsed Logical Plan ==
'Project [*]
+- 'Join Inner, Some(('sd.campaignid = 'c.campaign_id))
   :- 'UnresolvedRelation `pe_servingdata`, Some(sd)
   +- 'UnresolvedRelation `pe_campaigns_gzip`, Some(c)

== Analyzed Logical Plan ==
originaltime: string, pluid: string, sdg: string, type: bigint, useragent: 
string, utctime: string, diorigin: string, dbid: string, timeid: string, 
browser: string, brandid: bigint, time: string, zip: string, dma: string, 
ad_id: int, ismobile: string, privacy: string, df: string, userip: string, 
agencyid: bigint, ta: string, mb: string, advertiserid: bigint, campaignid: 
bigint, os: string, usr: string, isdefaultimg: string, isuserinit: string, 
impressiontype: string, referrer: string, city: string, masteradid: bigint, 
state: string, val: string, isclick: string, flightid: bigint, siteid: string, 
intrn: string, asset: string, sid: string, account_id: bigint, event_time: 
bigint, campaign_id: bigint, campaign_type_id: int, campaign_name: string, 
version: int, account_id: bigint
Project 
[originaltime#194,pluid#195,sdg#196,type#197L,useragent#198,utctime#199,diorigin#200,dbid#201,timeid#202,browser#203,brandid#204L,time#205,zip#206,dma#207,ad_id#208,ismobile#209,privacy#210,df#211,userip#212,agencyid#213L,ta#214,mb#215,advertiserid#216L,campaignid#217L,os#218,usr#219,isdefaultimg#220,isuserinit#221,impressiontype#222,referrer#223,city#224,masteradid#225L,state#226,val#227,isclick#228,flightid#229L,siteid#230,intrn#231,asset#232,sid#233,account_id#192L,event_time#193L,campaign_id#235L,campaign_type_id#236,campaign_name#237,version#238,account_id#234L]
+- Join Inner, Some((campaignid#217L = campaign_id#235L))
   :- SubqueryAlias sd
   :  +- 
Relation[originaltime#194,pluid#195,sdg#196,type#197L,useragent#198,utctime#199,diorigin#200,dbid#201,timeid#202,browser#203,brandid#204L,time#205,zip#206,dma#207,ad_id#208,ismobile#209,privacy#210,df#211,userip#212,agencyid#213L,ta#214,mb#215,advertiserid#216L,campaignid#217L,os#218,usr#219,isdefaultimg#220,isuserinit#221,impressiontype#222,referrer#223,city#224,masteradid#225L,state#226,val#227,isclick#228,flightid#229L,siteid#230,intrn#231,asset#232,sid#233,account_id#192L,event_time#193L]
 HadoopFiles
   +- SubqueryAlias c
      +- 
Relation[campaign_id#235L,campaign_type_id#236,campaign_name#237,version#238,account_id#234L]
 HadoopFiles

== Optimized Logical Plan ==
Join Inner, Some((campaignid#217L = campaign_id#235L))
:- Filter isnotnull(campaignid#217L)
:  +- 
Relation[originaltime#194,pluid#195,sdg#196,type#197L,useragent#198,utctime#199,diorigin#200,dbid#201,timeid#202,browser#203,brandid#204L,time#205,zip#206,dma#207,ad_id#208,ismobile#209,privacy#210,df#211,userip#212,agencyid#213L,ta#214,mb#215,advertiserid#216L,campaignid#217L,os#218,usr#219,isdefaultimg#220,isuserinit#221,impressiontype#222,referrer#223,city#224,masteradid#225L,state#226,val#227,isclick#228,flightid#229L,siteid#230,intrn#231,asset#232,sid#233,account_id#192L,event_time#193L]
 HadoopFiles
+- Filter isnotnull(campaign_id#235L)
   +- 
Relation[campaign_id#235L,campaign_type_id#236,campaign_name#237,version#238,account_id#234L]
 HadoopFiles

== Physical Plan ==
WholeStageCodegen
:  +- BroadcastHashJoin [campaignid#217L], [campaign_id#235L], Inner, 
BuildRight, None
:     :- Project 
[originaltime#194,pluid#195,sdg#196,type#197L,useragent#198,utctime#199,diorigin#200,dbid#201,timeid#202,browser#203,brandid#204L,time#205,zip#206,dma#207,ad_id#208,ismobile#209,privacy#210,df#211,userip#212,agencyid#213L,ta#214,mb#215,advertiserid#216L,campaignid#217L,os#218,usr#219,isdefaultimg#220,isuserinit#221,impressiontype#222,referrer#223,city#224,masteradid#225L,state#226,val#227,isclick#228,flightid#229L,siteid#230,intrn#231,asset#232,sid#233,account_id#192L,event_time#193L]
:     :  +- Filter isnotnull(campaignid#217L)
:     :     +- BatchedScan 
HadoopFiles[originaltime#194,pluid#195,sdg#196,type#197L,useragent#198,utctime#199,diorigin#200,dbid#201,timeid#202,browser#203,brandid#204L,time#205,zip#206,dma#207,ad_id#208,ismobile#209,privacy#210,df#211,userip#212,agencyid#213L,ta#214,mb#215,advertiserid#216L,campaignid#217L,os#218,usr#219,isdefaultimg#220,isuserinit#221,impressiontype#222,referrer#223,city#224,masteradid#225L,state#226,val#227,isclick#228,flightid#229L,siteid#230,intrn#231,asset#232,sid#233,account_id#192L,event_time#193L]
 Format: ParquetFormat, PushedFilters: [IsNotNull(campaignid)], ReadSchema: 
struct<originaltime:string,pluid:string,sdg:string,type:bigint,useragent:string,utctime:string,diorigin:string,dbid:string,timeid:string,browser:string,brandid:bigint,time:string,zip:string,dma:string,ad_id:int,ismobile:string,privacy:string,df:string,userip:string,agencyid:bigint,ta:string,mb:string,advertiserid:bigint,campaignid:bigint,os:string,usr:string,isdefaultimg:string,isuserinit:string,impressiontype:string,referrer:string,city:string,masteradid:bigint,state:string,val:string,isclick:string,flightid:bigint,siteid:string,intrn:string,asset:string,sid:string>
:     +- INPUT
+- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint]))
   +- WholeStageCodegen
      :  +- Project 
[campaign_id#235L,campaign_type_id#236,campaign_name#237,version#238,account_id#234L]
      :     +- Filter isnotnull(campaign_id#235L)
      :        +- BatchedScan 
HadoopFiles[campaign_id#235L,campaign_type_id#236,campaign_name#237,version#238,account_id#234L]
 Format: ParquetFormat, PushedFilters: [IsNotNull(campaign_id)], ReadSchema: 
struct<campaign_id:bigint,campaign_type_id:int,campaign_name:string,version:int>

> Spark 2.0 vs 1.6.1 Query Time(out)
> ----------------------------------
>
>                 Key: SPARK-14946
>                 URL: https://issues.apache.org/jira/browse/SPARK-14946
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Raymond Honderdors
>            Priority: Critical
>         Attachments: Query Plan 1.6.1.png, screenshot-spark_2.0.png, 
> spark-defaults.conf, spark-env.sh
>
>
> I run a query using JDBC driver running it on version 1.6.1 it return after 5 
> – 6 min , the same query against version 2.0 fails after 2h (due to timeout) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to