More details :
Execution plan for Original query
select distinct pge.portfolio_code
from table1 pge join table2 p
on p.perm_group = pge.anc_port_group
join table3 uge
on p.user_group=uge.anc_user_group
where uge.user_name = 'user' and p.perm_type = 'TEST'
== Physical Plan ==
TungstenAggregate(key=[portfolio_code#14119], functions=[],
output=[portfolio_code#14119])
TungstenExchange hashpartitioning(portfolio_code#14119)
TungstenAggregate(key=[portfolio_code#14119], functions=[],
output=[portfolio_code#14119])
TungstenProject [portfolio_code#14119]
BroadcastHashJoin [user_group#13665], [anc_user_group#13658], BuildRight
TungstenProject [portfolio_code#14119,user_group#13665]
BroadcastHashJoin [anc_port_group#14117], [perm_group#13667],
BuildRight
ConvertToUnsafe
Scan
ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table1.parquet][portfolio_code#14119,anc_port_group#14117]
ConvertToUnsafe
Project [user_group#13665,perm_group#13667]
Filter (perm_type#13666 = TEST)
Scan
ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table2.parquet][user_group#13665,perm_group#13667,perm_type#13666]
ConvertToUnsafe
Project [anc_user_group#13658]
Filter (user_name#13659 = user)
Scan
ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table3.parquet][anc_user_group#13658,user_name#13659]
Execution plan for optimized query
select distinct pge.portfolio_code
from table1 uge, table2 p, table3 pge
where uge.user_name = 'user' and p.perm_type = 'TEST'
and p.perm_group = pge.anc_port_group
and p.user_group=uge.anc_user_group
== Physical Plan ==
TungstenAggregate(key=[portfolio_code#14119], functions=[],
output=[portfolio_code#14119])
TungstenExchange hashpartitioning(portfolio_code#14119)
TungstenAggregate(key=[portfolio_code#14119], functions=[],
output=[portfolio_code#14119])
TungstenProject [portfolio_code#14119]
BroadcastHashJoin [perm_group#13667], [anc_port_group#14117], BuildRight
TungstenProject [perm_group#13667]
BroadcastHashJoin [anc_user_group#13658], [user_group#13665],
BuildRight
ConvertToUnsafe
Project [anc_user_group#13658]
Filter (user_name#13659 = user)
Scan
ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table3.parquet][anc_user_group#13658,user_name#13659]
ConvertToUnsafe
Project [perm_group#13667,user_group#13665]
Filter (perm_type#13666 = TEST)
Scan
ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table2.parquet][perm_group#13667,user_group#13665,perm_type#13666]
ConvertToUnsafe
Scan
ParquetRelation[snackfs://shared:9042/aladdin_data_beta/table1.parquet][portfolio_code#14119,anc_port_group#14117]
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-Optimization-tp26548p26553.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]