Yadong Qi created SPARK-12352:
---------------------------------
Summary: Reuse the result of split in SQL
Key: SPARK-12352
URL: https://issues.apache.org/jira/browse/SPARK-12352
Project: Spark
Issue Type: Improvement
Components: SQL
Affects Versions: 1.5.2
Reporter: Yadong Qi
When use split in sql, if we want to get the value through index from same
array, it will split the same row every time.
{code}
spark-sql> explain extended select array[0] as a, array[1] as b, array[2] as c
from (select split(value, ',') as array from src_split) t;
== Parsed Logical Plan ==
'Project [unresolvedalias('array[0] AS a#16),unresolvedalias('array[1] AS
b#17),unresolvedalias('array[2] AS c#18)]
'Subquery t
'Project [unresolvedalias('split('value,,) AS array#15)]
'UnresolvedRelation [src_split], None
== Analyzed Logical Plan ==
a: string, b: string, c: string
Project [array#15[0] AS a#16,array#15[1] AS b#17,array#15[2] AS c#18]
Subquery t
Project [split(value#20,,) AS array#15]
MetastoreRelation default, src_split, None
== Optimized Logical Plan ==
Project [split(value#20,,)[0] AS a#16,split(value#20,,)[1] AS
b#17,split(value#20,,)[2] AS c#18]
MetastoreRelation default, src_split, None
== Physical Plan ==
Project [split(value#20,,)[0] AS a#16,split(value#20,,)[1] AS
b#17,split(value#20,,)[2] AS c#18]
HiveTableScan [value#20], (MetastoreRelation default, src_split, None)
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]