[
https://issues.apache.org/jira/browse/SPARK-12441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiao Li updated SPARK-12441:
----------------------------
Description:
The value of missingInput in
Generate/MapPartitions/AppendColumns/MapGroups/CoGroup is incorrect.
{code}
val df = Seq((1, "a b c"), (2, "a b"), (3, "a")).toDF("number", "letters")
val df2 =
df.explode('letters) {
case Row(letters: String) => letters.split(" ").map(Tuple1(_)).toSeq
}
df2.explain(true)
{code}
{code}
== Parsed Logical Plan ==
'Generate UserDefinedGenerator('letters), true, false, None
+- Project [_1#0 AS number#2,_2#1 AS letters#3]
+- LocalRelation [_1#0,_2#1], [[1,a b c],[2,a b],[3,a]]
== Analyzed Logical Plan ==
number: int, letters: string, _1: string
Generate UserDefinedGenerator(letters#3), true, false, None, [_1#8]
+- Project [_1#0 AS number#2,_2#1 AS letters#3]
+- LocalRelation [_1#0,_2#1], [[1,a b c],[2,a b],[3,a]]
== Optimized Logical Plan ==
Generate UserDefinedGenerator(letters#3), true, false, None, [_1#8]
+- LocalRelation [number#2,letters#3], [[1,a b c],[2,a b],[3,a]]
== Physical Plan ==
!Generate UserDefinedGenerator(letters#3), true, false,
[number#2,letters#3,_1#8]
+- LocalTableScan [number#2,letters#3], [[1,a b c],[2,a b],[3,a]]
{code}
was:
The value of missingInput in Generate is incorrect.
{code}
val df = Seq((1, "a b c"), (2, "a b"), (3, "a")).toDF("number", "letters")
val df2 =
df.explode('letters) {
case Row(letters: String) => letters.split(" ").map(Tuple1(_)).toSeq
}
df2.explain(true)
{code}
{code}
== Parsed Logical Plan ==
'Generate UserDefinedGenerator('letters), true, false, None
+- Project [_1#0 AS number#2,_2#1 AS letters#3]
+- LocalRelation [_1#0,_2#1], [[1,a b c],[2,a b],[3,a]]
== Analyzed Logical Plan ==
number: int, letters: string, _1: string
Generate UserDefinedGenerator(letters#3), true, false, None, [_1#8]
+- Project [_1#0 AS number#2,_2#1 AS letters#3]
+- LocalRelation [_1#0,_2#1], [[1,a b c],[2,a b],[3,a]]
== Optimized Logical Plan ==
Generate UserDefinedGenerator(letters#3), true, false, None, [_1#8]
+- LocalRelation [number#2,letters#3], [[1,a b c],[2,a b],[3,a]]
== Physical Plan ==
!Generate UserDefinedGenerator(letters#3), true, false,
[number#2,letters#3,_1#8]
+- LocalTableScan [number#2,letters#3], [[1,a b c],[2,a b],[3,a]]
{code}
> Fixing missingInput in Generate/MapPartitions/AppendColumns/MapGroups/CoGroup
> -----------------------------------------------------------------------------
>
> Key: SPARK-12441
> URL: https://issues.apache.org/jira/browse/SPARK-12441
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.5.2, 1.6.0
> Reporter: Xiao Li
>
> The value of missingInput in
> Generate/MapPartitions/AppendColumns/MapGroups/CoGroup is incorrect.
> {code}
> val df = Seq((1, "a b c"), (2, "a b"), (3, "a")).toDF("number", "letters")
> val df2 =
> df.explode('letters) {
> case Row(letters: String) => letters.split(" ").map(Tuple1(_)).toSeq
> }
> df2.explain(true)
> {code}
> {code}
> == Parsed Logical Plan ==
> 'Generate UserDefinedGenerator('letters), true, false, None
> +- Project [_1#0 AS number#2,_2#1 AS letters#3]
> +- LocalRelation [_1#0,_2#1], [[1,a b c],[2,a b],[3,a]]
> == Analyzed Logical Plan ==
> number: int, letters: string, _1: string
> Generate UserDefinedGenerator(letters#3), true, false, None, [_1#8]
> +- Project [_1#0 AS number#2,_2#1 AS letters#3]
> +- LocalRelation [_1#0,_2#1], [[1,a b c],[2,a b],[3,a]]
> == Optimized Logical Plan ==
> Generate UserDefinedGenerator(letters#3), true, false, None, [_1#8]
> +- LocalRelation [number#2,letters#3], [[1,a b c],[2,a b],[3,a]]
> == Physical Plan ==
> !Generate UserDefinedGenerator(letters#3), true, false,
> [number#2,letters#3,_1#8]
> +- LocalTableScan [number#2,letters#3], [[1,a b c],[2,a b],[3,a]]
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]