[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2017-02-22 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15878111#comment-15878111
 ] 

Apache Spark commented on SPARK-13721:
--

User 'bogdanrdc' has created a pull request for this issue:
https://github.com/apache/spark/pull/17026

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>Assignee: Bogdan Raducanu
> Fix For: 2.2.0
>
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2017-02-16 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15870106#comment-15870106
 ] 

Apache Spark commented on SPARK-13721:
--

User 'rxin' has created a pull request for this issue:
https://github.com/apache/spark/pull/16958

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>Assignee: Bogdan Raducanu
> Fix For: 2.2.0
>
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2017-01-16 Thread Apache Spark (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15824308#comment-15824308
 ] 

Apache Spark commented on SPARK-13721:
--

User 'bogdanrdc' has created a pull request for this issue:
https://github.com/apache/spark/pull/16608

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2016-09-05 Thread Ewan Leith (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15464427#comment-15464427
 ] 

Ewan Leith commented on SPARK-13721:


Assuming Don's use case is the same as ours, we have to do odd looking queries 
like this pseudo-code to get the full set of entries when using explode with 
records where the nested array is not always populated (with the .filter's to 
make it explicit what's happening):

val df1 = df
  .filter("column.nested_array is not null")
  .withColumn("element", explode(col("column.nested_array")))
  .select("other_column", "element")

val df2 = df
  .filter("column.nested_array is null")
  .select("other_column", lit("") as "element")

df1.unionAll(df2)



> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2016-09-02 Thread Herman van Hovell (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459539#comment-15459539
 ] 

Herman van Hovell commented on SPARK-13721:
---

Could you explain what this would looks like? I am asking because adding 
{outer} to {explode()} is a bit weird, since outer is a property of the 
generate process and not of the generator.

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2016-09-02 Thread Don Drake (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15459490#comment-15459490
 ] 

Don Drake commented on SPARK-13721:
---

My nested structures aren't simple types, they are structs (case classes) and 
so this existing method works great for me. 

This ticket it about modifying the explode() call to support outer, not adding 
outer to the data frame api.

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2016-08-31 Thread Herman van Hovell (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452926#comment-15452926
 ] 

Herman van Hovell commented on SPARK-13721:
---

You can follow the suggestion in the deprecation warning and do this:
{noformat}
scala> val df = spark.range(1000).select($"id", array($"id" % 2, $"id" % 
3).as("values"))
scala> df.select($"id", explode($"values")).show
+---+---+
| id|col|
+---+---+
|  0|  0|
|  0|  0|
|  1|  1|
|  1|  1|
|  2|  0|
|  2|  2|
|  3|  1|
|  3|  0|
|  4|  0|
|  4|  1|
|  5|  1|
|  5|  2|
|  6|  0|
|  6|  0|
|  7|  1|
|  7|  1|
|  8|  0|
|  8|  2|
|  9|  1|
|  9|  0|
+---+---+
only showing top 20 rows
{noformat}

This not what the ticket is about, that would be for adding `outer` to the data 
frame api.

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2016-08-31 Thread Don Drake (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15452846#comment-15452846
 ] 

Don Drake commented on SPARK-13721:
---

Spark 2.0 has deprecated this function, what workarounds are suggested?

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13721) Add support for LATERAL VIEW OUTER explode()

2016-03-07 Thread Xiao Li (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184213#comment-15184213
 ] 

Xiao Li commented on SPARK-13721:
-

That sounds reasonable. Maybe we can wait until DataFrame and DataSet APIs are 
combined. 

> Add support for LATERAL VIEW OUTER explode()
> 
>
> Key: SPARK-13721
> URL: https://issues.apache.org/jira/browse/SPARK-13721
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Reporter: Ian Hellstrom
>
> Hive supports the [LATERAL VIEW 
> OUTER|https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView#LanguageManualLateralView-OuterLateralViews]
>  syntax to make sure that when an array is empty, the content from the outer 
> table is still returned. 
> Within Spark, this is currently only possible within the HiveContext and 
> executing HiveQL statements. It would be nice if the standard explode() 
> DataFrame method allows the same. A possible signature would be: 
> {code:scala}
> explode[A, B](inputColumn: String, outputColumn: String, outer: Boolean = 
> false)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org