Re: flatMap for dataframe

2022-02-09 Thread Khalid Mammadov
One way is to split->explode->pivot
These are column and Dataframe methods.
Here are quick examples from web:
https://www.google.com/amp/s/sparkbyexamples.com/spark/spark-split-dataframe-column-into-multiple-columns/amp/


https://www.google.com/amp/s/sparkbyexamples.com/spark/explode-spark-array-and-map-dataframe-column/amp/

On Wed, 9 Feb 2022, 01:55 frakass,  wrote:

> Hello
>
> for the RDD I can apply flatMap method:
>
>  >>> sc.parallelize(["a few words","ba na ba na"]).flatMap(lambda x:
> x.split(" ")).collect()
> ['a', 'few', 'words', 'ba', 'na', 'ba', 'na']
>
>
> But for a dataframe table how can I flatMap that as above?
>
>  >>> df.show()
> ++
> |   value|
> ++
> | a few lines|
> |hello world here|
> | ba na ba na|
> ++
>
>
> Thanks
>
> -
> To unsubscribe e-mail: user-unsubscr...@spark.apache.org
>
>


Re: flatMap for dataframe

2022-02-08 Thread frakass

Is this the scala syntax?
Yes in scala I know how to do it by converting the df to a dataset.
how for pyspark?

Thanks

On 2022/2/9 10:24, oliver dd wrote:

df.flatMap(row => row.getAs[String]("value").split(" "))


-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: flatMap for dataframe

2022-02-08 Thread oliver dd
Hi,

You can achieve your goal by:

df.flatMap(row => row.getAs[String]("value").split(" "))

—
Best Regards,
oliverdding

flatMap for dataframe

2022-02-08 Thread frakass

Hello

for the RDD I can apply flatMap method:

>>> sc.parallelize(["a few words","ba na ba na"]).flatMap(lambda x: 
x.split(" ")).collect()

['a', 'few', 'words', 'ba', 'na', 'ba', 'na']


But for a dataframe table how can I flatMap that as above?

>>> df.show()
++
|   value|
++
| a few lines|
|hello world here|
| ba na ba na|
++


Thanks

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: Examples of flatMap in dataFrame

2015-06-08 Thread Ram Sriharsha
Hi

You are looking for the explode method (in Dataframe API starting 1.3 I
believe)
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala#L1002

Ram

On Sun, Jun 7, 2015 at 9:22 PM, Dimp Bhat dimp201...@gmail.com wrote:

 Hi,
 I'm trying to write a custom transformer in Spark ML and since that uses
 DataFrames, am trying to use flatMap function in DataFrame class in Java.
 Can you share a simple example of how to use the flatMap function to do
 word count on single column of the DataFrame. Thanks


 Dimple



FlatMap in DataFrame

2015-06-07 Thread dimple
Hi,
I'm trying to write a custom transformer in Spark ML and since that uses
DataFrames, am trying to use flatMap function in DataFrame class in Java.
Can you share a simple example of how to use the flatMap function to do word
count on single column of the DataFrame. Thanks.

Dimple



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/FlatMap-in-DataFrame-tp23199.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Examples of flatMap in dataFrame

2015-06-07 Thread Dimp Bhat
Hi,
I'm trying to write a custom transformer in Spark ML and since that uses
DataFrames, am trying to use flatMap function in DataFrame class in Java.
Can you share a simple example of how to use the flatMap function to do
word count on single column of the DataFrame. Thanks


Dimple