Examples of flatMap in dataFrame

2015-06-07 Thread Dimp Bhat
Hi, I'm trying to write a custom transformer in Spark ML and since that uses DataFrames, am trying to use flatMap function in DataFrame class in Java. Can you share a simple example of how to use the flatMap function to do word count on single column of the DataFrame. Thanks Dimple

Re: Embedding your own transformer in Spark.ml Pipleline

2015-06-02 Thread Dimp Bhat
This API was added to 1.3.0 > AFAIK. > > > On 2015-06-02 21:40, Dimp Bhat wrote: > > Thanks Peter. Can you share the Tokenizer.java class for Spark 1.2.1. > > Dimple > > On Tue, Jun 2, 2015 at 10:51 AM, Peter Rudenko > wrote: > >> Hi Dimple, >> t

Re: Embedding your own transformer in Spark.ml Pipleline

2015-06-02 Thread Dimp Bhat
Thanks Peter. Can you share the Tokenizer.java class for Spark 1.2.1. Dimple On Tue, Jun 2, 2015 at 10:51 AM, Peter Rudenko wrote: > Hi Dimple, > take a look to existing transformers: > > https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/OneHotEncoder

Re: Embedding your own transformer in Spark.ml Pipleline

2015-06-02 Thread Dimp Bhat
Thanks for the quick reply Ram. Will take a look at the Tokenizer code and try it out. Dimple On Tue, Jun 2, 2015 at 10:42 AM, Ram Sriharsha wrote: > Hi > > We are in the process of adding examples for feature transformations ( > https://issues.apache.org/jira/browse/SPARK-7546) and this shoul