Re: Dataframe, Java: How to convert String to Vector ?

2016-10-02 Thread Yan Facai
Hi, Perter. It's interesting that `DecisionTreeRegressor.transformImpl` also use udf to transform dataframe, instead of using map: https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/regression/DecisionTreeRegressor.scala#L175 On Wed, Sep 21, 2016 at 10:22 PM,

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-21 Thread Peter Figliozzi
I'm sure there's another way to do it; I hope someone can show us. I couldn't figure out how to use `map` either. On Wed, Sep 21, 2016 at 3:32 AM, 颜发才(Yan Facai) wrote: > Thanks, Peter. > It works! > > Why udf is needed? > > > > > On Wed, Sep 21, 2016 at 12:00 AM, Peter

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-21 Thread Yan Facai
Thanks, Peter. It works! Why udf is needed? On Wed, Sep 21, 2016 at 12:00 AM, Peter Figliozzi wrote: > Hi Yan, I agree, it IS really confusing. Here is the technique for > transforming a column. It is very general because you can make "myConvert" > do whatever

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-20 Thread Peter Figliozzi
Hi Yan, I agree, it IS really confusing. Here is the technique for transforming a column. It is very general because you can make "myConvert" do whatever you want. import org.apache.spark.mllib.linalg.Vectors val df = Seq((0, "[1,3,5]"), (1, "[2,4,6]")).toDF df.show() // The columns were named

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-19 Thread Yan Facai
Hi, all. I find that it's really confuse. I can use Vectors.parse to create a DataFrame contains Vector type. scala> val dataVec = Seq((0, Vectors.parse("[1,3,5]")), (1, Vectors.parse("[2,4,6]"))).toDF dataVec: org.apache.spark.sql.DataFrame = [_1: int, _2: vector] But using map to

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-08 Thread Yan Facai
many thanks, Peter. On Wed, Sep 7, 2016 at 10:14 PM, Peter Figliozzi wrote: > Here's a decent GitHub book: Mastering Apache Spark > > . > > I'm new at Scala too. I found it very helpful to

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-07 Thread Peter Figliozzi
Here's a decent GitHub book: Mastering Apache Spark . I'm new at Scala too. I found it very helpful to study the Scala language without Spark. The documentation found here is

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-07 Thread Yan Facai
Hi Peter, I'm familiar with Pandas / Numpy in python, while spark / scala is totally new for me. Pandas provides a detailed document, like how to slice data, parse file, use apply and filter function. Do spark have some more detailed document? On Tue, Sep 6, 2016 at 9:58 PM, Peter Figliozzi

Re: Dataframe, Java: How to convert String to Vector ?

2016-09-06 Thread Peter Figliozzi
Hi Yan, I think you'll have to map the features column to a new numerical features column. Here's one way to do the individual transform: scala> val x = "[1, 2, 3, 4, 5]" x: String = [1, 2, 3, 4, 5] scala> val y:Array[Int] = x slice(1, x.length - 1) replace(",", "") split(" ") map(_.toInt) y: