Re: LabeledPoint creation

2016-09-08 Thread 市场部
{ val featureVector = Vectors.dense(x.getAs[org.apache.spark.mllib.linalg.SparseVector]("categoryVec").toArray) val label = x.getAs[java.lang.Integer]("id").toDouble LabeledPoint(label, featureVector) } } var result = sqlContext.createDataFrame(data)

Re: LabeledPoint creation

2016-09-07 Thread Madabhattula Rajesh Kumar
"category").toString() val id = line.getAs[java.lang.Integer]("id").toDouble var i = -1 categories.foreach { x => i += 1; categoriesList(i) = if (x == values) 1.0 else 0.0 } val denseVector = Vectors.dense(categoriesList) LabeledPoint(id, d

Re: LabeledPoint creation

2016-09-07 Thread aka.fe2s
t 5:40 PM, Madabhattula Rajesh Kumar < > mrajaf...@gmail.com> wrote: > >> Hi, >> >> I am new to Spark ML, trying to create a LabeledPoint from categorical >> dataset(example code from spark). For this, I am using One-hot encoding >> <h

Re: LabeledPoint creation

2016-09-07 Thread Madabhattula Rajesh Kumar
Hi, Any help on above mail use case ? Regards, Rajesh On Tue, Sep 6, 2016 at 5:40 PM, Madabhattula Rajesh Kumar < mrajaf...@gmail.com> wrote: > Hi, > > I am new to Spark ML, trying to create a LabeledPoint from categorical > dataset(example code from spark). For this,

LabeledPoint creation

2016-09-06 Thread Madabhattula Rajesh Kumar
Hi, I am new to Spark ML, trying to create a LabeledPoint from categorical dataset(example code from spark). For this, I am using One-hot encoding <http://en.wikipedia.org/wiki/One-hot> feature. Below is my code val df = sparkSession.createDataFrame(Seq( (0, "a"), (1

Pls assist: need to create an udf that returns a LabeledPoint in pyspark

2016-07-28 Thread Marco Mistroni
hi all could anyone assist? i need to create a udf function that returns a LabeledPoint I read that in pyspark (1.6) LabeledPoint is not supported and i have to create a StructType anyone can point me in some directions? kr marco

Change spark dataframe to LabeledPoint in Java

2016-06-30 Thread Abhishek Anand
Hi , I have a dataframe which i want to convert to labeled point. DataFrame labeleddf = model.transform(newdf).select("label","features"); How can I convert this to a LabeledPoint to use in my Logistic Regression model. I could do this in scala using val trainData

Re: Labeledpoint

2016-06-21 Thread Ndjido Ardo BAR
To answer more accurately to your question, the model.fit(df) method takes in a DataFrame of Row(label=double, feature=Vectors.dense([...])) . cheers, Ardo. On Tue, Jun 21, 2016 at 6:44 PM, Ndjido Ardo BAR wrote: > Hi, > > You can use a RDD of LabelPoints to fit your model. Check the doc for m

Re: Labeledpoint

2016-06-21 Thread Ndjido Ardo BAR
Hi, You can use a RDD of LabelPoints to fit your model. Check the doc for more example : http://spark.apache.org/docs/latest/api/python/pyspark.ml.html?highlight=transform#pyspark.ml.classification.RandomForestClassificationModel.transform cheers, Ardo. On Tue, Jun 21, 2016 at 6:12 PM, pseudo od

Labeledpoint

2016-06-21 Thread pseudo oduesp
Hi, i am pyspark user and i want test Randomforest. i have dataframe with 100 columns i should give Rdd or data frame to algorithme i transformed my dataframe to only tow columns label ands features columns df.label df.features 0(517,(0,1,2,333,56 ... 1 (517,(0,11,0,3

Re: LabeledPoint with features in matrix form (word2vec matrix)

2016-04-07 Thread jamborta
ted matrix like indexedRowMatrix (http://spark.apache.org/docs/latest/mllib-data-types.html#indexedrowmatrix). -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/LabeledPoint-with-features-in-matrix-form-word2vec-matrix-tp26629p26696.html Sent from the Apache Spark

Re: LabeledPoint with features in matrix form (word2vec matrix)

2016-04-06 Thread jamborta
this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/LabeledPoint-with-features-in-matrix-form-word2vec-matrix-tp26629p26690.html Sent from the Apache Spark User List mailing list archive at Nabble.com. ---

Spark MLLlib Ideal way to convert categorical features into LabeledPoint RDD?

2016-02-01 Thread unk1102
in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-MLLlib-Ideal-way-to-convert-categorical-features-into-LabeledPoint-RDD-tp26125.html Sent from the Apache Spark User List mailing list archive at Nabble.com

issue creating pyspark Transformer UDF that creates a LabeledPoint: AttributeError: 'DataFrame' object has no attribute '_get_object_id'

2015-12-07 Thread Andy Davidson
Hi I am running into a strange error. I am trying to write a transformer that takes in to columns and creates a LabeledPoint. I can not figure out why I am getting AttributeError: 'DataFrame' object has no attribute Œ_get_object_id¹ I am using spark-1.5.1-bin-hadoop2.6 Any idea

How to deal with null values on LabeledPoint

2015-07-07 Thread Saif.A.Ellafi
Hello, reading from spark-csv, got some lines with missing data (not invalid). applying map() to create a LabeledPoint with denseVector. Using map( Row => Row.getDouble(col_index) ) To this point: res173: org.apache.spark.mllib.regression.LabeledPoint = (-1.53013269

RE: How to create a LabeledPoint RDD from a Data Frame

2015-07-06 Thread Mohammed Guller
LabeledPoint RDD from a Data Frame Hi, I have a Dataframe which I want to use for creating a RandomForest model using MLLib. The RandonForest model needs a RDD with LabeledPoints. Wondering how do I convert the DataFrame to LabeledPointRDD Regards, Sourav

How to create a LabeledPoint RDD from a Data Frame

2015-07-06 Thread Sourav Mazumder
Hi, I have a Dataframe which I want to use for creating a RandomForest model using MLLib. The RandonForest model needs a RDD with LabeledPoints. Wondering how do I convert the DataFrame to LabeledPointRDD Regards, Sourav

Re: From DataFrame to LabeledPoint

2015-04-07 Thread Sergio Jiménez Barrio
some columns with null >> values. >> >> This is the first row of Dataframe: >> scala> dataDF.take(1) >> res11: Array[org.apache.spark.sql.Row] = >> Array([null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null,null

Re: From DataFrame to LabeledPoint

2015-04-06 Thread Joseph Bradley
ull,null,null,null,null,null,null,null,null,null,null]) > > > > This is the RDD[LabeledPoint] created: > scala> data.take(1) > 15/04/06 15:46:31 ERROR TaskSetManager: Task 0 in stage 6.0 failed 4 > times; aborting job > org.apache.spark.SparkException: Job aborted due to st

Re: From DataFrame to LabeledPoint

2015-04-02 Thread Joseph Bradley
Peter's suggestion sounds good, but watch out for the match case since I believe you'll have to match on: case (Row(feature1, feature2, ...), Row(label)) => On Thu, Apr 2, 2015 at 7:57 AM, Peter Rudenko wrote: > Hi try next code: > > val labeledPoints: RDD[LabeledPoint]

Re: From DataFrame to LabeledPoint

2015-04-02 Thread Peter Rudenko
Hi try next code: |val labeledPoints: RDD[LabeledPoint] = features.zip(labels).map{ case Row(feture1, feture2,..., label) => LabeledPoint(label, Vectors.dense(feature1, feature2, ...)) } | Thanks, Peter Rudenko On 2015-04-02 17:17, drarse wrote: Hello!, I have a questions since days

From DataFrame to LabeledPoint

2015-04-02 Thread drarse
e2","feature3",...); val labels = df.select ("cassification")/ But, now, I don't know create a LabeledPoint for RandomForest. I tried some solutions without success. Can you help me? Thanks for all! -- View this message in context: http://apache-spark-user-list.10015

Re: Extracting an element from the feature vector in LabeledPoint

2014-08-22 Thread LPG
-list.1001560.n3.nabble.com/Extracting-an-element-from-the-feature-vector-in-LabeledPoint-tp0p12644.html Sent from the Apache Spark User List mailing list archive at Nabble.com. - To unsubscribe, e-mail: user-unsubscr

Re: Extracting an element from the feature vector in LabeledPoint

2014-08-01 Thread Sean Owen
Oh I'm sorry, I somehow misread your email as looking for the label. I read too fast. That was pretty silly. THis works for me though: scala> val point = LabeledPoint(1,Vectors.dense(2,3,4)) point: org.apache.spark.mllib.regression.LabeledPoint = (1.0,[2.0,3.0,4.0]) scala> point

Re: Extracting an element from the feature vector in LabeledPoint

2014-08-01 Thread SK
org.apache.spark.mllib.linalg.Vector -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Extracting-an-element-from-the-feature-vector-in-LabeledPoint-tp0p11181.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Extracting an element from the feature vector in LabeledPoint

2014-08-01 Thread Sean Owen
If you look at the class LabeledPoint, you'll see it has a field called "label": data.label data.features(1) would access the second element of features, which is not the same thing. On Fri, Aug 1, 2014 at 3:01 AM, SK wrote: > > Hi, > > I want to extract the indiv

Re: Extracting an element from the feature vector in LabeledPoint

2014-07-31 Thread Yanbo Liang
Which version you are use? data.features(1) is OK for spark 1.0 2014-08-01 10:01 GMT+08:00 SK : > > Hi, > > I want to extract the individual elements of a feature vector that is part > of a LabeledPoint. I tried the following: > > data.features._1 > data.features(1)

Extracting an element from the feature vector in LabeledPoint

2014-07-31 Thread SK
Hi, I want to extract the individual elements of a feature vector that is part of a LabeledPoint. I tried the following: data.features._1 data.features(1) data.features.map(_.1) data is a LabeledPoint with a feature vector containing 3 features. All of the above resulted in compilation

Re: Decision Tree requires regression LabeledPoint

2014-07-30 Thread SK
I have also used labeledPoint or libSVM format (for sparse data) for DecisionTree. When I had categorical labels (not features), I mapped the categories to numerical data as part of the data transformation step (i.e. before creating the LabeledPoint). -- View this message in context: http

Re: LabeledPoint with weight

2014-07-21 Thread Xiangrui Meng
xt: > http://apache-spark-user-list.1001560.n3.nabble.com/LabeledPoint-with-weight-tp10291.html > Sent from the Apache Spark User List mailing list archive at Nabble.com.

LabeledPoint with weight

2014-07-21 Thread Jiusheng Chen
alue1 index2:value2 ... -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/LabeledPoint-with-weight-tp10291.html Sent from the Apache Spark User List mailing list archive at Nabble.com.