[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-16 Thread Dulaj Rajitha (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422597#comment-15422597
 ] 

Dulaj Rajitha commented on SPARK-16993:
---

The issue is solved and that was not a bug.
Thank you.
There was a error in with column statement and I had used a column form a wrong 
data-frame.

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-16 Thread Yanbo Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15422384#comment-15422384
 ] 

Yanbo Liang commented on SPARK-16993:
-

[~dulajrajitha] I can not reproduce your reported issue, the following code 
works well.
{code}
val data = 
spark.read.format("libsvm").load("/Users/yliang/data/trunk0/spark/data/mllib/sample_libsvm_data.txt")

val featureIndexer = new VectorIndexer()
  .setInputCol("features")
  .setOutputCol("indexedFeatures")
  .setMaxCategories(4)
  .fit(data)

val trainingData = data
val testData = data.drop("label")

val rf = new RandomForestRegressor()
  .setLabelCol("label")
  .setFeaturesCol("indexedFeatures")

val pipeline = new Pipeline()
  .setStages(Array(featureIndexer, rf))

val model = pipeline.fit(trainingData)

val predictions = model.transform(testData)

predictions.select("prediction", "features").show(5)
{code}
Could you tell me whether this code snippet coincide with your issues? If yes, 
I think it's not a bug. Thanks!

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-11 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417892#comment-15417892
 ] 

Sean Owen commented on SPARK-16993:
---

You would need to show some code or more about the error.

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-11 Thread Dulaj Rajitha (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417847#comment-15417847
 ] 

Dulaj Rajitha commented on SPARK-16993:
---

But the thing is if add dummy column as as the label column, the process goes 
fine.
I could not continue without add dummy the label column for the data set that 
needs the prediction.

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-11 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417192#comment-15417192
 ] 

Sean Owen commented on SPARK-16993:
---

Yes, that's clear. You haven't said what the error is, and I expect it's coming 
from some other misunderstanding, because the class in question does not use 
the label column in transform()

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-11 Thread Dulaj Rajitha (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15417144#comment-15417144
 ] 

Dulaj Rajitha commented on SPARK-16993:
---

Here is the scenario.
My train data set has : features,and label column
Using that I do train and get a model. (Also I do an evaluation using a split 
of the training data.)
Using the above model I need to predict for data set which has only id and 
features column.
But when using the second data frame I get the error.
So how we use the same model for different data frame for prediction after 
evaluation?

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-10 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415222#comment-15415222
 ] 

Sean Owen commented on SPARK-16993:
---

You need a label for training and evaluation. You do not need one for 
prediction, of course. But I do not see any use of labelCol in transform 
methods. That's why I'm asking for more detail, like where this exception 
occurs. I'm still not sure you're actually making predictions in your code.

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-10 Thread Dulaj Rajitha (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415102#comment-15415102
 ] 

Dulaj Rajitha commented on SPARK-16993:
---

Is there a method to use do the prediction for non evaluating purposes (Just 
predictictions).

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-10 Thread Dulaj Rajitha (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415081#comment-15415081
 ] 

Dulaj Rajitha commented on SPARK-16993:
---

I do not want to evaluate. I just need to predict using the model I got from 
the regressor.fit(dataframe) method.

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-10 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415070#comment-15415070
 ] 

Sean Owen commented on SPARK-16993:
---

You certainly need labels in your held out test set for evaluation. But you 
seem to be talking about model.transform which is different. It is not clear 
what you are describing.

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-10 Thread Dulaj Rajitha (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415063#comment-15415063
 ] 

Dulaj Rajitha commented on SPARK-16993:
---

When using the RandomForestRegressor.
I trained using a dataframe with the label column and got a model.
by: model = regressor.fit(trainData)

But my test data does not have a label column. (Since this is the column I need 
to be prediicted).
therefore when transforming I got a error.
model.transform(test)

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16993) model.transform without label column in random forest regression

2016-08-10 Thread Sean Owen (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15415050#comment-15415050
 ] 

Sean Owen commented on SPARK-16993:
---

Questions should go to user@.
Can you clarify? where do you get this exception? the transform method does not 
require a label, no.

> model.transform without label column in random forest regression
> 
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
>  Issue Type: Question
>  Components: Java API, ML
>Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's 
> training data split).
> But those data do not have the label column. (Since these data are the data 
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input 
> columns: [id,features,prediction]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org