[
https://issues.apache.org/jira/browse/SPARK-16993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15422384#comment-15422384
]
Yanbo Liang commented on SPARK-16993:
-------------------------------------
[~dulajrajitha] I can not reproduce your reported issue, the following code
works well.
{code}
val data =
spark.read.format("libsvm").load("/Users/yliang/data/trunk0/spark/data/mllib/sample_libsvm_data.txt")
val featureIndexer = new VectorIndexer()
.setInputCol("features")
.setOutputCol("indexedFeatures")
.setMaxCategories(4)
.fit(data)
val trainingData = data
val testData = data.drop("label")
val rf = new RandomForestRegressor()
.setLabelCol("label")
.setFeaturesCol("indexedFeatures")
val pipeline = new Pipeline()
.setStages(Array(featureIndexer, rf))
val model = pipeline.fit(trainingData)
val predictions = model.transform(testData)
predictions.select("prediction", "features").show(5)
{code}
Could you tell me whether this code snippet coincide with your issues? If yes,
I think it's not a bug. Thanks!
> model.transform without label column in random forest regression
> ----------------------------------------------------------------
>
> Key: SPARK-16993
> URL: https://issues.apache.org/jira/browse/SPARK-16993
> Project: Spark
> Issue Type: Question
> Components: Java API, ML
> Reporter: Dulaj Rajitha
>
> I need to use a separate data set to prediction (Not as show in example's
> training data split).
> But those data do not have the label column. (Since these data are the data
> that needs to be predict the label).
> but model.transform is informing label column is missing.
> org.apache.spark.sql.AnalysisException: cannot resolve 'label' given input
> columns: [id,features,prediction]
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]