[PySpark Pipeline XGboost] How to use XGboost in PySpark Pipeline

2018-05-31 Thread Daniel Du
Dear all, 

I want to update my code of pyspark. In the pyspark, it must put the base
model in a pipeline, the office demo of pipeline use the LogistictRegression
as an base model. However, it seems not be able to use XGboost model in the
pipeline api. How can I use the pyspark like this: 

from xgboost import XGBClassifier 
... 
model = XGBClassifier() 
model.fit(X_train, y_train) 
pipeline = Pipeline(stages=[..., model, ...]) 

It is convenient to use the pipeline api, so can anybody give some advices?
Thank you! 

Daniel



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Fwd: XGBoost on PySpark

2018-05-23 Thread Aakash Basu
Guys any insight on the below?

-- Forwarded message --
From: Aakash Basu <aakash.spark@gmail.com>
Date: Sat, May 19, 2018 at 12:21 PM
Subject: XGBoost on PySpark
To: user <user@spark.apache.org>


Hi guys,

I need help in implementing XG-Boost in PySpark.

As per the conversation in a popular thread regarding XGB goes, it is
available in Scala and Java versions but not Python. But, we've to
implement a pythonic distributed solution (on Spark) maybe using DMLC or
similar, to go ahead with XGB solutioning.

Anybody implemented the same? If yes, please share some insight on how to
go about it.

Thanks,
Aakash.


XGBoost on PySpark

2018-05-19 Thread Aakash Basu
Hi guys,

I need help in implementing XG-Boost in PySpark.

As per the conversation in a popular thread regarding XGB goes, it is
available in Scala and Java versions but not Python. But, we've to
implement a pythonic distributed solution (on Spark) maybe using DMLC or
similar, to go ahead with XGB solutioning.

Anybody implemented the same? If yes, please share some insight on how to
go about it.

Thanks,
Aakash.