[PySpark Pipeline XGboost] How to use XGboost in PySpark Pipeline
Dear all, I want to update my code of pyspark. In the pyspark, it must put the base model in a pipeline, the office demo of pipeline use the LogistictRegression as an base model. However, it seems not be able to use XGboost model in the pipeline api. How can I use the pyspark like this: from xgboost import XGBClassifier ... model = XGBClassifier() model.fit(X_train, y_train) pipeline = Pipeline(stages=[..., model, ...]) It is convenient to use the pipeline api, so can anybody give some advices? Thank you! Daniel -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To unsubscribe e-mail: user-unsubscr...@spark.apache.org
Fwd: XGBoost on PySpark
Guys any insight on the below? -- Forwarded message -- From: Aakash Basu <aakash.spark@gmail.com> Date: Sat, May 19, 2018 at 12:21 PM Subject: XGBoost on PySpark To: user <user@spark.apache.org> Hi guys, I need help in implementing XG-Boost in PySpark. As per the conversation in a popular thread regarding XGB goes, it is available in Scala and Java versions but not Python. But, we've to implement a pythonic distributed solution (on Spark) maybe using DMLC or similar, to go ahead with XGB solutioning. Anybody implemented the same? If yes, please share some insight on how to go about it. Thanks, Aakash.
XGBoost on PySpark
Hi guys, I need help in implementing XG-Boost in PySpark. As per the conversation in a popular thread regarding XGB goes, it is available in Scala and Java versions but not Python. But, we've to implement a pythonic distributed solution (on Spark) maybe using DMLC or similar, to go ahead with XGB solutioning. Anybody implemented the same? If yes, please share some insight on how to go about it. Thanks, Aakash.