Re: RandomSplit with Spark-ML and Dataframe

2015-05-19 Thread Olivier Girardot
Thank you !

Le mar. 19 mai 2015 à 21:08, Xiangrui Meng men...@gmail.com a écrit :

 In 1.4, we added RAND as a DataFrame expression, which can be used for
 random split. Please check the example here:

 https://github.com/apache/spark/blob/master/python/pyspark/ml/tuning.py#L214.
 https://github.com/apache/spark/blob/master/python/pyspark/ml/tuning.py#L214.-Xiangrui
 -Xiangrui
 https://github.com/apache/spark/blob/master/python/pyspark/ml/tuning.py#L214.-Xiangrui

 On Thu, May 7, 2015 at 8:39 AM, Olivier Girardot
 o.girar...@lateral-thoughts.com wrote:
  Hi,
  is there any best practice to do like in MLLib a randomSplit of
  training/cross-validation set with dataframes and the pipeline API ?
 
  Regards
 
  Olivier.



Re: RandomSplit with Spark-ML and Dataframe

2015-05-19 Thread Xiangrui Meng
In 1.4, we added RAND as a DataFrame expression, which can be used for
random split. Please check the example here:
https://github.com/apache/spark/blob/master/python/pyspark/ml/tuning.py#L214.
-Xiangrui

On Thu, May 7, 2015 at 8:39 AM, Olivier Girardot
o.girar...@lateral-thoughts.com wrote:
 Hi,
 is there any best practice to do like in MLLib a randomSplit of
 training/cross-validation set with dataframes and the pipeline API ?

 Regards

 Olivier.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RandomSplit with Spark-ML and Dataframe

2015-05-07 Thread Olivier Girardot
Hi,
is there any best practice to do like in MLLib a randomSplit of
training/cross-validation set with dataframes and the pipeline API ?

Regards

Olivier.