Re: Random forest binary classification H20 difference Spark

2016-08-11 Thread Bedrytski Aliaksandr
Hi Samir,

either use *dataframe.na.fill()* method or the *nvl()* UDF when
selecting features:

val train = sqlContext.sql("SELECT ... nvl(Field, 1.0) AS Field ...
FROM test")

--
  Bedrytski Aliaksandr
  sp...@bedryt.ski



On Wed, Aug 10, 2016, at 11:19, Yanbo Liang wrote:
> Hi Samir,
>
> Did you use VectorAssembler to assemble some columns into the feature
> column? If there are NULLs in your dataset, VectorAssembler will throw
> this exception. You can use DataFrame.drop() or DataFrame.replace() to
> drop/substitute NULL values.
>
> Thanks
> Yanbo
>
> 2016-08-07 19:51 GMT-07:00 Javier Rey :
>> Hi everybody.
>> I have executed RF on H2O I didn't troubles with nulls values, by in
>> contrast in Spark using dataframes and ML library I obtain this
>> error,l I know my dataframe contains nulls, but I understand that
>> Random Forest supports null values:
>>
>> "Values to assemble cannot be null"
>>
>> Any advice, that framework can handle this issue?.
>>
>> Regards,
>> Samir


Re: Random forest binary classification H20 difference Spark

2016-08-10 Thread Yanbo Liang
Hi Samir,

Did you use VectorAssembler to assemble some columns into the feature
column? If there are NULLs in your dataset, VectorAssembler will throw this
exception. You can use DataFrame.drop() or DataFrame.replace() to
drop/substitute NULL values.

Thanks
Yanbo

2016-08-07 19:51 GMT-07:00 Javier Rey :

> Hi everybody.
>
> I have executed RF on H2O I didn't troubles with nulls values, by in
> contrast in Spark using dataframes and ML library I obtain this error,l I
> know my dataframe contains nulls, but I understand that Random Forest
> supports null values:
>
> "Values to assemble cannot be null"
>
> Any advice, that framework can handle this issue?.
>
> Regards,
>
> Samir
>


Random forest binary classification H20 difference Spark

2016-08-07 Thread Javier Rey
Hi everybody.

I have executed RF on H2O I didn't troubles with nulls values, by in
contrast in Spark using dataframes and ML library I obtain this error,l I
know my dataframe contains nulls, but I understand that Random Forest
supports null values:

"Values to assemble cannot be null"

Any advice, that framework can handle this issue?.

Regards,

Samir