@CD, is this something we could fix? can we list features in the order of
the indices?

On Tue, Aug 11, 2015 at 12:25 PM, Thushan Ganegedara <thu...@gmail.com>
wrote:

> Hi,
>
> I noticed that, in certain cases, the features don't follow the correct
> ordering. Any idea why this is happening?
>
> For example in this image, V10 appears after V1
>
> On Tue, Aug 11, 2015 at 12:10 PM, Thushan Ganegedara <thu...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> After a daunting struggle, I was able to corner the issue with the poor
>> accuracy for the specific leaf dataset. The dataset has classes from 1 to
>> 36. However, there are no classes from 16th - 22nd. i.e. Classes go as
>> 1,2,..,14,15,23,24,...,35,36
>>
>> Then, while converting these class labels to enums in H-2-O (combined
>> with the fact that there's very little data for each class) confuses H-2-O
>> and causes it to *assign different enum values for the same classes in
>> different datasets*. Which manifest itself as a poor accuracy.
>>
>> I suspect that there's a mismatch between the labels provided by JavaRDD
>> and enums produced by H-2-O as well. I'm looking into this issue right now.
>>
>> Thank you
>>
>> On Mon, Aug 10, 2015 at 11:16 AM, Thushan Ganegedara <thu...@gmail.com>
>> wrote:
>>
>>> Hi all,
>>>
>>> I've been testing the new Deeplearning component with few different
>>> datasets (mainly leaf dataset) and the leaf dataset seems to be not working
>>> as expected for an unknown reason.
>>>
>>> However, I tested the Deeplearning component extensively with the leaf
>>> dataset and identified several potential problems that might be causing the
>>> poor accuracy.
>>>
>>> 1. Need to have higher number of epochs (compared to other datasets) to
>>> produce a reasonable accuracy.
>>>
>>> 2. Too many neurons causing overfitting thereby causing poor accuracy.
>>>
>>> 3. Some classes have quite closely related features (Especially the
>>> latter classes are misclassified often)
>>>
>>> I was able to get an accuracy of 86% with Logistic Regression L-BFGS.
>>> Which is quite reasonable. But I'm having trouble reaching that accuracy
>>> with Deeplearning (which should be quite easy). Highest accuracy I reached
>>> so far is 71.xx%
>>>
>>> So I'm still looking for any definite issues causing the poor accuracy.
>>>
>>> Thank you.
>>>
>>>
>>> --
>>> Regards,
>>>
>>> Thushan Ganegedara
>>> School of IT
>>> University of Sydney, Australia
>>>
>>
>>
>>
>> --
>> Regards,
>>
>> Thushan Ganegedara
>> School of IT
>> University of Sydney, Australia
>>
>
>
>
> --
> Regards,
>
> Thushan Ganegedara
> School of IT
> University of Sydney, Australia
>



-- 

Thanks & regards,
Nirmal

Team Lead - WSO2 Machine Learner
Associate Technical Lead - Data Technologies Team, WSO2 Inc.
Mobile: +94715779733
Blog: http://nirmalfdo.blogspot.com/
_______________________________________________
Dev mailing list
Dev@wso2.org
http://wso2.org/cgi-bin/mailman/listinfo/dev

Reply via email to