On Fri, Aug 14, 2015 at 10:01 AM, Thushan Ganegedara <[email protected]> wrote:
> Hi, > > This was mainly due to the detection of a numerical feature as a > categorical one. > Oh, it makes sense now. Why don't we try taking a sample of data and if > the sample contains only integers (or doubles without any decimals) or > strings, consider it as a categorical variable. > I tried that approach too, but there're some datasets like automobile dataset normalized-losses feature, which has integer values (0-164) but which is probably not categorical. > > We suggested increasing the categorical threshold as a work-around. > @thushan did it work? > Yes, it worked. After increasing the threshold to 40. > > On Fri, Aug 14, 2015 at 2:21 PM, Nirmal Fernando <[email protected]> wrote: > >> This was mainly due to the detection of a numerical feature as a >> categorical one. >> >> We suggested increasing the categorical threshold as a work-around. >> @thushan did it work? >> >> On Tue, Aug 11, 2015 at 5:50 PM, Thushan Ganegedara <[email protected]> >> wrote: >> >>> This issue occurs, if I turn the response variable to a categorical >>> variable. If I get the variable as a numerical variable, the values are >>> read correctly. >>> >>> So I presume there is a fault in categorical conversion of the variable. >>> >>> On Tue, Aug 11, 2015 at 7:11 PM, Thushan Ganegedara <[email protected]> >>> wrote: >>> >>>> I still get the same result >>>> >>>> 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 >>>> 1.0 1.0 1.0 12.0 12.0 12.0 12.0 12.0 12.0 >>>> 12.0 12.0 12.0 12.0 13.0 13.0 13.0 13.0 13.0 >>>> 13.0 >>>> 13.0 13.0 13.0 13.0 14.0 14.0 14.0 14.0 14.0 >>>> 14.0 14.0 14.0 15.0 15.0 15.0 15.0 15.0 15.0 >>>> 15.0 15.0 15.0 15.0 15.0 15.0 16.0 16.0 16.0 >>>> 16.0 >>>> 16.0 16.0 16.0 16.0 17.0 17.0 17.0 17.0 17.0 >>>> 17.0 17.0 17.0 17.0 17.0 18.0 18.0 18.0 18.0 >>>> 18.0 18.0 18.0 18.0 18.0 18.0 18.0 19.0 19.0 >>>> 19.0 >>>> 19.0 19.0 19.0 19.0 19.0 19.0 19.0 19.0 19.0 >>>> 19.0 19.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 >>>> 2.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 4.0 4.0 4.0 4.0 4.0 4.0 >>>> 4.0 4.0 4.0 4.0 4.0 4.0 5.0 5.0 5.0 5.0 >>>> 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 >>>> 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 >>>> 6.0 6.0 6.0 7.0 7.0 7.0 7.0 7.0 7.0 7.0 >>>> 7.0 7.0 7.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>> 3.0 3.0 3.0 3.0 >>>> >>>> On Tue, Aug 11, 2015 at 7:05 PM, Nirmal Fernando <[email protected]> >>>> wrote: >>>> >>>>> Can you use following code and try; >>>>> >>>>> List<LabeledPoint> points = labeledPoints.collect(); >>>>> for(int i=0;i<points.size();i++){ >>>>> System.out.print(points.get(i).label() + "\t"); >>>>> } >>>>> >>>>> On Tue, Aug 11, 2015 at 2:30 PM, Thushan Ganegedara <[email protected]> >>>>> wrote: >>>>> >>>>>> I used the following snippet >>>>>> >>>>>> for(int i=0;i<labeledPoints.collect().size();i++){ >>>>>> System.out.print(labeledPoints.collect().get(i).label() >>>>>> + "\t"); >>>>>> } >>>>>> >>>>>> in the public MLModel build() throws MLModelBuilderException in >>>>>> DeeplearningModelBuilder.java >>>>>> >>>>>> >>>>>> On Tue, Aug 11, 2015 at 6:17 PM, Nirmal Fernando <[email protected]> >>>>>> wrote: >>>>>> >>>>>>> Hi thushan, >>>>>>> >>>>>>> We need more info. What did you exactly print and where? >>>>>>> >>>>>>> On Tue, Aug 11, 2015 at 12:47 PM, Thushan Ganegedara < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Hi, >>>>>>>> >>>>>>>> I found the potential cause of the poor accuracy for the leaf >>>>>>>> dataset. It seems the data read into ML is wrong. >>>>>>>> >>>>>>>> I have attached the data file as a CSV (classes are in the last >>>>>>>> column) >>>>>>>> >>>>>>>> However, when I print out the labels of the read data (classes), it >>>>>>>> looks something like below. Clearly there aren't this many "3.0" >>>>>>>> classes >>>>>>>> and there should be classes up to 36.0. >>>>>>>> >>>>>>>> Is this caused by a bug? >>>>>>>> >>>>>>>> 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 >>>>>>>> 1.0 1.0 1.0 1.0 12.0 12.0 12.0 12.0 12.0 >>>>>>>> 12.0 12.0 12.0 12.0 12.0 13.0 13.0 13.0 13.0 >>>>>>>> 13.0 13.0 >>>>>>>> 13.0 13.0 13.0 13.0 14.0 14.0 14.0 14.0 >>>>>>>> 14.0 14.0 14.0 14.0 15.0 15.0 15.0 15.0 15.0 >>>>>>>> 15.0 15.0 15.0 15.0 15.0 15.0 15.0 16.0 16.0 >>>>>>>> 16.0 16.0 >>>>>>>> 16.0 16.0 16.0 16.0 17.0 17.0 17.0 17.0 >>>>>>>> 17.0 17.0 17.0 17.0 17.0 17.0 18.0 18.0 18.0 >>>>>>>> 18.0 18.0 18.0 18.0 18.0 18.0 18.0 18.0 19.0 >>>>>>>> 19.0 19.0 >>>>>>>> 19.0 19.0 19.0 19.0 19.0 19.0 19.0 19.0 >>>>>>>> 19.0 19.0 19.0 2.0 2.0 2.0 2.0 2.0 2.0 >>>>>>>> 2.0 2.0 2.0 2.0 2.0 2.0 2.0 3.0 3.0 >>>>>>>> 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 4.0 4.0 4.0 4.0 4.0 >>>>>>>> 4.0 4.0 4.0 4.0 4.0 4.0 4.0 5.0 5.0 >>>>>>>> 5.0 5.0 >>>>>>>> 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 >>>>>>>> 5.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 6.0 >>>>>>>> 6.0 6.0 6.0 6.0 7.0 7.0 7.0 7.0 7.0 >>>>>>>> 7.0 7.0 >>>>>>>> 7.0 7.0 7.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 >>>>>>>> 3.0 3.0 >>>>>>>> 3.0 3.0 3.0 3.0 >>>>>>>> >>>>>>>> -- >>>>>>>> Regards, >>>>>>>> >>>>>>>> Thushan Ganegedara >>>>>>>> School of IT >>>>>>>> University of Sydney, Australia >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> Thanks & regards, >>>>>>> Nirmal >>>>>>> >>>>>>> Team Lead - WSO2 Machine Learner >>>>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>>>> Mobile: +94715779733 >>>>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Regards, >>>>>> >>>>>> Thushan Ganegedara >>>>>> School of IT >>>>>> University of Sydney, Australia >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> >>>>> Thanks & regards, >>>>> Nirmal >>>>> >>>>> Team Lead - WSO2 Machine Learner >>>>> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >>>>> Mobile: +94715779733 >>>>> Blog: http://nirmalfdo.blogspot.com/ >>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Regards, >>>> >>>> Thushan Ganegedara >>>> School of IT >>>> University of Sydney, Australia >>>> >>> >>> >>> >>> -- >>> Regards, >>> >>> Thushan Ganegedara >>> School of IT >>> University of Sydney, Australia >>> >> >> >> >> -- >> >> Thanks & regards, >> Nirmal >> >> Team Lead - WSO2 Machine Learner >> Associate Technical Lead - Data Technologies Team, WSO2 Inc. >> Mobile: +94715779733 >> Blog: http://nirmalfdo.blogspot.com/ >> >> >> > > > -- > Regards, > > Thushan Ganegedara > School of IT > University of Sydney, Australia > -- Thanks & regards, Nirmal Team Lead - WSO2 Machine Learner Associate Technical Lead - Data Technologies Team, WSO2 Inc. Mobile: +94715779733 Blog: http://nirmalfdo.blogspot.com/
_______________________________________________ Dev mailing list [email protected] http://wso2.org/cgi-bin/mailman/listinfo/dev
