[ 
https://issues.apache.org/jira/browse/MADLIB-605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Frank McQuillan closed MADLIB-605.
----------------------------------
    Resolution: Fixed

Resolved by writing new SVM for scratch for v1.9

Closing this JIRA.

> SVM Regression: Accurancy should be improved for some data sets
> ---------------------------------------------------------------
>
>                 Key: MADLIB-605
>                 URL: https://issues.apache.org/jira/browse/MADLIB-605
>             Project: Apache MADlib
>          Issue Type: Bug
>            Reporter: Jiali Yao
>            Assignee: Rahul Iyer
>            Priority: Critical
>              Labels: severity_set
>             Fix For: v1.9
>
>
> We run comparable test cases both in MADlib and libsvm and compared mean 
> square error. 
> We found that below data sets have worse score than svm 
> 1. Kernel function is dot:
> {code}
> Data Sets     MADlib(Parallel = true) MADlib(Parallel = false)        libsvm  
>         Madlib/libsvm
> bodyfat               249962.1847             4397613.616                     
> 4.68E-05        5336898594
> mpg           239380954.3             1.89706E+11                     22.5239 
>         10627864.37
> Test case:
> SELECT madlib.svm_regression
>                         ( 'madlibtestdata.svm_bodyfat'::text     --input_table
>                         , 'madlibtestresult.reg_model_table'::text    
> --model_table
>                         , 'true'::boolean       --parallel
>                         , 'madlib.svm_dot'::text    --kernel_func
>                         , 'false'::boolean        --verbose
>                         , '0.1'::float8            --eta
>                         , '0.005'::float8             --nu
>                         , '0.05'::float8        --slambda
>                    ) AS q;
> SELECT madlib.svm_regression
>                         ( 'madlibtestdata.svm_mpg'::text     --input_table
>                         , 'madlibtestresult.reg_model_table'::text    
> --model_table
>                         , 'true'::boolean       --parallel
>                         , 'madlib.svm_dot'::text    --kernel_func
>                         , 'false'::boolean        --verbose
>                         , '0.1'::float8            --eta
>                         , '0.005'::float8             --nu
>                         , '0.05'::float8        --slambda
>                    ) AS q;
> {code}
> 2. Polynomial
> {code}
> Data Sets     MADlib(Parallel = true) MADlib(Parallel = false)        libsvm  
> Madlib/libsvm
> bodyfat               4.07E+26        1.86E+27        0.00143458      
> 2.83446E+29
> cpusmall      2.38E+71        4.41E+72        1.42E+42        1.67986E+29
> housing       9.31E+29        6.79E+31        249267          3.73671E+24
> mpg           2.25E+37        9.89E+39        610.474         3.68346E+34
> Test case example:
> SELECT madlib.svm_regression
>                         ( 'madlibtestdata.svm_bodyfat'::text     --input_table
>                         , 'madlibtestresult.reg_model_table'::text    
> --model_table
>                         , 'true'::boolean       --parallel
>                         , 'madlibtestdata.svm_polynomial'::text    
> --kernel_func
>                         , 'false'::boolean        --verbose
>                         , '0.1'::float8            --eta
>                         , '0.005'::float8             --nu
>                         , '0.05'::float8        --slambda
>                    ) AS q;
> {code}
> 3. Data sets
> {code}
> Data Sets Name        TrainSize       Attr    URL
>  abalone       4,177   8       
> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#abalone
>  
>  bodyfat       252     14      
> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#bodyfat
>  
>  cpusmall      8,192   12      
> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#cpusmalll
>  
>  housing       506     13      
> http://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/regression.html#housing
>  
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to