Github user fmcquillan99 commented on the issue:

    https://github.com/apache/madlib/pull/295
  
    Should impurity_var_importance always add up to 100?
    From the regression example in the user docs:
    
    ```
    DROP TABLE IF EXISTS mt_imp_output;
    SELECT madlib.get_var_importance('mt_cars_output','mt_imp_output');
    SELECT am, impurity_var_importance FROM mt_imp_output ORDER BY am, 
impurity_var_importance DESC;
    ```
    results in
    ```
    
     am | impurity_var_importance 
    ----+-------------------------
      0 |        35.7664683110879
      0 |        24.7481977075922
      0 |        12.4401197123678
      0 |        12.1559096708347
      0 |        4.88929809351791
      1 |        31.7259035495099
      1 |        29.6146492693988
      1 |        14.9602257795489
      1 |        7.01369118455985
      1 |        6.68552870777581
    (10 rows)
    ```
    which does not add up to 100
    ```
                grp 0                   grp 1
                35.76646831             31.72590355
                24.74819771             29.61464927
                12.44011971             14.96022578
                12.15590967             7.013691185
                4.889298094             6.685528708
    total       89.9999935              89.99999849
    ```


---

Reply via email to