[ 
https://issues.apache.org/jira/browse/MAHOUT-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ikumasa Mukai updated MAHOUT-840:
---------------------------------

    Attachment: MAHOUT-840-additional.patch

Hi Hakim-san.

I made a patch for makeing the decisionTreeBuilder default.

And this is the help.
{noformat}
Usage:                                                                          
 [--data <path> --dataset <dataset> --selection <m> --no-complete --minsplit    
<minsplit> --minprop <minprop> --seed <seed> --partial --nbtrees <nbtrees>      
--output <path> --help]                                                         
Options                                                                         
  --data (-d) path             Data path                                        
  --dataset (-ds) dataset      Dataset path                                     
  --selection (-sl) m          Optional, Number of variables to select randomly 
                               at each tree-node.                               
                               For classification problem, the default is       
                               square root of the number of explanatory         
                               variables.                                       
                               For regression problem, the default is 1/3 of    
                               the number of explanatory variables.             
  --no-complete (-nc)          Optional, The tree is not complemented           
  --minsplit (-ms) minsplit    Optional, The tree-node is not divided, if the   
                               branching data size is smaller than this value.  
                               The default is 2.                                
  --minprop (-mp) minprop      Optional, The tree-node is not divided, if the   
                               proportion of the variance of branching data is  
                               smaller than this value.                         
                               The default is 1/1000(0.001).                    
  --seed (-sd) seed            Optional, seed value used to initialise the      
                               Random number generator                          
  --partial (-p)               Optional, use the Partial Data implementation    
  --nbtrees (-t) nbtrees       Number of trees to grow                          
  --output (-o) path           Output path, will contain the Decision Forest    
  --help (-h)                  Print out help
{noformat}

In summary, I added these 3 options

{noformat}
--no-complete (-nc)
--minsplit (-ms) minsplit
--minprop (-mp) minprop
{noformat}

and change the condition of "--selection(-sl)" option from Required to Optional 
because I think the appropriate value can be calculated.

Would you please check my patch?
(this patch doesn't have the removal of defaultBuilder)

Regards,
                
> Decision Forests should support Regression problems
> ---------------------------------------------------
>
>                 Key: MAHOUT-840
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-840
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>            Reporter: Deneche A. Hakim
>            Assignee: Deneche A. Hakim
>         Attachments: DecisionTreeBuilderTest.java, 
> MAHOUT-840-additional.patch, MAHOUT-840.patch, regression.patch, 
> regression.patch, regression.patch
>
>
> Improve Decision Forest code in order to handle numerical targets, thus 
> supporting regression problems

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


Reply via email to