[
https://issues.apache.org/jira/browse/MAHOUT-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ikumasa Mukai updated MAHOUT-840:
---------------------------------
Attachment: MAHOUT-840-additional.patch
Hi Hakim-san.
I made a patch for makeing the decisionTreeBuilder default.
And this is the help.
{noformat}
Usage:
[--data <path> --dataset <dataset> --selection <m> --no-complete --minsplit
<minsplit> --minprop <minprop> --seed <seed> --partial --nbtrees <nbtrees>
--output <path> --help]
Options
--data (-d) path Data path
--dataset (-ds) dataset Dataset path
--selection (-sl) m Optional, Number of variables to select randomly
at each tree-node.
For classification problem, the default is
square root of the number of explanatory
variables.
For regression problem, the default is 1/3 of
the number of explanatory variables.
--no-complete (-nc) Optional, The tree is not complemented
--minsplit (-ms) minsplit Optional, The tree-node is not divided, if the
branching data size is smaller than this value.
The default is 2.
--minprop (-mp) minprop Optional, The tree-node is not divided, if the
proportion of the variance of branching data is
smaller than this value.
The default is 1/1000(0.001).
--seed (-sd) seed Optional, seed value used to initialise the
Random number generator
--partial (-p) Optional, use the Partial Data implementation
--nbtrees (-t) nbtrees Number of trees to grow
--output (-o) path Output path, will contain the Decision Forest
--help (-h) Print out help
{noformat}
In summary, I added these 3 options
{noformat}
--no-complete (-nc)
--minsplit (-ms) minsplit
--minprop (-mp) minprop
{noformat}
and change the condition of "--selection(-sl)" option from Required to Optional
because I think the appropriate value can be calculated.
Would you please check my patch?
(this patch doesn't have the removal of defaultBuilder)
Regards,
> Decision Forests should support Regression problems
> ---------------------------------------------------
>
> Key: MAHOUT-840
> URL: https://issues.apache.org/jira/browse/MAHOUT-840
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Reporter: Deneche A. Hakim
> Assignee: Deneche A. Hakim
> Attachments: DecisionTreeBuilderTest.java,
> MAHOUT-840-additional.patch, MAHOUT-840.patch, regression.patch,
> regression.patch, regression.patch
>
>
> Improve Decision Forest code in order to handle numerical targets, thus
> supporting regression problems
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira