[
https://issues.apache.org/jira/browse/MAHOUT-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742274#action_12742274
]
Deneche A. Hakim commented on MAHOUT-145:
-----------------------------------------
KDD 50%
|| Num Map Tasks || Num Trees || Oob Error || Build Time || Step 1 || Step 1-2
|| Step 2 || Step 2-2 || Step 3 ||
| 10 | 100 | 0.1911 | 0h 2m 39s 73 | 1m 23s | 9s | 27s | 40s | 1m 7s |
| 10 | 200 | 0.1902 | 0h 4m 57s 268 | 2m 39s | 17s | 40s | 1m 21s | 1m 4s |
| 10 | 400 | 0.1880 | 0h 9m 1s 400 | 4m 37s | 34s | 1m 5s | 2m 46s | 1m 6s |
| 20 | 100 | 0.1905 | 0h 1m 44s 853 | 32s | 5s | 24s | 44s | 1m 5s |
| 20 | 200 | 0.1853 | 0h 2m 58s 462 | 48s | 9s | 30s | 1m 32s | 1m 3s |
| 20 | 400 | 0.1856 | 0h 5m 20s 231 | 1m 26s | 17s | 47s | 2m 50s | 1m 5s |
| 50 | 100 | 0.4738 | 0h 1m 23s 989 | 19s | 2s | 24s | 39s | 1m 3s |
| 50 | 200 | 0.4738 | 0h 2m 10s 921 | 21s | 4s | 30s | 1m 16s | 1m 3s |
| 50 | 400 | 0.4738 | 0h 3m 52s 98 | 25s | 7s | 44s | 2m 36s | 1m 2s |
> PartialData mapreduce Random Forests
> ------------------------------------
>
> Key: MAHOUT-145
> URL: https://issues.apache.org/jira/browse/MAHOUT-145
> Project: Mahout
> Issue Type: New Feature
> Components: Classification
> Reporter: Deneche A. Hakim
> Priority: Minor
> Attachments: partial_August_10.patch, partial_August_2.patch,
> partial_August_9.patch
>
>
> This implementation is based on a suggestion by Ted:
> "modify the original algorithm to build multiple trees for different portions
> of the data. That loses some of the solidity of the original method, but
> could actually do better if the splits exposed non-stationary behavior."
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.