[ 
https://issues.apache.org/jira/browse/MAHOUT-145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742262#action_12742262
 ] 

Deneche A. Hakim edited comment on MAHOUT-145 at 8/12/09 3:39 AM:
------------------------------------------------------------------

update: I did a re-run on 50 map tests, the new results are more coherent

KDD 25%
|| Num Map Tasks || Num Trees || Oob Error || Build Time || Step 1 || Step 1-2 
|| Step 2 || Step 2-2 || Step 3 ||
| 10 | 100 | 0.0194 | 0h 1m 23s 210 | 39s | 4s | 20s | 20s | 33s |
| 10 | 200 | 0.0203 | 0h 2m 16s 510 | 1m 1s | 9s | 26s | 41s | 33s |
| 10 | 400 | 0.0195 | 0h 4m 10s 9 | 1m 53s | 18s | 39s | 1m 20s | 32s |
| 20 | 100 | 0.3875 | 0h 1m 5s 288 | 20s | 2s | 18s | 25s | 31s |
| 20 | 200 | 0.3626 | 0h 1m 29s 145 | 23s | 5s | 22s | 39s | 33s |
| 20 | 400 | 0.5003 | 0h 2m 30s 789 | 35s | 8s | 28s | 1m 19s | 32s |
| 50 | 100 | 0.5041 | 0h 1m 1s 375 | 19s | 3s | 19s | 21s | 32s |
| 50 | 200 | 0.5041 | 0h 1m 19s 202 | 19s | 2s | 22s | 36s | 32s |
| 50 | 400 | 0.5041 | 0h 2m 2s 250 | 18s | 4s | 28s | 1m 12s | 33s |


      was (Author: adeneche):
    KDD 25%
|| Num Map Tasks || Num Trees || Oob Error || Build Time || Step 1 || Step 1-2 
|| Step 2 || Step 2-2 || Step 3 ||
| 10 | 100 | 0.0194 | 0h 1m 23s 210 | 39s | 4s | 20s | 20s | 33s |
| 10 | 200 | 0.0203 | 0h 2m 16s 510 | 1m 1s | 9s | 26s | 41s | 33s |
| 10 | 400 | 0.0195 | 0h 4m 10s 9 | 1m 53s | 18s | 39s | 1m 20s | 32s |
| 20 | 100 | 0.3875 | 0h 1m 5s 288 | 20s | 2s | 18s | 25s | 31s |
| 20 | 200 | 0.3626 | 0h 1m 29s 145 | 23s | 5s | 22s | 39s | 33s |
| 20 | 400 | 0.5003 | 0h 2m 30s 789 | 35s | 8s | 28s | 1m 19s | 32s |
| 50 | 100 | 0.5041 | 0h 1m 46s 717 | 14s | 2s | 1m 11s | 20s | 33s |
| 50 | 200 | 0.5041 | 0h 1m 20s 977 | 17s | 2s | 22s | 40s | 33s |
| 50 | 400 | 0.5041 | 0h 2m 1s 714 | 16s | 4s | 29s | 1m 12s | 34s |

  
> PartialData mapreduce Random Forests
> ------------------------------------
>
>                 Key: MAHOUT-145
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-145
>             Project: Mahout
>          Issue Type: New Feature
>          Components: Classification
>            Reporter: Deneche A. Hakim
>            Priority: Minor
>         Attachments: partial_August_10.patch, partial_August_2.patch, 
> partial_August_9.patch
>
>
> This implementation is based on a suggestion by Ted:
> "modify the original algorithm to build multiple trees for different portions 
> of the data. That loses some of the solidity of the original method, but 
> could actually do better if the splits exposed non-stationary behavior."

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to