[ 
https://issues.apache.org/jira/browse/MAHOUT-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12785985#action_12785985
 ] 

Isabel Drost commented on MAHOUT-11:
------------------------------------

Applies cleanly and builds w/o unit test failures here.

The changes look all good to me. Great work, Drew.

One question though: In the TestMeanShift test (lines 301 and 304) you removed 
the canopyId adjustments - could you please explain what was the reason this 
was necessary?

I would like to commit this patch next week if noone objects.

> Static fields used throughout clustering code (Canopy, K-Means).
> ----------------------------------------------------------------
>
>                 Key: MAHOUT-11
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-11
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.1
>            Reporter: Dawid Weiss
>             Fix For: 0.3
>
>         Attachments: MAHOUT-11-all-cleanup-20091128.patch, 
> MAHOUT-11-kmeans-cleanup.patch, MAHOUT-11-RandomSeedGenerator.patch, 
> MAHOUT-11.patch
>
>
> I file this as a bug, even though I'm not 100% sure it is one. In the currect 
> code the information is exchanged via static fields (for example, distance 
> measure and thresholds for Canopies are static field). Is it always true in 
> Hadoop that one job runs inside one JVM with exclusive access? I haven't seen 
> it anywhere in Hadoop documentation and my impression was that everything 
> uses JobConf to pass configuration to jobs, but jobs are configured on a 
> per-object basis (a job is an object, a mapper is an object and everything 
> else is basically an object).
> If it's possible for two jobs to run in parallel inside one JVM then this is 
> a limitation and bug in our code that needs to be addressed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to