Also, the collector may not be called at all.  That makes the types get
buggered.

On Tue, Oct 14, 2008 at 7:04 PM, Jeff Eastman (JIRA) <[EMAIL PROTECTED]>wrote:

>
>    [
> https://issues.apache.org/jira/browse/MAHOUT-82?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639691#action_12639691]
>
> Jeff Eastman commented on MAHOUT-82:
> ------------------------------------
>
> This assertion needs a lot more justification before I would agree with it.
> The canopy reducer obtains cluster centroids from many mappers - each seeing
> only a portion of the input data - and attempts to coalesce them. Each
> mapper/combiner generates its own independent set of canopies and so there
> would be no common canopyIds to use in the reducer.
>
> > Canopy map intermediate file structure should be keyed by canopyId.
> > -------------------------------------------------------------------
> >
> >                 Key: MAHOUT-82
> >                 URL: https://issues.apache.org/jira/browse/MAHOUT-82
> >             Project: Mahout
> >          Issue Type: Bug
> >          Components: Clustering
> >    Affects Versions: 0.1
> >            Reporter: Edward J. Yoon
> >             Fix For: 0.1
> >
> >
> >  When emit the point to the collector, it should be keyed by canopyId w/o
> computed centroid. (or make a other key datum instead of hadoop.IO.Text)
>
> --
> This message is automatically generated by JIRA.
> -
> You can reply to this email to add a comment to the issue online.
>
>


-- 
ted

Reply via email to