So what are you guys doing to get from an unpredictable number of
canopies to a 'k' value for k-means and an initial assignment of each
item to one cluster?


On Thu, Jun 11, 2009 at 12:49 PM, Adil Aijaz<[email protected]> wrote:
> Jeff,
>
> Thanks for the quick turnaround on this issue. Just tested it and the canopy
> creation and kmeans both work now on syntheticcontroldata. I get 7 canopies
> and 7 clusters. Collection logic in close() is not pretty but can't think of
> a workaround myself.
>
> adil
>
> Jeff Eastman wrote:
>>
>> r783617 removed the CanopyCombiner and refactored its semantics back into
>> the reducer. Updated unit tests pass and Synthetic Control with Canopy
>> produces 6 clusters. Kmeans also runs produces 6 clusters too. I really
>> don't like doing stuff in close() but see no practical alternative. Ideas
>> are still welcomed.
>>
>> Jeff
>>
>>
>> Jeff Eastman wrote:
>>>
>>> Adil Aijaz wrote:
>>>>
>>>> 2. There is a bug in
>>>> examples/src/main/java/org/apache/mahout/clustering/syntheticcontrol/kmeans/Job.java
>>>> that called runJob from main function with my provided arguments 
>>>> transposed.
>>>> So, my convergenceDelta was interpreted as t1, t1 as t2, and t2 as
>>>> convergenceDelta. I will commit a patch as soon as I get approval for
>>>> opensource commits from my employer, however, I thought I'd put it out 
>>>> there
>>>> in case someone else is going through the same issue.
>>>>
>>> r783585 fixed the parameter ordering bug. Still working on the Combiner
>>> problem.
>>>
>>> Thanks Adil,
>>> Jeff
>>>
>>>
>>
>
>

Reply via email to