>
>
> In this case, the document needs to be paired with the nearest cluster
> right, something like Canopy clustering should give partial connection
> graph
>

Just populate similarity values for documents in a canopy, very sparse but
still connected graph due to the overlapping nature of canopy clustering

>
> Robin
>
>
>
>
>>  On Mon, Aug 16, 2010 at 7:00 AM, Robin Anil <[email protected]>
>> wrote:
>>
>> > From a GSOC angle, it needn't be done, its upto your mentor to decide. I
>> am
>> > interested more in getting this completed and pushed out so that people
>> can
>> > really use it. If you can spare time after GSOC and still hang around
>> the
>> > community and help in getting this polished, it will be great.
>> >
>> > To create your pairwise similarity(0-1  1 means dissimilar) matrix(it
>> can
>> > be
>> > the other way around as well), see the DistanceMeasure implementations.
>> > Creating the pairwise matrix is non trivial from a scalability stand
>> point.
>> >
>> > A complete spectral clustering package should take an input set of
>> > documents, create the matrix and run clustering and output the clusters.
>> To
>> > get an idea of your work till now, what are the blocks missing from this
>> > ideal package scenario?
>> >
>> >
>> > Robin
>> >
>>
>
>

Reply via email to