Yes, the evidence accumulation approach is quite simple yet popular, and
can be implemented on current methods in sklearn. However, I have a doubt
about the current implementation of the agglomerative hierarchical
clustering. Although it could be implemented differently, I would need to
get the final partition by inspecting the tree and specifying the height of
the dendrogram, not the number of clusters (I see there is an open issue
<https://github.com/scikit-learn/scikit-learn/issues/3796> about this).
This way, it is not necessary, according to this approach, to specify the
number of clusters in advance.
Ronnie, there are many applications of these methods in a wide range of
areas, like bioinformatics, businesses and software engineering, among
others. You can look for "consensus clustering", "cluster ensemble",
"clustering aggregation". There are good accuracy/performance reports (in
general, an ensemble method outperforms individual members), as well as
articles
<http://www.nature.com/srep/2014/140827/srep06207/full/srep06207.html> that
doubt about its benefits (although this one does not seem to take into
account the diversity of the input ensemble, which is essential to get good
results).
Regarding GSoC, I have never participated before. I was reading it is a
full job (40 hours week). Unfortunately, I can't do that now, so I think it
would be better to contribute outside the program. Sorry, I had a wrong
idea of it and I should have read before.
Regards,
Milton.
2015-02-22 19:02 GMT-03:00 Andy <t3k...@gmail.com>:
> the paper is quite well cited (500):
> http://scholar.google.com/scholar?q=Combining%20multiple%20clusterings%20using%20evidence%20accumulation&btnG=Search&as_sdt=800000000001&as_sdtp=on
>
> I thought the idea was to add (some of) the ensemble methods described in
> the paper, which are meta-algorithms that could build on any of the methods
> we already have,
> as far as I understand.
>
>
>
> On 02/18/2015 04:06 AM, Ronnie Ghose wrote:
>
> is there clear use for this clustering method and a sizable number of
> citations and obviously performance/accuracy/something benefits that
> warrant the time & maitenance cost then?
>
> Also by that @andreas, I'm not seeing a list of clustering methods to be
> added, right now it seems unbounded - i don't like unbounded scopes, they
> dont make for good project topics
>
>
>
> On Wed, Feb 18, 2015 at 5:22 AM, Gael Varoquaux <
> gael.varoqu...@normalesup.org> wrote:
>
>> On Tue, Feb 17, 2015 at 04:42:11PM -0800, Andy wrote:
>> > On 02/13/2015 07:08 AM, Ronnie Ghose wrote:
>> > > -1 we would have to build in support for more clustering methods
>> > > ,sounds like a not-very-standalone proj
>> > Why? We already have a bunch, right?
>>
>> I agree with Andreas that any addition should be motivated: the new
>> clustering method should bring something to the existing ones. It should
>> be different in some way, and have a clear benefit (pointing to a paper
>> isn't enough to demonstrate a benefit, the benefit should be easy to
>> explain and demonstrated many times).
>>
>> Gaƫl
>>
>>
>> ------------------------------------------------------------------------------
>> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
>> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
>> with Interactivity, Sharing, Native Excel Exports, App Integration & more
>> Get technology previously reserved for billion-dollar corporations, FREE
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations,
> FREEhttp://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
>
>
>
> _______________________________________________
> Scikit-learn-general mailing
> listScikit-learn-general@lists.sourceforge.nethttps://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
>
> ------------------------------------------------------------------------------
> Download BIRT iHub F-Type - The Free Enterprise-Grade BIRT Server
> from Actuate! Instantly Supercharge Your Business Reports and Dashboards
> with Interactivity, Sharing, Native Excel Exports, App Integration & more
> Get technology previously reserved for billion-dollar corporations, FREE
>
> http://pubads.g.doubleclick.net/gampad/clk?id=190641631&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
--
Milton Pividori
Blog: www.miltonpividori.com.ar
------------------------------------------------------------------------------
Dive into the World of Parallel Programming The Go Parallel Website, sponsored
by Intel and developed in partnership with Slashdot Media, is your hub for all
things parallel software development, from weekly thought leadership blogs to
news, videos, case studies, tutorials and more. Take a look and join the
conversation now. http://goparallel.sourceforge.net/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general