Yeah, I hear you there. I have a project I am working on that will
require me to generate examples, but it is a couple of weeks away.
The gene expression stuff is great. Text based ones would be really
cool too. I haven't done too much clustering work (other than using
Dawid's excellent Carrot2 project), so it is a learning experience for
me, and demos, tutorials would be great.
-Grant
On Mar 18, 2008, at 5:31 AM, Dawid Weiss wrote:
This is absolutely necessary, if not for just showing off with the
project, then certainly for verification of correctness of
algorithms inside it.
I will certainly hop in to such a subtask to the extent of my
current available time resources (not much, sadly).
D.
Grant Ingersoll wrote:
Now that we have some code in place for clustering, I think it
would be cool to put together some examples/demos of real world
problems. Things like clustering text (perhaps we can use the
wikipedia download or the reuters download that Lucene contrib/
benchmark uses) or clustering other pieces of data.
We could setup a demo area of code and use Lucene's analysis code
to create document vectors.
Ideas and/or thoughts or volunteers?
Cheers,
Grant