Yes, feature extraction is exactly the point.

First and second derivates is the approach I'm on right now. Do you now a library to calculate the roots of the derivates?

I found apache.commons.math do do something like this, but documentation how to use the solvers is quite poor.

On 08/16/2011 08:18 PM, Ted Dunning wrote:
OK.

This is more of a kind of time series analysis even if the horizontal axis
isn't time.

You need to extract features from these graphs before doing clustering.
  Something like extreme values of smoothed second derivative might be
useful.  Spectral or cepstral features might be useful as well, but with so
few data points, these probably devolve to something very much like the
smoothed derivatives.

On Tue, Aug 16, 2011 at 1:47 AM, Alexander Kerner<
[email protected]>  wrote:

Hello Ted,

thanks for your help!

To give you more details:
Clustering in this case has something of pattern recognition:

for the first graph, I am looking for following pattern:

     *   *
   *          *
*                   *

for the second graph, I basically want following "pattern":

*  *  *  *  *  *

What I want to detect is now "overlaying" data or at a very basic point of
view just changes from expected pattern:

e.g:

first case:


    *  *       *
  *       *  *   *
*

should be clustered into two groups

second case:

*  *  *
           *  *  *

should also clustered into two groups.

In general, I am working with very little datapoints ( 5 - 50).

I hope this makes it a bit more clear.

May thanks,
Alex


On 08/15/2011 05:08 PM, Ted Dunning wrote:

Well, weka still stands as an option.  And frankly, you can call R from
java
pretty easily.

But more importantly, *experimenting* with these alternatives doesn't need
to be in Java.  You can noodle around with all the clustering algorithms
in
the world, select one and port it into Java or find an implementation.

And if you don't describe your problem in a bit more detail, we can't help
you.  Clustering specifically and machine learning in general is domain
dependent.

Your graphs don't explain what your data is, what your are trying to do,
what results you expect to get nor why you don't like the results you are
getting.  It is not obvious.

I note that I asked essentially these same questions 10 days ago.

On Sun, Aug 14, 2011 at 11:35 PM, Alexander Kerner<
[email protected]>   wrote:

  Matlab or R is not an option, since I need to integrate this clustering
into an existing Java program.

On 08/05/2011 06:02 PM, Jeff Eastman wrote:

  You may be better off experimenting with Weka (or MatLab or R) to try
out
various clustering algorithms on your data. Unless you have billions of
points this sort of low-dimension clustering can all be done in memory
and
you don't need Mahout.


-----Original Message-----
From: Alexander Kerner 
[mailto:a.kerner@dkfz-**heidel**berg.de<http://heidelberg.de>
<a.kerner@dkfz-**heidelberg.de<[email protected]>>
]
Sent: Friday, August 05, 2011 7:28 AM
To: [email protected]>>    "[email protected]"
Subject: Re: Clustering Data

Here is a link:

Clustering 
data<http://kerner.cc/box.****tightening.challenges.png<http://kerner.cc/box.**tightening.challenges.png>
<http**://kerner.cc/box.tightening.**challenges.png<http://kerner.cc/box.tightening.challenges.png>
On 08/05/2011 02:31 PM, Sean Owen wrote:

  (Attachments don't come through on apache.org<http://apache.org>
mailing lists. Can you post it elsewhere, or describe it?)

On Fri, Aug 5, 2011 at 1:30 PM, Alexander Kerner
<[email protected]<****mailto:a.kerner@dkfz-**heidel**
berg.de<http://heidelberg.de><a.kerner@dkfz-**heidelberg.de<[email protected]>
  wrote:

     Hi all,

     I would like to cluster following data (see attached picture) into
     three
     groups (light blue, dark blue, black).
     Can I use Apache Mahout for this? I want to integrate clustering
     within
     my existing Java application.
     What algorithm would I need to use and how do I set this up
     programatically?

     Many thanks,
     Alex




  --
Alexander Kerner
PhD Student

Divison of Stem Cells and Cancer A010
German Cancer Research Center, DKFZ
and
Heidelberg Institute for Stem Cell Technology
and Experimental Medicine
HI-STEM GmbH

Neuenheimer Feld 280
69120 Heidelberg

Tel.: +49(0)6221/42-3922
Fax: +49(0)6221/42-3902

Email: [email protected]



--
Alexander Kerner
PhD Student

Divison of Stem Cells and Cancer A010
German Cancer Research Center, DKFZ
and
Heidelberg Institute for Stem Cell Technology
and Experimental Medicine
HI-STEM GmbH

Neuenheimer Feld 280
69120 Heidelberg

Tel.: +49(0)6221/42-3922
Fax: +49(0)6221/42-3902

Email: [email protected]



--
Alexander Kerner
PhD Student

Divison of Stem Cells and Cancer A010
German Cancer Research Center, DKFZ
and
Heidelberg Institute for Stem Cell Technology
and Experimental Medicine
HI-STEM GmbH

Neuenheimer Feld 280
69120 Heidelberg

Tel.: +49(0)6221/42-3922
Fax: +49(0)6221/42-3902

Email: [email protected]

Reply via email to