This post also relates to the
post<http://sourceforge.net/mailarchive/message.php?msg_id=31727995>regarding
using KMeans for large 1d data I asked today on the Mailing List.

I just read this answer <http://stats.stackexchange.com/a/40475/22441> on
Cross Validated, which suggests using Kernel Density Estimation for 1D
clustering. I tried my hands at it and from the looks of it, feels correct.
According to this answer, choosing the local optima should give the
location of cluster centers. Further having read Jake's recent post on KDE
implementations and the post he cited, KDE looks suited for the job.
Do you people have any opinions regarding the same?

Does KDE implementation provide such locations and values of optima? I
could not find one.

I thus resorted temporarily to a quick -fix solution from
here<http://stackoverflow.com/questions/3986345/how-to-find-the-local-minima-of-a-smooth-multidimensional-array-in-numpy-efficie>,
which detects more optima than what it ideally should.

The following is the plot showing the KDE evaluation (in red) and the local
optima predicted by above method (blue). NB: I changed the local_minima to
local_maxima.

[image: Inline image 1]






On Thu, Mar 7, 2013 at 7:52 PM, nipun batra <nipunredde...@gmail.com> wrote:

> Hi,
> Sorry for late reply. I was playing with REDD 
> <http://redd.csail.mit.edu/>dataset for Non Intrusive Load Monitoring. Two 
> issues with k-means:
>
>    1. Due to inherent nature of algorithm and random initializations, i
>    keep ending up with different clusters. These different clusters correspond
>    to different states of an electrical appliance (ON, OFF, Heater on etc)
>    2. When number of points is very large then it tends to ignore states
>    with relatively less number of points
>
> I think with careful manual intervention i can figure our the right
> cluster centroids (amongst the ones generated on re-runs). But, in general,
> it would be good to have something more specific for 1d.
>
> Jenkins method looks very close to K-Means. Haven't tried it yet, wanted
> to know the intuition behind why it could be better than K-Means.
>
> On Tue, Mar 5, 2013 at 8:45 PM, Ronnie Ghose <ronnie.gh...@gmail.com>wrote:
>
>> interesting posts :).
>>
>> so
>> 1) do we want a natural breaks method?
>> https://en.wikipedia.org/wiki/Jenks_natural_breaks_optimization
>> 2) have you considered looking at the distribution of the variable as
>> they suggest? any small-d tends to allow this rather than the usual giant-d
>> space.
>>
>> Do you have any general data set you could release w.r.t. this variable
>> nipun? the first question has very clear breaks if you use a histogram alone
>>
>>
>>
>>
>> On Tue, Mar 5, 2013 at 9:59 AM, nipun batra <nipunredde...@gmail.com>wrote:
>>
>>> It should. I would have straight away tried it, but read the following 2
>>> posts:
>>>
>>>    1.
>>>    http://stackoverflow.com/questions/11513484/1d-number-array-clustering
>>>    2. http://stats.stackexchange.com/questions/13781/clustering-1d-data
>>>
>>> Any thoughts?
>>>
>>> On Tue, Mar 5, 2013 at 8:24 PM, Ronnie Ghose <ronnie.gh...@gmail.com>wrote:
>>>
>>>> ..........does kmeans not work?
>>>>
>>>>
>>>> On Tue, Mar 5, 2013 at 9:51 AM, nipun batra <nipunredde...@gmail.com>wrote:
>>>>
>>>>> Hi,
>>>>> What clustering technique (with implementation in sklearn) is
>>>>> recommended for 1d data?
>>>>>
>>>>>
>>>>> ------------------------------------------------------------------------------
>>>>> Everyone hates slow websites. So do we.
>>>>> Make your web apps faster with AppDynamics
>>>>> Download AppDynamics Lite for free today:
>>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>>> _______________________________________________
>>>>> Scikit-learn-general mailing list
>>>>> Scikit-learn-general@lists.sourceforge.net
>>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>>
>>>>>
>>>>
>>>>
>>>> ------------------------------------------------------------------------------
>>>> Everyone hates slow websites. So do we.
>>>> Make your web apps faster with AppDynamics
>>>> Download AppDynamics Lite for free today:
>>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>>> _______________________________________________
>>>> Scikit-learn-general mailing list
>>>> Scikit-learn-general@lists.sourceforge.net
>>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>>
>>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> Everyone hates slow websites. So do we.
>>> Make your web apps faster with AppDynamics
>>> Download AppDynamics Lite for free today:
>>> http://p.sf.net/sfu/appdyn_d2d_feb
>>> _______________________________________________
>>> Scikit-learn-general mailing list
>>> Scikit-learn-general@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>>
>>>
>>
>>
>> ------------------------------------------------------------------------------
>> Everyone hates slow websites. So do we.
>> Make your web apps faster with AppDynamics
>> Download AppDynamics Lite for free today:
>> http://p.sf.net/sfu/appdyn_d2d_feb
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>

<<kde.png>>

------------------------------------------------------------------------------
Sponsored by Intel(R) XDK 
Develop, test and display web and hybrid apps with a single code base.
Download it for free now!
http://pubads.g.doubleclick.net/gampad/clk?id=111408631&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to