On Mon, Dec 04, 2023 at 01:19:42PM +0200, Ryan Mitchley wrote:
> Hi all,
> 
> I am aware of some options in Armadillo for Gaussian Mixture Model
> clustering. Is anyone in particular aware of performant algorithms (in
> MLpack or elsewhere) for performing iterative / online clustering (also
> called streaming clustering) in particular. My interests are in iterative
> cluster estimation, with downdating of samples (i.e. data expires).
> 
> This particular combination of requirements has seemed to be challenging.
> 
> I am aware of xokde++, which seems to be very promising (online-KDE):
> https://arxiv.org/abs/1606.02608
> When I examined the associated code, though, it seemed to be very much a
> research demonstrator artifact. It looks like it need a fair amount of
> development and refinement.

Hey Ryan,

I don't know if this satisfies your requirements fully, but the GMM
class in mlpack does have the option to use an existing model as a
starting point for training.  So although it may not be the most
efficient way to do things, you could imagine training a model on your
original data, then removing some of the original data that has expired,
adding new data, and then training again.  It isn't *quite* online GMMs
in the way you were thinking, but it might manage to at least be
something in the right direction.

Hope that helps.  At least personally I don't know xokde++, but perhaps
some of the ideas there could be adapted and cleaned up into production
code too.

Thanks,

Ryan

-- 
Ryan Curtin    | "Don't fight it son. Confess quickly! If you hold out too
[email protected] | long you could jeopardize your credit rating."  - Guard
_______________________________________________
mlpack mailing list -- [email protected]
To unsubscribe send an email to [email protected]
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

Reply via email to