There are two issues here: 1. We store all radius neighborhoods of all points in memory at once. This is a problem if each point has a large radius neighborhood. DBSCAN only requires that you store the radius neighbors of the point you are currently examining. We could provide a memory-efficient mode that would do so.
2. Given that we store all neighborhoods at once, a brute force nearest neighbors search will take O(n^2) which can be reduced by chunking the operation. Both solutions have patches available already, but not reviewed. On 18 May 2018 at 00:37, Mauricio Reis <rei...@gmail.com> wrote: > I'm not used to the terms used here. So I understood that the package had > memory management, which was removed. But you could make the code available > with memory management implementations. Is it?! :-) > The problem is that I do not know what I would do with the code, because I > only know how to work with the SciKitLearn package ready. :-( > > Att., > Mauricio Reis > > 2018-05-16 20:33 GMT-03:00 Joel Nothman <joel.noth...@gmail.com>: > >> Implemented in a previous version of #10280 >> <https://github.com/scikit-learn/scikit-learn/pull/10280>, but removed >> for now to simplify reviews >> <https://github.com/scikit-learn/scikit-learn/pull/10280#pullrequestreview-95622713>. >> If others would like to review #10280, I'm happy to follow up with the >> changes requested here, which have already been implemented by Aman Dalmia >> and myself. >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn