Could you please check memory usage while running DBSCAN to make sure
freezing is due to running out of memory and not to something else?
Which parameters do you run DBSCAN with? Changing algorithm, leaf_size
parameters and ensuring n_jobs=1 could help.
Assuming eps is reasonable, I think it shouldn't be an issue to run
DBSCAN on L2 normalized data: using the default euclidean metric, this
should produce somewhat similar results to clustering not normalized
data with metric='cosine'.
On 13/05/18 00:20, Andrew Nystrom wrote:
If you’re l2 norming your data, you’re making it live on the surface of
a hypershere. That surface will have a high density of points and may
not have areas of low density, in which case the entire surface could be
recognized as a single cluster if epsilon is high enough and min
neighbors is low enough. I’d suggest not using l2 norm with DBSCAN.
On Sat, May 12, 2018 at 7:27 AM Mauricio Reis <rei...@gmail.com
<mailto:rei...@gmail.com>> wrote:
The DBScan "fit" method (in scikit-learn v0.19.1) is freezing my
computer without any warning message!
I am using WinPython 3.6.5 64 bit.
The method works normally with the original data, but freezes when I
use the normalized data (between 0 and 1).
What should I do?
Att.,
Mauricio Reis
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org <mailto:scikit-learn@python.org>
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn