Hello J.B,

Thank you for your quick reply.

If you try with a very small (e.g., 100 sample) data file, does your code
employing MDS work?
As you increase the number of samples, does the script continue to work?
So I tried the same script while increasing the number of samples (100,
1000 and 10000) and it works indeed without swapping on my workstation.

That is 49,000,000 entries, plus overhead for a data structure.
I thought that even 49M entries of doubles would be able to be processed
with 64G of RAM. Is there something to configure to allow this computation?

The typical datasets I use can have around 200-300k rows with a few columns
(usually up to 3).

Best regards,

Guillaume

Quoting "Brown J.B. via scikit-learn" <scikit-learn@python.org>:

Hello Guillaume,

You are computing a distance matrix of shape 70000x70000 to generate MDS
coordinates.
That is 49,000,000 entries, plus overhead for a data structure.

If you try with a very small (e.g., 100 sample) data file, does your code
employing MDS work?
As you increase the number of samples, does the script continue to work?

Hope this helps you get started.
J.B.

2018年10月9日(火) 18:22 Guillaume Favelier <guillaume.favel...@lip6.fr>:

Hi everyone,

I'm trying to use some dimension reduction algorithm [1] on my dataset
[2] in a
python script [3] but for some reason, Python seems to consume a lot of my
main memory and even swap on my configuration [4] so I don't have the
expected result
but a memory error instead.

I have the impression that this behaviour is not intended so can you
help me know
what I did wrong or miss somewhere please?

[1]: MDS -
http://scikit-learn.org/stable/modules/generated/sklearn.manifold.MDS.html
[2]: dragon.csv - 69827 rows, 3 columns (x,y,z)
[3]: dragon.py - 10 lines
[4]: dragon_swap.png - htop on my workstation

TAR archive:
https://drive.google.com/open?id=1d1S99XeI7wNEq131wkBUCBrctPQRgpxn

Best regards,

Guillaume Favelier

_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn




_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to