Solving this issue in a generic way would be nice:
https://github.com/scikit-learn/scikit-learn/issues/325
On Mon, Jul 30, 2012 at 6:43 PM, Olivier Grisel <[email protected]>wrote:
> Actually I think the KNearestNeighborsClassifier implementation in
> scikit-learn has a real memory occupation issue in "brute" mode (which
> is selected).
>
> I suspect it is materializing the whole (n_samples_train,
> n_samples_predict) distances array in memory before computing the
> (n_samples_predict * k) minimum values.
>
> When both n_samples_train and n_samples_predict are big this is an issue.
>
> This could be worked around by chunking the data argument of the
> predict calls instead.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://github.com/ogrisel
>
>
> ------------------------------------------------------------------------------
> Live Security Virtual Conference
> Exclusive live event will cover all the ways today's security and
> threat landscape has changed and how IT managers can respond. Discussions
> will include endpoint security, mobile security and the latest in malware
> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
------------------------------------------------------------------------------
Live Security Virtual Conference
Exclusive live event will cover all the ways today's security and
threat landscape has changed and how IT managers can respond. Discussions
will include endpoint security, mobile security and the latest in malware
threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general