Hi Spark Devs

An idea developed recently out of a scikit-learn mailing list discussion (
http://sourceforge.net/mailarchive/forum.php?thread_name=CAFvE7K5HGKYH9Myp7imrJ-nU%3DpJgeGqcCn3JC0m4MmGWZi35Hw%40mail.gmail.com&forum_name=scikit-learn-general)
to have a coding sprint around Strata in Feb, focused on integration
between scikit-learn and PySpark for large-scale machine learning tasks.

Cloudera has kindly agreed to host the sprint, most likely in San
Francisco. Ideally it would be focused and capped at around 10 people. The
idea is not meant to be a teaching workshop for
newcomers but more as a prototyping session, so ideally it would be great
to have developers and users with deep knowledge of PySpark (Josh
especially :) and/or scikit-learn, attend.

Hopefully we can get some people from the Spark community involved, and
Olivier will drum up support from the scikit-learn community.

All the best and hope to see you there (though likely I will only be able
to join remotely).
Nick

Reply via email to