hi,

We are looking for a Scikit-Learn/Python fan interested in helping us
to implement native persistence of Scikit-Learn estimators and data.
The technology we plan to use is called NEO (http://www.neoppod.org/).
It is a distributed object database that can store serialized python
objects on a redundant array of inexpensive computers. It is based on
the ZODB protocol but supports high performance and distributed
architecture. NEO is already used by ERP5, an open source ERP/CRM that
powers large companies and governments.

The main task will consist in allowing Numpy ndarrays to implement
ZODB's Persistent class interface. We are also considering adding a
meta-object protocol that can be used to extend the representation of
Numpy arrays and distribute them across multiple storage nodes
transparently.

One application of this project consists in analyzing with
Scikit-Learn large collections of logs from a Cloud Computing
infrastructure in order to implement predictive decisions that can
help increasing resiliency: predict process migration, predict
disaster recovery, etc. However, the general goal of this work goes
beyond this initial application and intends to create a native
distributed storage for Scikit-Learn that is flexible enough for a
wide range of applications.

Future applications that are considered include: internet of things,
large scale scientific data processing (neuroimaging, chemistry,
genomics, physics etc.), financials, discrete simulation, etc.

If you're interested, please send me a resume. If you have a github
login please share.

Position is available now for a period of 8 months (renewal possible).

It will be located at Telecom ParisTech in downtown Paris.

Prior experience with sklearn is a must have. Experience with object
databases is a plus.

Best,
Alex

------------------------------------------------------------------------------
Managing the Performance of Cloud-Based Applications
Take advantage of what the Cloud has to offer - Avoid Common Pitfalls.
Read the Whitepaper.
http://pubads.g.doubleclick.net/gampad/clk?id=121051231&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to