Hi Jaganadh,

I once used hadoop to implement grid search / multi-task learning with
hadoop streaming. The setup was fairly simple: I put the serialized
dataset (joblib dump) on HDFS and created an input file - one line for
each parameter setting for grid search. The map script deserialized
the dataset from HDFS (in the init of the script) and for each map
task (=parameter setting) it trained a model, computed the prediction
error and emitted it. You can find some of the code here [1].

I used Hadoop because I had a Hadoop cluster at my disposal - nowadays
I'd use IPython.parallel and starcluster instead - much simpler IMHO.

best,
 Peter

[1] https://github.com/pprett/nut/blob/master/nut/structlearn/dumbomapper.py
 (this is the mapper script; the code which creates the input files
and puts everything onto HDFS is in the auxstrategy.py file)

2013/1/23 JAGANADH G <[email protected]>:
> Hi All,
>
> Does anybody tried using sklearn with Hadoop/Dumbo or hadoop streaming.
> Please share your thoughts and experience.
>
> Best regards
>
> --
> **********************************
> JAGANADH G
> http://jaganadhg.in
> ILUGCBE
> http://ilugcbe.org.in
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. ON SALE this month only -- learn more at:
> http://p.sf.net/sfu/learnnow-d2d
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
Peter Prettenhofer

------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to