Hi Peter,

Thanks for sharing the experience and code. I will try the same.

@Jaques : Thanks for the link. My plan is to use sklearn only . If I have
to use Mahout the entire project has to be converted to java. I am
interested to accomplish it in Python only !!

Best regards

jaganadh



On Wed, Jan 23, 2013 at 6:43 PM, Peter Prettenhofer <
[email protected]> wrote:

> Hi Jaganadh,
>
> I once used hadoop to implement grid search / multi-task learning with
> hadoop streaming. The setup was fairly simple: I put the serialized
> dataset (joblib dump) on HDFS and created an input file - one line for
> each parameter setting for grid search. The map script deserialized
> the dataset from HDFS (in the init of the script) and for each map
> task (=parameter setting) it trained a model, computed the prediction
> error and emitted it. You can find some of the code here [1].
>
> I used Hadoop because I had a Hadoop cluster at my disposal - nowadays
> I'd use IPython.parallel and starcluster instead - much simpler IMHO.
>
> best,
>  Peter
>
> [1]
> https://github.com/pprett/nut/blob/master/nut/structlearn/dumbomapper.py
>  (this is the mapper script; the code which creates the input files
> and puts everything onto HDFS is in the auxstrategy.py file)
>
> 2013/1/23 JAGANADH G <[email protected]>:
> > Hi All,
> >
> > Does anybody tried using sklearn with Hadoop/Dumbo or hadoop streaming.
> > Please share your thoughts and experience.
> >
> > Best regards
> >
> > --
> > **********************************
> > JAGANADH G
> > http://jaganadhg.in
> > ILUGCBE
> > http://ilugcbe.org.in
> >
> >
> ------------------------------------------------------------------------------
> > Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> > MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> > with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> > MVPs and experts. ON SALE this month only -- learn more at:
> > http://p.sf.net/sfu/learnnow-d2d
> > _______________________________________________
> > Scikit-learn-general mailing list
> > [email protected]
> > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> >
>
>
>
> --
> Peter Prettenhofer
>
>
> ------------------------------------------------------------------------------
> Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
> MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
> with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
> MVPs and experts. ON SALE this month only -- learn more at:
> http://p.sf.net/sfu/learnnow-d2d
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>



-- 
**********************************
JAGANADH G
http://jaganadhg.in
*ILUGCBE*
http://ilugcbe.org.in
------------------------------------------------------------------------------
Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS,
MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current
with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft
MVPs and experts. ON SALE this month only -- learn more at:
http://p.sf.net/sfu/learnnow-d2d
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to