Guijarro, Julio wrote:
>  
> 
>  
> 
> *From:* Cai Cai [mailto:[email protected]]
> *Sent:* 21 September 2009 08:00
> *To:* Guijarro, Julio
> *Subject:* Re: [Smartfrog-users] smartfrog and hadoop
> 
>  
> 
> Hi Julio,
>  Currently I'm doing some simple experiments about hadoop with some 
> classmates and we are interested in building our own cloudplatform(and 
> if our idea is workable, we may get more support from our institute). 
> But we just have some person computers with ordinary configurations. We 
> have installed Xen(Xen 3.2-1-amd64) on our computers, with debian lenny 
> OS, to set up simple clusters. And we also install hadoop on them. Now 
> we are thinking about how to do it with Smartfrog (since we think 
> Smartfrog is suitable for automatic deployment and may be helpful for 
> our platform in the future, such as application deployment and ect.)
>  Now I have download the "smartfrog.3.17.014_dist.tar.gz" and 
> "smartfrog-rpm-bundle-3.17.014.tar.gz", with the formal one, i have test 
> the examples in Smartfrog User Manual(sfRun 
> org/smartfrog/examples/arithnet/example1.sf and etc.), but there is 
> nothing related to hadoop,i find the followings in your website
>   Steps to deployability
>   1 Configure Hadoop from an SmartFrog description
>   2 Write components for the Hadoop nodes
>   3 Write the functional tests
>   4 Add workflow components to work with the filesystem; submit jobs
>   5 Get the tests to pass
> should i add the hadoop jar file in 
> "smartfrog-rpm-bundle-3.17.014.tar.gz" into "smartfrog.3.17.014_dist"'s  
> dist/lib filefolder or could you please tell me what i should do next?
> 
>  I hope the text above can supply plenty of information you want, and if 
> not, please email me.
>  Thank you very much and we're looking forward to you help.
>  Best wishes,
>  Cai
> 
>  

If things aren't written up, its my fault. We have tests and examples, 
but things are unstable and yes, I need to write all this up.

JARs

The sf-hadoop RPM contains all the JARs needed to bring up Hadoop under 
SmartFrog. In an RPM-based installation they should all go into 
SFHOME/lib automatically -if you are installing on Debian then alien 
ought to be able to handle everything.  Know that a debian package is 
something we would like to do at some point, it's just there are lots of 
other things on my todo list too.


Installations

Once the JARs are in place, SmartFrog can bring up a node as a namenode, 
datanode, job tracker, task tracker or all of the above -you just have 
to push out the right .sf file to the nodes to tell them what you want 
them to be. Which leads to the question, "where do those .sf files 
live"? They are in our SVN repository

-The hadoop-cluster package contains everything needed to configure a 
hadoop cluster, and some ant targets to help push
these descriptions out. This is something I am still busy developing.

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/extras/hadoop-cluster/

Although it is not released as any RPM, it does generate a JAR file 
containing nothing but .sf files for different parts of a system; I'm 
busy working on this to drive it more dynamically, expanding templates 
with late binding information (URLs of the master servers, numbers of 
task slots per host, etc.), so that when a cluster of machines is 
dynamically created, its easy to push out the configurations. Hadoop is 
tricky in that while the workers will all spin waiting for the master 
nodes to come up, they do all need to know the URL of the namenode and 
jobtracker before they start spinning -so you need to know the 
hostname/port of the master nodes before trying to start any workers.

the templates are here
http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/extras/hadoop-cluster/src/org/smartfrog/extras/hadoop/cluster/services/bondable/components.sf

these are what we push out to dynamically allocated machines once they 
are up, part of my long-haul cluster management and job submission stuff
http://www.slideshare.net/steve_l/long-haul-hadoop


-the citerank package contains everything needed to run a page-rank 
style algorithm on citation data; this is a basic example of a complex 
Hadoop MR sequence. It is what I use for testing that clusters work. It 
also does its

http://smartfrog.svn.sourceforge.net/viewvc/smartfrog/trunk/core/extras/citerank/

This is designed to build and run standalone (via standalone.xml) as 
well as part of our bigger build process. It uses the classic MapReduce 
APIs so it can run against older versions -provided you compile it 
against whichever version of Hadoop you intend to use.

My recommendations then are
  * check out the main SmartFrog core source tree, including the the 
extras/hadoop-cluster and citerank areas
  * have a look at how we bring up test clusters in components/hadoop 
and hadoop-cluster
  * If there are bits that are confusing, where you want some 
documentation, email this list and it will force me to write things up.

-Steve






------------------------------------------------------------------------------
Come build with us! The BlackBerry® Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9-12, 2009. Register now!
http://p.sf.net/sfu/devconf
_______________________________________________
Smartfrog-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/smartfrog-users

Reply via email to