Re: Is Hadoop the thing for us ?

Billy Pearson Fri, 27 Jun 2008 21:04:07 -0700

I do not totally understand you job you are running but if each simulationcan run independent of each other then you could run a map reduce job thatwill spread the simulation's over many servers so each one can run one ormore at the same time this will give you a level of protection on serversgoing down and take care of the work on spreading out the work to serveralso this should be able to handle more then the 100K simulation mark youstated you would like to run. You would just need to write a the input codeto handle splitting the simulations into splits that the MR framework couldwork with.


Billy

"Igor Nikolic" <[EMAIL PROTECTED]> wrote inmessage news:[EMAIL PROTECTED]

Thank you for your comment, it did confirm my suspicions.
You framed the problem correctly. I will probably invest a bit of timestudying the framework anyway, to see if a rewrite is interesting, sincewe hit scaling limitations on our Agent scheduler framework. Our maincomputational load is the massive amount of agent reasoning ( thinkJbossRules) and inter-agent communication ( they need to sell and buystuff to each other) so I am not sure if it is at all possible to breakit down to small tasks, specially if this needs to happen across CPU's,the latency is going to kill us.
Thanks
igor

John Martyniak wrote:
I am new to Hadoop.  So take this information with a grain of salt.
But the power of Hadoop is breaking down big problems into small piecesand
spreading it across many (thousands) of machines, in effect creating a
massively parallel processing engine.

But in order to take advantage of that functionality you must write your
application to take advantage of it, using the Hadoop frameworks.
So if I understand your dilemma correctly. I do not think that Hadoopisfor you, unless you want to re-write your app to take advantage of it.AndI suspect that if you have access to a traditional cluster, that will bea
better alternative for you.

Hope that this helps some.

-John
On Wed, Jun 25, 2008 at 7:33 AM, Igor Nikolic<[EMAIL PROTECTED]> wrote:
Hello list

We will be getting access to a cluster soon, and I was wondering whether
this I should use Hadoop ?  Or am I better of with the usual batch
schedulers such as ProActive etc ? I am not a CS/CE person, and fromreading
the website I can not get a sense of whether hadoop is for me.

A little background:
We have a relatively large agent based simulation ( 20+ MB jar) thatneeds
to be swept across very large parameter spaces. Agents communicate only
within the simulation, so there is no interprocess communication. The
parameter vector is max 20 long , the simulation may take 5-10 minuteson anormal desktop and it might return a few mb of raw data. We need10k-100K
runs, more if possible.



Thanks for advice, even a short yes/no is welcome

Greetings
Igor

--
ir. Igor Nikolic
PhD Researcher
Section Energy & Industry
Faculty of Technology, Policy and Management
Delft University of Technology, The Netherlands

Tel: +31152781135
Email: [EMAIL PROTECTED]
Web: http://www.igornikolic.com
wiki server: http://wiki.tudelft.nl
--
ir. Igor Nikolic
PhD Researcher
Section Energy & Industry
Faculty of Technology, Policy and Management
Delft University of Technology, The Netherlands

Tel: +31152781135
Email: [EMAIL PROTECTED]
Web: http://www.igornikolic.com
wiki server: http://wiki.tudelft.nl

Re: Is Hadoop the thing for us ?

Reply via email to