Hi Nathan,
I want to do real time computation using storm, which one is best storm or 
trident. i need to handle huge amount of  data , exactly once please help me

Thanks!
Date: Wed, 27 May 2015 12:40:43 -0400
Subject: Re: Status of running storm on yarn (the yahoo project)
From: [email protected]
To: [email protected]
CC: [email protected]; [email protected]

Mesosphere has official support for Storm on Mesos: 
https://github.com/mesos/storm
On Wed, May 27, 2015 at 11:14 AM,  <[email protected]> wrote:
Dell - Internal Use - Confidential 
Thanks Bobby, for the detailed answer. So it sounds like ,  it is better not to 
combine Storm with batch workloads at this point (yarn, mesos or ec2), due to 
the network saturation and timeout threats. Is this behavior also seen in other 
streaming frameworks like spark streaming running on YARN. From: Bobby Evans 
[mailto:[email protected]] 
Sent: Wednesday, May 27, 2015 9:07 AM
To: Jeffery Maass; [email protected]
Subject: Re: Status of running storm on yarn (the yahoo project) Mesos is very 
similar to YARN.  It is a resource scheduler.  Storm in the past had support 
for mesos, through a separate repo https://github.com/nathanmarz/storm-mesos it 
might still work with the latest versions of storm.  I don't know.  The concept 
here is that there was a special layer installed that would look for when the 
cluster had outstanding requests and not enough resources to meet those 
requests.  It would then request that many resources from mesos, launch 
supervisors on those nodes and let the scheduler do the rest.  It works quire 
well for elasticity at a small scale, or when you have a lot more network 
bandwidth than you need.  The problem is if mesos, or YARN, or open-stack, or 
EC2, or ... collocates your storm topology with some big batch job that 
suddenly saturates the network for a few seconds to a min heartbeats could 
start to time out, traffic would not flow from one worker to another, etc.  For 
some topologies all you do is tune your timeouts so workers don't get shot and 
relaunched too frequently and live with the noise from other stuff happening on 
the network.  For us though we have some very tight SLAs, if the data is 5 
seconds old throw it away I cannot use it any more.   My current goal with 
storm in this area is to have it be aware of the resources that your topology 
is using, the SLAs that it has, its desired budget for resources, how far over 
that budget it is willing to go,  Where it could possibly get other resources 
if needed (i.e. YARN, Mesos, Open Stack), and any other constraints it might 
have.  Storm would then take all of this into account and adjust the scheduling 
of your topology so that it can grow and shrink with the resources it needs to 
meet the SLAs it has, optionally taking some of those resources from other 
systems if needed.  This is still a ways out, but looking at the research that 
is being done in this area it should be doable in the next year or so. - Bobby  
  On Wednesday, May 27, 2015 8:38 AM, Jeffery Maass <[email protected]> wrote: 
I have heard Nathan Marz mention Mesos.How is yarn / storm-yarn / slider-yarn 
different from Mesos?

These are the links I found to Mesos:
https://github.com/mesos/storm
https://github.com/nathanmarz/storm-mesos
http://mesos.apache.org/Thank you for your time!

+++++++++++++++++++++
Jeff Maass
linkedin.com/in/jeffmaass
stackoverflow.com/users/373418/maassql
+++++++++++++++++++++ On Wed, May 27, 2015 at 8:28 AM, Bobby Evans 
<[email protected]> wrote:storm-yarn was originally done as a proof of 
concept.  We had plans to take it further, but the amount of work required to 
make it production ready on a very heavily used cluster was more then we were 
willing to invest at the time.  Most of that work was around network 
scheduling, isolation and prioritization, mainly in YARN itself.  There has 
been some work looking into this, but nothing much has happened with it.  At 
the same time http://slider.incubator.apache.org/ showed up and is now the 
preferred way to run Storm on YARN.  To get around the networking issues most 
people will tag a subset of their cluster, a few racks, and only schedule storm 
to run on those nodes.  Long term I really would like to revive storm on yarn, 
and integrate it directly into storm.  Giving storm and the scheduler the 
ability to request new resources with specific constraints opens up a lot of 
new possibilities.  If you want to help out, or if anyone else wants to help 
out with this work, I would be very happy to file some JIRA in open source and 
help direct what needs to be done. - Bobby   On Wednesday, May 27, 2015 4:59 
AM, Spico Florin <[email protected]> wrote: Hello!I'm interesting in 
running the storm topologies on yarn. I was looking at the yahoo project 
https://github.com/yahoo/storm-yarn, and I could observed that there is no 
activity since 7 months ago. Also, the issues and requests lists are not 
updated.Therefore I have some questions:1. Is there any plan to evolve this 
project?2. Is there any plan to integrate this project in the main branch?3. Is 
someone using this approach in production ready mode? I look forward for your 
answers. Regards, Florin        

-- 
Twitter: @nathanmarz
http://nathanmarz.com
                                          

Reply via email to