Hi Diego, did you already plan to make a benchmark of your result on the Mesos platform VS Bare-metal servers ? It would be really interesting for Enterprise evangelism, they love benchmark and metrics.
I'm impress by your project and how it goes fast. I'm myself a fan of Golang, but why did you choose it? 2015-03-03 3:03 GMT+01:00 Diego Medina <[email protected]>: > Hi everyone, based on all the great feedback I got here I updated the code > and now I have one scheduler and two executors, one for fetching html and a > second one that extracts links and text from the html. > I also run the actual work on their own goroutines (like threads for tose > not familiar with Go) and it's working great. > > I wrote about the changes here > http://blog.fmpwizard.com/blog/owlcrawler-multiple-executors-using-meso > and you can find the updated code here > https://github.com/fmpwizard/owlcrawler > > Again, thanks everyone for your input. > > Diego > > > > > On Fri, Feb 27, 2015 at 1:52 PM, Diego Medina <[email protected]> wrote: > >> Thanks for looking at the code and the feedback Alex. I'll be working on >> those changes later tonight! >> >> Diego >> >> On Fri, Feb 27, 2015 at 12:15 PM, Alex Rukletsov <[email protected]> >> wrote: >> >>> Diego, >>> >>> I've checked your code, nice effort! Great to see people hacking with >>> mesos and go bindings! >>> >>> One thing though. You do the actual job in the launchTask() of your >>> executor. This prevents you from multiple tasks in parallel on one >>> executor. That means you can't have more simultaneous tasks than executors >>> in your cluster. You may want to spawn a thread for every incoming task and >>> do the job there, while launchTasks() will do solely task initialization >>> (basically, starting a thread). Check the project John referenced to: >>> https://github.com/mesosphere/RENDLER. >>> >>> Best, >>> Alex >>> >>> On Fri, Feb 27, 2015 at 11:03 AM, Diego Medina <[email protected]> >>> wrote: >>> >>>> Hi Billy, >>>> >>>> comments inline: >>>> >>>> On Fri, Feb 27, 2015 at 4:07 AM, Billy Bones <[email protected]> >>>> wrote: >>>> >>>>> Hi diego, as a real fan of the golang, I'm cudoes and clap for your >>>>> work on this distributed crawler and hope you'll finally release it ;-) >>>>> >>>>> >>>> >>>> Thanks! my 3 month old baby is making sure I don't sleep much and have >>>> plenty of time to work on this project :) >>>> >>>> >>>>> About your question, the common architecture is to have one scheduler >>>>> and multiple executors rather than one big executor. >>>>> The basics of mesos is to take any resources, put them together on a >>>>> pool to then swarm tasks on this pool, so, basically the architecture of >>>>> your application should share this philosophy and then explode / decouple >>>>> your application as much as possible but be carreful to not loop lock >>>>> yourself on threads and tasks if they're dependents. >>>>> >>>>> I don't know if I'm explaining myself correctly so do not hesitate if >>>>> you need more clarification. >>>>> >>>>> >>>> >>>> Your answer was very clear. Today I started to split the executor into >>>> two, one that simply fetches the html and then a second one that extracts >>>> text without tags from it, this second executor gets the data from a >>>> database, so far it seems like a natural way to split the tasks. I was >>>> going with the idea of also having two schedulers, but I think I just >>>> figured out how to use just one. >>>> >>>> Thanks! >>>> >>>> Diego >>>> >>>> >>>> >>>>> >>>>> 2015-02-26 21:50 GMT+01:00 Diego Medina <[email protected]>: >>>>> >>>>>> @John: thanks for the link, i see that RENDLER uses the ExecutorId >>>>>> from ExecutorInfo to decide what to do, I'll give this a try >>>>>> @Craig: you are right, after I sent the email I continued to read >>>>>> more of the mesos docs and saw that I used the wrong term, where I meant >>>>>> scheduler instead of framework, thanks. >>>>>> >>>>>> Thanks and looking forward to any other feedback you may all have. >>>>>> >>>>>> Diego >>>>>> >>>>>> >>>>>> On Thu, Feb 26, 2015 at 5:24 AM, craig w <[email protected]> wrote: >>>>>> >>>>>>> Diego, >>>>>>> >>>>>>> I'm also interested in hearing feedback to your qusestion. One minor >>>>>>> thing I'd point out is that a Framework is made up of a Scheduler and >>>>>>> Executor(s), so I think it's more correct to say you've created a >>>>>>> Scheduler >>>>>>> (instead of "one big framework") and an Executor. >>>>>>> >>>>>>> Anyhow, for what it's worth, the Aurora framework has multiple >>>>>>> executors ( >>>>>>> https://github.com/apache/incubator-aurora/blob/master/examples/vagrant/aurorabuild.sh#L61). >>>>>>> You might pop into the #aurora IRC chat room and ask, usually a few >>>>>>> Aurora >>>>>>> contributors are in there answering questions when they can. >>>>>>> >>>>>>> On Wed, Feb 25, 2015 at 9:02 PM, John Pampuch <[email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Diego- >>>>>>>> >>>>>>>> You might want to look at this project for some insights: >>>>>>>> >>>>>>>> https://github.com/mesosphere/RENDLER >>>>>>>> >>>>>>>> >>>>>>>> -John >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Feb 25, 2015 at 5:27 PM, Diego Medina <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> >>>>>>>>> Short: Is it better to have one big framework and executor with if >>>>>>>>> statements to select what to do or several smaller framework <-> >>>>>>>>> executors >>>>>>>>> when writing a Mesos app? >>>>>>>>> >>>>>>>>> Longer question: >>>>>>>>> >>>>>>>>> Last week I started a side project based on mesos (using Go), >>>>>>>>> >>>>>>>>> http://blog.fmpwizard.com/blog/web-crawler-using-mesos-and-golang >>>>>>>>> https://github.com/fmpwizard/owlcrawler >>>>>>>>> >>>>>>>>> It's a web crawler written on top of Mesos, The very first version >>>>>>>>> of it had a framework that sent a task to an executor and that single >>>>>>>>> executor would fetch the page, extract links from the html and then >>>>>>>>> send >>>>>>>>> them to a message queue. >>>>>>>>> >>>>>>>>> Then the framework reads that queue and starts again, run the >>>>>>>>> executor, etc, etc. >>>>>>>>> >>>>>>>>> Now I'm splitting fetching the html and extracting links into two >>>>>>>>> separate tasks, and putting those two tasks in the same executor >>>>>>>>> doesn't >>>>>>>>> feel right, so I'm thinking that I need at least two diff executors >>>>>>>>> and one >>>>>>>>> framework, but then I wonder if people more experienced with mesos >>>>>>>>> would >>>>>>>>> normally write several pairs of framework <-> executors to keep the >>>>>>>>> design >>>>>>>>> cleaner. >>>>>>>>> >>>>>>>>> On this particular case, I can see the project growing into even >>>>>>>>> more tasks that can be decoupled. >>>>>>>>> >>>>>>>>> Any feedback on the design would be great and let me know if I >>>>>>>>> should explain this better. >>>>>>>>> >>>>>>>>> Thanks >>>>>>>>> >>>>>>>>> >>>>>>>>> Diego >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Diego Medina >>>>>>>>> Lift/Scala consultant >>>>>>>>> [email protected] >>>>>>>>> http://fmpwizard.telegr.am >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> >>>>>>> https://github.com/mindscratch >>>>>>> https://www.google.com/+CraigWickesser >>>>>>> https://twitter.com/mind_scratch >>>>>>> https://twitter.com/craig_links >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Diego Medina >>>>>> Lift/Scala consultant >>>>>> [email protected] >>>>>> http://fmpwizard.telegr.am >>>>>> >>>>> >>>>> >>>> >>>> >>>> -- >>>> Diego Medina >>>> Lift/Scala consultant >>>> [email protected] >>>> http://fmpwizard.telegr.am >>>> >>> >>> >> >> >> -- >> Diego Medina >> Lift/Scala consultant >> [email protected] >> http://fmpwizard.telegr.am >> > > > > -- > Diego Medina > Lift/Scala consultant > [email protected] > http://fmpwizard.telegr.am >

