Re: multiple frameworks or one big one

Diego Medina Wed, 04 Mar 2015 02:19:54 -0800

> Well, I deeply think that additionally to the architecture and the
> organisations concerns, Mesos need to provide some Enterprise oriented
> benchmark and "proof" to be able to really prime time on the enterprise
> world and not only on the "Startup style enterprises", but it's not the
> topic of your post and I'll made my own regarding this specific topic.
>
>
Looking forward to that discussion.




> Anyway, thank you very much for your answers.
>
> Regarding your choose of Golang instead of Scala because of some pain
> points, could you send me some exemples (except for compile time)? Even in
> private if you do not want to steal the thread, as I'm really balancing
> between those two.
>
>

I'll send you a separate private message with the reply, I don't mind
talking about it, but wouldn't want to distract this mailing list with the
topic.

Thanks

Diego



> 2015-03-03 14:26 GMT+01:00 Diego Medina <[email protected]>:
>
>> Hi Alex,
>>
>> On Tue, Mar 3, 2015 at 7:37 AM, Alex Rukletsov <[email protected]>
>> wrote:
>>
>>> Next good big thing would be to handle task state updates. Instead of
>>> dying on TASK_LOST, you may want to retry this task several times.
>>>
>>
>> Yes, this is definitely something I need to address, for now I use it to
>> help me find bugs in the code, if the app stops, I know I did something
>> wrong :)
>> I also need to find out why some tasks stay in status "Staging" on the
>> Mesos UI, but I'll start a separate thread for it.
>>
>> Thanks
>>
>> Diego
>>
>>
>>
>>
>>>
>>> On Tue, Mar 3, 2015 at 10:38 AM, Billy Bones <[email protected]>
>>> wrote:
>>>
>>>> Oh and you've got a glitch on one of your executor name in your first
>>>> code block.
>>>>
>>>> You've got:
>>>>
>>>> *extractorExe := &mesos.ExecutorInfo{
>>>>    ExecutorId: util.NewExecutorID("owl-cralwer-extractor"),
>>>>    Name:       proto.String("OwlCralwer Fetcher"),
>>>>    Source:     proto.String("owl-cralwer"),
>>>>    Command: &mesos.CommandInfo{
>>>>            Value: proto.String(extractorExecutorCommand),
>>>>            Uris:  executorUris,
>>>>    },
>>>> }*
>>>>
>>>> It should rather be:
>>>>
>>>> *extractorExe := &mesos.ExecutorInfo{
>>>>    ExecutorId: util.NewExecutorID("owl-cralwer-extractor"),
>>>>    Name:       proto.String("OwlCralwer Extractor"),
>>>>    Source:     proto.String("owl-cralwer"),
>>>>    Command: &mesos.CommandInfo{
>>>>            Value: proto.String(extractorExecutorCommand),
>>>>            Uris:  executorUris,
>>>>    },
>>>> }*
>>>>
>>>>
>>>>
>>>> 2015-03-03 10:28 GMT+01:00 Billy Bones <[email protected]>:
>>>>
>>>>> Hi Diego, did you already plan to make a benchmark of your result on
>>>>> the Mesos platform VS Bare-metal servers ?
>>>>> It would be really interesting for Enterprise evangelism, they love
>>>>> benchmark and metrics.
>>>>>
>>>>> I'm impress by your project and how it goes fast. I'm myself a fan of
>>>>> Golang, but why did you choose it?
>>>>>
>>>>> 2015-03-03 3:03 GMT+01:00 Diego Medina <[email protected]>:
>>>>>
>>>>>> Hi everyone, based on all the great feedback I got here I updated the
>>>>>> code and now I have one scheduler and two executors, one for fetching 
>>>>>> html
>>>>>> and a second one that extracts links and text from the html.
>>>>>> I also run the actual work on their own goroutines (like threads for
>>>>>> tose not familiar with Go) and it's working great.
>>>>>>
>>>>>> I wrote about the changes here
>>>>>>
>>>>>> http://blog.fmpwizard.com/blog/owlcrawler-multiple-executors-using-meso
>>>>>> and you can find the updated code here
>>>>>> https://github.com/fmpwizard/owlcrawler
>>>>>>
>>>>>> Again, thanks everyone for your input.
>>>>>>
>>>>>> Diego
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Feb 27, 2015 at 1:52 PM, Diego Medina <[email protected]>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks for looking at the code and the feedback Alex. I'll be
>>>>>>> working on those changes later tonight!
>>>>>>>
>>>>>>> Diego
>>>>>>>
>>>>>>> On Fri, Feb 27, 2015 at 12:15 PM, Alex Rukletsov <[email protected]
>>>>>>> > wrote:
>>>>>>>
>>>>>>>> Diego,
>>>>>>>>
>>>>>>>> I've checked your code, nice effort! Great to see people hacking
>>>>>>>> with mesos and go bindings!
>>>>>>>>
>>>>>>>> One thing though. You do the actual job in the launchTask() of your
>>>>>>>> executor. This prevents you from multiple tasks in parallel on one
>>>>>>>> executor. That means you can't have more simultaneous tasks than 
>>>>>>>> executors
>>>>>>>> in your cluster. You may want to spawn a thread for every incoming 
>>>>>>>> task and
>>>>>>>> do the job there, while launchTasks() will do solely task 
>>>>>>>> initialization
>>>>>>>> (basically, starting a thread). Check the project John referenced to:
>>>>>>>> https://github.com/mesosphere/RENDLER.
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Alex
>>>>>>>>
>>>>>>>> On Fri, Feb 27, 2015 at 11:03 AM, Diego Medina <[email protected]
>>>>>>>> > wrote:
>>>>>>>>
>>>>>>>>> Hi Billy,
>>>>>>>>>
>>>>>>>>> comments inline:
>>>>>>>>>
>>>>>>>>> On Fri, Feb 27, 2015 at 4:07 AM, Billy Bones <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Hi diego, as a real fan of the golang, I'm cudoes and clap for
>>>>>>>>>> your work on this distributed crawler and hope you'll finally 
>>>>>>>>>> release it ;-)
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks! my 3 month old baby is making sure I don't sleep much and
>>>>>>>>> have plenty of time to work on this project :)
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> About your question, the common architecture is to have one
>>>>>>>>>> scheduler and multiple executors rather than one big executor.
>>>>>>>>>> The basics of mesos is to take any resources, put them together
>>>>>>>>>> on a pool to then swarm tasks on this pool, so, basically the 
>>>>>>>>>> architecture
>>>>>>>>>> of your application should share this philosophy and then explode /
>>>>>>>>>> decouple your application as much as possible but be carreful to not 
>>>>>>>>>> loop
>>>>>>>>>> lock yourself on threads and tasks if they're dependents.
>>>>>>>>>>
>>>>>>>>>> I don't know if I'm explaining myself correctly so do not
>>>>>>>>>> hesitate if you need more clarification.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Your answer was very clear. Today I started to split the executor
>>>>>>>>> into two, one that simply fetches the html and then a second one that
>>>>>>>>> extracts text without tags from it, this second executor gets the 
>>>>>>>>> data from
>>>>>>>>> a database, so far it seems like a natural way to split the tasks. I 
>>>>>>>>> was
>>>>>>>>> going with the idea of also having two schedulers, but I think I just
>>>>>>>>> figured out how to use just one.
>>>>>>>>>
>>>>>>>>> Thanks!
>>>>>>>>>
>>>>>>>>> Diego
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2015-02-26 21:50 GMT+01:00 Diego Medina <[email protected]>:
>>>>>>>>>>
>>>>>>>>>>> @John: thanks for the link, i see that RENDLER uses the
>>>>>>>>>>> ExecutorId from ExecutorInfo to decide what to do, I'll give this a 
>>>>>>>>>>> try
>>>>>>>>>>> @Craig: you are right, after I sent the email I continued to
>>>>>>>>>>> read more of the mesos docs and saw that I used the wrong term, 
>>>>>>>>>>> where I
>>>>>>>>>>> meant scheduler instead of framework, thanks.
>>>>>>>>>>>
>>>>>>>>>>> Thanks and looking forward to any other feedback you may all
>>>>>>>>>>> have.
>>>>>>>>>>>
>>>>>>>>>>> Diego
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Feb 26, 2015 at 5:24 AM, craig w <[email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Diego,
>>>>>>>>>>>>
>>>>>>>>>>>> I'm also interested in hearing feedback to your qusestion. One
>>>>>>>>>>>> minor thing I'd point out is that a Framework is made up of a 
>>>>>>>>>>>> Scheduler and
>>>>>>>>>>>> Executor(s), so I think it's more correct to say you've created a 
>>>>>>>>>>>> Scheduler
>>>>>>>>>>>> (instead of "one big framework") and an Executor.
>>>>>>>>>>>>
>>>>>>>>>>>> Anyhow, for what it's worth, the Aurora framework has multiple
>>>>>>>>>>>> executors (
>>>>>>>>>>>> https://github.com/apache/incubator-aurora/blob/master/examples/vagrant/aurorabuild.sh#L61).
>>>>>>>>>>>> You might pop into the #aurora IRC chat room and ask, usually a 
>>>>>>>>>>>> few Aurora
>>>>>>>>>>>> contributors are in there answering questions when they can.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Feb 25, 2015 at 9:02 PM, John Pampuch <
>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Diego-
>>>>>>>>>>>>>
>>>>>>>>>>>>> You might want to look at this project for some insights:
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://github.com/mesosphere/RENDLER
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -John
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Feb 25, 2015 at 5:27 PM, Diego Medina <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Short: Is it better to have one big framework and executor
>>>>>>>>>>>>>> with if statements to select what to do or several smaller 
>>>>>>>>>>>>>> framework <->
>>>>>>>>>>>>>> executors when writing a Mesos app?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Longer question:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Last week I started a side project based on mesos (using Go),
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> http://blog.fmpwizard.com/blog/web-crawler-using-mesos-and-golang
>>>>>>>>>>>>>> https://github.com/fmpwizard/owlcrawler
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It's a web crawler written on top of Mesos, The very first
>>>>>>>>>>>>>> version of it had a framework that sent a task to an executor 
>>>>>>>>>>>>>> and that
>>>>>>>>>>>>>> single executor would fetch the page, extract links from the 
>>>>>>>>>>>>>> html and then
>>>>>>>>>>>>>> send them to a message queue.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Then the framework reads that queue and starts again, run the
>>>>>>>>>>>>>> executor, etc, etc.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Now I'm splitting fetching the html and extracting links into
>>>>>>>>>>>>>> two separate tasks, and putting those two tasks in the same 
>>>>>>>>>>>>>> executor
>>>>>>>>>>>>>> doesn't feel right, so I'm thinking that I need at least two 
>>>>>>>>>>>>>> diff executors
>>>>>>>>>>>>>> and one framework, but then I wonder if people more experienced 
>>>>>>>>>>>>>> with mesos
>>>>>>>>>>>>>> would normally write several pairs of framework <-> executors to 
>>>>>>>>>>>>>> keep the
>>>>>>>>>>>>>> design cleaner.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On this particular case, I can see the project growing into
>>>>>>>>>>>>>> even more tasks that can be decoupled.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Any feedback on the design would be great and let me know if
>>>>>>>>>>>>>> I should explain this better.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Diego
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> Diego Medina
>>>>>>>>>>>>>> Lift/Scala consultant
>>>>>>>>>>>>>> [email protected]
>>>>>>>>>>>>>> http://fmpwizard.telegr.am
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>>
>>>>>>>>>>>> https://github.com/mindscratch
>>>>>>>>>>>> https://www.google.com/+CraigWickesser
>>>>>>>>>>>> https://twitter.com/mind_scratch
>>>>>>>>>>>> https://twitter.com/craig_links
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> Diego Medina
>>>>>>>>>>> Lift/Scala consultant
>>>>>>>>>>> [email protected]
>>>>>>>>>>> http://fmpwizard.telegr.am
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Diego Medina
>>>>>>>>> Lift/Scala consultant
>>>>>>>>> [email protected]
>>>>>>>>> http://fmpwizard.telegr.am
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> Diego Medina
>>>>>>> Lift/Scala consultant
>>>>>>> [email protected]
>>>>>>> http://fmpwizard.telegr.am
>>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Diego Medina
>>>>>> Lift/Scala consultant
>>>>>> [email protected]
>>>>>> http://fmpwizard.telegr.am
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>>
>> --
>> Diego Medina
>> Lift/Scala consultant
>> [email protected]
>> http://fmpwizard.telegr.am
>>
>
>


-- 
Diego Medina
Lift/Scala consultant
[email protected]
http://fmpwizard.telegr.am

Re: multiple frameworks or one big one

Reply via email to