Hi Diego, did you already plan to make a benchmark of your result on the
Mesos platform VS Bare-metal servers ?
It would be really interesting for Enterprise evangelism, they love
benchmark and metrics.

I'm impress by your project and how it goes fast. I'm myself a fan of
Golang, but why did you choose it?

2015-03-03 3:03 GMT+01:00 Diego Medina <[email protected]>:

> Hi everyone, based on all the great feedback I got here I updated the code
> and now I have one scheduler and two executors, one for fetching html and a
> second one that extracts links and text from the html.
> I also run the actual work on their own goroutines (like threads for tose
> not familiar with Go) and it's working great.
>
> I wrote about the changes here
> http://blog.fmpwizard.com/blog/owlcrawler-multiple-executors-using-meso
> and you can find the updated code here
> https://github.com/fmpwizard/owlcrawler
>
> Again, thanks everyone for your input.
>
> Diego
>
>
>
>
> On Fri, Feb 27, 2015 at 1:52 PM, Diego Medina <[email protected]> wrote:
>
>> Thanks for looking at the code and the feedback Alex. I'll be working on
>> those changes later tonight!
>>
>> Diego
>>
>> On Fri, Feb 27, 2015 at 12:15 PM, Alex Rukletsov <[email protected]>
>> wrote:
>>
>>> Diego,
>>>
>>> I've checked your code, nice effort! Great to see people hacking with
>>> mesos and go bindings!
>>>
>>> One thing though. You do the actual job in the launchTask() of your
>>> executor. This prevents you from multiple tasks in parallel on one
>>> executor. That means you can't have more simultaneous tasks than executors
>>> in your cluster. You may want to spawn a thread for every incoming task and
>>> do the job there, while launchTasks() will do solely task initialization
>>> (basically, starting a thread). Check the project John referenced to:
>>> https://github.com/mesosphere/RENDLER.
>>>
>>> Best,
>>> Alex
>>>
>>> On Fri, Feb 27, 2015 at 11:03 AM, Diego Medina <[email protected]>
>>> wrote:
>>>
>>>> Hi Billy,
>>>>
>>>> comments inline:
>>>>
>>>> On Fri, Feb 27, 2015 at 4:07 AM, Billy Bones <[email protected]>
>>>> wrote:
>>>>
>>>>> Hi diego, as a real fan of the golang, I'm cudoes and clap for your
>>>>> work on this distributed crawler and hope you'll finally release it ;-)
>>>>>
>>>>>
>>>>
>>>> Thanks! my 3 month old baby is making sure I don't sleep much and have
>>>> plenty of time to work on this project :)
>>>>
>>>>
>>>>> About your question, the common architecture is to have one scheduler
>>>>> and multiple executors rather than one big executor.
>>>>> The basics of mesos is to take any resources, put them together on a
>>>>> pool to then swarm tasks on this pool, so, basically the architecture of
>>>>> your application should share this philosophy and then explode / decouple
>>>>> your application as much as possible but be carreful to not loop lock
>>>>> yourself on threads and tasks if they're dependents.
>>>>>
>>>>> I don't know if I'm explaining myself correctly so do not hesitate if
>>>>> you need more clarification.
>>>>>
>>>>>
>>>>
>>>> Your answer was very clear. Today I started to split the executor into
>>>> two, one that simply fetches the html and then a second one that extracts
>>>> text without tags from it, this second executor gets the data from a
>>>> database, so far it seems like a natural way to split the tasks. I was
>>>> going with the idea of also having two schedulers, but I think I just
>>>> figured out how to use just one.
>>>>
>>>> Thanks!
>>>>
>>>> Diego
>>>>
>>>>
>>>>
>>>>>
>>>>> 2015-02-26 21:50 GMT+01:00 Diego Medina <[email protected]>:
>>>>>
>>>>>> @John: thanks for the link, i see that RENDLER uses the ExecutorId
>>>>>> from ExecutorInfo to decide what to do, I'll give this a try
>>>>>> @Craig: you are right, after I sent the email I continued to read
>>>>>> more of the mesos docs and saw that I used the wrong term, where I meant
>>>>>> scheduler instead of framework, thanks.
>>>>>>
>>>>>> Thanks and looking forward to any other feedback you may all have.
>>>>>>
>>>>>> Diego
>>>>>>
>>>>>>
>>>>>> On Thu, Feb 26, 2015 at 5:24 AM, craig w <[email protected]> wrote:
>>>>>>
>>>>>>> Diego,
>>>>>>>
>>>>>>> I'm also interested in hearing feedback to your qusestion. One minor
>>>>>>> thing I'd point out is that a Framework is made up of a Scheduler and
>>>>>>> Executor(s), so I think it's more correct to say you've created a 
>>>>>>> Scheduler
>>>>>>> (instead of "one big framework") and an Executor.
>>>>>>>
>>>>>>> Anyhow, for what it's worth, the Aurora framework has multiple
>>>>>>> executors (
>>>>>>> https://github.com/apache/incubator-aurora/blob/master/examples/vagrant/aurorabuild.sh#L61).
>>>>>>> You might pop into the #aurora IRC chat room and ask, usually a few 
>>>>>>> Aurora
>>>>>>> contributors are in there answering questions when they can.
>>>>>>>
>>>>>>> On Wed, Feb 25, 2015 at 9:02 PM, John Pampuch <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Diego-
>>>>>>>>
>>>>>>>> You might want to look at this project for some insights:
>>>>>>>>
>>>>>>>> https://github.com/mesosphere/RENDLER
>>>>>>>>
>>>>>>>>
>>>>>>>> -John
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Feb 25, 2015 at 5:27 PM, Diego Medina <[email protected]>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Short: Is it better to have one big framework and executor with if
>>>>>>>>> statements to select what to do or several smaller framework <-> 
>>>>>>>>> executors
>>>>>>>>> when writing a Mesos app?
>>>>>>>>>
>>>>>>>>> Longer question:
>>>>>>>>>
>>>>>>>>> Last week I started a side project based on mesos (using Go),
>>>>>>>>>
>>>>>>>>> http://blog.fmpwizard.com/blog/web-crawler-using-mesos-and-golang
>>>>>>>>> https://github.com/fmpwizard/owlcrawler
>>>>>>>>>
>>>>>>>>> It's a web crawler written on top of Mesos, The very first version
>>>>>>>>> of it had a framework that sent a task to an executor and that single
>>>>>>>>> executor would fetch the page, extract links from the html and then 
>>>>>>>>> send
>>>>>>>>> them to a message queue.
>>>>>>>>>
>>>>>>>>> Then the framework reads that queue and starts again, run the
>>>>>>>>> executor, etc, etc.
>>>>>>>>>
>>>>>>>>> Now I'm splitting fetching the html and extracting links into two
>>>>>>>>> separate tasks, and putting those two tasks in the same executor 
>>>>>>>>> doesn't
>>>>>>>>> feel right, so I'm thinking that I need at least two diff executors 
>>>>>>>>> and one
>>>>>>>>> framework, but then I wonder if people more experienced with mesos 
>>>>>>>>> would
>>>>>>>>> normally write several pairs of framework <-> executors to keep the 
>>>>>>>>> design
>>>>>>>>> cleaner.
>>>>>>>>>
>>>>>>>>> On this particular case, I can see the project growing into even
>>>>>>>>> more tasks that can be decoupled.
>>>>>>>>>
>>>>>>>>> Any feedback on the design would be great and let me know if I
>>>>>>>>> should explain this better.
>>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Diego
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> Diego Medina
>>>>>>>>> Lift/Scala consultant
>>>>>>>>> [email protected]
>>>>>>>>> http://fmpwizard.telegr.am
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>
>>>>>>> https://github.com/mindscratch
>>>>>>> https://www.google.com/+CraigWickesser
>>>>>>> https://twitter.com/mind_scratch
>>>>>>> https://twitter.com/craig_links
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Diego Medina
>>>>>> Lift/Scala consultant
>>>>>> [email protected]
>>>>>> http://fmpwizard.telegr.am
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>> Diego Medina
>>>> Lift/Scala consultant
>>>> [email protected]
>>>> http://fmpwizard.telegr.am
>>>>
>>>
>>>
>>
>>
>> --
>> Diego Medina
>> Lift/Scala consultant
>> [email protected]
>> http://fmpwizard.telegr.am
>>
>
>
>
> --
> Diego Medina
> Lift/Scala consultant
> [email protected]
> http://fmpwizard.telegr.am
>

Reply via email to