+1 to the two jobs idea.

Not only do you get the benefits that Stephan mentioned, but Aurora already
assumes that a job is a pool of like tasks. This assumption is used right
now for the maintenance tooling (where it try to keep 95% of the instances
up at a given time), but it can be used elsewhere in the future. By going
the two jobs route you avoid the case where Aurora reschedules shard 0 and
disrupts all of the work done.

It is possible to make shard 0 the coordinator/driver task and I have seen
production systems which do this. However, I advise against it for the
reasons mentioned. If you do find creating 2 jobs is more difficult to
orchestrate, could you please detail why?

On Tue, Dec 22, 2015 at 9:38 AM, Erb, Stephan <[email protected]>
wrote:

> Hi Chris,
>
>
> we are spawning an internal batch processing framework on
> Aurora, consisting of a single master and multiple workers.
>
>
> We opted for the 2 jobs idea. Main advantage I see with this approach that
> you actually keep separate things separate without having to teach all
> external systems (service discovery, load balancer, your monitoring
> solution, etc) that the first instance is different.
>
>
> Best Regards,
>
> Stephan
>
> ​
>
>
>
> ------------------------------
> *From:* Chris Bannister <[email protected]>
> *Sent:* Tuesday, December 22, 2015 2:47 PM
> *To:* [email protected]
> *Subject:* Launching master/slave jobs in Auora
>
> Hi, I'm doing some work to get Apache Spark running in Aurora and it seems
> to work reasonably well without many changes to Spark, the only issue I'm
> running into is launching it over many instances.
>
> Spark runs in a driver executor model, where the driver coordinates works
> on the executors. The problem I have is that I want to launch the executors
> and drivers independently, ie I want to have 10 executors and 1 driver. I
> can accomplish this by having 2 jobs, a driver and an executor job, but
> launching this seems a bit complicated to orchestrate. Another option would
> be to declare the job with 2 tasks, have the driver run on shard 0 and
> executors on the rest.
>
> Has anyone had any experience with running similar systems in Aurora? I
> imagine Heron must have to do something similar, launching the topology
> master and workers.
>
> Chris
>



-- 
Zameer Manji

Reply via email to