Re: AWS EC2 deployment best practices

Joe Crobak Tue, 30 Sep 2014 06:54:19 -0700

I didn't know about KAFKA-1215, thanks. I'm not sure it would fully address
my concerns of a producer writing to the partition leader in different AZ,
though.


To answer your question, I was thinking ephemerals with replication, yes.
With a reservation, it's pretty easy to get e.g. two i2.xlarge for an
amortized cost below a single m2.2xlarge with the same amount of EBS
storage and provisioned IOPs.

On Mon, Sep 29, 2014 at 9:40 PM, Philip O'Toole <
[email protected]> wrote:

> If only Kafka had rack awareness....you could run 1 cluster and set up the
> replicas in different AZs.
>
>
> https://issues.apache.org/jira/browse/KAFKA-1215
>
> As for your question about ephemeral versus EBS, I presume you are
> proposing to use ephemeral *with* replicas, right?
>
>
> Philip
>
>
>
> -----------------------------------------
> http://www.philipotoole.com
>
>
> On Monday, September 29, 2014 9:45 PM, Joe Crobak <[email protected]>
> wrote:
>
>
>
> We're planning a deploy to AWS EC2, and I was hoping to get some advice on
> best practices. I've seen the Loggly presentation [1], which has some good
> recommendations on instance types and EBS setup. Aside from that, there
> seem to be several options in terms of multi-Availability Zone (AZ)
> deployment. The ones we're considering are:
>
> 1) Treat each AZ as a separate data center. Producers write to the kafka
> cluster in the same AZ. For consumption, two options:
> 1a) designate one cluster the "master" cluster and use mirrormaker. This
> was discussed here [2] where some gotchas related to offset management were
> raised.
> 1b) Build consumers to consume from both clusters (e.g. Two camus jobs-one
> for each cluster).
>
> Pros:
> * if there's a network partition between AZs (or extra latency), the
> consumer(s) will catch up once the event is resolved.
> * If an AZ goes offline, only unprocessed data in that AZ is lost until the
> AZ comes back online. The other AZ is unaffected. (consume failover is more
> complicated in 1a, it seems).
> Cons:
> * Duplicate infrastructure and either more moving parts (1a) or more
> complicated consumers (1b).
> * It's unclear how this scales if one wants to add a second region to the
> mix.
>
> 2) The second option is to treat AZs as the same data center. In this case,
> there's no guarantee that a writer is writing to a node in the same AZ.
>
> Pros:
> * Simplified setup-all data is in one place.
> Cons:
> * Harder to design for availability—what if the leader of the partition is
> in a different AZ than the producer and there's a partition between AZs? If
> latency is high or throughput is low between AZs, write throughput suffers
> if `request.required.acks` = -1
>
>
> Some other considerations:
> * Zookeeper deploy—the best practice seems to be a 3-node cluster across 3
> AZs, but option 1a/b would let us do separate clusters per AZ.
> * EBS / provisioned IOPs—The Loggly presentation predates Kafka 0.8
> replication. Are folks using ephemeral storage instead of EBS now?
> Provisioned IOPs can get expensive pretty quickly.
>
> Any suggestions/experience along these lines (or others!) would be greatly
> appreciated. If there's good feedback, I'd be happy to put together a wiki
> page with the details.
>
> Thanks,
> Joe
>
> [1] http://search-hadoop.com/m/4TaT4BQRJy
> [2] http://search-hadoop.com/m/4TaT49l0Gh/AWS+availability+zone/v=plain
>

Re: AWS EC2 deployment best practices

Reply via email to