Re: [DISCUSS] KIP-405: Kafka Tiered Storage

Tom Bentley Fri, 08 Nov 2019 00:06:50 -0800

Thanks for those insights Ying.

On Thu, Nov 7, 2019 at 9:26 PM Ying Zheng <yi...@uber.com.invalid> wrote:


> >
> >
> >
> > Thanks, I missed that point. However, there's still a point at which the
> > consumer fetches start getting served from remote storage (even if that
> > point isn't as soon as the local log retention time/size). This
> represents
> > a kind of performance cliff edge and what I'm really interested in is how
> > easy it is for a consumer which falls off that cliff to catch up and so
> its
> > fetches again come from local storage. Obviously this can depend on all
> > sorts of factors (like production rate, consumption rate), so it's not
> > guaranteed (just like it's not guaranteed for Kafka today), but this
> would
> > represent a new failure mode.
> >
>
>  As I have explained in the last mail, it's a very rare case that a
> consumer
> need to read remote data. With our experience at Uber, this only happens
> when the consumer service had an outage for several hours.
>
> There is not a "performance cliff" as you assume. The remote storage is
> even faster than local disks in terms of bandwidth. Reading from remote
> storage is going to have higher latency than local disk. But since the
> consumer
> is catching up several hours data, it's not sensitive to the sub-second
> level
> latency, and each remote read request will read a large amount of data to
> make the overall performance better than reading from local disks.
>
>
>
> > Another aspect I'd like to understand better is the effect of serving
> fetch
> > request from remote storage has on the broker's network utilization. If
> > we're just trimming the amount of data held locally (without increasing
> the
> > overall local+remote retention), then we're effectively trading disk
> > bandwidth for network bandwidth when serving fetch requests from remote
> > storage (which I understand to be a good thing, since brokers are
> > often/usually disk bound). But if we're increasing the overall
> local+remote
> > retention then it's more likely that network itself becomes the
> bottleneck.
> > I appreciate this is all rather hand wavy, I'm just trying to understand
> > how this would affect broker performance, so I'd be grateful for any
> > insights you can offer.
> >
> >
> Network bandwidth is a function of produce speed, it has nothing to do with
> remote retention. As long as the data is shipped to remote storage, you can
> keep the data there for 1 day or 1 year or 100 years, it doesn't consume
> any
> network resources.
>

Re: [DISCUSS] KIP-405: Kafka Tiered Storage

Reply via email to