On Tue, Nov 21, 2017, at 22:11, Jun Rao wrote:
> Hi, Jay,
> 
> I guess in your proposal the leader has to cache the last offset given
> back for each partition so that it knows from which offset to serve the next
> fetch request.

Hi Jun,

Just to clarify, the leader has to cache the last offset for each
follower / UUID in the original KIP-227 proposal as well.  Sorry if that
wasn't clear.

> This is doable but it means that the leader needs to do an
> additional index lookup per partition to serve a fetch request. Not sure
> if the benefit from the lighter fetch request obviously offsets the
> additional index lookup though.

The runtime impact should be a small constant factor at most, right? 
You would just have a mapping between UUID and the latest offset in each
partition data structure.  It seems like the runtime impact of looking
up the fetch offset in a hash table (or small array) in the in-memory
partition data structure should be very similar to the runtime impact of
looking up the fetch offset in the FetchRequest.

The extra memory consumption per partition is O(num_brokers), which is
essentially a small constant.  (The fact that brokers can have multiple
UUIDs due to parallel fetches is a small wrinkle.  But we can place an
upper bound on the number of UUIDs permitted per broker.)

best,
Colin

> 
> Thanks,
> 
> Jun
> 
> On Tue, Nov 21, 2017 at 7:03 PM, Jay Kreps <j...@confluent.io> wrote:
> 
> > I think the general thrust of this makes a ton of sense.
> >
> > I don't love that we're introducing a second type of fetch request. I think
> > the motivation is for compatibility, right? But isn't that what versioning
> > is for? Basically to me although the modification we're making makes sense,
> > the resulting protocol doesn't really seem like something you would design
> > this way from scratch.
> >
> > I think I may be misunderstanding the semantics of the partitions in
> > IncrementalFetchRequest. I think the intention is that the server remembers
> > the partitions you last requested, and the partitions you specify in the
> > request are added to this set. This is a bit odd though because you can add
> > partitions but I don't see how you remove them, so it doesn't really let
> > you fully make changes incrementally. I suspect I'm misunderstanding that
> > somehow, though. You'd also need to be a little bit careful that there was
> > no way for the server's idea of what the client is interested in and the
> > client's idea to ever diverge as you made these modifications over time
> > (due to bugs or whatever).
> >
> > It seems like an alternative would be to not add a second request, but
> > instead change the fetch api and implementation
> >
> >    1. We save the partitions you last fetched on that connection in the
> >    session for the connection (as I think you are proposing)
> >    2. It only gives you back info on partitions that have data or have
> >    changed (no reason you need the others, right?)
> >    3. Not specifying any partitions means "give me the usual", as defined
> >    by whatever you requested before attached to the session.
> >
> > This would be a new version of the fetch API, so compatibility would be
> > retained by retaining the older version as is.
> >
> > This seems conceptually simpler to me. It's true that you have to resend
> > the full set whenever you want to change it, but that actually seems less
> > error prone and that should be rare.
> >
> > I suspect you guys thought about this and it doesn't quite work, but maybe
> > you could explain why?
> >
> > -Jay
> >
> > On Tue, Nov 21, 2017 at 1:02 PM, Colin McCabe <cmcc...@apache.org> wrote:
> >
> > > Hi all,
> > >
> > > I created a KIP to improve the scalability and latency of FetchRequest:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
> > > Partition+Scalability
> > >
> > > Please take a look.
> > >
> > > cheers,
> > > Colin
> > >
> >

Reply via email to