Thanks for the KIP, Colin. As others have said, I think something along
these lines is definitely useful, we just need to work out the various
details.

Jun and Jay, I think you are discussing 2 orthogonal things:

1. Should we have a separate IncrementalFetchRequest or do we reuse
FetchRequest? Like Jay, I have a preference for the latter (with
appropriate modifications). If we want to make it explicit, we could add
`type` field to differentiate between Full and Incremental requests. I am
willing to reconsider this is there are significant advantages in using a
separate request type.

2. Whether the follower selectively sends partitions with the fetch offset
for incremental fetches or if the leader keeps track of the offsets. As Jun
said, this is a tradeoff between making the fetch requests even lighter and
a bit more work on the leader.

Ismael

On Wed, Nov 22, 2017 at 6:11 AM, Jun Rao <j...@confluent.io> wrote:

> Hi, Jay,
>
> I guess in your proposal the leader has to cache the last offset given back
> for each partition so that it knows from which offset to serve the next
> fetch request. This is doable but it means that the leader needs to do an
> additional index lookup per partition to serve a fetch request. Not sure if
> the benefit from the lighter fetch request obviously offsets the additional
> index lookup though.
>
> Thanks,
>
> Jun
>
> On Tue, Nov 21, 2017 at 7:03 PM, Jay Kreps <j...@confluent.io> wrote:
>
> > I think the general thrust of this makes a ton of sense.
> >
> > I don't love that we're introducing a second type of fetch request. I
> think
> > the motivation is for compatibility, right? But isn't that what
> versioning
> > is for? Basically to me although the modification we're making makes
> sense,
> > the resulting protocol doesn't really seem like something you would
> design
> > this way from scratch.
> >
> > I think I may be misunderstanding the semantics of the partitions in
> > IncrementalFetchRequest. I think the intention is that the server
> remembers
> > the partitions you last requested, and the partitions you specify in the
> > request are added to this set. This is a bit odd though because you can
> add
> > partitions but I don't see how you remove them, so it doesn't really let
> > you fully make changes incrementally. I suspect I'm misunderstanding that
> > somehow, though. You'd also need to be a little bit careful that there
> was
> > no way for the server's idea of what the client is interested in and the
> > client's idea to ever diverge as you made these modifications over time
> > (due to bugs or whatever).
> >
> > It seems like an alternative would be to not add a second request, but
> > instead change the fetch api and implementation
> >
> >    1. We save the partitions you last fetched on that connection in the
> >    session for the connection (as I think you are proposing)
> >    2. It only gives you back info on partitions that have data or have
> >    changed (no reason you need the others, right?)
> >    3. Not specifying any partitions means "give me the usual", as defined
> >    by whatever you requested before attached to the session.
> >
> > This would be a new version of the fetch API, so compatibility would be
> > retained by retaining the older version as is.
> >
> > This seems conceptually simpler to me. It's true that you have to resend
> > the full set whenever you want to change it, but that actually seems less
> > error prone and that should be rare.
> >
> > I suspect you guys thought about this and it doesn't quite work, but
> maybe
> > you could explain why?
> >
> > -Jay
> >
> > On Tue, Nov 21, 2017 at 1:02 PM, Colin McCabe <cmcc...@apache.org>
> wrote:
> >
> > > Hi all,
> > >
> > > I created a KIP to improve the scalability and latency of FetchRequest:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 227%3A+Introduce+Incremental+FetchRequests+to+Increase+
> > > Partition+Scalability
> > >
> > > Please take a look.
> > >
> > > cheers,
> > > Colin
> > >
> >
>

Reply via email to