+1 on Joe's suggestions, glad to see it happening!

On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira <gshap...@cloudera.com> wrote:

> The errors are part of the KIP process now, so I think the clients are
> safe :)
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin <steve.mo...@gmail.com>
> wrote:
> > Agree errors should be part of the protocol
> >
> >> On Jan 15, 2015, at 17:59, Gwen Shapira <gshap...@cloudera.com> wrote:
> >>
> >> Hi,
> >>
> >> I got convinced by Joe and Dana that errors are indeed part of the
> >> protocol and can't be randomly added.
> >>
> >> So, it looks like we need to bump version of ProduceRequest in the
> >> following way:
> >> Version 0 -> accept acks >1. I think we should keep the existing
> >> behavior too (i.e. not replace it with -1) to avoid surprising
> >> clients, but I'm willing to hear other opinions.
> >> Version 1 -> do not accept acks >1 and return an error.
> >> Are we ok with the error I added in KAFKA-1697? We can use something
> >> less specific like InvalidRequestParameter. This error can be reused
> >> in the future and reduce the need to add errors, but will also be less
> >> clear to the client and its users. Maybe even add the error message
> >> string to the protocol in addition to the error code? (since we are
> >> bumping versions....)
> >>
> >> I think maintaining the old version throughout 0.8.X makes sense. IMO
> >> dropping it for 0.9 is feasible, but I'll let client owners help make
> >> that call.
> >>
> >> Am I missing anything? Should I start a KIP? It seems like a KIP-type
> >> discussion :)
> >>
> >> Gwen
> >>
> >>
> >> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava
> >> <e...@confluent.io> wrote:
> >>> Gwen,
> >>>
> >>> I think the only option that wouldn't require a protocol version
> change is
> >>> the one where acks > 1 is converted to acks = -1 since it's the only
> one
> >>> that doesn't potentially break older clients. The protocol guide says
> that
> >>> the expected upgrade path is servers first, then clients, so old
> clients,
> >>> including non-java clients, that may be using acks > 1 should be able
> to
> >>> work with a new broker version.
> >>>
> >>> It's more work, but I think dealing with the protocol change is the
> right
> >>> thing to do since it eventually gets us to the behavior I think is
> better --
> >>> the broker should reject requests with invalid values. I think Joe and
> I
> >>> were basically in agreement. In my mind the major piece missing from
> his
> >>> description is how long we're going to maintain his "case 0" behavior.
> It's
> >>> impractical to maintain old versions forever, but it sounds like there
> >>> hasn't been a decision on how long to maintain them. Maybe that's
> another
> >>> item to add to KIPs -- protocol versions and behavior need to be
> listed as
> >>> deprecated and the earliest version in which they'll be removed should
> be
> >>> specified so users can understand which versions are guaranteed to be
> >>> compatible, even if they're using (well-written) non-java clients.
> >>>
> >>> -Ewen
> >>>
> >>>
> >>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers <dana.pow...@gmail.com>
> wrote:
> >>>>
> >>>>> clients don't break on unknown errors
> >>>>
> >>>> maybe true for the official java clients, but I dont think the
> assumption
> >>>> holds true for community-maintained clients and users of those
> clients.
> >>>> kafka-python generally follows the fail-fast philosophy and raises an
> >>>> exception on any unrecognized error code in any server response.  in
> this
> >>>> case, kafka-python allows users to set their own required-acks policy
> when
> >>>> creating a producer instance.  It is possible that users of
> kafka-python
> >>>> have deployed producer code that uses ack>1 -- perhaps in production
> >>>> environments -- and for those users the new error code will crash
> their
> >>>> producer code.  I would not be surprised if the same were true of
> other
> >>>> community clients.
> >>>>
> >>>> *one reason for the fail-fast approach is that there isn't great
> >>>> documentation on what errors to expect for each request / response --
> so
> >>>> we
> >>>> use failures to alert that some error case is not handled properly.
> and
> >>>> because of that, introducing new error cases without bumping the api
> >>>> version is likely to cause those errors to get raised/thrown all the
> way
> >>>> back up to the user.  of course we (client maintainers) can fix the
> issues
> >>>> in the client libraries and suggest users upgrade, but it's not the
> ideal
> >>>> situation.
> >>>>
> >>>>
> >>>> long-winded way of saying: I agree w/ Joe.
> >>>>
> >>>> -Dana
> >>>>
> >>>>
> >>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira <gshap...@cloudera.com
> >
> >>>> wrote:
> >>>>
> >>>>> Is the protocol bump caused by the behavior change or the new error
> >>>>> code?
> >>>>>
> >>>>> 1) IMO, error_codes are data, and clients can expect to receive
> errors
> >>>>> that they don't understand (i.e. unknown errors). AFAIK, clients
> don't
> >>>>> break on unknown errors, they are simple more challenging to debug.
> If
> >>>>> we document the new behavior, then its definitely debuggable and
> >>>>> fixable.
> >>>>>
> >>>>> 2) The behavior change is basically a deprecation - i.e. acks > 1
> were
> >>>>> never documented, and are not supported by Kafka clients starting
> with
> >>>>> version 0.8.2. I'm not sure this requires a protocol bump either,
> >>>>> although its a better case than new error codes.
> >>>>>
> >>>>> Thanks,
> >>>>> Gwen
> >>>>>
> >>>>> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein <joe.st...@stealth.ly>
> >>>>> wrote:
> >>>>>> Looping in the mailing list that the client developers live on
> because
> >>>>> they
> >>>>>> are all not on dev (though they should be if they want to be helping
> >>>>>> to
> >>>>>> build the best client libraries they can).
> >>>>>>
> >>>>>> I whole hardily believe that we need to not break existing
> >>>>>> functionality
> >>>>> of
> >>>>>> the client protocol, ever.
> >>>>>>
> >>>>>> There are many reasons for this and we have other threads on the
> >>>>>> mailing
> >>>>>> list where we are discussing that topic (no pun intended) that I
> don't
> >>>>> want
> >>>>>> to re-hash here.
> >>>>>>
> >>>>>> If we change wire protocol functionality OR the binary format
> (either)
> >>>>>> we
> >>>>>> must bump version AND treat version as a feature flag with backward
> >>>>>> compatibility support until it is deprecated for some time for folks
> >>>>>> to
> >>>>> deal
> >>>>>> with it.
> >>>>>>
> >>>>>> match version = {
> >>>>>> case 0: keepDoingWhatWeWereDoing()
> >>>>>> case 1: doNewStuff()
> >>>>>> case 2: doEvenMoreNewStuff()
> >>>>>> }
> >>>>>>
> >>>>>> has to be a practice we adopt imho ... I know feature flags can be
> >>>>> construed
> >>>>>> as "messy code" but I am eager to hear another (better? different?)
> >>>>> solution
> >>>>>> to this.
> >>>>>>
> >>>>>> If we don't do a feature flag like this specifically with this
> change
> >>>>> then
> >>>>>> what happens is that someone upgrades their brokers with a rolling
> >>>>> restart
> >>>>>> in 0.8.3 and every single one of their producer requests start to
> fail
> >>>>> and
> >>>>>> they have a major production outage. eeeek!!!!
> >>>>>>
> >>>>>> I do 100% agree that > 1 makes no sense and we *REALLY* need people
> to
> >>>>> start
> >>>>>> using 0,1,-1 but we need to-do that in a way that is going to work
> for
> >>>>>> everyone.
> >>>>>>
> >>>>>> Old producers and consumers must keep working with new brokers and
> if
> >>>>>> we
> >>>>> are
> >>>>>> not going to support that then I am unclear what the use of
> "version"
> >>>>>> is
> >>>>>> based on our original intentions of having it because of the
> >>>>>> 0.7=>-0.8.
> >>>>> We
> >>>>>> said no more breaking changes when we did that.
> >>>>>>
> >>>>>> - Joe Stein
> >>>>>>
> >>>>>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen Cheslack-Postava <
> >>>>> e...@confluent.io>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Right, so this looks like it could create an issue similar to
> what's
> >>>>>>> currently being discussed in
> >>>>>>> https://issues.apache.org/jira/browse/KAFKA-1649 where users now
> get
> >>>>>>> errors
> >>>>>>> under conditions when they previously wouldn't. Old clients won't
> >>>>>>> even
> >>>>>>> know
> >>>>>>> about the error code, so besides failing they won't even be able to
> >>>>>>> log
> >>>>>>> any
> >>>>>>> meaningful error messages.
> >>>>>>>
> >>>>>>> I think there are two options for compatibility:
> >>>>>>>
> >>>>>>> 1. An alternative change is to remove the ack > 1 code, but
> silently
> >>>>>>> "upgrade" requests with acks > 1 to acks = -1. This isn't the same
> as
> >>>>>>> other
> >>>>>>> changes to behavior since the interaction between the client and
> >>>>>>> server
> >>>>>>> remains the same, no error codes change, etc. The client might just
> >>>>>>> see
> >>>>>>> some increased latency since the message might need to be
> replicated
> >>>>>>> to
> >>>>>>> more brokers than they requested.
> >>>>>>> 2. Split this into two patches, one that bumps the protocol version
> >>>>>>> on
> >>>>>>> that
> >>>>>>> message to include the new error code but maintains both old (now
> >>>>>>> deprecated) and new behavior, then a second that would be applied
> in
> >>>>>>> a
> >>>>>>> later release that removes the old protocol + code for handling
> acks
> >>>>> 1.
> >>>>>>>
> >>>>>>> 2 is probably the right thing to do. If we specify the release when
> >>>>> we'll
> >>>>>>> remove the deprecated protocol at the time of deprecation it makes
> >>>>> things
> >>>>>>> a
> >>>>>>> lot easier for people writing non-java clients and could give users
> >>>>> better
> >>>>>>> predictability (e.g. if clients are at most 1 major release behind
> >>>>>>> brokers,
> >>>>>>> they'll remain compatible but possibly use deprecated features).
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira <
> gshap...@cloudera.com>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi Kafka Devs,
> >>>>>>>>
> >>>>>>>> We are working on KAFKA-1697 - remove code related to ack>1 on the
> >>>>>>>> broker. Per Neha's suggestion, I'd like to give everyone a heads
> up
> >>>>>>>> on
> >>>>>>>> what these changes mean.
> >>>>>>>>
> >>>>>>>> Once this patch is included, any produce requests that include
> >>>>>>>> request.required.acks > 1 will result in an exception. This will
> be
> >>>>>>>> InvalidRequiredAcks in new versions (0.8.3 and up, I assume) and
> >>>>>>>> UnknownException in existing versions (sorry, but I can't add
> error
> >>>>>>>> codes retroactively).
> >>>>>>>>
> >>>>>>>> This behavior is already enforced by 0.8.2 producers (sync and
> >>>>>>>> new),
> >>>>>>>> but we expect impact on users with older producers that relied on
> >>>>>>>> acks
> >>>>>>>>> 1 and external clients (i.e python, go, etc).
> >>>>>>>>
> >>>>>>>> Users who relied on acks > 1 are expected to switch to using acks
> =
> >>>>>>>> -1
> >>>>>>>> and a min.isr parameter than matches their user case.
> >>>>>>>>
> >>>>>>>> This change was discussed in the past in the context of KAFKA-1555
> >>>>>>>> (min.isr), but let us know if you have any questions or concerns
> >>>>>>>> regarding this change.
> >>>>>>>>
> >>>>>>>> Gwen
> >>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> Thanks,
> >>>>>>> Ewen
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> You received this message because you are subscribed to the Google
> >>>>>> Groups
> >>>>>> "kafka-clients" group.
> >>>>>> To unsubscribe from this group and stop receiving emails from it,
> send
> >>>>>> an
> >>>>>> email to kafka-clients+unsubscr...@googlegroups.com.
> >>>>>> To post to this group, send email to kafka-clie...@googlegroups.com
> .
> >>>>>> Visit this group at http://groups.google.com/group/kafka-clients.
> >>>>>> To view this discussion on the web visit
> >>>>>
> >>>>>
> https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com
> >>>>> .
> >>>>>>
> >>>>>> For more options, visit https://groups.google.com/d/optout.
> >>>>>
> >>>>> --
> >>>>> You received this message because you are subscribed to the Google
> >>>>> Groups
> >>>>> "kafka-clients" group.
> >>>>> To unsubscribe from this group and stop receiving emails from it,
> send
> >>>>> an
> >>>>> email to kafka-clients+unsubscr...@googlegroups.com.
> >>>>> To post to this group, send email to kafka-clie...@googlegroups.com.
> >>>>> Visit this group at http://groups.google.com/group/kafka-clients.
> >>>>> To view this discussion on the web visit
> >>>>>
> >>>>>
> https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com
> >>>>> .
> >>>>> For more options, visit https://groups.google.com/d/optout.
> >>>
> >>>
> >>>
> >>>
> >>> --
> >>> Thanks,
> >>> Ewen
>



-- 
-- Guozhang

Reply via email to