Sounds good to me. I'll open a new JIRA for 0.8.2 with just an extra log warning, to avoid making KAFKA-1697 any more confusing.
On Mon, Jan 19, 2015 at 9:46 AM, Joe Stein <joe.st...@stealth.ly> wrote: > << For 2, how about we make a change to log a warning for ack > 1 in 0.8.2 > and then drop the ack > 1 support in trunk (w/o bumping up the protocol > version)? > > +1 > > > On Mon, Jan 19, 2015 at 12:35 PM, Jun Rao <j...@confluent.io> wrote: >> >> For 2, how about we make a change to log a warning for ack > 1 in 0.8.2 >> and then drop the ack > 1 support in trunk (w/o bumping up the protocol >> version)? Thanks, >> >> Jun >> >> On Sun, Jan 18, 2015 at 8:24 PM, Gwen Shapira <gshap...@cloudera.com> >> wrote: >>> >>> Overall, agree on point #1, less sure on point #2. >>> >>> 1. Some protocols never ever add new errors, while others add errors >>> without bumping versions. HTTP is a good example of the second type. >>> HTTP-451 was added fairly recently, there are some errors specific to >>> NGINX, etc. No one cares. I think we should properly document in the >>> wire-protocol doc that new errors can be added, and I think we should >>> strongly suggest (and implement ourselves) that unknown error codes >>> should be shown to users (or at least logged), so they can be googled >>> and understood through our documentation. >>> In addition, hierarchy of error codes, so clients will know if an >>> error is retry-able just by looking at the code could be nice. Same >>> for adding an error string to the protocol. These are future >>> enhancements that should be discussed separately. >>> >>> 2. I think we want to allow admins to upgrade their Kafka brokers >>> without having to chase down clients in their organization and without >>> getting blamed if clients break. I think it makes sense to have one >>> version that will support existing behavior, but log warnings, so >>> admins will know about misbehaving clients and can track them down >>> before an upgrade that breaks them (or before the broken config causes >>> them to lose data!). Hopefully this is indeed a very rare behavior and >>> we are taking extra precaution for nothing, but I have customers where >>> one traumatic upgrade means they will never upgrade a Kafka again, so >>> I'm being conservative. >>> >>> Gwen >>> >>> >>> On Sun, Jan 18, 2015 at 3:50 PM, Jun Rao <j...@confluent.io> wrote: >>> > Overall, I agree with Jay on both points. >>> > >>> > 1. I think it's reasonable to add new error codes w/o bumping up the >>> > protocol version. In most cases, by adding new error codes, we are just >>> > refining the categorization of those unknown errors. So, a client >>> > shouldn't >>> > behave worse than before as long as unknown errors have been properly >>> > handled. >>> > >>> > 2. I think it's reasonable to just document that 0.8.2 will be the last >>> > release that will support ack > 1 and remove the support completely in >>> > trunk >>> > w/o bumping up the protocol. This is because (a) we never included ack >>> > > 1 >>> > explicitly in the documentation and so the usage should be limited; (2) >>> > ack >>> >> 1 doesn't provide the guarantee that people really want and so it >>> > shouldn't really be used. >>> > >>> > Thanks, >>> > >>> > Jun >>> > >>> > >>> > On Sun, Jan 18, 2015 at 11:03 AM, Jay Kreps <jay.kr...@gmail.com> >>> > wrote: >>> >> >>> >> Hey guys, >>> >> >>> >> I really think we are discussing two things here: >>> >> >>> >> How should we generally handle changes to the set of errors? Should >>> >> introducing new errors be considered a protocol change or should we >>> >> reserve >>> >> the right to introduce new error codes? >>> >> Given that this particular change is possibly incompatible, how should >>> >> we >>> >> handle it? >>> >> >>> >> I think it would be good for people who are responding here to be >>> >> specific >>> >> about which they are addressing. >>> >> >>> >> Here is what I think: >>> >> >>> >> 1. Errors should be extensible within a protocol version. >>> >> >>> >> We should change the protocol documentation to list the errors that >>> >> can be >>> >> given back from each api, their meaning, and how to handle them, BUT >>> >> we >>> >> should explicitly state that the set of errors are open ended. That is >>> >> we >>> >> should reserve the right to introduce new errors and explicitly state >>> >> that >>> >> clients need a blanket "unknown error" handling mechanism. The error >>> >> can >>> >> link to the protocol definition (something like "Unknown error 42, see >>> >> protocol definition at http://link"). We could make this work really >>> >> well by >>> >> instructing all the clients to report the error in a very googlable >>> >> way as >>> >> Oracle does with their error format (e.g. "ORA-32") so that if you >>> >> ever get >>> >> the raw error google will take you to the definition. >>> >> >>> >> I agree that a more rigid definition seems like "right thing", but >>> >> having >>> >> just implemented two clients and spent a bunch of time on the server >>> >> side, I >>> >> think, it will work out poorly in practice. Here is why: >>> >> >>> >> I think we will make a lot of mistakes in nailing down the set of >>> >> error >>> >> codes up front and we will end up going through 3-4 churns of the >>> >> protocol >>> >> definition just realizing the set of errors that can be thrown. I >>> >> think this >>> >> churn will actually make life worse for clients that now have to >>> >> figure out >>> >> 7 identical versions of the protocol and will be a mess in terms of >>> >> testing >>> >> on the server side. I actually know this to be true because while >>> >> implementing the clients I tried to guess the errors that could be >>> >> thrown, >>> >> then checked my guess by close code inspection. It turned out that I >>> >> always >>> >> missed things in my belief about errors, but more importantly even >>> >> after >>> >> close code inspection I found tons of other errors in my stress >>> >> testing. >>> >> In practice error handling always involves calling out one or two >>> >> meaningful failures that have special recovery and then a blanket case >>> >> that >>> >> just handles everything else. It's true that some clients may not have >>> >> done >>> >> this well, but I think it is for the best if they fix that. >>> >> Reserving the right to add errors doesn't mean we will do it without >>> >> care. >>> >> We will think through each change and decide whether giving a little >>> >> more >>> >> precision in the error is worth the overhead and churn of a protocol >>> >> version >>> >> bump. >>> >> >>> >> 2. In this case in particular we should not introduce a new protocol >>> >> version >>> >> >>> >> In this particular case we are saying that acks > 1 doesn't make sense >>> >> and >>> >> we want to give an error to people specifying this so that they change >>> >> their >>> >> configuration. This is a configuration that few people use and we want >>> >> to >>> >> just make it an error. The bad behavior will just be that the error >>> >> will not >>> >> be as good as it could be. I think that is a better tradeoff than >>> >> introducing a separate protocol version (this may be true of the java >>> >> clients too). >>> >> >>> >> We will have lots of cases like this in the future and we aren't going >>> >> to >>> >> want to churn the protocol for each of them. For example we previously >>> >> had >>> >> to get more precise about which characters were legal and which >>> >> illegal in >>> >> topic names. >>> >> >>> >> -Jay >>> >> >>> >> On Fri, Jan 16, 2015 at 11:55 AM, Gwen Shapira <gshap...@cloudera.com> >>> >> wrote: >>> >>> >>> >>> I updated the KIP: Using acks > 1 in version 0 will log a WARN >>> >>> message >>> >>> in the broker about client using deprecated behavior (suggested by >>> >>> Joe >>> >>> in the JIRA, and I think it makes sense). >>> >>> >>> >>> Gwen >>> >>> >>> >>> On Fri, Jan 16, 2015 at 10:40 AM, Gwen Shapira >>> >>> <gshap...@cloudera.com> >>> >>> wrote: >>> >>> > How about we continue the discussion on this thread, so we won't >>> >>> > lose >>> >>> > the context of this discussion, and put it up for VOTE when this >>> >>> > has >>> >>> > been finalized? >>> >>> > >>> >>> > On Fri, Jan 16, 2015 at 10:22 AM, Neha Narkhede <n...@confluent.io> >>> >>> > wrote: >>> >>> >> Gwen, >>> >>> >> >>> >>> >> KIP write-up looks good. According to the rest of the KIP process >>> >>> >> proposal, >>> >>> >> would you like to start a DISCUSS/VOTE thread for it? >>> >>> >> >>> >>> >> Thanks, >>> >>> >> Neha >>> >>> >> >>> >>> >> On Fri, Jan 16, 2015 at 9:37 AM, Ewen Cheslack-Postava >>> >>> >> <e...@confluent.io> >>> >>> >> wrote: >>> >>> >> >>> >>> >>> Gwen -- KIP write up looks good. Deprecation schedule probably >>> >>> >>> needs >>> >>> >>> to be >>> >>> >>> more specific, but I think that discussion probably needs to >>> >>> >>> happen >>> >>> >>> after a >>> >>> >>> solution is agreed upon. >>> >>> >>> >>> >>> >>> Jay -- I think "older clients will get a bad error message >>> >>> >>> instead of >>> >>> >>> a >>> >>> >>> good one" isn't what would be happening with this change. >>> >>> >>> Previously >>> >>> >>> they >>> >>> >>> wouldn't have received an error and they would have been able to >>> >>> >>> produce >>> >>> >>> messages. After the change they'll just receive this new error >>> >>> >>> message >>> >>> >>> which their clients can't possibly handle gracefully since it >>> >>> >>> didn't >>> >>> >>> exist >>> >>> >>> when the client was written. Whether the acks > 1 setting was >>> >>> >>> actually >>> >>> >>> accomplishing what they thought doesn't matter. Someone could >>> >>> >>> have >>> >>> >>> reasonably read the docs on 0.8.1.1, thought acks = 2 is an ok >>> >>> >>> setting for >>> >>> >>> their applications, set it as a default across a bunch of apps, >>> >>> >>> then >>> >>> >>> follow >>> >>> >>> the recommended upgrade path of updating brokers to 0.8.2 and all >>> >>> >>> their >>> >>> >>> apps will start failing on produce requests. >>> >>> >>> >>> >>> >>> >>> >>> >>> On Thu, Jan 15, 2015 at 8:27 PM, Jay Kreps <jay.kr...@gmail.com> >>> >>> >>> wrote: >>> >>> >>> >>> >>> >>> > This is a good case to discuss. >>> >>> >>> > >>> >>> >>> > Let's figure the general case of how we want to handle errors >>> >>> >>> > and >>> >>> >>> > get >>> >>> >>> that >>> >>> >>> > documented in the protocol. The problem right now is that we >>> >>> >>> > give >>> >>> >>> > no >>> >>> >>> > guidance on this. I actually thought Gwen's suggestion made >>> >>> >>> > sense >>> >>> >>> > on the >>> >>> >>> > guidance we should have given which is that we will enumerate a >>> >>> >>> > set >>> >>> >>> > of >>> >>> >>> > errors and their meaning for each API but it is possible that >>> >>> >>> > other >>> >>> >>> errors >>> >>> >>> > will occur and they should be handled (maybe poorly) in the >>> >>> >>> > same >>> >>> >>> > way >>> >>> >>> > UNKNOWN_ERROR is handled which is our normal escape hatch for >>> >>> >>> > things like >>> >>> >>> > OOMException. >>> >>> >>> > >>> >>> >>> > I really do think we shouldn't be dogmatic here: In considering >>> >>> >>> > a >>> >>> >>> > change >>> >>> >>> > to errors we should consider the potential ill-effect vs the >>> >>> >>> > complexity >>> >>> >>> of >>> >>> >>> > yet another protocol version. >>> >>> >>> > >>> >>> >>> > In this case I actually am not sure we need to bump the >>> >>> >>> > protocol >>> >>> >>> > because >>> >>> >>> > the whole point of the change was to make a setting we think >>> >>> >>> > doesn't make >>> >>> >>> > sense break, right? Well this will break it. It seems like the >>> >>> >>> > only >>> >>> >>> > downside is that older clients will get a bad error message >>> >>> >>> > instead >>> >>> >>> > of a >>> >>> >>> > good one. But it isn't like we will have rendered a client >>> >>> >>> > unusable, it >>> >>> >>> is >>> >>> >>> > just that they will need to change their config. >>> >>> >>> > >>> >>> >>> > -Jay >>> >>> >>> > >>> >>> >>> > On Thu, Jan 15, 2015 at 6:14 PM, Gwen Shapira >>> >>> >>> > <gshap...@cloudera.com> >>> >>> >>> > wrote: >>> >>> >>> > >>> >>> >>> >> I created a KIP for this suggestion: >>> >>> >>> >> >>> >>> >>> >> >>> >>> >>> >> >>> >>> >>> >>> >>> >>> >>> >>> >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1+-+Remove+support+of+request.required.acks >>> >>> >>> >> >>> >>> >>> >> Basically documenting what was already discussed here. >>> >>> >>> >> Comments >>> >>> >>> >> will >>> >>> >>> >> be awesome! >>> >>> >>> >> >>> >>> >>> >> Gwen >>> >>> >>> >> >>> >>> >>> >> On Thu, Jan 15, 2015 at 5:19 PM, Gwen Shapira >>> >>> >>> >> <gshap...@cloudera.com> >>> >>> >>> >> wrote: >>> >>> >>> >> > The errors are part of the KIP process now, so I think the >>> >>> >>> >> > clients are >>> >>> >>> >> safe :) >>> >>> >>> >> > >>> >>> >>> >> > >>> >>> >>> >> >>> >>> >>> >>> >>> >>> >>> >>> >>> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals >>> >>> >>> >> > >>> >>> >>> >> > On Thu, Jan 15, 2015 at 5:12 PM, Steve Morin >>> >>> >>> >> > <steve.mo...@gmail.com> >>> >>> >>> >> wrote: >>> >>> >>> >> >> Agree errors should be part of the protocol >>> >>> >>> >> >> >>> >>> >>> >> >>> On Jan 15, 2015, at 17:59, Gwen Shapira >>> >>> >>> >> >>> <gshap...@cloudera.com> >>> >>> >>> >> wrote: >>> >>> >>> >> >>> >>> >>> >>> >> >>> Hi, >>> >>> >>> >> >>> >>> >>> >>> >> >>> I got convinced by Joe and Dana that errors are indeed >>> >>> >>> >> >>> part of >>> >>> >>> >> >>> the >>> >>> >>> >> >>> protocol and can't be randomly added. >>> >>> >>> >> >>> >>> >>> >>> >> >>> So, it looks like we need to bump version of >>> >>> >>> >> >>> ProduceRequest in >>> >>> >>> >> >>> the >>> >>> >>> >> >>> following way: >>> >>> >>> >> >>> Version 0 -> accept acks >1. I think we should keep the >>> >>> >>> >> >>> existing >>> >>> >>> >> >>> behavior too (i.e. not replace it with -1) to avoid >>> >>> >>> >> >>> surprising >>> >>> >>> >> >>> clients, but I'm willing to hear other opinions. >>> >>> >>> >> >>> Version 1 -> do not accept acks >1 and return an error. >>> >>> >>> >> >>> Are we ok with the error I added in KAFKA-1697? We can use >>> >>> >>> >> >>> something >>> >>> >>> >> >>> less specific like InvalidRequestParameter. This error can >>> >>> >>> >> >>> be >>> >>> >>> >> >>> reused >>> >>> >>> >> >>> in the future and reduce the need to add errors, but will >>> >>> >>> >> >>> also >>> >>> >>> >> >>> be >>> >>> >>> less >>> >>> >>> >> >>> clear to the client and its users. Maybe even add the >>> >>> >>> >> >>> error >>> >>> >>> >> >>> message >>> >>> >>> >> >>> string to the protocol in addition to the error code? >>> >>> >>> >> >>> (since >>> >>> >>> >> >>> we are >>> >>> >>> >> >>> bumping versions....) >>> >>> >>> >> >>> >>> >>> >>> >> >>> I think maintaining the old version throughout 0.8.X makes >>> >>> >>> >> >>> sense. >>> >>> >>> IMO >>> >>> >>> >> >>> dropping it for 0.9 is feasible, but I'll let client >>> >>> >>> >> >>> owners >>> >>> >>> >> >>> help >>> >>> >>> make >>> >>> >>> >> >>> that call. >>> >>> >>> >> >>> >>> >>> >>> >> >>> Am I missing anything? Should I start a KIP? It seems like >>> >>> >>> >> >>> a >>> >>> >>> KIP-type >>> >>> >>> >> >>> discussion :) >>> >>> >>> >> >>> >>> >>> >>> >> >>> Gwen >>> >>> >>> >> >>> >>> >>> >>> >> >>> >>> >>> >>> >> >>> On Thu, Jan 15, 2015 at 2:31 PM, Ewen Cheslack-Postava >>> >>> >>> >> >>> <e...@confluent.io> wrote: >>> >>> >>> >> >>>> Gwen, >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> I think the only option that wouldn't require a protocol >>> >>> >>> >> >>>> version >>> >>> >>> >> change is >>> >>> >>> >> >>>> the one where acks > 1 is converted to acks = -1 since >>> >>> >>> >> >>>> it's >>> >>> >>> >> >>>> the >>> >>> >>> only >>> >>> >>> >> one >>> >>> >>> >> >>>> that doesn't potentially break older clients. The >>> >>> >>> >> >>>> protocol >>> >>> >>> >> >>>> guide >>> >>> >>> >> says that >>> >>> >>> >> >>>> the expected upgrade path is servers first, then clients, >>> >>> >>> >> >>>> so >>> >>> >>> >> >>>> old >>> >>> >>> >> clients, >>> >>> >>> >> >>>> including non-java clients, that may be using acks > 1 >>> >>> >>> >> >>>> should >>> >>> >>> >> >>>> be >>> >>> >>> >> able to >>> >>> >>> >> >>>> work with a new broker version. >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> It's more work, but I think dealing with the protocol >>> >>> >>> >> >>>> change >>> >>> >>> >> >>>> is the >>> >>> >>> >> right >>> >>> >>> >> >>>> thing to do since it eventually gets us to the behavior I >>> >>> >>> >> >>>> think is >>> >>> >>> >> better -- >>> >>> >>> >> >>>> the broker should reject requests with invalid values. I >>> >>> >>> >> >>>> think Joe >>> >>> >>> >> and I >>> >>> >>> >> >>>> were basically in agreement. In my mind the major piece >>> >>> >>> >> >>>> missing >>> >>> >>> from >>> >>> >>> >> his >>> >>> >>> >> >>>> description is how long we're going to maintain his "case >>> >>> >>> >> >>>> 0" >>> >>> >>> >> behavior. It's >>> >>> >>> >> >>>> impractical to maintain old versions forever, but it >>> >>> >>> >> >>>> sounds >>> >>> >>> >> >>>> like >>> >>> >>> >> there >>> >>> >>> >> >>>> hasn't been a decision on how long to maintain them. >>> >>> >>> >> >>>> Maybe >>> >>> >>> >> >>>> that's >>> >>> >>> >> another >>> >>> >>> >> >>>> item to add to KIPs -- protocol versions and behavior >>> >>> >>> >> >>>> need to >>> >>> >>> >> >>>> be >>> >>> >>> >> listed as >>> >>> >>> >> >>>> deprecated and the earliest version in which they'll be >>> >>> >>> >> >>>> removed >>> >>> >>> >> should be >>> >>> >>> >> >>>> specified so users can understand which versions are >>> >>> >>> >> >>>> guaranteed to >>> >>> >>> be >>> >>> >>> >> >>>> compatible, even if they're using (well-written) non-java >>> >>> >>> >> >>>> clients. >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> -Ewen >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:52 PM, Dana Powers < >>> >>> >>> >> dana.pow...@gmail.com> wrote: >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>>> clients don't break on unknown errors >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>> maybe true for the official java clients, but I dont >>> >>> >>> >> >>>>> think >>> >>> >>> >> >>>>> the >>> >>> >>> >> assumption >>> >>> >>> >> >>>>> holds true for community-maintained clients and users of >>> >>> >>> >> >>>>> those >>> >>> >>> >> clients. >>> >>> >>> >> >>>>> kafka-python generally follows the fail-fast philosophy >>> >>> >>> >> >>>>> and >>> >>> >>> >> >>>>> raises >>> >>> >>> >> an >>> >>> >>> >> >>>>> exception on any unrecognized error code in any server >>> >>> >>> >> >>>>> response. >>> >>> >>> >> in this >>> >>> >>> >> >>>>> case, kafka-python allows users to set their own >>> >>> >>> >> >>>>> required-acks >>> >>> >>> >> policy when >>> >>> >>> >> >>>>> creating a producer instance. It is possible that users >>> >>> >>> >> >>>>> of >>> >>> >>> >> kafka-python >>> >>> >>> >> >>>>> have deployed producer code that uses ack>1 -- perhaps >>> >>> >>> >> >>>>> in >>> >>> >>> production >>> >>> >>> >> >>>>> environments -- and for those users the new error code >>> >>> >>> >> >>>>> will >>> >>> >>> >> >>>>> crash >>> >>> >>> >> their >>> >>> >>> >> >>>>> producer code. I would not be surprised if the same >>> >>> >>> >> >>>>> were >>> >>> >>> >> >>>>> true of >>> >>> >>> >> other >>> >>> >>> >> >>>>> community clients. >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>> *one reason for the fail-fast approach is that there >>> >>> >>> >> >>>>> isn't >>> >>> >>> >> >>>>> great >>> >>> >>> >> >>>>> documentation on what errors to expect for each request >>> >>> >>> >> >>>>> / >>> >>> >>> >> >>>>> response >>> >>> >>> >> -- so >>> >>> >>> >> >>>>> we >>> >>> >>> >> >>>>> use failures to alert that some error case is not >>> >>> >>> >> >>>>> handled >>> >>> >>> >> properly. and >>> >>> >>> >> >>>>> because of that, introducing new error cases without >>> >>> >>> >> >>>>> bumping >>> >>> >>> >> >>>>> the >>> >>> >>> api >>> >>> >>> >> >>>>> version is likely to cause those errors to get >>> >>> >>> >> >>>>> raised/thrown >>> >>> >>> >> >>>>> all >>> >>> >>> >> the way >>> >>> >>> >> >>>>> back up to the user. of course we (client maintainers) >>> >>> >>> >> >>>>> can >>> >>> >>> >> >>>>> fix >>> >>> >>> the >>> >>> >>> >> issues >>> >>> >>> >> >>>>> in the client libraries and suggest users upgrade, but >>> >>> >>> >> >>>>> it's >>> >>> >>> >> >>>>> not >>> >>> >>> the >>> >>> >>> >> ideal >>> >>> >>> >> >>>>> situation. >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>> long-winded way of saying: I agree w/ Joe. >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>> -Dana >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>> On Thu, Jan 15, 2015 at 12:07 PM, Gwen Shapira < >>> >>> >>> >> gshap...@cloudera.com> >>> >>> >>> >> >>>>> wrote: >>> >>> >>> >> >>>>> >>> >>> >>> >> >>>>>> Is the protocol bump caused by the behavior change or >>> >>> >>> >> >>>>>> the >>> >>> >>> >> >>>>>> new >>> >>> >>> error >>> >>> >>> >> >>>>>> code? >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>>>>> 1) IMO, error_codes are data, and clients can expect to >>> >>> >>> >> >>>>>> receive >>> >>> >>> >> errors >>> >>> >>> >> >>>>>> that they don't understand (i.e. unknown errors). >>> >>> >>> >> >>>>>> AFAIK, >>> >>> >>> >> >>>>>> clients >>> >>> >>> >> don't >>> >>> >>> >> >>>>>> break on unknown errors, they are simple more >>> >>> >>> >> >>>>>> challenging >>> >>> >>> >> >>>>>> to >>> >>> >>> >> debug. If >>> >>> >>> >> >>>>>> we document the new behavior, then its definitely >>> >>> >>> >> >>>>>> debuggable and >>> >>> >>> >> >>>>>> fixable. >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>>>>> 2) The behavior change is basically a deprecation - >>> >>> >>> >> >>>>>> i.e. >>> >>> >>> >> >>>>>> acks > 1 >>> >>> >>> >> were >>> >>> >>> >> >>>>>> never documented, and are not supported by Kafka >>> >>> >>> >> >>>>>> clients >>> >>> >>> >> >>>>>> starting >>> >>> >>> >> with >>> >>> >>> >> >>>>>> version 0.8.2. I'm not sure this requires a protocol >>> >>> >>> >> >>>>>> bump >>> >>> >>> >> >>>>>> either, >>> >>> >>> >> >>>>>> although its a better case than new error codes. >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>>>>> Thanks, >>> >>> >>> >> >>>>>> Gwen >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>>>>> On Thu, Jan 15, 2015 at 10:10 AM, Joe Stein < >>> >>> >>> joe.st...@stealth.ly> >>> >>> >>> >> >>>>>> wrote: >>> >>> >>> >> >>>>>>> Looping in the mailing list that the client developers >>> >>> >>> >> >>>>>>> live on >>> >>> >>> >> because >>> >>> >>> >> >>>>>> they >>> >>> >>> >> >>>>>>> are all not on dev (though they should be if they want >>> >>> >>> >> >>>>>>> to >>> >>> >>> >> >>>>>>> be >>> >>> >>> >> helping >>> >>> >>> >> >>>>>>> to >>> >>> >>> >> >>>>>>> build the best client libraries they can). >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> I whole hardily believe that we need to not break >>> >>> >>> >> >>>>>>> existing >>> >>> >>> >> >>>>>>> functionality >>> >>> >>> >> >>>>>> of >>> >>> >>> >> >>>>>>> the client protocol, ever. >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> There are many reasons for this and we have other >>> >>> >>> >> >>>>>>> threads >>> >>> >>> >> >>>>>>> on the >>> >>> >>> >> >>>>>>> mailing >>> >>> >>> >> >>>>>>> list where we are discussing that topic (no pun >>> >>> >>> >> >>>>>>> intended) >>> >>> >>> >> >>>>>>> that I >>> >>> >>> >> don't >>> >>> >>> >> >>>>>> want >>> >>> >>> >> >>>>>>> to re-hash here. >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> If we change wire protocol functionality OR the binary >>> >>> >>> >> >>>>>>> format >>> >>> >>> >> (either) >>> >>> >>> >> >>>>>>> we >>> >>> >>> >> >>>>>>> must bump version AND treat version as a feature flag >>> >>> >>> >> >>>>>>> with >>> >>> >>> >> backward >>> >>> >>> >> >>>>>>> compatibility support until it is deprecated for some >>> >>> >>> >> >>>>>>> time >>> >>> >>> >> >>>>>>> for >>> >>> >>> >> folks >>> >>> >>> >> >>>>>>> to >>> >>> >>> >> >>>>>> deal >>> >>> >>> >> >>>>>>> with it. >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> match version = { >>> >>> >>> >> >>>>>>> case 0: keepDoingWhatWeWereDoing() >>> >>> >>> >> >>>>>>> case 1: doNewStuff() >>> >>> >>> >> >>>>>>> case 2: doEvenMoreNewStuff() >>> >>> >>> >> >>>>>>> } >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> has to be a practice we adopt imho ... I know feature >>> >>> >>> >> >>>>>>> flags can >>> >>> >>> be >>> >>> >>> >> >>>>>> construed >>> >>> >>> >> >>>>>>> as "messy code" but I am eager to hear another >>> >>> >>> >> >>>>>>> (better? >>> >>> >>> >> different?) >>> >>> >>> >> >>>>>> solution >>> >>> >>> >> >>>>>>> to this. >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> If we don't do a feature flag like this specifically >>> >>> >>> >> >>>>>>> with >>> >>> >>> >> >>>>>>> this >>> >>> >>> >> change >>> >>> >>> >> >>>>>> then >>> >>> >>> >> >>>>>>> what happens is that someone upgrades their brokers >>> >>> >>> >> >>>>>>> with a >>> >>> >>> rolling >>> >>> >>> >> >>>>>> restart >>> >>> >>> >> >>>>>>> in 0.8.3 and every single one of their producer >>> >>> >>> >> >>>>>>> requests >>> >>> >>> >> >>>>>>> start >>> >>> >>> to >>> >>> >>> >> fail >>> >>> >>> >> >>>>>> and >>> >>> >>> >> >>>>>>> they have a major production outage. eeeek!!!! >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> I do 100% agree that > 1 makes no sense and we >>> >>> >>> >> >>>>>>> *REALLY* >>> >>> >>> >> >>>>>>> need >>> >>> >>> >> people to >>> >>> >>> >> >>>>>> start >>> >>> >>> >> >>>>>>> using 0,1,-1 but we need to-do that in a way that is >>> >>> >>> >> >>>>>>> going >>> >>> >>> >> >>>>>>> to >>> >>> >>> >> work for >>> >>> >>> >> >>>>>>> everyone. >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> Old producers and consumers must keep working with new >>> >>> >>> >> >>>>>>> brokers >>> >>> >>> >> and if >>> >>> >>> >> >>>>>>> we >>> >>> >>> >> >>>>>> are >>> >>> >>> >> >>>>>>> not going to support that then I am unclear what the >>> >>> >>> >> >>>>>>> use >>> >>> >>> >> >>>>>>> of >>> >>> >>> >> "version" >>> >>> >>> >> >>>>>>> is >>> >>> >>> >> >>>>>>> based on our original intentions of having it because >>> >>> >>> >> >>>>>>> of >>> >>> >>> >> >>>>>>> the >>> >>> >>> >> >>>>>>> 0.7=>-0.8. >>> >>> >>> >> >>>>>> We >>> >>> >>> >> >>>>>>> said no more breaking changes when we did that. >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> - Joe Stein >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> On Thu, Jan 15, 2015 at 12:38 PM, Ewen >>> >>> >>> >> >>>>>>> Cheslack-Postava < >>> >>> >>> >> >>>>>> e...@confluent.io> >>> >>> >>> >> >>>>>>> wrote: >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> Right, so this looks like it could create an issue >>> >>> >>> >> >>>>>>>> similar to >>> >>> >>> >> what's >>> >>> >>> >> >>>>>>>> currently being discussed in >>> >>> >>> >> >>>>>>>> https://issues.apache.org/jira/browse/KAFKA-1649 >>> >>> >>> >> >>>>>>>> where >>> >>> >>> >> >>>>>>>> users >>> >>> >>> >> now get >>> >>> >>> >> >>>>>>>> errors >>> >>> >>> >> >>>>>>>> under conditions when they previously wouldn't. Old >>> >>> >>> >> >>>>>>>> clients >>> >>> >>> won't >>> >>> >>> >> >>>>>>>> even >>> >>> >>> >> >>>>>>>> know >>> >>> >>> >> >>>>>>>> about the error code, so besides failing they won't >>> >>> >>> >> >>>>>>>> even >>> >>> >>> >> >>>>>>>> be >>> >>> >>> able >>> >>> >>> >> to >>> >>> >>> >> >>>>>>>> log >>> >>> >>> >> >>>>>>>> any >>> >>> >>> >> >>>>>>>> meaningful error messages. >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> I think there are two options for compatibility: >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> 1. An alternative change is to remove the ack > 1 >>> >>> >>> >> >>>>>>>> code, >>> >>> >>> >> >>>>>>>> but >>> >>> >>> >> silently >>> >>> >>> >> >>>>>>>> "upgrade" requests with acks > 1 to acks = -1. This >>> >>> >>> >> >>>>>>>> isn't >>> >>> >>> >> >>>>>>>> the >>> >>> >>> >> same as >>> >>> >>> >> >>>>>>>> other >>> >>> >>> >> >>>>>>>> changes to behavior since the interaction between the >>> >>> >>> >> >>>>>>>> client >>> >>> >>> and >>> >>> >>> >> >>>>>>>> server >>> >>> >>> >> >>>>>>>> remains the same, no error codes change, etc. The >>> >>> >>> >> >>>>>>>> client >>> >>> >>> >> >>>>>>>> might >>> >>> >>> >> just >>> >>> >>> >> >>>>>>>> see >>> >>> >>> >> >>>>>>>> some increased latency since the message might need >>> >>> >>> >> >>>>>>>> to be >>> >>> >>> >> replicated >>> >>> >>> >> >>>>>>>> to >>> >>> >>> >> >>>>>>>> more brokers than they requested. >>> >>> >>> >> >>>>>>>> 2. Split this into two patches, one that bumps the >>> >>> >>> >> >>>>>>>> protocol >>> >>> >>> >> version >>> >>> >>> >> >>>>>>>> on >>> >>> >>> >> >>>>>>>> that >>> >>> >>> >> >>>>>>>> message to include the new error code but maintains >>> >>> >>> >> >>>>>>>> both >>> >>> >>> >> >>>>>>>> old >>> >>> >>> (now >>> >>> >>> >> >>>>>>>> deprecated) and new behavior, then a second that >>> >>> >>> >> >>>>>>>> would be >>> >>> >>> >> applied in >>> >>> >>> >> >>>>>>>> a >>> >>> >>> >> >>>>>>>> later release that removes the old protocol + code >>> >>> >>> >> >>>>>>>> for >>> >>> >>> >> >>>>>>>> handling >>> >>> >>> >> acks >>> >>> >>> >> >>>>>> 1. >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> 2 is probably the right thing to do. If we specify >>> >>> >>> >> >>>>>>>> the >>> >>> >>> >> >>>>>>>> release >>> >>> >>> >> when >>> >>> >>> >> >>>>>> we'll >>> >>> >>> >> >>>>>>>> remove the deprecated protocol at the time of >>> >>> >>> >> >>>>>>>> deprecation >>> >>> >>> >> >>>>>>>> it >>> >>> >>> >> makes >>> >>> >>> >> >>>>>> things >>> >>> >>> >> >>>>>>>> a >>> >>> >>> >> >>>>>>>> lot easier for people writing non-java clients and >>> >>> >>> >> >>>>>>>> could >>> >>> >>> >> >>>>>>>> give >>> >>> >>> >> users >>> >>> >>> >> >>>>>> better >>> >>> >>> >> >>>>>>>> predictability (e.g. if clients are at most 1 major >>> >>> >>> >> >>>>>>>> release >>> >>> >>> >> behind >>> >>> >>> >> >>>>>>>> brokers, >>> >>> >>> >> >>>>>>>> they'll remain compatible but possibly use deprecated >>> >>> >>> features). >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> On Wed, Jan 14, 2015 at 3:51 PM, Gwen Shapira < >>> >>> >>> >> gshap...@cloudera.com> >>> >>> >>> >> >>>>>>>> wrote: >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>>> Hi Kafka Devs, >>> >>> >>> >> >>>>>>>>> >>> >>> >>> >> >>>>>>>>> We are working on KAFKA-1697 - remove code related >>> >>> >>> >> >>>>>>>>> to >>> >>> >>> >> >>>>>>>>> ack>1 on >>> >>> >>> >> the >>> >>> >>> >> >>>>>>>>> broker. Per Neha's suggestion, I'd like to give >>> >>> >>> >> >>>>>>>>> everyone >>> >>> >>> >> >>>>>>>>> a >>> >>> >>> >> heads up >>> >>> >>> >> >>>>>>>>> on >>> >>> >>> >> >>>>>>>>> what these changes mean. >>> >>> >>> >> >>>>>>>>> >>> >>> >>> >> >>>>>>>>> Once this patch is included, any produce requests >>> >>> >>> >> >>>>>>>>> that >>> >>> >>> >> >>>>>>>>> include >>> >>> >>> >> >>>>>>>>> request.required.acks > 1 will result in an >>> >>> >>> >> >>>>>>>>> exception. >>> >>> >>> >> >>>>>>>>> This >>> >>> >>> >> will be >>> >>> >>> >> >>>>>>>>> InvalidRequiredAcks in new versions (0.8.3 and up, I >>> >>> >>> >> >>>>>>>>> assume) >>> >>> >>> and >>> >>> >>> >> >>>>>>>>> UnknownException in existing versions (sorry, but I >>> >>> >>> >> >>>>>>>>> can't add >>> >>> >>> >> error >>> >>> >>> >> >>>>>>>>> codes retroactively). >>> >>> >>> >> >>>>>>>>> >>> >>> >>> >> >>>>>>>>> This behavior is already enforced by 0.8.2 producers >>> >>> >>> >> >>>>>>>>> (sync and >>> >>> >>> >> >>>>>>>>> new), >>> >>> >>> >> >>>>>>>>> but we expect impact on users with older producers >>> >>> >>> >> >>>>>>>>> that >>> >>> >>> >> >>>>>>>>> relied >>> >>> >>> >> on >>> >>> >>> >> >>>>>>>>> acks >>> >>> >>> >> >>>>>>>>>> 1 and external clients (i.e python, go, etc). >>> >>> >>> >> >>>>>>>>> >>> >>> >>> >> >>>>>>>>> Users who relied on acks > 1 are expected to switch >>> >>> >>> >> >>>>>>>>> to >>> >>> >>> >> >>>>>>>>> using >>> >>> >>> >> acks = >>> >>> >>> >> >>>>>>>>> -1 >>> >>> >>> >> >>>>>>>>> and a min.isr parameter than matches their user >>> >>> >>> >> >>>>>>>>> case. >>> >>> >>> >> >>>>>>>>> >>> >>> >>> >> >>>>>>>>> This change was discussed in the past in the context >>> >>> >>> >> >>>>>>>>> of >>> >>> >>> >> KAFKA-1555 >>> >>> >>> >> >>>>>>>>> (min.isr), but let us know if you have any questions >>> >>> >>> >> >>>>>>>>> or >>> >>> >>> concerns >>> >>> >>> >> >>>>>>>>> regarding this change. >>> >>> >>> >> >>>>>>>>> >>> >>> >>> >> >>>>>>>>> Gwen >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> >>> >>> >>> >> >>>>>>>> -- >>> >>> >>> >> >>>>>>>> Thanks, >>> >>> >>> >> >>>>>>>> Ewen >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> -- >>> >>> >>> >> >>>>>>> You received this message because you are subscribed >>> >>> >>> >> >>>>>>> to >>> >>> >>> >> >>>>>>> the >>> >>> >>> Google >>> >>> >>> >> >>>>>>> Groups >>> >>> >>> >> >>>>>>> "kafka-clients" group. >>> >>> >>> >> >>>>>>> To unsubscribe from this group and stop receiving >>> >>> >>> >> >>>>>>> emails >>> >>> >>> >> >>>>>>> from >>> >>> >>> it, >>> >>> >>> >> send >>> >>> >>> >> >>>>>>> an >>> >>> >>> >> >>>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >>> >>> >>> >> >>>>>>> To post to this group, send email to >>> >>> >>> >> kafka-clie...@googlegroups.com. >>> >>> >>> >> >>>>>>> Visit this group at >>> >>> >>> http://groups.google.com/group/kafka-clients. >>> >>> >>> >> >>>>>>> To view this discussion on the web visit >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>> >>> >>> >>> >>> >>> >>> >>> >>> https://groups.google.com/d/msgid/kafka-clients/CAA7ooCBtH2JjyQsArdx_%3DV25B4O1QJk0YvOu9U6kYt9sB4aqng%40mail.gmail.com >>> >>> >>> >> >>>>>> . >>> >>> >>> >> >>>>>>> >>> >>> >>> >> >>>>>>> For more options, visit >>> >>> >>> >> >>>>>>> https://groups.google.com/d/optout. >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>>>>> -- >>> >>> >>> >> >>>>>> You received this message because you are subscribed to >>> >>> >>> >> >>>>>> the >>> >>> >>> Google >>> >>> >>> >> >>>>>> Groups >>> >>> >>> >> >>>>>> "kafka-clients" group. >>> >>> >>> >> >>>>>> To unsubscribe from this group and stop receiving >>> >>> >>> >> >>>>>> emails >>> >>> >>> >> >>>>>> from it, >>> >>> >>> >> send >>> >>> >>> >> >>>>>> an >>> >>> >>> >> >>>>>> email to kafka-clients+unsubscr...@googlegroups.com. >>> >>> >>> >> >>>>>> To post to this group, send email to >>> >>> >>> >> kafka-clie...@googlegroups.com. >>> >>> >>> >> >>>>>> Visit this group at >>> >>> >>> >> >>>>>> http://groups.google.com/group/kafka-clients >>> >>> >>> . >>> >>> >>> >> >>>>>> To view this discussion on the web visit >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>>>>> >>> >>> >>> >> >>> >>> >>> >>> >>> >>> >>> >>> >>> https://groups.google.com/d/msgid/kafka-clients/CAHBV8WeUebxi%2B%2BSbjz8E9Yf4u4hkcPJ80Xsj0XTKcTac%3D%2B613A%40mail.gmail.com >>> >>> >>> >> >>>>>> . >>> >>> >>> >> >>>>>> For more options, visit >>> >>> >>> >> >>>>>> https://groups.google.com/d/optout. >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> >>> >>> >>> >> >>>> -- >>> >>> >>> >> >>>> Thanks, >>> >>> >>> >> >>>> Ewen >>> >>> >>> >> >>> >>> >>> > >>> >>> >>> > >>> >>> >>> >>> >>> >>> >>> >>> >>> -- >>> >>> >>> Thanks, >>> >>> >>> Ewen >>> >>> >>> >>> >>> >> >>> >>> >> >>> >>> >> >>> >>> >> -- >>> >>> >> Thanks, >>> >>> >> Neha >>> >> >>> >> >>> >> -- >>> >> You received this message because you are subscribed to the Google >>> >> Groups >>> >> "kafka-clients" group. >>> >> To unsubscribe from this group and stop receiving emails from it, send >>> >> an >>> >> email to kafka-clients+unsubscr...@googlegroups.com. >>> >> To post to this group, send email to kafka-clie...@googlegroups.com. >>> >> Visit this group at http://groups.google.com/group/kafka-clients. >>> >> To view this discussion on the web visit >>> >> >>> >> https://groups.google.com/d/msgid/kafka-clients/CAOeJiJh17CYq%3D-qgPu9rnArsPW%3D7RL9AAW_h%3DrrXx0%2BKhhKgNQ%40mail.gmail.com. >>> >> >>> >> For more options, visit https://groups.google.com/d/optout. >>> > >>> > >> >> >