Hi, Jason, The reasons for this change are: (1) This is probably the most convenient setting for people to migrate from 0.7 to 0.8. The process is to build an 0.8 shadow cluster using our migration tool, upgrade all consumers to 0.8, and finally upgrade all producers to 0.8. Since most consumers are likely real time, when moving from 0.7 to 0.8, it's better for them to pick up the latest offset in 0.8 so that they don't get too many duplicates (there could be a small number of message loss for those consumers). (2) This matches the default behavior of console consumer which is the first thing that most new users experience. Does that make sense?
Thanks, Jun On Tue, Jun 18, 2013 at 9:02 AM, Jason Rosenberg <j...@squareup.com> wrote: > I'm wondering why the default setting for auto.offset.reset in the > ConsumerConfig class was changed from 'smallest' to 'largest', so late in > the game (looks like a commit on June 3 changed the default). This is an > extremely major change, I should think. Consumers now by default only get > messages newer than when the consumer starts? > > What use cases are there for that? I can think of one off cases, where you > just want to start consuming the latest feed, etc., to bootstrap things. > But in the normal case, where you want to take a consumer down for an > update, and bring it back up, you'd always be losing messages in that case. > > The default of the old (now renamed) "autooffset.reset" property in 0.7.2 > is "smallest", so this is a major change. > > Sadly, this one change caused me many hours of consternation with some > broken tests (e.g. KAFKA-945). I don't see any mention of this change > listed in any messages to the group, etc. > > It might be good to have a configuration migration page outlining changes > from 0.7.2. This change is particularly difficult since it is in a default > setting that in most cases people had been using the default (and now will > override the default in most cases). > > Jason >