Re: [DISCUSS] KIP-455: Create an Administrative API for Replica Reassignment

2019-06-25 Thread Colin McCabe
On Wed, Jun 19, 2019, at 03:36, Stanislav Kozlovski wrote:
> Hey there Colin,
> 
> Thanks for the work on this KIP. It is a much-needed improvement and I'm
> excited to see it. Sorry for coming in so late to the discussion, I have
> one question to better understand the change and a small suggestion.
> 
> I see we allow reassignment cancellation at the partition level - what is
> the motivation behind that? I think that having the back-end structures
> support it is a good choice since it allows us more flexibility in the
> future but what are the reasons for allowing a user to cancel at a
> partition level? I think allowing it might let users shoot themselves in
> the foot easier and make tools harder to implement (needing to guard
> against it).

Hi Stanislav,

Thanks for taking a look.

I'm not sure I follow the comment about cancellation.  Why would we need to 
guard against someone cancelling a reassignment of a single partition?  Why 
would cancelling a single partition's reassignment be "shooting yourself in the 
foot"?

> In all regards, what do you think about an ease of use improvement where we
> allow a user to cancel all reassignments for a topic without specifying its
> partitions? Essentially, we could cancel all reassignments for a topic if
> the Partitions field in AlterPartitionAssignmentsRequest is null.

I'm not sure why we would want to cancel all reassignments for a particular 
topic.  In general, I would expect reassignments to be cancelled if the 
situation on a broker changed, and it was overloaded instead of underloaded.  
Or, some reassignments might be cancelled if they created more overhead for 
users of the system than anticipated.  In both of these cases, topics are not 
really relevant.

After all, the partitions for a particular topic are probably spread across the 
whole system.  Topics are a useful administrative concept, but not really that 
relevant to the world of partition reassignment (or maybe I'm missing 
something?)

best,
Colin


> 
> Best,
> Stanislav
> 
> On Mon, May 6, 2019 at 5:42 PM Colin McCabe  wrote:
> 
> > On Mon, May 6, 2019, at 07:39, Ismael Juma wrote:
> > > Hi Colin,
> > >
> > > A quick comment.
> > >
> > > On Sat, May 4, 2019 at 11:18 PM Colin McCabe  wrote:
> > >
> > > > The big advantage of doing batching on the controller is that the
> > > > controller has more information about what is going on in the
> > cluster.  So
> > > > it can schedule reassignments in a more optimal way.  For instance, it
> > can
> > > > schedule reassignments so that the load is distributed evenly across
> > > > nodes.  This advantage is lost if we have to adhere to a rigid ordering
> > > > that is set up in advance.  We don't know exactly when anything will
> > > > complete in any case.  Just because one partition reassignment was
> > started
> > > > before another doesn't mean it will finish before another.
> > >
> > >
> > > This is not quite true, right? The Controller doesn't know about
> > partition
> > > sizes, throughput per partition and other such information that external
> > > tools like Cruise Control track.
> >
> > Hi Ismael,
> >
> > That's a good point, and one I should have included.
> >
> > I guess when I think about "do batching in the controller" versus "do
> > batching in an external system" I tend to think about the information the
> > controller could theoretically collect, rather than what it actually does
> > :)  But certainly, adding this information to the controller would be a
> > significant change, and maybe one we don't want to do if the external
> > systems work well enough.
> >
> > Thinking about this a little bit more, I can see three advantages to
> > controller-side batching.  Firstly, doing batching in the controller saves
> > memory because we don't use a separate JVM, and don't duplicate the
> > in-memory map of all the partitions.  Secondly, the information we're
> > acting on would also be more up-to-date.  (I'm not sure how important this
> > would be.)  Finally, it's one less thing to deploy.  I don't know if those
> > are really enough to motivate switching now, but in a greenfield system I
> > would probably choose controller-side rebalancing.
> >
> > In any case, this KIP is orthogonal to controller-side rebalancing versus
> > external rebalancing.  That's why the KIP states that we will continue to
> > perform all the given partition rebalances immediately.  I was just
> > responding to the idea that maybe we should have an "ordering" of
> > rebalancing partitions.  I don't think we want that, for controller-side
> > rebalancing or externally batched rebalancing.
> >
> > best,
> > Colin
> >
> 
> 
> -- 
> Best,
> Stanislav
>


Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-25 Thread Satish Duggana
+1 (non-binding)

On Wed, Jun 26, 2019 at 8:37 AM Satish Duggana  wrote:
>
> +1 Matthias/Andy.
> IMHO, interface is about the contract, it should not have/expose any
> implementation. I am fine with either way as it is more of taste or
> preference.
>
> Agree with Ismael/Colin/Ryanne on not deprecating for good reasons.
>
>
> On Mon, Jun 24, 2019 at 8:33 PM Andy Coates  wrote:
> >
> > I agree Matthias.
> >
> > (In Scala, such factory methods are on a companion object. As Java doesn't
> > have the concept of a companion object, an equivalent would be a utility
> > class with a similar name...)
> >
> > However, I'll update the KIP to include the factory method on the interface.
> >
> > On Fri, 21 Jun 2019 at 23:40, Matthias J. Sax  wrote:
> >
> > > I still think, that an interface does not need to know anything about
> > > its implementation. But I am also fine if we add a factory method to the
> > > new interface if that is preferred by most people.
> > >
> > >
> > > -Matthias
> > >
> > > On 6/21/19 7:10 AM, Ismael Juma wrote:
> > > > This is even more reason not to deprecate immediately, there is very
> > > little
> > > > maintenance cost for us. We should be mindful that many of our users (eg
> > > > Spark, Flink, etc.) typically allow users to specify the kafka clients
> > > > version and hence avoid using new classes/interfaces for some time. They
> > > > would get a bunch of warnings they cannot do anything about apart from
> > > > suppressing.
> > > >
> > > > Ismael
> > > >
> > > > On Fri, Jun 21, 2019 at 4:00 AM Andy Coates  wrote:
> > > >
> > > >> Hi Ismael,
> > > >>
> > > >> I’m happy enough to not deprecate the existing `AdminClient` class as
> > > part
> > > >> of this change.
> > > >>
> > > >> However, note that, the class will likely be empty, i.e. all methods 
> > > >> and
> > > >> implementations will be inherited from the interface:
> > > >>
> > > >> public abstract class AdminClient implements Admin {
> > > >> }
> > > >>
> > > >> Not marking it as deprecated has the benefit that users won’t see any
> > > >> deprecation warnings on the next release. Conversely, deprecating it
> > > will
> > > >> mean we can choose to remove this, now pointless class, in the future
> > > if we
> > > >> choose.
> > > >>
> > > >> That’s my thinking for deprecation, but as I’ve said I’m happy either
> > > way.
> > > >>
> > > >> Andy
> > > >>
> > > >>> On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
> > > >>>
> > > >>> I agree with Ryanne, I think we should avoid deprecating AdminClient
> > > and
> > > >>> causing so much churn for users who don't actually care about this
> > > niche
> > > >>> use case.
> > > >>>
> > > >>> Ismael
> > > >>>
> > > >>> On Tue, Jun 18, 2019 at 6:43 AM Andy Coates  wrote:
> > > >>>
> > >  Hi Ryanne,
> > > 
> > >  If we don't change the client code, then everywhere will still expect
> > >  subclasses of `AdminClient`, so the interface will be of no use, i.e.
> > > I
> > >  can't write a class that implements the new interface and pass it to
> > > the
> > >  client code.
> > > 
> > >  Thanks,
> > > 
> > >  Andy
> > > 
> > >  On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan 
> > > >> wrote:
> > > 
> > > > Andy, while I agree that the new interface is useful, I'm not
> > > convinced
> > > > adding an interface requires deprecating AdminClient and changing so
> > > >> much
> > > > client code. Why not just add the Admin interface, have AdminClient
> > > > implement it, and have done?
> > > >
> > > > Ryanne
> > > >
> > > > On Mon, Jun 17, 2019 at 12:09 PM Andy Coates 
> > > >> wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I think I've addressed all concerns. Let me know if I've not.  Can 
> > > >> I
> > >  call
> > > >> another round of votes please?
> > > >>
> > > >> Thanks,
> > > >>
> > > >> Andy
> > > >>
> > > >> On Fri, 14 Jun 2019 at 04:55, Satish Duggana <
> > > >> satish.dugg...@gmail.com
> > > >
> > > >> wrote:
> > > >>
> > > >>> Hi Andy,
> > > >>> Thanks for the KIP. This is a good change and it gives the user a
> > > > better
> > > >>> handle on Admin client usage. I agree with the proposal except the
> > >  new
> > > >>> `Admin` interface having all the methods from `AdminClient`
> > > abstract
> > > >> class.
> > > >>> It should be kept clean having only the admin operations as 
> > > >>> methods
> > > > from
> > > >>> KafkaClient abstract class but not the factory methods as 
> > > >>> mentioned
> > >  in
> > > >> the
> > > >>> earlier mail.
> > > >>>
> > > >>> I know about dynamic proxies(which were widely used in RMI/EJB
> > >  world).
> > > > I
> > > >> am
> > > >>> curious about the usecase using dynamic proxies with Admin client
> > > >>> interface. Dynamic proxy can have performance penalty if it is 
> > > >>> used
> > >  in
> > > 

Build failed in Jenkins: kafka-trunk-jdk8 #3752

2019-06-25 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Fix flaky test case for compact/delete topics (#6975)

--
[...truncated 4.82 MB...]

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.internals.SuppressTopologyTest.shouldUseNumberingForAnonymousSuppressionNode
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.internals.SuppressTopologyTest.shouldApplyNameToSuppressionNode
 STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.internals.SuppressTopologyTest.shouldApplyNameToSuppressionNode
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.internals.SuppressTopologyTest.shouldUseNumberingForAnonymousFinalSuppressionNode
 STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.internals.SuppressTopologyTest.shouldUseNumberingForAnonymousFinalSuppressionNode
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.internals.SuppressTopologyTest.shouldApplyNameToFinalSuppressionNode
 STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.internals.SuppressTopologyTest.shouldApplyNameToFinalSuppressionNode
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.SuppressedTest.finalEventsShouldAcceptStrictBuffersAndSetBounds
 STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.SuppressedTest.finalEventsShouldAcceptStrictBuffersAndSetBounds
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.SuppressedTest.intermediateEventsShouldAcceptAnyBufferAndSetBounds
 STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.SuppressedTest.intermediateEventsShouldAcceptAnyBufferAndSetBounds
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.SuppressedTest.bufferBuilderShouldBeConsistent 
STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.kstream.SuppressedTest.bufferBuilderShouldBeConsistent 
PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBufferTest.bufferShouldAllowCacheEnablement
 STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBufferTest.bufferShouldAllowCacheEnablement
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBufferTest.bufferShouldAllowCacheDisablement
 STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.InMemoryTimeOrderedKeyValueBufferTest.bufferShouldAllowCacheDisablement
 PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldReturnPriorValueForBufferedKey[0:
 test=in-memory buffer] STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldReturnPriorValueForBufferedKey[0:
 test=in-memory buffer] PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldReturnUndefinedOnPriorValueForNotBufferedKey[0:
 test=in-memory buffer] STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldReturnUndefinedOnPriorValueForNotBufferedKey[0:
 test=in-memory buffer] PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldTrackMinTimestamp[0:
 test=in-memory buffer] STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldTrackMinTimestamp[0:
 test=in-memory buffer] PASSED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldNotRestoreUnrecognizedVersionRecord[0:
 test=in-memory buffer] STARTED

org.apache.kafka.streams.kstream.internals.suppress.SuppressSuite > 
org.apache.kafka.streams.state.internals.TimeOrderedKeyValueBufferTest.shouldNotRestoreUnrecognizedVersionRecord[0:
 

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-25 Thread Satish Duggana
+1 Matthias/Andy.
IMHO, interface is about the contract, it should not have/expose any
implementation. I am fine with either way as it is more of taste or
preference.

Agree with Ismael/Colin/Ryanne on not deprecating for good reasons.


On Mon, Jun 24, 2019 at 8:33 PM Andy Coates  wrote:
>
> I agree Matthias.
>
> (In Scala, such factory methods are on a companion object. As Java doesn't
> have the concept of a companion object, an equivalent would be a utility
> class with a similar name...)
>
> However, I'll update the KIP to include the factory method on the interface.
>
> On Fri, 21 Jun 2019 at 23:40, Matthias J. Sax  wrote:
>
> > I still think, that an interface does not need to know anything about
> > its implementation. But I am also fine if we add a factory method to the
> > new interface if that is preferred by most people.
> >
> >
> > -Matthias
> >
> > On 6/21/19 7:10 AM, Ismael Juma wrote:
> > > This is even more reason not to deprecate immediately, there is very
> > little
> > > maintenance cost for us. We should be mindful that many of our users (eg
> > > Spark, Flink, etc.) typically allow users to specify the kafka clients
> > > version and hence avoid using new classes/interfaces for some time. They
> > > would get a bunch of warnings they cannot do anything about apart from
> > > suppressing.
> > >
> > > Ismael
> > >
> > > On Fri, Jun 21, 2019 at 4:00 AM Andy Coates  wrote:
> > >
> > >> Hi Ismael,
> > >>
> > >> I’m happy enough to not deprecate the existing `AdminClient` class as
> > part
> > >> of this change.
> > >>
> > >> However, note that, the class will likely be empty, i.e. all methods and
> > >> implementations will be inherited from the interface:
> > >>
> > >> public abstract class AdminClient implements Admin {
> > >> }
> > >>
> > >> Not marking it as deprecated has the benefit that users won’t see any
> > >> deprecation warnings on the next release. Conversely, deprecating it
> > will
> > >> mean we can choose to remove this, now pointless class, in the future
> > if we
> > >> choose.
> > >>
> > >> That’s my thinking for deprecation, but as I’ve said I’m happy either
> > way.
> > >>
> > >> Andy
> > >>
> > >>> On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
> > >>>
> > >>> I agree with Ryanne, I think we should avoid deprecating AdminClient
> > and
> > >>> causing so much churn for users who don't actually care about this
> > niche
> > >>> use case.
> > >>>
> > >>> Ismael
> > >>>
> > >>> On Tue, Jun 18, 2019 at 6:43 AM Andy Coates  wrote:
> > >>>
> >  Hi Ryanne,
> > 
> >  If we don't change the client code, then everywhere will still expect
> >  subclasses of `AdminClient`, so the interface will be of no use, i.e.
> > I
> >  can't write a class that implements the new interface and pass it to
> > the
> >  client code.
> > 
> >  Thanks,
> > 
> >  Andy
> > 
> >  On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan 
> > >> wrote:
> > 
> > > Andy, while I agree that the new interface is useful, I'm not
> > convinced
> > > adding an interface requires deprecating AdminClient and changing so
> > >> much
> > > client code. Why not just add the Admin interface, have AdminClient
> > > implement it, and have done?
> > >
> > > Ryanne
> > >
> > > On Mon, Jun 17, 2019 at 12:09 PM Andy Coates 
> > >> wrote:
> > >
> > >> Hi all,
> > >>
> > >> I think I've addressed all concerns. Let me know if I've not.  Can I
> >  call
> > >> another round of votes please?
> > >>
> > >> Thanks,
> > >>
> > >> Andy
> > >>
> > >> On Fri, 14 Jun 2019 at 04:55, Satish Duggana <
> > >> satish.dugg...@gmail.com
> > >
> > >> wrote:
> > >>
> > >>> Hi Andy,
> > >>> Thanks for the KIP. This is a good change and it gives the user a
> > > better
> > >>> handle on Admin client usage. I agree with the proposal except the
> >  new
> > >>> `Admin` interface having all the methods from `AdminClient`
> > abstract
> > >> class.
> > >>> It should be kept clean having only the admin operations as methods
> > > from
> > >>> KafkaClient abstract class but not the factory methods as mentioned
> >  in
> > >> the
> > >>> earlier mail.
> > >>>
> > >>> I know about dynamic proxies(which were widely used in RMI/EJB
> >  world).
> > > I
> > >> am
> > >>> curious about the usecase using dynamic proxies with Admin client
> > >>> interface. Dynamic proxy can have performance penalty if it is used
> >  in
> > >>> critical path. Is that the primary motivation for creating the KIP?
> > >>>
> > >>> Thanks,
> > >>> Satish.
> > >>>
> > >>> On Wed, Jun 12, 2019 at 8:43 PM Andy Coates 
> >  wrote:
> > >>>
> >  I'm not married to that part.  That was only done to keep it more
> >  or
> > >> less
> >  inline with what's already there, (an abstract class that has a
> > > factory
> 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Jason Gustafson
For reference, we have BrokerApiVersionCommand already as a public
interface. We have a bit of tech debt at the moment because it uses a
custom AdminClient. It would be nice to clean that up. In general, I think
it is reasonable to expose from AdminClient. It can be used by management
tools to inspect running Kafka versions for example.

-Jason

On Tue, Jun 25, 2019 at 4:37 PM Boyang Chen 
wrote:

> Thank you for the context Colin. The groupId was indeed a copy-paste error.
> Our use case here for 447 is (Quoted from Guozhang):
> '''
> I think if we can do something else to
> avoid this config though, for example we can use the embedded AdminClient
> to send the APIVersion request upon starting up, and based on the returned
> value decides whether to go to the old code path or the new behavior.
> '''
> The benefit we get is to avoid adding a new configuration to make a
> decision simply base on broker version. If you have concerns with exposing
> ApiVersion for client, we could
> try to think of alternative solutions too.
>
> Boyang
>
>
>
> On Tue, Jun 25, 2019 at 4:20 PM Colin McCabe  wrote:
>
> > kafka.api.ApiVersion is an internal class, not suitable to exposing
> > through AdminClient.  That class is not even accessible without having
> the
> > broker jars on your CLASSPATH.
> >
> > Another question is, what is the groupId parameter doing in the call?
> The
> > API versions are the same no matter what consumer group we use, right?
> > Perhaps this was a copy and paste error?
> >
> > This is not the first time we have discussed having a method in
> > AdminClient to retrieve API version information.  In fact, the original
> KIP
> > which created KafkaAdminClient specified an API for fetching version
> > information.  It was called apiVersions and it is still there on the
> wiki.
> > See
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-117%3A+Add+a+public+AdminClient+API+for+Kafka+admin+operations
> >
> > However, this API wasn't ready in time for 0.11.0 so we shipped without
> > it.  There was a JIRA to implement it for later versions,
> > https://issues.apache.org/jira/browse/KAFKA-5214 , as well as a PR,
> > https://github.com/apache/kafka/pull/3012 .  However, we started to
> > rethink whether this AdminClient function was even necessary.  Most of
> the
> > use-cases we could think of seemed like horrible hacks.  So it has never
> > really been implemented (yet?).
> >
> > best,
> > Colin
> >
> >
> > On Tue, Jun 25, 2019, at 15:46, Boyang Chen wrote:
> > > Actually, after a second thought, I think it actually makes sense to
> > > support auto upgrade through admin client to help use get api version
> > > from
> > > broker.
> > > A draft KIP is here:
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-483%3A++Add+Broker+Version+API+in+Admin+Client
> > >
> > > Boyang
> > >
> > > On Tue, Jun 25, 2019 at 2:57 PM Boyang Chen <
> reluctanthero...@gmail.com>
> > > wrote:
> > >
> > > > Thank you Guozhang, some of my understandings are inline below.
> > > >
> > > > On Tue, Jun 25, 2019 at 11:05 AM Jason Gustafson  >
> > > > wrote:
> > > >
> > > >> >
> > > >> > I think co-locating does have some merits here, i.e. letting the
> > > >> > ConsumerCoordinator which has the source-of-truth of assignment to
> > act
> > > >> as
> > > >> > the TxnCoordinator as well; but I agree there's also some cons of
> > > >> coupling
> > > >> > them together. I'm still a bit inclining towards colocation but if
> > there
> > > >> > are good rationales not to do so I can be convinced as well.
> > > >>
> > > >>
> > > >> The good rationale is that we have no mechanism to colocate
> > partitions ;).
> > > >> Are you suggesting we store the group and transaction state in the
> > same
> > > >> log? Can you be more concrete about the benefit?
> > > >>
> > > >> -Jason
> > > >>
> > > >> On Tue, Jun 25, 2019 at 10:51 AM Guozhang Wang 
> > > >> wrote:
> > > >>
> > > >> > Hi Boyang,
> > > >> >
> > > >> > 1. One advantage of retry against on-hold is that it will not
> > tie-up a
> > > >> > handler thread (of course the latter could do the same but that
> > involves
> > > >> > using a purgatory which is more complicated), and also it is less
> > > >> likely to
> > > >> > violate request timeout. So I think there are some rationales to
> > prefer
> > > >> > retries.
> > > >> >
> > > >>
> > > >  That sounds fair to me, also we are avoiding usage of another
> > purgatory
> > > > instance. Usually for one back-off
> > > > we are only delaying 50ms during startup which is trivial cost. This
> > > > behavior shouldn't be changed.
> > > >
> > > > > 2. Regarding "ConsumerRebalanceListener": both
> > ConsumerRebalanceListener
> > > >> > and PartitionAssignors are user-customizable modules, and only
> > > >> difference
> > > >> > is that the former is specified via code and the latter is
> > specified via
> > > >> > config.
> > > >> >
> > > >> > Regarding Jason's proposal of ConsumerAssignment, one thing to
> note
> > > >> though

Re: [DISCUSS] KIP-455: Create an Administrative API for Replica Reassignment

2019-06-25 Thread Jason Gustafson
Hi Colin,

Took another pass on the KIP. Looks good overall. A few questions below:

1. I wasn't clear why `currentReplicas` is an optional field. Wouldn't we
always have a current set of replicas?

2. Seems the only option is to list all active partition reassignments? I
think we have tended to regret these APIs. At least should we be able to
specify a subset of topics or partitions perhaps?

3. Can you elaborate a bit on the handling of /admin/reassign_partitions?
Does this alter the target_replicas of the leader and ISR znode?

4. I think it would be helpful to provide an example of the rebalance
process for a given partition. Specifically I am wondering whether the
replica set is updated incrementally or if we follow the current behavior.
Possibly some decisions can be deferred to implementation, but it would be
helpful to work through a case of changing the replication factor just to
make sure there are reasonable options.

5. Are we changing the semantics of the URP and UnderMinIsr metrics in this
KIP or in a follow-up?

6. We have both "TargetBrokers" and "PendingReplicas" as names in the
proposal. Perhaps we should try to be consistent?

7. I am not sure specifying `targetReplicas` as empty is the clearest way
to cancel a reassignment. Whether we implement it this way or not in the
protocol is a separate issue, but perhaps we should have an explicit
`cancelReassignment` method in AdminClient?

Thanks,
Jason




On Wed, Jun 19, 2019 at 3:36 AM Stanislav Kozlovski 
wrote:

> Hey there Colin,
>
> Thanks for the work on this KIP. It is a much-needed improvement and I'm
> excited to see it. Sorry for coming in so late to the discussion, I have
> one question to better understand the change and a small suggestion.
>
> I see we allow reassignment cancellation at the partition level - what is
> the motivation behind that? I think that having the back-end structures
> support it is a good choice since it allows us more flexibility in the
> future but what are the reasons for allowing a user to cancel at a
> partition level? I think allowing it might let users shoot themselves in
> the foot easier and make tools harder to implement (needing to guard
> against it).
>
> In all regards, what do you think about an ease of use improvement where we
> allow a user to cancel all reassignments for a topic without specifying its
> partitions? Essentially, we could cancel all reassignments for a topic if
> the Partitions field in AlterPartitionAssignmentsRequest is null.
>
> Best,
> Stanislav
>
> On Mon, May 6, 2019 at 5:42 PM Colin McCabe  wrote:
>
> > On Mon, May 6, 2019, at 07:39, Ismael Juma wrote:
> > > Hi Colin,
> > >
> > > A quick comment.
> > >
> > > On Sat, May 4, 2019 at 11:18 PM Colin McCabe 
> wrote:
> > >
> > > > The big advantage of doing batching on the controller is that the
> > > > controller has more information about what is going on in the
> > cluster.  So
> > > > it can schedule reassignments in a more optimal way.  For instance,
> it
> > can
> > > > schedule reassignments so that the load is distributed evenly across
> > > > nodes.  This advantage is lost if we have to adhere to a rigid
> ordering
> > > > that is set up in advance.  We don't know exactly when anything will
> > > > complete in any case.  Just because one partition reassignment was
> > started
> > > > before another doesn't mean it will finish before another.
> > >
> > >
> > > This is not quite true, right? The Controller doesn't know about
> > partition
> > > sizes, throughput per partition and other such information that
> external
> > > tools like Cruise Control track.
> >
> > Hi Ismael,
> >
> > That's a good point, and one I should have included.
> >
> > I guess when I think about "do batching in the controller" versus "do
> > batching in an external system" I tend to think about the information the
> > controller could theoretically collect, rather than what it actually does
> > :)  But certainly, adding this information to the controller would be a
> > significant change, and maybe one we don't want to do if the external
> > systems work well enough.
> >
> > Thinking about this a little bit more, I can see three advantages to
> > controller-side batching.  Firstly, doing batching in the controller
> saves
> > memory because we don't use a separate JVM, and don't duplicate the
> > in-memory map of all the partitions.  Secondly, the information we're
> > acting on would also be more up-to-date.  (I'm not sure how important
> this
> > would be.)  Finally, it's one less thing to deploy.  I don't know if
> those
> > are really enough to motivate switching now, but in a greenfield system I
> > would probably choose controller-side rebalancing.
> >
> > In any case, this KIP is orthogonal to controller-side rebalancing versus
> > external rebalancing.  That's why the KIP states that we will continue to
> > perform all the given partition rebalances immediately.  I was just
> > responding to the idea that maybe we should have an 

[jira] [Created] (KAFKA-8603) Document upgrade path

2019-06-25 Thread Sophie Blee-Goldman (JIRA)
Sophie Blee-Goldman created KAFKA-8603:
--

 Summary: Document upgrade path
 Key: KAFKA-8603
 URL: https://issues.apache.org/jira/browse/KAFKA-8603
 Project: Kafka
  Issue Type: Sub-task
  Components: consumer, streams
Reporter: Sophie Blee-Goldman


Users need to follow a specific upgrade path in order to smoothly and safely 
perform live upgrade. We should very clearly document the steps needed to 
upgrade a Consumer and a Streams app (note they will be different)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[DISCUSS] KIP-484: Expose metrics for group and transaction metadata loading duration

2019-06-25 Thread Anastasia Vela
Hi all,

I'd like to discuss KIP-484:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-484%3A+Expose+metrics+for+group+and+transaction+metadata+loading+duration

Let me know what you think!

Thanks,
Anastasia


Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Boyang Chen
Thank you for the context Colin. The groupId was indeed a copy-paste error.
Our use case here for 447 is (Quoted from Guozhang):
'''
I think if we can do something else to
avoid this config though, for example we can use the embedded AdminClient
to send the APIVersion request upon starting up, and based on the returned
value decides whether to go to the old code path or the new behavior.
'''
The benefit we get is to avoid adding a new configuration to make a
decision simply base on broker version. If you have concerns with exposing
ApiVersion for client, we could
try to think of alternative solutions too.

Boyang



On Tue, Jun 25, 2019 at 4:20 PM Colin McCabe  wrote:

> kafka.api.ApiVersion is an internal class, not suitable to exposing
> through AdminClient.  That class is not even accessible without having the
> broker jars on your CLASSPATH.
>
> Another question is, what is the groupId parameter doing in the call?  The
> API versions are the same no matter what consumer group we use, right?
> Perhaps this was a copy and paste error?
>
> This is not the first time we have discussed having a method in
> AdminClient to retrieve API version information.  In fact, the original KIP
> which created KafkaAdminClient specified an API for fetching version
> information.  It was called apiVersions and it is still there on the wiki.
> See
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-117%3A+Add+a+public+AdminClient+API+for+Kafka+admin+operations
>
> However, this API wasn't ready in time for 0.11.0 so we shipped without
> it.  There was a JIRA to implement it for later versions,
> https://issues.apache.org/jira/browse/KAFKA-5214 , as well as a PR,
> https://github.com/apache/kafka/pull/3012 .  However, we started to
> rethink whether this AdminClient function was even necessary.  Most of the
> use-cases we could think of seemed like horrible hacks.  So it has never
> really been implemented (yet?).
>
> best,
> Colin
>
>
> On Tue, Jun 25, 2019, at 15:46, Boyang Chen wrote:
> > Actually, after a second thought, I think it actually makes sense to
> > support auto upgrade through admin client to help use get api version
> > from
> > broker.
> > A draft KIP is here:
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-483%3A++Add+Broker+Version+API+in+Admin+Client
> >
> > Boyang
> >
> > On Tue, Jun 25, 2019 at 2:57 PM Boyang Chen 
> > wrote:
> >
> > > Thank you Guozhang, some of my understandings are inline below.
> > >
> > > On Tue, Jun 25, 2019 at 11:05 AM Jason Gustafson 
> > > wrote:
> > >
> > >> >
> > >> > I think co-locating does have some merits here, i.e. letting the
> > >> > ConsumerCoordinator which has the source-of-truth of assignment to
> act
> > >> as
> > >> > the TxnCoordinator as well; but I agree there's also some cons of
> > >> coupling
> > >> > them together. I'm still a bit inclining towards colocation but if
> there
> > >> > are good rationales not to do so I can be convinced as well.
> > >>
> > >>
> > >> The good rationale is that we have no mechanism to colocate
> partitions ;).
> > >> Are you suggesting we store the group and transaction state in the
> same
> > >> log? Can you be more concrete about the benefit?
> > >>
> > >> -Jason
> > >>
> > >> On Tue, Jun 25, 2019 at 10:51 AM Guozhang Wang 
> > >> wrote:
> > >>
> > >> > Hi Boyang,
> > >> >
> > >> > 1. One advantage of retry against on-hold is that it will not
> tie-up a
> > >> > handler thread (of course the latter could do the same but that
> involves
> > >> > using a purgatory which is more complicated), and also it is less
> > >> likely to
> > >> > violate request timeout. So I think there are some rationales to
> prefer
> > >> > retries.
> > >> >
> > >>
> > >  That sounds fair to me, also we are avoiding usage of another
> purgatory
> > > instance. Usually for one back-off
> > > we are only delaying 50ms during startup which is trivial cost. This
> > > behavior shouldn't be changed.
> > >
> > > > 2. Regarding "ConsumerRebalanceListener": both
> ConsumerRebalanceListener
> > >> > and PartitionAssignors are user-customizable modules, and only
> > >> difference
> > >> > is that the former is specified via code and the latter is
> specified via
> > >> > config.
> > >> >
> > >> > Regarding Jason's proposal of ConsumerAssignment, one thing to note
> > >> though
> > >> > with KIP-429 the onPartitionAssigned may not be called if the
> assignment
> > >> > does not change, whereas onAssignment would always be called at the
> end
> > >> of
> > >> > sync-group response. My proposed semantics is that
> > >> > `RebalanceListener#onPartitionsXXX` are used for notifications to
> user,
> > >> and
> > >> > hence if there's no changes these will not be called, whereas
> > >> > `PartitionAssignor` is used for assignor logic, whose callback would
> > >> always
> > >> > be called no matter if the partitions have changed or not.
> > >>
> > >> I think a third option is to gracefully expose generation id as part
> of
> > > consumer API, so that 

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-06-25 Thread Colin McCabe
+1 (binding).

C.

On Mon, Jun 24, 2019, at 08:10, Andy Coates wrote:
> Hi all,
> 
> KIP updated:
> - No deprecation
> - Factory method back onto Admin interface
> 
> I'd like to kick off another round of voting please.
> 
> Thanks,
> 
> Andy
> 
> On Mon, 24 Jun 2019 at 16:03, Andy Coates  wrote:
> 
> > I agree Matthias.
> >
> > (In Scala, such factory methods are on a companion object. As Java doesn't
> > have the concept of a companion object, an equivalent would be a utility
> > class with a similar name...)
> >
> > However, I'll update the KIP to include the factory method on the
> > interface.
> >
> > On Fri, 21 Jun 2019 at 23:40, Matthias J. Sax 
> > wrote:
> >
> >> I still think, that an interface does not need to know anything about
> >> its implementation. But I am also fine if we add a factory method to the
> >> new interface if that is preferred by most people.
> >>
> >>
> >> -Matthias
> >>
> >> On 6/21/19 7:10 AM, Ismael Juma wrote:
> >> > This is even more reason not to deprecate immediately, there is very
> >> little
> >> > maintenance cost for us. We should be mindful that many of our users (eg
> >> > Spark, Flink, etc.) typically allow users to specify the kafka clients
> >> > version and hence avoid using new classes/interfaces for some time. They
> >> > would get a bunch of warnings they cannot do anything about apart from
> >> > suppressing.
> >> >
> >> > Ismael
> >> >
> >> > On Fri, Jun 21, 2019 at 4:00 AM Andy Coates  wrote:
> >> >
> >> >> Hi Ismael,
> >> >>
> >> >> I’m happy enough to not deprecate the existing `AdminClient` class as
> >> part
> >> >> of this change.
> >> >>
> >> >> However, note that, the class will likely be empty, i.e. all methods
> >> and
> >> >> implementations will be inherited from the interface:
> >> >>
> >> >> public abstract class AdminClient implements Admin {
> >> >> }
> >> >>
> >> >> Not marking it as deprecated has the benefit that users won’t see any
> >> >> deprecation warnings on the next release. Conversely, deprecating it
> >> will
> >> >> mean we can choose to remove this, now pointless class, in the future
> >> if we
> >> >> choose.
> >> >>
> >> >> That’s my thinking for deprecation, but as I’ve said I’m happy either
> >> way.
> >> >>
> >> >> Andy
> >> >>
> >> >>> On 18 Jun 2019, at 16:09, Ismael Juma  wrote:
> >> >>>
> >> >>> I agree with Ryanne, I think we should avoid deprecating AdminClient
> >> and
> >> >>> causing so much churn for users who don't actually care about this
> >> niche
> >> >>> use case.
> >> >>>
> >> >>> Ismael
> >> >>>
> >> >>> On Tue, Jun 18, 2019 at 6:43 AM Andy Coates 
> >> wrote:
> >> >>>
> >>  Hi Ryanne,
> >> 
> >>  If we don't change the client code, then everywhere will still expect
> >>  subclasses of `AdminClient`, so the interface will be of no use,
> >> i.e. I
> >>  can't write a class that implements the new interface and pass it to
> >> the
> >>  client code.
> >> 
> >>  Thanks,
> >> 
> >>  Andy
> >> 
> >>  On Mon, 17 Jun 2019 at 19:01, Ryanne Dolan 
> >> >> wrote:
> >> 
> >> > Andy, while I agree that the new interface is useful, I'm not
> >> convinced
> >> > adding an interface requires deprecating AdminClient and changing so
> >> >> much
> >> > client code. Why not just add the Admin interface, have AdminClient
> >> > implement it, and have done?
> >> >
> >> > Ryanne
> >> >
> >> > On Mon, Jun 17, 2019 at 12:09 PM Andy Coates 
> >> >> wrote:
> >> >
> >> >> Hi all,
> >> >>
> >> >> I think I've addressed all concerns. Let me know if I've not.  Can
> >> I
> >>  call
> >> >> another round of votes please?
> >> >>
> >> >> Thanks,
> >> >>
> >> >> Andy
> >> >>
> >> >> On Fri, 14 Jun 2019 at 04:55, Satish Duggana <
> >> >> satish.dugg...@gmail.com
> >> >
> >> >> wrote:
> >> >>
> >> >>> Hi Andy,
> >> >>> Thanks for the KIP. This is a good change and it gives the user a
> >> > better
> >> >>> handle on Admin client usage. I agree with the proposal except the
> >>  new
> >> >>> `Admin` interface having all the methods from `AdminClient`
> >> abstract
> >> >> class.
> >> >>> It should be kept clean having only the admin operations as
> >> methods
> >> > from
> >> >>> KafkaClient abstract class but not the factory methods as
> >> mentioned
> >>  in
> >> >> the
> >> >>> earlier mail.
> >> >>>
> >> >>> I know about dynamic proxies(which were widely used in RMI/EJB
> >>  world).
> >> > I
> >> >> am
> >> >>> curious about the usecase using dynamic proxies with Admin client
> >> >>> interface. Dynamic proxy can have performance penalty if it is
> >> used
> >>  in
> >> >>> critical path. Is that the primary motivation for creating the
> >> KIP?
> >> >>>
> >> >>> Thanks,
> >> >>> Satish.
> >> >>>
> >> >>> On Wed, Jun 12, 2019 at 8:43 PM Andy Coates 
> >>  wrote:
> >> >>>
> 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Colin McCabe
kafka.api.ApiVersion is an internal class, not suitable to exposing through 
AdminClient.  That class is not even accessible without having the broker jars 
on your CLASSPATH.

Another question is, what is the groupId parameter doing in the call?  The API 
versions are the same no matter what consumer group we use, right?  Perhaps 
this was a copy and paste error?

This is not the first time we have discussed having a method in AdminClient to 
retrieve API version information.  In fact, the original KIP which created 
KafkaAdminClient specified an API for fetching version information.  It was 
called apiVersions and it is still there on the wiki.  See 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-117%3A+Add+a+public+AdminClient+API+for+Kafka+admin+operations

However, this API wasn't ready in time for 0.11.0 so we shipped without it.  
There was a JIRA to implement it for later versions, 
https://issues.apache.org/jira/browse/KAFKA-5214 , as well as a PR, 
https://github.com/apache/kafka/pull/3012 .  However, we started to rethink 
whether this AdminClient function was even necessary.  Most of the use-cases we 
could think of seemed like horrible hacks.  So it has never really been 
implemented (yet?).

best,
Colin


On Tue, Jun 25, 2019, at 15:46, Boyang Chen wrote:
> Actually, after a second thought, I think it actually makes sense to
> support auto upgrade through admin client to help use get api version 
> from
> broker.
> A draft KIP is here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-483%3A++Add+Broker+Version+API+in+Admin+Client
> 
> Boyang
> 
> On Tue, Jun 25, 2019 at 2:57 PM Boyang Chen 
> wrote:
> 
> > Thank you Guozhang, some of my understandings are inline below.
> >
> > On Tue, Jun 25, 2019 at 11:05 AM Jason Gustafson 
> > wrote:
> >
> >> >
> >> > I think co-locating does have some merits here, i.e. letting the
> >> > ConsumerCoordinator which has the source-of-truth of assignment to act
> >> as
> >> > the TxnCoordinator as well; but I agree there's also some cons of
> >> coupling
> >> > them together. I'm still a bit inclining towards colocation but if there
> >> > are good rationales not to do so I can be convinced as well.
> >>
> >>
> >> The good rationale is that we have no mechanism to colocate partitions ;).
> >> Are you suggesting we store the group and transaction state in the same
> >> log? Can you be more concrete about the benefit?
> >>
> >> -Jason
> >>
> >> On Tue, Jun 25, 2019 at 10:51 AM Guozhang Wang 
> >> wrote:
> >>
> >> > Hi Boyang,
> >> >
> >> > 1. One advantage of retry against on-hold is that it will not tie-up a
> >> > handler thread (of course the latter could do the same but that involves
> >> > using a purgatory which is more complicated), and also it is less
> >> likely to
> >> > violate request timeout. So I think there are some rationales to prefer
> >> > retries.
> >> >
> >>
> >  That sounds fair to me, also we are avoiding usage of another purgatory
> > instance. Usually for one back-off
> > we are only delaying 50ms during startup which is trivial cost. This
> > behavior shouldn't be changed.
> >
> > > 2. Regarding "ConsumerRebalanceListener": both ConsumerRebalanceListener
> >> > and PartitionAssignors are user-customizable modules, and only
> >> difference
> >> > is that the former is specified via code and the latter is specified via
> >> > config.
> >> >
> >> > Regarding Jason's proposal of ConsumerAssignment, one thing to note
> >> though
> >> > with KIP-429 the onPartitionAssigned may not be called if the assignment
> >> > does not change, whereas onAssignment would always be called at the end
> >> of
> >> > sync-group response. My proposed semantics is that
> >> > `RebalanceListener#onPartitionsXXX` are used for notifications to user,
> >> and
> >> > hence if there's no changes these will not be called, whereas
> >> > `PartitionAssignor` is used for assignor logic, whose callback would
> >> always
> >> > be called no matter if the partitions have changed or not.
> >>
> >> I think a third option is to gracefully expose generation id as part of
> > consumer API, so that we don't need to
> > bother overloading various callbacks. Of course, this builds upon the
> > assumption that topic partitions
> > will not be included in new initTransaction API.
> >
> > > 3. I feel it is a bit awkward to let the TxnCoordinator keeping partition
> >> > assignments since it is sort of taking over the job of the
> >> > ConsumerCoordinator, and may likely cause a split-brain problem as two
> >> > coordinators keep a copy of this assignment which may be different.
> >> >
> >> > I think co-locating does have some merits here, i.e. letting the
> >> > ConsumerCoordinator which has the source-of-truth of assignment to act
> >> as
> >> > the TxnCoordinator as well; but I agree there's also some cons of
> >> coupling
> >> > them together. I'm still a bit inclining towards colocation but if there
> >> > are good rationales not to do so I can be convinced as 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Boyang Chen
Actually, after a second thought, I think it actually makes sense to
support auto upgrade through admin client to help use get api version from
broker.
A draft KIP is here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-483%3A++Add+Broker+Version+API+in+Admin+Client

Boyang

On Tue, Jun 25, 2019 at 2:57 PM Boyang Chen 
wrote:

> Thank you Guozhang, some of my understandings are inline below.
>
> On Tue, Jun 25, 2019 at 11:05 AM Jason Gustafson 
> wrote:
>
>> >
>> > I think co-locating does have some merits here, i.e. letting the
>> > ConsumerCoordinator which has the source-of-truth of assignment to act
>> as
>> > the TxnCoordinator as well; but I agree there's also some cons of
>> coupling
>> > them together. I'm still a bit inclining towards colocation but if there
>> > are good rationales not to do so I can be convinced as well.
>>
>>
>> The good rationale is that we have no mechanism to colocate partitions ;).
>> Are you suggesting we store the group and transaction state in the same
>> log? Can you be more concrete about the benefit?
>>
>> -Jason
>>
>> On Tue, Jun 25, 2019 at 10:51 AM Guozhang Wang 
>> wrote:
>>
>> > Hi Boyang,
>> >
>> > 1. One advantage of retry against on-hold is that it will not tie-up a
>> > handler thread (of course the latter could do the same but that involves
>> > using a purgatory which is more complicated), and also it is less
>> likely to
>> > violate request timeout. So I think there are some rationales to prefer
>> > retries.
>> >
>>
>  That sounds fair to me, also we are avoiding usage of another purgatory
> instance. Usually for one back-off
> we are only delaying 50ms during startup which is trivial cost. This
> behavior shouldn't be changed.
>
> > 2. Regarding "ConsumerRebalanceListener": both ConsumerRebalanceListener
>> > and PartitionAssignors are user-customizable modules, and only
>> difference
>> > is that the former is specified via code and the latter is specified via
>> > config.
>> >
>> > Regarding Jason's proposal of ConsumerAssignment, one thing to note
>> though
>> > with KIP-429 the onPartitionAssigned may not be called if the assignment
>> > does not change, whereas onAssignment would always be called at the end
>> of
>> > sync-group response. My proposed semantics is that
>> > `RebalanceListener#onPartitionsXXX` are used for notifications to user,
>> and
>> > hence if there's no changes these will not be called, whereas
>> > `PartitionAssignor` is used for assignor logic, whose callback would
>> always
>> > be called no matter if the partitions have changed or not.
>>
>> I think a third option is to gracefully expose generation id as part of
> consumer API, so that we don't need to
> bother overloading various callbacks. Of course, this builds upon the
> assumption that topic partitions
> will not be included in new initTransaction API.
>
> > 3. I feel it is a bit awkward to let the TxnCoordinator keeping partition
>> > assignments since it is sort of taking over the job of the
>> > ConsumerCoordinator, and may likely cause a split-brain problem as two
>> > coordinators keep a copy of this assignment which may be different.
>> >
>> > I think co-locating does have some merits here, i.e. letting the
>> > ConsumerCoordinator which has the source-of-truth of assignment to act
>> as
>> > the TxnCoordinator as well; but I agree there's also some cons of
>> coupling
>> > them together. I'm still a bit inclining towards colocation but if there
>> > are good rationales not to do so I can be convinced as well.
>> >
>>
> The purpose of co-location is to let txn coordinator see the group
> assignment. This priority is weakened
> when we already have defense on the consumer offset fetch, so I guess it's
> not super important anymore.
>
>
>> > 4. I guess I'm preferring the philosophy of "only add configs if
>> there's no
>> > other ways", since more and more configs would make it less and less
>> > intuitive out of the box to use.
>> >
>> > I think it's a valid point that checks upon starting up does not cope
>> with
>> > brokers downgrading but even with a config, but it is still hard for
>> users
>> > to determine when they can be ensured the broker would never downgrade
>> > anymore and hence can safely switch the config. So my feeling is that
>> this
>> > config would not be helping too much still. If we want to be at the
>> safer
>> > side, then I'd suggest we modify the Coordinator -> NetworkClient
>> hierarchy
>> > to allow the NetworkClient being able to pass the APIVersion metadata to
>> > Coordinator, so that Coordinator can rely on that logic to change its
>> > behavior dynamically.
>>
> The stream thread init could not be supported by a client coordinator
> behavior change on the fly,
> we are only losing possibilities after we initialized. (main thread gets
> exit and no thread has global picture anymore)
> If we do want to support auto version detection, admin client request in
> this sense shall be easier.
>
>
>> >
>> > 5. I do not have a 

Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Colin McCabe
No worries.  Thanks for fixing it!
C.

On Tue, Jun 25, 2019, at 13:47, Justine Olshan wrote:
> Also apologies on the late link to the jira, but apparently https links do
> not work and it kept defaulting to an image on my desktop even when it
> looked like I put the correct link in. Weird...
> 
> On Tue, Jun 25, 2019 at 1:41 PM Justine Olshan  wrote:
> 
> > I came up with a good solution for this and will push the commit soon. The
> > repartition will be called only when a partition is not manually sent.
> >
> > On Tue, Jun 25, 2019 at 1:39 PM Colin McCabe  wrote:
> >
> >> Well, this is a generic partitioner method, so it shouldn't dictate any
> >> particular behavior.
> >>
> >> Colin
> >>
> >>
> >> On Tue, Jun 25, 2019, at 12:04, Justine Olshan wrote:
> >> > I also just noticed that if we want to use this method on the keyed
> >> record
> >> > case, I will need to move the method outside of the sticky (no key, no
> >> set
> >> > partition) check. Not a big problem, but something to keep in mind.
> >> > Perhaps, we should encapsulate the sticky vs. not behavior inside the
> >> > method? More things to think about.
> >> >
> >> > On Tue, Jun 25, 2019 at 11:55 AM Colin McCabe 
> >> wrote:
> >> >
> >> > > Hi Justine,
> >> > >
> >> > > The KIP discusses adding a new method to the partitioner interface.
> >> > >
> >> > > > default public Integer onNewBatch(String topic, Cluster cluster) {
> >> ... }
> >> > >
> >> > > However, this new method doesn't give the partitioner access to the
> >> key
> >> > > and value of the message.  While this works for the case described
> >> here (no
> >> > > key), in general we might need this information when re-assigning a
> >> > > partitition based on the batch completing.  So I think we should add
> >> these
> >> > > methods to onNewBatch.
> >> > >
> >> > > Also, it would be nice to call this something like
> >> "repartitionOnNewBatch"
> >> > > or something, to make it clearer what is going on.
> >> > >
> >> > > best,
> >> > > Colin
> >> > >
> >> > > On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
> >> > > > Thank you Justine for the KIP! Do you mind creating a corresponding
> >> JIRA
> >> > > > ticket too?
> >> > > >
> >> > > > On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe 
> >> wrote:
> >> > > >
> >> > > > > Hi Justine,
> >> > > > >
> >> > > > > Thanks for the KIP.  This looks great!
> >> > > > >
> >> > > > > In one place in the KIP, you write: "Remove
> >> > > > > testRoundRobinWithUnavailablePartitions() and testRoundRobin()
> >> since
> >> > > the
> >> > > > > round robin functionality of the partitioner has been removed."
> >> You
> >> > > can
> >> > > > > skip this and similar lines.  We don't need to describe changes to
> >> > > internal
> >> > > > > test classes in the KIP since they're not visible to users or
> >> external
> >> > > > > developers.
> >> > > > >
> >> > > > > It seems like maybe the performance tests should get their own
> >> section.
> >> > > > > Right now, the way the layout is makes it look like they are part
> >> of
> >> > > the
> >> > > > > "Compatibility, Deprecation, and Migration Plan"
> >> > > > >
> >> > > > > best,
> >> > > > > Colin
> >> > > > >
> >> > > > >
> >> > > > > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
> >> > > > > > Hello,
> >> > > > > > This is the discussion thread for KIP-480: Sticky Partitioner.
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > >
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> >> > > > > >
> >> > > > > > Thank you,
> >> > > > > > Justine Olshan
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
>


Re: Contribution access to Kafka Confluence page

2019-06-25 Thread Bill Bejeck
Hi Anastasia,

You're all set now!

-Bill

On Tue, Jun 25, 2019 at 1:50 PM Anastasia Vela  wrote:

> Hi,
>
> Could I get access to create KIPs on the Kafka Confluence page?
> Email: av...@confleunt.io
> UserID: avela
>
> Thanks,
> Anastasia
>


Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Boyang Chen
Thank you Guozhang, some of my understandings are inline below.

On Tue, Jun 25, 2019 at 11:05 AM Jason Gustafson  wrote:

> >
> > I think co-locating does have some merits here, i.e. letting the
> > ConsumerCoordinator which has the source-of-truth of assignment to act as
> > the TxnCoordinator as well; but I agree there's also some cons of
> coupling
> > them together. I'm still a bit inclining towards colocation but if there
> > are good rationales not to do so I can be convinced as well.
>
>
> The good rationale is that we have no mechanism to colocate partitions ;).
> Are you suggesting we store the group and transaction state in the same
> log? Can you be more concrete about the benefit?
>
> -Jason
>
> On Tue, Jun 25, 2019 at 10:51 AM Guozhang Wang  wrote:
>
> > Hi Boyang,
> >
> > 1. One advantage of retry against on-hold is that it will not tie-up a
> > handler thread (of course the latter could do the same but that involves
> > using a purgatory which is more complicated), and also it is less likely
> to
> > violate request timeout. So I think there are some rationales to prefer
> > retries.
> >
>
 That sounds fair to me, also we are avoiding usage of another purgatory
instance. Usually for one back-off
we are only delaying 50ms during startup which is trivial cost. This
behavior shouldn't be changed.

> 2. Regarding "ConsumerRebalanceListener": both ConsumerRebalanceListener
> > and PartitionAssignors are user-customizable modules, and only difference
> > is that the former is specified via code and the latter is specified via
> > config.
> >
> > Regarding Jason's proposal of ConsumerAssignment, one thing to note
> though
> > with KIP-429 the onPartitionAssigned may not be called if the assignment
> > does not change, whereas onAssignment would always be called at the end
> of
> > sync-group response. My proposed semantics is that
> > `RebalanceListener#onPartitionsXXX` are used for notifications to user,
> and
> > hence if there's no changes these will not be called, whereas
> > `PartitionAssignor` is used for assignor logic, whose callback would
> always
> > be called no matter if the partitions have changed or not.
>
> I think a third option is to gracefully expose generation id as part of
consumer API, so that we don't need to
bother overloading various callbacks. Of course, this builds upon the
assumption that topic partitions
will not be included in new initTransaction API.

> 3. I feel it is a bit awkward to let the TxnCoordinator keeping partition
> > assignments since it is sort of taking over the job of the
> > ConsumerCoordinator, and may likely cause a split-brain problem as two
> > coordinators keep a copy of this assignment which may be different.
> >
> > I think co-locating does have some merits here, i.e. letting the
> > ConsumerCoordinator which has the source-of-truth of assignment to act as
> > the TxnCoordinator as well; but I agree there's also some cons of
> coupling
> > them together. I'm still a bit inclining towards colocation but if there
> > are good rationales not to do so I can be convinced as well.
> >
>
The purpose of co-location is to let txn coordinator see the group
assignment. This priority is weakened
when we already have defense on the consumer offset fetch, so I guess it's
not super important anymore.


> > 4. I guess I'm preferring the philosophy of "only add configs if there's
> no
> > other ways", since more and more configs would make it less and less
> > intuitive out of the box to use.
> >
> > I think it's a valid point that checks upon starting up does not cope
> with
> > brokers downgrading but even with a config, but it is still hard for
> users
> > to determine when they can be ensured the broker would never downgrade
> > anymore and hence can safely switch the config. So my feeling is that
> this
> > config would not be helping too much still. If we want to be at the safer
> > side, then I'd suggest we modify the Coordinator -> NetworkClient
> hierarchy
> > to allow the NetworkClient being able to pass the APIVersion metadata to
> > Coordinator, so that Coordinator can rely on that logic to change its
> > behavior dynamically.
>
The stream thread init could not be supported by a client coordinator
behavior change on the fly,
we are only losing possibilities after we initialized. (main thread gets
exit and no thread has global picture anymore)
If we do want to support auto version detection, admin client request in
this sense shall be easier.


> >
> > 5. I do not have a concrete idea about how the impact on Connect would
> > make, maybe Randall or Konstantine can help here?
> >
>
Sounds good, let's see their thoughts.


> > Guozhang
> >
> > On Mon, Jun 24, 2019 at 10:26 PM Boyang Chen  >
> > wrote:
> >
> > > Hey Jason,
> > >
> > > thank you for the proposal here. Some of my thoughts below.
> > >
> > > On Mon, Jun 24, 2019 at 8:58 PM Jason Gustafson 
> > > wrote:
> > >
> > > > Hi Boyang,
> > > >
> > > > Thanks for picking this up! Still reading 

Re: Preliminary blog post for the Apache Kafka 2.3.0 release

2019-06-25 Thread Stephane Maarek
Here it is: https://youtu.be/YutjYKSGd64

Thanks in advance!!

On Tue., 25 Jun. 2019, 12:39 am Colin McCabe,  wrote:

> Hi Stephane,
>
> Sounds interesting!  Do you have a link to the video you made for 2.3?
>
> best,
> Colin
>
>
> On Sun, Jun 23, 2019, at 15:10, Stephane Maarek wrote:
> > The video is ready :) waiting for the release of Kafka 2.3 to make it
> > public. @colin if you want to link it in the blog at some point that'd be
> > great!
> >
> > On Wed., 19 Jun. 2019, 4:03 pm Ron Dagostino,  wrote:
> >
> > > Looks great, Colin.
> > >
> > > I have also enjoyed Stephane Maarek's "What's New in Kafka..." series
> of
> > > videos (e.g. https://www.youtube.com/watch?v=kaWbp1Cnfo4=10s).
> Having
> > > summaries like this in both formats -- blog and video -- for every
> release
> > > would be helpful as different people have different preferences.
> > >
> > > Ron
> > >
> > > On Tue, Jun 18, 2019 at 8:20 PM Colin McCabe 
> wrote:
> > >
> > > > Thanks, Konstantine.  I reworked the wording a bit -- take a look.
> > > >
> > > > best,
> > > > C.
> > > >
> > > > On Tue, Jun 18, 2019, at 14:31, Konstantine Karantasis wrote:
> > > > > Thanks Colin.
> > > > > Great initiative!
> > > > >
> > > > > Here's a small correction (between **) for KIP-415 with a small
> > > > suggestion
> > > > > as well (between _ _):
> > > > >
> > > > > In Kafka Connect, worker tasks are distributed among the available
> > > worker
> > > > > nodes. When a connector is reconfigured or a new connector is
> deployed
> > > > _as
> > > > > well as when a worker is added or removed_, the *tasks* must be
> > > > rebalanced
> > > > > across the Connect cluster to help ensure that all of the worker
> nodes
> > > > are
> > > > > doing a fair share of the Connect work. In 2.2 and earlier, a
> Connect
> > > > > rebalance caused all worker threads to pause while the rebalance
> > > > proceeded.
> > > > > As of KIP-415, rebalancing is no longer a stop-the-world affair,
> making
> > > > > configuration changes a more pleasant thing.
> > > > >
> > > > > Cheers,
> > > > > Konstantine
> > > > >
> > > > > On Tue, Jun 18, 2019 at 1:50 PM Swen Moczarski <
> > > swen.moczar...@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Nice overview!
> > > > > >
> > > > > > I found some typos:
> > > > > > rbmainder
> > > > > > bmits
> > > > > > implbmentation
> > > > > >
> > > > > > Am Di., 18. Juni 2019 um 22:43 Uhr schrieb Boyang Chen <
> > > > > > bche...@outlook.com
> > > > > > >:
> > > > > >
> > > > > > > One typo:
> > > > > > > KIP-428: Add in-mbmory window store
> > > > > > > should be
> > > > > > > KIP-428: Add in-memory window store
> > > > > > >
> > > > > > >
> > > > > > > 
> > > > > > > From: Colin McCabe 
> > > > > > > Sent: Wednesday, June 19, 2019 4:22 AM
> > > > > > > To: dev@kafka.apache.org
> > > > > > > Subject: Re: Preliminary blog post for the Apache Kafka 2.3.0
> > > release
> > > > > > >
> > > > > > > Sorry, I copied the wrong URL at first.  Try this URL instead:
> > > > > > >
> > > > > >
> > > >
> > >
> https://blogs.apache.org/preview/kafka/?previewEntry=what-s-new-in-apache
> > > > > > >
> > > > > > > best,
> > > > > > > Colin
> > > > > > >
> > > > > > > On Tue, Jun 18, 2019, at 13:17, Colin McCabe wrote:
> > > > > > > > Hmm.  I'm looking to see if there's any way to open up the
> > > > > > > permissions... :|
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Jun 18, 2019, at 13:12, M. Manna wrote:
> > > > > > > > > It’s asking for credentials...?
> > > > > > > > >
> > > > > > > > > On Tue, 18 Jun 2019 at 15:10, Colin McCabe <
> cmcc...@apache.org
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > I've written up a preliminary blog post about the
> upcoming
> > > > Apache
> > > > > > > Kafka
> > > > > > > > > > 2.3.0 release.  Take a look and let me know what you
> think.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > >
> > >
> https://blogs.apache.org/roller-ui/authoring/preview/kafka/?previewEntry=what-s-new-in-apache
> > > > > > > > > >
> > > > > > > > > > cheers,
> > > > > > > > > > Colin
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Justine Olshan
Also apologies on the late link to the jira, but apparently https links do
not work and it kept defaulting to an image on my desktop even when it
looked like I put the correct link in. Weird...

On Tue, Jun 25, 2019 at 1:41 PM Justine Olshan  wrote:

> I came up with a good solution for this and will push the commit soon. The
> repartition will be called only when a partition is not manually sent.
>
> On Tue, Jun 25, 2019 at 1:39 PM Colin McCabe  wrote:
>
>> Well, this is a generic partitioner method, so it shouldn't dictate any
>> particular behavior.
>>
>> Colin
>>
>>
>> On Tue, Jun 25, 2019, at 12:04, Justine Olshan wrote:
>> > I also just noticed that if we want to use this method on the keyed
>> record
>> > case, I will need to move the method outside of the sticky (no key, no
>> set
>> > partition) check. Not a big problem, but something to keep in mind.
>> > Perhaps, we should encapsulate the sticky vs. not behavior inside the
>> > method? More things to think about.
>> >
>> > On Tue, Jun 25, 2019 at 11:55 AM Colin McCabe 
>> wrote:
>> >
>> > > Hi Justine,
>> > >
>> > > The KIP discusses adding a new method to the partitioner interface.
>> > >
>> > > > default public Integer onNewBatch(String topic, Cluster cluster) {
>> ... }
>> > >
>> > > However, this new method doesn't give the partitioner access to the
>> key
>> > > and value of the message.  While this works for the case described
>> here (no
>> > > key), in general we might need this information when re-assigning a
>> > > partitition based on the batch completing.  So I think we should add
>> these
>> > > methods to onNewBatch.
>> > >
>> > > Also, it would be nice to call this something like
>> "repartitionOnNewBatch"
>> > > or something, to make it clearer what is going on.
>> > >
>> > > best,
>> > > Colin
>> > >
>> > > On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
>> > > > Thank you Justine for the KIP! Do you mind creating a corresponding
>> JIRA
>> > > > ticket too?
>> > > >
>> > > > On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe 
>> wrote:
>> > > >
>> > > > > Hi Justine,
>> > > > >
>> > > > > Thanks for the KIP.  This looks great!
>> > > > >
>> > > > > In one place in the KIP, you write: "Remove
>> > > > > testRoundRobinWithUnavailablePartitions() and testRoundRobin()
>> since
>> > > the
>> > > > > round robin functionality of the partitioner has been removed."
>> You
>> > > can
>> > > > > skip this and similar lines.  We don't need to describe changes to
>> > > internal
>> > > > > test classes in the KIP since they're not visible to users or
>> external
>> > > > > developers.
>> > > > >
>> > > > > It seems like maybe the performance tests should get their own
>> section.
>> > > > > Right now, the way the layout is makes it look like they are part
>> of
>> > > the
>> > > > > "Compatibility, Deprecation, and Migration Plan"
>> > > > >
>> > > > > best,
>> > > > > Colin
>> > > > >
>> > > > >
>> > > > > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
>> > > > > > Hello,
>> > > > > > This is the discussion thread for KIP-480: Sticky Partitioner.
>> > > > > >
>> > > > > >
>> > > > >
>> > >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
>> > > > > >
>> > > > > > Thank you,
>> > > > > > Justine Olshan
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Justine Olshan
I came up with a good solution for this and will push the commit soon. The
repartition will be called only when a partition is not manually sent.

On Tue, Jun 25, 2019 at 1:39 PM Colin McCabe  wrote:

> Well, this is a generic partitioner method, so it shouldn't dictate any
> particular behavior.
>
> Colin
>
>
> On Tue, Jun 25, 2019, at 12:04, Justine Olshan wrote:
> > I also just noticed that if we want to use this method on the keyed
> record
> > case, I will need to move the method outside of the sticky (no key, no
> set
> > partition) check. Not a big problem, but something to keep in mind.
> > Perhaps, we should encapsulate the sticky vs. not behavior inside the
> > method? More things to think about.
> >
> > On Tue, Jun 25, 2019 at 11:55 AM Colin McCabe 
> wrote:
> >
> > > Hi Justine,
> > >
> > > The KIP discusses adding a new method to the partitioner interface.
> > >
> > > > default public Integer onNewBatch(String topic, Cluster cluster) {
> ... }
> > >
> > > However, this new method doesn't give the partitioner access to the key
> > > and value of the message.  While this works for the case described
> here (no
> > > key), in general we might need this information when re-assigning a
> > > partitition based on the batch completing.  So I think we should add
> these
> > > methods to onNewBatch.
> > >
> > > Also, it would be nice to call this something like
> "repartitionOnNewBatch"
> > > or something, to make it clearer what is going on.
> > >
> > > best,
> > > Colin
> > >
> > > On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
> > > > Thank you Justine for the KIP! Do you mind creating a corresponding
> JIRA
> > > > ticket too?
> > > >
> > > > On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe 
> wrote:
> > > >
> > > > > Hi Justine,
> > > > >
> > > > > Thanks for the KIP.  This looks great!
> > > > >
> > > > > In one place in the KIP, you write: "Remove
> > > > > testRoundRobinWithUnavailablePartitions() and testRoundRobin()
> since
> > > the
> > > > > round robin functionality of the partitioner has been removed."
> You
> > > can
> > > > > skip this and similar lines.  We don't need to describe changes to
> > > internal
> > > > > test classes in the KIP since they're not visible to users or
> external
> > > > > developers.
> > > > >
> > > > > It seems like maybe the performance tests should get their own
> section.
> > > > > Right now, the way the layout is makes it look like they are part
> of
> > > the
> > > > > "Compatibility, Deprecation, and Migration Plan"
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > >
> > > > > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
> > > > > > Hello,
> > > > > > This is the discussion thread for KIP-480: Sticky Partitioner.
> > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > > > > >
> > > > > > Thank you,
> > > > > > Justine Olshan
> > > > > >
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Colin McCabe
Well, this is a generic partitioner method, so it shouldn't dictate any 
particular behavior.

Colin


On Tue, Jun 25, 2019, at 12:04, Justine Olshan wrote:
> I also just noticed that if we want to use this method on the keyed record
> case, I will need to move the method outside of the sticky (no key, no set
> partition) check. Not a big problem, but something to keep in mind.
> Perhaps, we should encapsulate the sticky vs. not behavior inside the
> method? More things to think about.
> 
> On Tue, Jun 25, 2019 at 11:55 AM Colin McCabe  wrote:
> 
> > Hi Justine,
> >
> > The KIP discusses adding a new method to the partitioner interface.
> >
> > > default public Integer onNewBatch(String topic, Cluster cluster) { ... }
> >
> > However, this new method doesn't give the partitioner access to the key
> > and value of the message.  While this works for the case described here (no
> > key), in general we might need this information when re-assigning a
> > partitition based on the batch completing.  So I think we should add these
> > methods to onNewBatch.
> >
> > Also, it would be nice to call this something like "repartitionOnNewBatch"
> > or something, to make it clearer what is going on.
> >
> > best,
> > Colin
> >
> > On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
> > > Thank you Justine for the KIP! Do you mind creating a corresponding JIRA
> > > ticket too?
> > >
> > > On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe  wrote:
> > >
> > > > Hi Justine,
> > > >
> > > > Thanks for the KIP.  This looks great!
> > > >
> > > > In one place in the KIP, you write: "Remove
> > > > testRoundRobinWithUnavailablePartitions() and testRoundRobin() since
> > the
> > > > round robin functionality of the partitioner has been removed."  You
> > can
> > > > skip this and similar lines.  We don't need to describe changes to
> > internal
> > > > test classes in the KIP since they're not visible to users or external
> > > > developers.
> > > >
> > > > It seems like maybe the performance tests should get their own section.
> > > > Right now, the way the layout is makes it look like they are part of
> > the
> > > > "Compatibility, Deprecation, and Migration Plan"
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
> > > > > Hello,
> > > > > This is the discussion thread for KIP-480: Sticky Partitioner.
> > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > > > >
> > > > > Thank you,
> > > > > Justine Olshan
> > > > >
> > > >
> > >
> >
>


[jira] [Created] (KAFKA-8602) StreamThread Dies Because Restore Consumer is not Subscribed to Any Topic

2019-06-25 Thread Bruno Cadonna (JIRA)
Bruno Cadonna created KAFKA-8602:


 Summary: StreamThread Dies Because Restore Consumer is not 
Subscribed to Any Topic
 Key: KAFKA-8602
 URL: https://issues.apache.org/jira/browse/KAFKA-8602
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.1.1
Reporter: Bruno Cadonna


StreamThread dies with the following exception:
{code:java}
java.lang.IllegalStateException: Consumer is not subscribed to any topics or 
assigned any partitions
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1206)
at 
org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1199)
at 
org.apache.kafka.streams.processor.internals.StreamThread.maybeUpdateStandbyTasks(StreamThread.java:1126)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:923)
at 
org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:805)
at 
org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:774)
{code}
The reason is that the restore consumer is not subscribed to any topic. This 
happens when a StreamThread gets assigned standby tasks for sub-topologies with 
just state stores with disabled logging.

To reproduce the bug start two applications with one StreamThread and one 
standby replica each and the following topology. The input topic should have 
two partitions:
{code:java}
final StreamsBuilder builder = new StreamsBuilder();
final String stateStoreName = "myTransformState";
final StoreBuilder> keyValueStoreBuilder =
Stores.keyValueStoreBuilder(Stores.persistentKeyValueStore(stateStoreName),
Serdes.Integer(),
Serdes.Integer())
.withLoggingDisabled();
builder.addStateStore(keyValueStoreBuilder);
builder.stream(INPUT_TOPIC, Consumed.with(Serdes.Integer(), Serdes.Integer()))
.transform(() -> new Transformer>() {
private KeyValueStore state;

@SuppressWarnings("unchecked")
@Override
public void init(final ProcessorContext context) {
state = (KeyValueStore) 
context.getStateStore(stateStoreName);
}

@Override
public KeyValue transform(final Integer key, 
final Integer value) {
final KeyValue result = new KeyValue<>(key, 
value);
return result;
}

@Override
public void close() {}
}, stateStoreName)
.to(OUTPUT_TOPIC);
{code}
Both StreamThreads should die with the above exception.

The root cause is that standby tasks are created although all state stores of 
the sub-topology have a logging disabled.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Justine Olshan
I also just noticed that if we want to use this method on the keyed record
case, I will need to move the method outside of the sticky (no key, no set
partition) check. Not a big problem, but something to keep in mind.
Perhaps, we should encapsulate the sticky vs. not behavior inside the
method? More things to think about.

On Tue, Jun 25, 2019 at 11:55 AM Colin McCabe  wrote:

> Hi Justine,
>
> The KIP discusses adding a new method to the partitioner interface.
>
> > default public Integer onNewBatch(String topic, Cluster cluster) { ... }
>
> However, this new method doesn't give the partitioner access to the key
> and value of the message.  While this works for the case described here (no
> key), in general we might need this information when re-assigning a
> partitition based on the batch completing.  So I think we should add these
> methods to onNewBatch.
>
> Also, it would be nice to call this something like "repartitionOnNewBatch"
> or something, to make it clearer what is going on.
>
> best,
> Colin
>
> On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
> > Thank you Justine for the KIP! Do you mind creating a corresponding JIRA
> > ticket too?
> >
> > On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe  wrote:
> >
> > > Hi Justine,
> > >
> > > Thanks for the KIP.  This looks great!
> > >
> > > In one place in the KIP, you write: "Remove
> > > testRoundRobinWithUnavailablePartitions() and testRoundRobin() since
> the
> > > round robin functionality of the partitioner has been removed."  You
> can
> > > skip this and similar lines.  We don't need to describe changes to
> internal
> > > test classes in the KIP since they're not visible to users or external
> > > developers.
> > >
> > > It seems like maybe the performance tests should get their own section.
> > > Right now, the way the layout is makes it look like they are part of
> the
> > > "Compatibility, Deprecation, and Migration Plan"
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
> > > > Hello,
> > > > This is the discussion thread for KIP-480: Sticky Partitioner.
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > > >
> > > > Thank you,
> > > > Justine Olshan
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Colin McCabe
Thanks Justine!  You might want to update the "JIRA" link on the KIP so that it 
links to this.

cheers,
Colin

On Tue, Jun 25, 2019, at 11:59, Justine Olshan wrote:
> Thank you for looking at my KIP!
> 
> I will get to work on these changes.
> 
> In addition, here is the JIRA ticket:
> https://issues.apache.org/jira/browse/KAFKA-8601
> 
> Thanks again,
> Justine
> 
> On Tue, Jun 25, 2019 at 11:55 AM Colin McCabe  wrote:
> 
> > Hi Justine,
> >
> > The KIP discusses adding a new method to the partitioner interface.
> >
> > > default public Integer onNewBatch(String topic, Cluster cluster) { ... }
> >
> > However, this new method doesn't give the partitioner access to the key
> > and value of the message.  While this works for the case described here (no
> > key), in general we might need this information when re-assigning a
> > partitition based on the batch completing.  So I think we should add these
> > methods to onNewBatch.
> >
> > Also, it would be nice to call this something like "repartitionOnNewBatch"
> > or something, to make it clearer what is going on.
> >
> > best,
> > Colin
> >
> > On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
> > > Thank you Justine for the KIP! Do you mind creating a corresponding JIRA
> > > ticket too?
> > >
> > > On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe  wrote:
> > >
> > > > Hi Justine,
> > > >
> > > > Thanks for the KIP.  This looks great!
> > > >
> > > > In one place in the KIP, you write: "Remove
> > > > testRoundRobinWithUnavailablePartitions() and testRoundRobin() since
> > the
> > > > round robin functionality of the partitioner has been removed."  You
> > can
> > > > skip this and similar lines.  We don't need to describe changes to
> > internal
> > > > test classes in the KIP since they're not visible to users or external
> > > > developers.
> > > >
> > > > It seems like maybe the performance tests should get their own section.
> > > > Right now, the way the layout is makes it look like they are part of
> > the
> > > > "Compatibility, Deprecation, and Migration Plan"
> > > >
> > > > best,
> > > > Colin
> > > >
> > > >
> > > > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
> > > > > Hello,
> > > > > This is the discussion thread for KIP-480: Sticky Partitioner.
> > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > > > >
> > > > > Thank you,
> > > > > Justine Olshan
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-435: Incremental Partition Reassignment

2019-06-25 Thread Colin McCabe
Hi Viktor,

Now that the 2.3 release is over, we're going to be turning our attention back 
to working on KIP-455, which provides an API for partition reassignment, and 
also solves the incremental reassignment problem.  Sorry about the pause, but I 
had to focus on the stuff that was going into 2.3.

I think last time we talked about this, the consensus was that KIP-455 
supersedes KIP-435, since KIP-455 supports incremental reassignment.  We also 
don't want to add more technical debt in the form of a new ZooKeeper-based API 
that we'll have to support for a while.  So let's focus on KIP-455 here.  We 
have more resources now so I think we'll be able to get it done soonish.

best,
Colin


On Tue, Jun 25, 2019, at 08:09, Viktor Somogyi-Vass wrote:
> Hi All,
> 
> I have added another improvement to this, which is to limit the parallel
> leader movements. I think I'll soon (maybe late this week or early next)
> start a vote on this too if there are no additional feedback.
> 
> Thanks,
> Viktor
> 
> On Mon, Apr 29, 2019 at 1:26 PM Viktor Somogyi-Vass 
> wrote:
> 
> > Hi Folks,
> >
> > I've updated the KIP with the batching which would work on both replica
> > and partition level. To explain it briefly: for instance if the replica
> > level is set to 2 and partition level is set to 3, then 2x3=6 replica
> > reassignment would be in progress at the same time. In case of reassignment
> > for a single partition from (0, 1, 2, 3, 4) to (5, 6, 7, 8, 9) we would
> > form the batches (0, 1) → (5, 6); (2, 3) → (7, 8) and 4 → 9 and would
> > execute the reassignment in this order.
> >
> > Let me know what you think.
> >
> > Best,
> > Viktor
> >
> > On Mon, Apr 15, 2019 at 7:01 PM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com> wrote:
> >
> >> A follow up on the batching topic to clarify my points above.
> >>
> >> Generally I think that batching should be a core feature as Colin said
> >> the controller should possess all information that are related.
> >> Also Cruise Control (or really any 3rd party admin system) might build
> >> upon this to give more holistic approach to balance brokers. We may cater
> >> them with APIs that act like building blocks to make their life easier like
> >> incrementalization, batching, cancellation and rollback but I think the
> >> more advanced we go we'll need more advanced control surface and Kafka's
> >> basic tooling might not be suitable for that.
> >>
> >> Best,
> >> Viktor
> >>
> >>
> >> On Mon, 15 Apr 2019, 18:22 Viktor Somogyi-Vass, 
> >> wrote:
> >>
> >>> Hey Guys,
> >>>
> >>> I'll reply to you all in this email:
> >>>
> >>> @Jun:
> >>> 1. yes, it'd be a good idea to add this feature, I'll write this into
> >>> the KIP. I was actually thinking about introducing a dynamic config called
> >>> reassignment.parallel.partition.count and
> >>> reassignment.parallel.replica.count. The first property would control how
> >>> many partition reassignment can we do concurrently. The second would go 
> >>> one
> >>> level in granularity and would control how many replicas do we want to 
> >>> move
> >>> for a given partition. Also one more thing that'd be useful to fix is that
> >>> a given list of partition -> replica list would be executed in the same
> >>> order (from first to last) so it's overall predictable and the user would
> >>> have some control over the order of reassignments should be specified as
> >>> the JSON is still assembled by the user.
> >>> 2. the /kafka/brokers/topics/{topic} znode to be specific. I'll update
> >>> the KIP to contain this.
> >>>
> >>> @Jason:
> >>> I think building this functionality into Kafka would definitely benefit
> >>> all the users and that CC as well as it'd simplify their software as you
> >>> said. As I understand the main advantage of CC and other similar softwares
> >>> are to give high level features for automatic load balancing. Reliability,
> >>> stability and predictability of the reassignment should be a core feature
> >>> of Kafka. I think the incrementalization feature would make it more 
> >>> stable.
> >>> I would consider cancellation too as a core feature and we can leave the
> >>> gate open for external tools to feed in their reassignment json as they
> >>> want. I was also thinking about what are the set of features we can 
> >>> provide
> >>> for Kafka but I think the more advanced we go the more need there is for 
> >>> an
> >>> administrative UI component.
> >>> Regarding KIP-352: Thanks for pointing this out, I didn't see this
> >>> although lately I was also thinking about the throttling aspect of it.
> >>> Would be a nice add-on to Kafka since though the above configs provide 
> >>> some
> >>> level of control, it'd be nice to put an upper cap on the bandwidth and
> >>> make it monitorable.
> >>>
> >>> Viktor
> >>>
> >>> On Wed, Apr 10, 2019 at 2:57 AM Jason Gustafson 
> >>> wrote:
> >>>
>  Hi Colin,
> 
>  On a related note, what do you think about the idea of storing the
>  > reassigning replicas in

Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Justine Olshan
Thank you for looking at my KIP!

I will get to work on these changes.

In addition, here is the JIRA ticket:
https://issues.apache.org/jira/browse/KAFKA-8601

Thanks again,
Justine

On Tue, Jun 25, 2019 at 11:55 AM Colin McCabe  wrote:

> Hi Justine,
>
> The KIP discusses adding a new method to the partitioner interface.
>
> > default public Integer onNewBatch(String topic, Cluster cluster) { ... }
>
> However, this new method doesn't give the partitioner access to the key
> and value of the message.  While this works for the case described here (no
> key), in general we might need this information when re-assigning a
> partitition based on the batch completing.  So I think we should add these
> methods to onNewBatch.
>
> Also, it would be nice to call this something like "repartitionOnNewBatch"
> or something, to make it clearer what is going on.
>
> best,
> Colin
>
> On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
> > Thank you Justine for the KIP! Do you mind creating a corresponding JIRA
> > ticket too?
> >
> > On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe  wrote:
> >
> > > Hi Justine,
> > >
> > > Thanks for the KIP.  This looks great!
> > >
> > > In one place in the KIP, you write: "Remove
> > > testRoundRobinWithUnavailablePartitions() and testRoundRobin() since
> the
> > > round robin functionality of the partitioner has been removed."  You
> can
> > > skip this and similar lines.  We don't need to describe changes to
> internal
> > > test classes in the KIP since they're not visible to users or external
> > > developers.
> > >
> > > It seems like maybe the performance tests should get their own section.
> > > Right now, the way the layout is makes it look like they are part of
> the
> > > "Compatibility, Deprecation, and Migration Plan"
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
> > > > Hello,
> > > > This is the discussion thread for KIP-480: Sticky Partitioner.
> > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > > >
> > > > Thank you,
> > > > Justine Olshan
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-480 : Sticky Partitioner

2019-06-25 Thread Colin McCabe
Hi Justine,

The KIP discusses adding a new method to the partitioner interface.

> default public Integer onNewBatch(String topic, Cluster cluster) { ... }

However, this new method doesn't give the partitioner access to the key and 
value of the message.  While this works for the case described here (no key), 
in general we might need this information when re-assigning a partitition based 
on the batch completing.  So I think we should add these methods to onNewBatch.

Also, it would be nice to call this something like "repartitionOnNewBatch" or 
something, to make it clearer what is going on.

best,
Colin

On Mon, Jun 24, 2019, at 18:32, Boyang Chen wrote:
> Thank you Justine for the KIP! Do you mind creating a corresponding JIRA
> ticket too?
> 
> On Mon, Jun 24, 2019 at 4:51 PM Colin McCabe  wrote:
> 
> > Hi Justine,
> >
> > Thanks for the KIP.  This looks great!
> >
> > In one place in the KIP, you write: "Remove
> > testRoundRobinWithUnavailablePartitions() and testRoundRobin() since the
> > round robin functionality of the partitioner has been removed."  You can
> > skip this and similar lines.  We don't need to describe changes to internal
> > test classes in the KIP since they're not visible to users or external
> > developers.
> >
> > It seems like maybe the performance tests should get their own section.
> > Right now, the way the layout is makes it look like they are part of the
> > "Compatibility, Deprecation, and Migration Plan"
> >
> > best,
> > Colin
> >
> >
> > On Mon, Jun 24, 2019, at 14:04, Justine Olshan wrote:
> > > Hello,
> > > This is the discussion thread for KIP-480: Sticky Partitioner.
> > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-480%3A+Sticky+Partitioner
> > >
> > > Thank you,
> > > Justine Olshan
> > >
> >
>


Re: [RESULT] [VOTE] 2.3.0 RC3

2019-06-25 Thread Adam Bellemare
Thanks for the hard work Colin, and everyone else who helped get this out!

On Mon, Jun 24, 2019 at 1:38 PM Colin McCabe  wrote:

> Hi all,
>
> This vote passes with 6 +1 votes (3 of which are binding) and no 0 or -1
> votes.  Thanks to everyone who voted.
>
> +1 votes
> PMC Members:
> * Ismael Juma
> * Guozhang Wang
> * Gwen Shapira
>
> Community:
> * Kamal Chandraprakash
> * Jakub Scholz
> * Mickael Maison
>
> 0 votes
> * No votes
>
> -1 votes
> * No votes
>
> Vote thread:
> https://www.mail-archive.com/dev@kafka.apache.org/msg98814.html
>
> I'll continue with the release process and the release announcement will
> follow in the next few days.
>
> thanks,
> Colin
>
>
> On Mon, Jun 24, 2019, at 01:17, Mickael Maison wrote:
> > Thanks Colin for making a new RC for KAFKA-8564.
> > +1 (non binding)
> > I checked signatures and ran quickstart on the 2.12 binary
> >
> > On Mon, Jun 24, 2019 at 6:03 AM Gwen Shapira  wrote:
> > >
> > > +1 (binding)
> > > Verified signatures, verified good build on jenkins, built from
> > > sources anyway and ran quickstart on the 2.11 binary.
> > >
> > > Looks good!
> > >
> > > On Sun, Jun 23, 2019 at 3:06 PM Jakub Scholz  wrote:
> > > >
> > > > +1 (non-binding). I used the binaries and run some of my tests
> against them.
> > > >
> > > > On Thu, Jun 20, 2019 at 12:03 AM Colin McCabe 
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > We discovered some problems with the second release candidate
> (RC2) of
> > > > > 2.3.0.  Specifically, KAFKA-8564.  I've created a new RC which
> includes the
> > > > > fix for this issue.
> > > > >
> > > > > Check out the release notes for the 2.3.0 release here:
> > > > >
> https://home.apache.org/~cmccabe/kafka-2.3.0-rc3/RELEASE_NOTES.html
> > > > >
> > > > > The vote will go until Saturday, June 22nd, or until we create
> another RC.
> > > > >
> > > > > * Kafka's KEYS file containing PGP keys we use to sign the release
> can be
> > > > > found here:
> > > > > https://kafka.apache.org/KEYS
> > > > >
> > > > > * The release artifacts to be voted upon (source and binary) are
> here:
> > > > > https://home.apache.org/~cmccabe/kafka-2.3.0-rc3/
> > > > >
> > > > > * Maven artifacts to be voted upon:
> > > > >
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
> > > > >
> > > > > * Javadoc:
> > > > > https://home.apache.org/~cmccabe/kafka-2.3.0-rc3/javadoc/
> > > > >
> > > > > * The tag to be voted upon (off the 2.3 branch) is the 2.3.0 tag:
> > > > > https://github.com/apache/kafka/releases/tag/2.3.0-rc3
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > C.
> > > > >
> > >
> > >
> > >
> > > --
> > > Gwen Shapira
> > > Product Manager | Confluent
> > > 650.450.2760 | @gwenshap
> > > Follow us: Twitter | blog
> >
>


Re: [ANNOUNCE] Apache Kafka 2.3.0

2019-06-25 Thread Colin McCabe
Thanks to everyone who reviewed the Apache blog post about 2.3.  It's live now 
at https://blogs.apache.org/kafka/date/20190624

Plus, Tim Berglund made a video about what's new in this release.  
https://www.youtube.com/watch?v=sNqwJT2WguQ

Finally, check out Stéphane Maarek's video about 2.3 here: 
https://www.youtube.com/watch?v=YutjYKSGd64

cheers,
Colin


On Tue, Jun 25, 2019, at 09:40, Colin McCabe wrote:
> The Apache Kafka community is pleased to announce the release for 
> Apache Kafka 2.3.0.
> This release includes several new features, including:
> 
> - There have been several improvements to the Kafka Connect REST API.
> - Kafka Connect now supports incremental cooperative rebalancing. 
> - Kafka Streams now supports an in-memory session store and window 
> store.
> - The AdminClient now allows users to determine what operations they 
> are authorized to perform on topics.
> - There is a new broker start time metric.
> - JMXTool can now connect to secured RMI ports.
> - An incremental AlterConfigs API has been added.  The old AlterConfigs 
> API has been deprecated.
> - We now track partitions which are under their min ISR count.
> - Consumers can now opt-out of automatic topic creation, even when it 
> is enabled on the broker.
> - Kafka components can now use external configuration stores (KIP-421)
> - We have implemented improved replica fetcher behavior when errors are 
> encountered
> 
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html
> 
> 
> You can download the source and binary release (Scala 2.11 and 2.12) from:
> https://kafka.apache.org/downloads#2.3.0
> 
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html
> 
> You can download the source and binary release (Scala 2.11 and 2.12) from:
> https://kafka.apache.org/downloads#2.3.0
> 
> ---
> 
> Apache Kafka is a distributed streaming platform with four core APIs:
> 
> ** The Producer API allows an application to publish a stream records 
> to one or more Kafka topics.
> 
> ** The Consumer API allows an application to subscribe to one or more 
> topics and process the stream of records produced to them.
> 
> ** The Streams API allows an application to act as a stream processor, 
> consuming an input stream from one or more topics and producing an 
> output stream to one or more output topics, effectively transforming 
> the input streams to output streams.
> 
> ** The Connector API allows building and running reusable producers or 
> consumers that connect Kafka topics to existing applications or data 
> systems. For example, a connector to a relational database might 
> capture every change to a table.
> 
> 
> With these APIs, Kafka can be used for two broad classes of application:
> ** Building real-time streaming data pipelines that reliably get data 
> between systems or applications.
> ** Building real-time streaming applications that transform or react to 
> the streams of data.
> 
> Apache Kafka is in use at large and small companies worldwide, 
> including Capital One, Goldman Sachs, ING, LinkedIn, Netflix, 
> Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and 
> Zalando, among others.
> 
> A big thank you for the following 101 contributors to this release!
> 
> Aishwarya Gune, Alex Diachenko, Alex Dunayevsky, Anna Povzner, Arabelle 
> Hou, Arjun Satish, A. Sophie Blee-Goldman, asutosh936, Bill Bejeck, Bob 
> Barrett, Boyang Chen, Brian Bushree, cadonna, Casey Green, Chase 
> Walden, Chia-Ping Tsai, Chris Egerton, Chris Steingen, Colin Hicks, 
> Colin Patrick McCabe, commandini, cwildman, Cyrus Vafadari, Dan 
> Norwood, David Arthur, Dejan Stojadinović, Dhruvil Shah, Doroszlai, 
> Attila, Ewen Cheslack-Postava, Fangbin Sun, Filipe Agapito, Florian 
> Hussonnois, Gardner Vickers, Guozhang Wang, Gwen Shapira, Hai-Dang Dam, 
> highluck, huxi, huxihx, Ismael Juma, Ivan Yurchenko, Jarrod Urban, 
> Jason Gustafson, John Roesler, José Armando García Sancio, Joyce Fee, 
> Jun Rao, KartikVK, Kengo Seki, Kevin Lu, khairy, Konstantine 
> Karantasis, Kristian Aurlien, Kyle Ambroff-Kao, lambdaliu, Lee Dongjin, 
> Lifei Chen, Lucas Bradstreet, Lysss, lzh3636, Magesh Nandakumar, 
> Manikumar Reddy, Mark Cho, Massimo Siani, Matthias J. Sax, Michael 
> Gruben Trejo, Mickael Maison, Murad, Nicholas Parker, opera443399, Paul 
> Davidson, pierDipi, pkleindl, Radai Rosenblatt, Rajini Sivaram, Randall 
> Hauch, Rohan, Rohan Desai, Ron Dagostino, Ryan Chen, saisandeep, 
> sandmannn, sdreynolds, Sebastián Ortega, Shaobo Liu, Sönke Liebau, 
> Stanislav Kozlovski, Suman BN, tadsul, Tcsalist, Ted Yu, Vahid 
> Hashemian, Victoria Bialas, Viktor Somogyi, Viktor Somogyi-Vass, Vito 
> Jeng, wenhoujx, Xiongqi Wu, Yaroslav Klymko, Zhanxiang (Patrick) Huang
> 
> We welcome your help and 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Jason Gustafson
>
> I think co-locating does have some merits here, i.e. letting the
> ConsumerCoordinator which has the source-of-truth of assignment to act as
> the TxnCoordinator as well; but I agree there's also some cons of coupling
> them together. I'm still a bit inclining towards colocation but if there
> are good rationales not to do so I can be convinced as well.


The good rationale is that we have no mechanism to colocate partitions ;).
Are you suggesting we store the group and transaction state in the same
log? Can you be more concrete about the benefit?

-Jason

On Tue, Jun 25, 2019 at 10:51 AM Guozhang Wang  wrote:

> Hi Boyang,
>
> 1. One advantage of retry against on-hold is that it will not tie-up a
> handler thread (of course the latter could do the same but that involves
> using a purgatory which is more complicated), and also it is less likely to
> violate request timeout. So I think there are some rationales to prefer
> retries.
>
> 2. Regarding "ConsumerRebalanceListener": both ConsumerRebalanceListener
> and PartitionAssignors are user-customizable modules, and only difference
> is that the former is specified via code and the latter is specified via
> config.
>
> Regarding Jason's proposal of ConsumerAssignment, one thing to note though
> with KIP-429 the onPartitionAssigned may not be called if the assignment
> does not change, whereas onAssignment would always be called at the end of
> sync-group response. My proposed semantics is that
> `RebalanceListener#onPartitionsXXX` are used for notifications to user, and
> hence if there's no changes these will not be called, whereas
> `PartitionAssignor` is used for assignor logic, whose callback would always
> be called no matter if the partitions have changed or not.
>
> 3. I feel it is a bit awkward to let the TxnCoordinator keeping partition
> assignments since it is sort of taking over the job of the
> ConsumerCoordinator, and may likely cause a split-brain problem as two
> coordinators keep a copy of this assignment which may be different.
>
> I think co-locating does have some merits here, i.e. letting the
> ConsumerCoordinator which has the source-of-truth of assignment to act as
> the TxnCoordinator as well; but I agree there's also some cons of coupling
> them together. I'm still a bit inclining towards colocation but if there
> are good rationales not to do so I can be convinced as well.
>
> 4. I guess I'm preferring the philosophy of "only add configs if there's no
> other ways", since more and more configs would make it less and less
> intuitive out of the box to use.
>
> I think it's a valid point that checks upon starting up does not cope with
> brokers downgrading but even with a config, but it is still hard for users
> to determine when they can be ensured the broker would never downgrade
> anymore and hence can safely switch the config. So my feeling is that this
> config would not be helping too much still. If we want to be at the safer
> side, then I'd suggest we modify the Coordinator -> NetworkClient hierarchy
> to allow the NetworkClient being able to pass the APIVersion metadata to
> Coordinator, so that Coordinator can rely on that logic to change its
> behavior dynamically.
>
> 5. I do not have a concrete idea about how the impact on Connect would
> make, maybe Randall or Konstantine can help here?
>
>
> Guozhang
>
> On Mon, Jun 24, 2019 at 10:26 PM Boyang Chen 
> wrote:
>
> > Hey Jason,
> >
> > thank you for the proposal here. Some of my thoughts below.
> >
> > On Mon, Jun 24, 2019 at 8:58 PM Jason Gustafson 
> > wrote:
> >
> > > Hi Boyang,
> > >
> > > Thanks for picking this up! Still reading through the updates, but here
> > are
> > > a couple initial comments on the APIs:
> > >
> > > 1. The `TxnProducerIdentity` class is a bit awkward. I think we are
> > trying
> > > to encapsulate state from the current group assignment. Maybe something
> > > like `ConsumerAssignment` would be clearer? If we make the usage
> > consistent
> > > across the consumer and producer, then we can avoid exposing internal
> > state
> > > like the generationId.
> > >
> > > For example:
> > >
> > > // Public API
> > > interface ConsumerAssignment {
> > >   Set partittions();
> > > }
> > >
> > > // Not a public API
> > > class InternalConsumerAssignment implements ConsumerAssignment {
> > >   Set partittions;
> > >   int generationId;
> > > }
> > >
> > > Then we can change the rebalance listener to something like this:
> > > onPartitionsAssigned(ConsumerAssignment assignment)
> > >
> > > And on the producer:
> > > void initTransactions(String groupId, ConsumerAssignment assignment);
> > >
> > > 2. Another bit of awkwardness is the fact that we have to pass the
> > groupId
> > > through both initTransactions() and sendOffsetsToTransaction(). We
> could
> > > consider a config instead. Maybe something like `
> transactional.group.id
> > `?
> > > Then we could simplify the producer APIs, potentially even deprecating
> > the
> > > current 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Guozhang Wang
Hi Boyang,

1. One advantage of retry against on-hold is that it will not tie-up a
handler thread (of course the latter could do the same but that involves
using a purgatory which is more complicated), and also it is less likely to
violate request timeout. So I think there are some rationales to prefer
retries.

2. Regarding "ConsumerRebalanceListener": both ConsumerRebalanceListener
and PartitionAssignors are user-customizable modules, and only difference
is that the former is specified via code and the latter is specified via
config.

Regarding Jason's proposal of ConsumerAssignment, one thing to note though
with KIP-429 the onPartitionAssigned may not be called if the assignment
does not change, whereas onAssignment would always be called at the end of
sync-group response. My proposed semantics is that
`RebalanceListener#onPartitionsXXX` are used for notifications to user, and
hence if there's no changes these will not be called, whereas
`PartitionAssignor` is used for assignor logic, whose callback would always
be called no matter if the partitions have changed or not.

3. I feel it is a bit awkward to let the TxnCoordinator keeping partition
assignments since it is sort of taking over the job of the
ConsumerCoordinator, and may likely cause a split-brain problem as two
coordinators keep a copy of this assignment which may be different.

I think co-locating does have some merits here, i.e. letting the
ConsumerCoordinator which has the source-of-truth of assignment to act as
the TxnCoordinator as well; but I agree there's also some cons of coupling
them together. I'm still a bit inclining towards colocation but if there
are good rationales not to do so I can be convinced as well.

4. I guess I'm preferring the philosophy of "only add configs if there's no
other ways", since more and more configs would make it less and less
intuitive out of the box to use.

I think it's a valid point that checks upon starting up does not cope with
brokers downgrading but even with a config, but it is still hard for users
to determine when they can be ensured the broker would never downgrade
anymore and hence can safely switch the config. So my feeling is that this
config would not be helping too much still. If we want to be at the safer
side, then I'd suggest we modify the Coordinator -> NetworkClient hierarchy
to allow the NetworkClient being able to pass the APIVersion metadata to
Coordinator, so that Coordinator can rely on that logic to change its
behavior dynamically.

5. I do not have a concrete idea about how the impact on Connect would
make, maybe Randall or Konstantine can help here?


Guozhang

On Mon, Jun 24, 2019 at 10:26 PM Boyang Chen 
wrote:

> Hey Jason,
>
> thank you for the proposal here. Some of my thoughts below.
>
> On Mon, Jun 24, 2019 at 8:58 PM Jason Gustafson 
> wrote:
>
> > Hi Boyang,
> >
> > Thanks for picking this up! Still reading through the updates, but here
> are
> > a couple initial comments on the APIs:
> >
> > 1. The `TxnProducerIdentity` class is a bit awkward. I think we are
> trying
> > to encapsulate state from the current group assignment. Maybe something
> > like `ConsumerAssignment` would be clearer? If we make the usage
> consistent
> > across the consumer and producer, then we can avoid exposing internal
> state
> > like the generationId.
> >
> > For example:
> >
> > // Public API
> > interface ConsumerAssignment {
> >   Set partittions();
> > }
> >
> > // Not a public API
> > class InternalConsumerAssignment implements ConsumerAssignment {
> >   Set partittions;
> >   int generationId;
> > }
> >
> > Then we can change the rebalance listener to something like this:
> > onPartitionsAssigned(ConsumerAssignment assignment)
> >
> > And on the producer:
> > void initTransactions(String groupId, ConsumerAssignment assignment);
> >
> > 2. Another bit of awkwardness is the fact that we have to pass the
> groupId
> > through both initTransactions() and sendOffsetsToTransaction(). We could
> > consider a config instead. Maybe something like `transactional.group.id
> `?
> > Then we could simplify the producer APIs, potentially even deprecating
> the
> > current sendOffsetsToTransaction. In fact, for this new usage, the `
> > transational.id` config is not needed. It would be nice if we don't have
> > to
> > provide it.
> >
> > I like the idea of combining 1 and 2. We could definitely pass in a
> group.id config
> so that we could avoid exposing that information in a public API. The
> question I have
> is that whether we should name the interface `GroupAssignment` instead, so
> that Connect later
> could also extend on the same interface, just to echo Guozhang's point
> here, Also the base interface
> is better to be defined empty for easy extension, or define an abstract
> type called `Resource` to be shareable
> later IMHO.
>
>
> > By the way, I'm a bit confused about discussion above about colocating
> the
> > txn and group coordinators. That is not actually necessary, is it?
> >
> > 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-06-25 Thread Jason Gustafson
>
> The question I have is that whether we should name the interface
> `GroupAssignment` instead, so
> that Connect later could also extend on the same interface, just to echo
> Guozhang's point here,


Are you referring to the API used for initTransactions? There would be no
reason to use a more generic interface in ConsumerRebalanceListener since
that is already tied to the consumer. Possibly you can have GroupAssignment
> ConsumerAssignment to try and leave the door open for a ConnectAssignment
implementation in the future. Then we could have the following APIs:

// ConsumerRebalanceListener
onPartitionsAssigned(ConsumerAssignment assignment)
// Producer
void initTransactions(String groupId, GroupAssignment assignment);

The mechanism Connect uses for offsets is quite different. They are stored
in a separate topic, so we don't have the convenience of the group
coordinator to gate access. This begs another question. The current
proposal still has my initial suggestion to include the partition
assignment in the InitProducerId API. Do we still need this since we are
doing fencing in the group coordinator? To handle Connect, we would need to
generalize the notion of an assignment in the transaction coordinator.
Alternatively, we can rely on group coordinator and leave the assignment
out of transaction coordinator for now. This could always be revisited in
the future.

-Jason


On Mon, Jun 24, 2019 at 10:26 PM Boyang Chen 
wrote:

> Hey Jason,
>
> thank you for the proposal here. Some of my thoughts below.
>
> On Mon, Jun 24, 2019 at 8:58 PM Jason Gustafson 
> wrote:
>
> > Hi Boyang,
> >
> > Thanks for picking this up! Still reading through the updates, but here
> are
> > a couple initial comments on the APIs:
> >
> > 1. The `TxnProducerIdentity` class is a bit awkward. I think we are
> trying
> > to encapsulate state from the current group assignment. Maybe something
> > like `ConsumerAssignment` would be clearer? If we make the usage
> consistent
> > across the consumer and producer, then we can avoid exposing internal
> state
> > like the generationId.
> >
> > For example:
> >
> > // Public API
> > interface ConsumerAssignment {
> >   Set partittions();
> > }
> >
> > // Not a public API
> > class InternalConsumerAssignment implements ConsumerAssignment {
> >   Set partittions;
> >   int generationId;
> > }
> >
> > Then we can change the rebalance listener to something like this:
> > onPartitionsAssigned(ConsumerAssignment assignment)
> >
> > And on the producer:
> > void initTransactions(String groupId, ConsumerAssignment assignment);
> >
> > 2. Another bit of awkwardness is the fact that we have to pass the
> groupId
> > through both initTransactions() and sendOffsetsToTransaction(). We could
> > consider a config instead. Maybe something like `transactional.group.id
> `?
> > Then we could simplify the producer APIs, potentially even deprecating
> the
> > current sendOffsetsToTransaction. In fact, for this new usage, the `
> > transational.id` config is not needed. It would be nice if we don't have
> > to
> > provide it.
> >
> > I like the idea of combining 1 and 2. We could definitely pass in a
> group.id config
> so that we could avoid exposing that information in a public API. The
> question I have
> is that whether we should name the interface `GroupAssignment` instead, so
> that Connect later
> could also extend on the same interface, just to echo Guozhang's point
> here, Also the base interface
> is better to be defined empty for easy extension, or define an abstract
> type called `Resource` to be shareable
> later IMHO.
>
>
> > By the way, I'm a bit confused about discussion above about colocating
> the
> > txn and group coordinators. That is not actually necessary, is it?
> >
> > Yes, this is not a requirement for this KIP, because it is inherently
> impossible to
> achieve co-locating  topic partition of transaction log and consumed offset
> topics.
>
>
> > Thanks,
> > Jason
> >
> On Mon, Jun 24, 2019 at 10:07 AM Boyang Chen 
> > wrote:
> >
> > > Thank you Ismael for the suggestion. We will attempt to address it by
> > > giving more details to rejected alternative section.
> > >
> > >
> > > Thank you for the comment Guozhang! Answers are inline below.
> > >
> > >
> > >
> > > On Sun, Jun 23, 2019 at 6:33 PM Guozhang Wang 
> > wrote:
> > >
> > > > Hello Boyang,
> > > >
> > > > Thanks for the KIP, I have some comments below:
> > > >
> > > > 1. "Once transactions are complete, the call will return." This seems
> > > > different from the existing behavior, in which we would return a
> > > retriable
> > > > CONCURRENT_TRANSACTIONS and let the client to retry, is this
> > intentional?
> > > >
> > >
> > > I don’t think it is intentional, and I will defer this question to
> Jason
> > > when he got time to answer since from what I understood retry and on
> hold
> > > seem both valid approaches.
> > >
> > >
> > > > 2. "an overload to onPartitionsAssigned in the consumer's rebalance
> > > > listener 

Contribution access to Kafka Confluence page

2019-06-25 Thread Anastasia Vela
Hi,

Could I get access to create KIPs on the Kafka Confluence page?
Email: av...@confleunt.io
UserID: avela

Thanks,
Anastasia


Re: [DISCUSS] KIP-477: Add PATCH method for connector config in Connect REST API

2019-06-25 Thread Ivan Yurchenko
Hi,

Since Kafka 2.3 has just been release and more people may have time to look
at this now, I'd like to bump this discussion.
Thanks.

Ivan


On Thu, 13 Jun 2019 at 17:20, Ivan Yurchenko 
wrote:

> Hello,
>
> I'd like to start the discussion of KIP-477: Add PATCH method for
> connector config in Connect REST API.
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-477%3A+Add+PATCH+method+for+connector+config+in+Connect+REST+API
>
> There is also a draft PR: https://github.com/apache/kafka/pull/6934.
>
> Thank you.
>
> Ivan
>


[jira] [Created] (KAFKA-8601) Producer Improvement: Sticky Partitioner

2019-06-25 Thread Justine Olshan (JIRA)
Justine Olshan created KAFKA-8601:
-

 Summary: Producer Improvement: Sticky Partitioner
 Key: KAFKA-8601
 URL: https://issues.apache.org/jira/browse/KAFKA-8601
 Project: Kafka
  Issue Type: Improvement
Reporter: Justine Olshan
Assignee: Justine Olshan


Currently the default partitioner uses a round-robin strategy to partition 
non-keyed values. The idea is to implement a "sticky partitioner" that chooses 
a partition for a topic and sends all records to that partition until the batch 
is sent. Then a new partition is chosen. This new partitioner will increase 
batching and decrease latency. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[ANNOUNCE] Apache Kafka 2.3.0

2019-06-25 Thread Colin McCabe
The Apache Kafka community is pleased to announce the release for Apache Kafka 
2.3.0.
This release includes several new features, including:

- There have been several improvements to the Kafka Connect REST API.
- Kafka Connect now supports incremental cooperative rebalancing. 
- Kafka Streams now supports an in-memory session store and window store.
- The AdminClient now allows users to determine what operations they are 
authorized to perform on topics.
- There is a new broker start time metric.
- JMXTool can now connect to secured RMI ports.
- An incremental AlterConfigs API has been added.  The old AlterConfigs API has 
been deprecated.
- We now track partitions which are under their min ISR count.
- Consumers can now opt-out of automatic topic creation, even when it is 
enabled on the broker.
- Kafka components can now use external configuration stores (KIP-421)
- We have implemented improved replica fetcher behavior when errors are 
encountered

All of the changes in this release can be found in the release notes:
https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html


You can download the source and binary release (Scala 2.11 and 2.12) from:
https://kafka.apache.org/downloads#2.3.0

All of the changes in this release can be found in the release notes:
https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html

You can download the source and binary release (Scala 2.11 and 2.12) from:
https://kafka.apache.org/downloads#2.3.0

---

Apache Kafka is a distributed streaming platform with four core APIs:

** The Producer API allows an application to publish a stream records to one or 
more Kafka topics.

** The Consumer API allows an application to subscribe to one or more topics 
and process the stream of records produced to them.

** The Streams API allows an application to act as a stream processor, 
consuming an input stream from one or more topics and producing an output 
stream to one or more output topics, effectively transforming the input streams 
to output streams.

** The Connector API allows building and running reusable producers or 
consumers that connect Kafka topics to existing applications or data systems. 
For example, a connector to a relational database might capture every change to 
a table.


With these APIs, Kafka can be used for two broad classes of application:
** Building real-time streaming data pipelines that reliably get data between 
systems or applications.
** Building real-time streaming applications that transform or react to the 
streams of data.

Apache Kafka is in use at large and small companies worldwide, including 
Capital One, Goldman Sachs, ING, LinkedIn, Netflix, Pinterest, Rabobank, 
Target, The New York Times, Uber, Yelp, and Zalando, among others.

A big thank you for the following 101 contributors to this release!

Aishwarya Gune, Alex Diachenko, Alex Dunayevsky, Anna Povzner, Arabelle Hou, 
Arjun Satish, A. Sophie Blee-Goldman, asutosh936, Bill Bejeck, Bob Barrett, 
Boyang Chen, Brian Bushree, cadonna, Casey Green, Chase Walden, Chia-Ping Tsai, 
Chris Egerton, Chris Steingen, Colin Hicks, Colin Patrick McCabe, commandini, 
cwildman, Cyrus Vafadari, Dan Norwood, David Arthur, Dejan Stojadinović, 
Dhruvil Shah, Doroszlai, Attila, Ewen Cheslack-Postava, Fangbin Sun, Filipe 
Agapito, Florian Hussonnois, Gardner Vickers, Guozhang Wang, Gwen Shapira, 
Hai-Dang Dam, highluck, huxi, huxihx, Ismael Juma, Ivan Yurchenko, Jarrod 
Urban, Jason Gustafson, John Roesler, José Armando García Sancio, Joyce Fee, 
Jun Rao, KartikVK, Kengo Seki, Kevin Lu, khairy, Konstantine Karantasis, 
Kristian Aurlien, Kyle Ambroff-Kao, lambdaliu, Lee Dongjin, Lifei Chen, Lucas 
Bradstreet, Lysss, lzh3636, Magesh Nandakumar, Manikumar Reddy, Mark Cho, 
Massimo Siani, Matthias J. Sax, Michael Gruben Trejo, Mickael Maison, Murad, 
Nicholas Parker, opera443399, Paul Davidson, pierDipi, pkleindl, Radai 
Rosenblatt, Rajini Sivaram, Randall Hauch, Rohan, Rohan Desai, Ron Dagostino, 
Ryan Chen, saisandeep, sandmannn, sdreynolds, Sebastián Ortega, Shaobo Liu, 
Sönke Liebau, Stanislav Kozlovski, Suman BN, tadsul, Tcsalist, Ted Yu, Vahid 
Hashemian, Victoria Bialas, Viktor Somogyi, Viktor Somogyi-Vass, Vito Jeng, 
wenhoujx, Xiongqi Wu, Yaroslav Klymko, Zhanxiang (Patrick) Huang

We welcome your help and feedback. For more information on how to report 
problems, and to get involved, visit the project website at 
https://kafka.apache.org/

Thank you!

Regards,
Colin


Build failed in Jenkins: kafka-trunk-jdk8 #3751

2019-06-25 Thread Apache Jenkins Server
See 


Changes:

[manikumar.reddy] KAFKA-8390: Use automatic RPC generation in 
CreateDelegationToken

--
[...truncated 2.55 MB...]

kafka.security.auth.PermissionTypeTest > testJavaConversions PASSED

kafka.security.auth.PermissionTypeTest > testFromString STARTED

kafka.security.auth.PermissionTypeTest > testFromString PASSED

2420 tests completed, 1 failed, 2 skipped

> Task :core:test FAILED

> Task :streams:streams-scala:test

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsMaterialized 
STARTED

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsMaterialized 
PASSED

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsJava STARTED

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsJava PASSED

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWords STARTED

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWords PASSED

org.apache.kafka.streams.scala.kstream.ProducedTest > Create a Produced should 
create a Produced with Serdes STARTED

org.apache.kafka.streams.scala.kstream.ProducedTest > Create a Produced should 
create a Produced with Serdes PASSED

org.apache.kafka.streams.scala.kstream.ProducedTest > Create a Produced with 
timestampExtractor and resetPolicy should create a Consumed with Serdes, 
timestampExtractor and resetPolicy STARTED

org.apache.kafka.streams.scala.kstream.ProducedTest > Create a Produced with 
timestampExtractor and resetPolicy should create a Consumed with Serdes, 
timestampExtractor and resetPolicy PASSED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped should 
create a Grouped with Serdes STARTED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped should 
create a Grouped with Serdes PASSED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped with 
repartition topic name should create a Grouped with Serdes, and repartition 
topic name STARTED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped with 
repartition topic name should create a Grouped with Serdes, and repartition 
topic name PASSED

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes STARTED

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes PASSED

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes and repartition topic name STARTED

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes and repartition topic name PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilWindowCloses should produce the correct suppression STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilWindowCloses should produce the correct suppression PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilTimeLimit should produce the correct suppression STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilTimeLimit should produce the correct suppression PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxRecords 
should produce the correct buffer config STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxRecords 
should produce the correct buffer config PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxBytes 
should produce the correct buffer config STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxBytes 
should produce the correct buffer config PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.unbounded 
should produce the correct buffer config STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.unbounded 
should produce the correct buffer config PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig should 
support very long chains of factory methods STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig should 
support very long chains of factory methods PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialized 
should create a Materialized with Serdes STARTED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialized 
should create a Materialized with Serdes PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a store name should create a Materialized with Serdes and a store name 
STARTED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 
with a store name should create a Materialized with Serdes and a store name 
PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialize 

Jenkins build is back to normal : kafka-trunk-jdk11 #658

2019-06-25 Thread Apache Jenkins Server
See 




[jira] [Created] (KAFKA-8600) Replace DescribeDelegationToken request/response with automated protocol

2019-06-25 Thread Mickael Maison (JIRA)
Mickael Maison created KAFKA-8600:
-

 Summary: Replace DescribeDelegationToken request/response with 
automated protocol
 Key: KAFKA-8600
 URL: https://issues.apache.org/jira/browse/KAFKA-8600
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison
Assignee: Mickael Maison






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8598) Replace RenewDelegationToken request/response with automated protocol

2019-06-25 Thread Mickael Maison (JIRA)
Mickael Maison created KAFKA-8598:
-

 Summary: Replace RenewDelegationToken request/response with 
automated protocol
 Key: KAFKA-8598
 URL: https://issues.apache.org/jira/browse/KAFKA-8598
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison
Assignee: Mickael Maison






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8599) Replace ExpireDelegationToken request/response with automated protocol

2019-06-25 Thread Mickael Maison (JIRA)
Mickael Maison created KAFKA-8599:
-

 Summary: Replace ExpireDelegationToken request/response with 
automated protocol
 Key: KAFKA-8599
 URL: https://issues.apache.org/jira/browse/KAFKA-8599
 Project: Kafka
  Issue Type: Sub-task
Reporter: Mickael Maison
Assignee: Mickael Maison






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-435: Incremental Partition Reassignment

2019-06-25 Thread Viktor Somogyi-Vass
Hi All,

I have added another improvement to this, which is to limit the parallel
leader movements. I think I'll soon (maybe late this week or early next)
start a vote on this too if there are no additional feedback.

Thanks,
Viktor

On Mon, Apr 29, 2019 at 1:26 PM Viktor Somogyi-Vass 
wrote:

> Hi Folks,
>
> I've updated the KIP with the batching which would work on both replica
> and partition level. To explain it briefly: for instance if the replica
> level is set to 2 and partition level is set to 3, then 2x3=6 replica
> reassignment would be in progress at the same time. In case of reassignment
> for a single partition from (0, 1, 2, 3, 4) to (5, 6, 7, 8, 9) we would
> form the batches (0, 1) → (5, 6); (2, 3) → (7, 8) and 4 → 9 and would
> execute the reassignment in this order.
>
> Let me know what you think.
>
> Best,
> Viktor
>
> On Mon, Apr 15, 2019 at 7:01 PM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com> wrote:
>
>> A follow up on the batching topic to clarify my points above.
>>
>> Generally I think that batching should be a core feature as Colin said
>> the controller should possess all information that are related.
>> Also Cruise Control (or really any 3rd party admin system) might build
>> upon this to give more holistic approach to balance brokers. We may cater
>> them with APIs that act like building blocks to make their life easier like
>> incrementalization, batching, cancellation and rollback but I think the
>> more advanced we go we'll need more advanced control surface and Kafka's
>> basic tooling might not be suitable for that.
>>
>> Best,
>> Viktor
>>
>>
>> On Mon, 15 Apr 2019, 18:22 Viktor Somogyi-Vass, 
>> wrote:
>>
>>> Hey Guys,
>>>
>>> I'll reply to you all in this email:
>>>
>>> @Jun:
>>> 1. yes, it'd be a good idea to add this feature, I'll write this into
>>> the KIP. I was actually thinking about introducing a dynamic config called
>>> reassignment.parallel.partition.count and
>>> reassignment.parallel.replica.count. The first property would control how
>>> many partition reassignment can we do concurrently. The second would go one
>>> level in granularity and would control how many replicas do we want to move
>>> for a given partition. Also one more thing that'd be useful to fix is that
>>> a given list of partition -> replica list would be executed in the same
>>> order (from first to last) so it's overall predictable and the user would
>>> have some control over the order of reassignments should be specified as
>>> the JSON is still assembled by the user.
>>> 2. the /kafka/brokers/topics/{topic} znode to be specific. I'll update
>>> the KIP to contain this.
>>>
>>> @Jason:
>>> I think building this functionality into Kafka would definitely benefit
>>> all the users and that CC as well as it'd simplify their software as you
>>> said. As I understand the main advantage of CC and other similar softwares
>>> are to give high level features for automatic load balancing. Reliability,
>>> stability and predictability of the reassignment should be a core feature
>>> of Kafka. I think the incrementalization feature would make it more stable.
>>> I would consider cancellation too as a core feature and we can leave the
>>> gate open for external tools to feed in their reassignment json as they
>>> want. I was also thinking about what are the set of features we can provide
>>> for Kafka but I think the more advanced we go the more need there is for an
>>> administrative UI component.
>>> Regarding KIP-352: Thanks for pointing this out, I didn't see this
>>> although lately I was also thinking about the throttling aspect of it.
>>> Would be a nice add-on to Kafka since though the above configs provide some
>>> level of control, it'd be nice to put an upper cap on the bandwidth and
>>> make it monitorable.
>>>
>>> Viktor
>>>
>>> On Wed, Apr 10, 2019 at 2:57 AM Jason Gustafson 
>>> wrote:
>>>
 Hi Colin,

 On a related note, what do you think about the idea of storing the
 > reassigning replicas in
 > /brokers/topics/[topic]/partitions/[partitionId]/state, rather than
 in the
 > reassignment znode?  I don't think this requires a major change to the
 > proposal-- when the controller becomes aware that it should do a
 > reassignment, the controller could make the changes.  This also helps
 keep
 > the reassignment znode from getting larger, which has been a problem.


 Yeah, I think it's a good idea to store the reassignment state at a
 finer
 level. I'm not sure the LeaderAndIsr znode is the right one though.
 Another
 option is /brokers/topics/{topic}. That is where we currently store the
 replica assignment. I think we basically want to represent both the
 current
 state and the desired state. This would also open the door to a cleaner
 way
 to update a reassignment while it is still in progress.

 -Jason




 On Mon, Apr 8, 2019 at 11:14 PM George Li >>> .invalid>

[jira] [Resolved] (KAFKA-8390) Replace CreateDelegationToken request/response with automated protocol

2019-06-25 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-8390.
--
   Resolution: Fixed
Fix Version/s: 2.4.0

> Replace CreateDelegationToken request/response with automated protocol
> --
>
> Key: KAFKA-8390
> URL: https://issues.apache.org/jira/browse/KAFKA-8390
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Mickael Maison
>Assignee: Mickael Maison
>Priority: Major
> Fix For: 2.4.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk8 #3750

2019-06-25 Thread Apache Jenkins Server
See 


Changes:

[manikumar.reddy] MINOR: add unit test for Utils.murmur2 (#5926)

--
[...truncated 4.69 MB...]

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldRestoreGlobalInMemoryKTableOnRestart STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldRestoreGlobalInMemoryKTableOnRestart PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldNotRestoreAbortedMessages STARTED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldNotRestoreAbortedMessages PASSED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldRestoreTransactionalMessages STARTED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldRestoreTransactionalMessages PASSED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableEOSIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.RepartitionWithMergeOptimizingIntegrationTest
 > shouldSendCorrectRecords_OPTIMIZED STARTED

org.apache.kafka.streams.integration.RepartitionWithMergeOptimizingIntegrationTest
 > shouldSendCorrectRecords_OPTIMIZED PASSED

org.apache.kafka.streams.integration.RepartitionWithMergeOptimizingIntegrationTest
 > shouldSendCorrectResults_NO_OPTIMIZATION STARTED

org.apache.kafka.streams.integration.RepartitionWithMergeOptimizingIntegrationTest
 > shouldSendCorrectResults_NO_OPTIMIZATION PASSED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldFlatTransform STARTED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldFlatTransform PASSED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldFlatTransformValuesWithValueTransformerWithoutKey STARTED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldFlatTransformValuesWithValueTransformerWithoutKey PASSED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldFlatTransformValuesWithKey STARTED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldFlatTransformValuesWithKey PASSED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldTransform STARTED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldTransform PASSED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldTransformValuesWithValueTransformerWithKey STARTED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldTransformValuesWithValueTransformerWithKey PASSED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldTransformValuesWithValueTransformerWithoutKey STARTED

org.apache.kafka.streams.integration.KStreamTransformIntegrationTest > 
shouldTransformValuesWithValueTransformerWithoutKey PASSED

org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest > 
shouldRestoreState STARTED

org.apache.kafka.streams.integration.PurgeRepartitionTopicIntegrationTest > 
shouldRestoreState PASSED

org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest > 
shouldWorkWithRebalance STARTED

org.apache.kafka.streams.integration.SmokeTestDriverIntegrationTest > 
shouldWorkWithRebalance PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED


Build failed in Jenkins: kafka-trunk-jdk11 #657

2019-06-25 Thread Apache Jenkins Server
See 


Changes:

[manikumar.reddy] MINOR: add unit test for Utils.murmur2 (#5926)

--
[...truncated 2.88 MB...]

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
timestampToConnectWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
timestampToConnectWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > timeToConnectOptional STARTED

org.apache.kafka.connect.json.JsonConverterTest > timeToConnectOptional PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnectWithDefaultValue 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToConnectWithDefaultValue 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullValueToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullValueToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > decimalToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > decimalToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringHeaderToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringHeaderToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonNonStringKeys PASSED

org.apache.kafka.connect.json.JsonConverterTest > longToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > longToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > mismatchSchemaJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > mapToConnectNonStringKeys 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
testJsonSchemaMetadataTranslation PASSED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > bytesToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > shortToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndNullValueToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > timestampToConnectOptional 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > structToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > stringToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndArrayToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaPrimitiveToConnect 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > byteToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > intToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > dateToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToConnect PASSED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson STARTED

org.apache.kafka.connect.json.JsonConverterTest > noSchemaToJson PASSED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
STARTED

org.apache.kafka.connect.json.JsonConverterTest > nullSchemaAndPrimitiveToJson 
PASSED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue STARTED

org.apache.kafka.connect.json.JsonConverterTest > 
decimalToConnectOptionalWithDefaultValue PASSED

org.apache.kafka.connect.json.JsonConverterTest > mapToJsonStringKeys STARTED

org.apache.kafka.connect.json.JsonConverterTest > 

[jira] [Created] (KAFKA-8597) Give access to the Dead Letter Queue APIs to Kafka Connect Developers

2019-06-25 Thread Andrea Santurbano (JIRA)
Andrea Santurbano created KAFKA-8597:


 Summary: Give access to the Dead Letter Queue APIs to Kafka 
Connect Developers
 Key: KAFKA-8597
 URL: https://issues.apache.org/jira/browse/KAFKA-8597
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Andrea Santurbano


Would be cool to have the chance to have access to the DLQ APIs give to enable 
us (developers) to use that.

For instance, if someone uses JSON as message format with no schema and it's 
trying to import some data into a table, and the JSON contains a null value for 
a NON-NULL table field, so we want to move that event to the DLQ.

Thanks a lot!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8596) Kafka topic pre-creation error message needs to be passed to application as an exception

2019-06-25 Thread ASHISH M VYAS (JIRA)
ASHISH M VYAS created KAFKA-8596:


 Summary: Kafka topic pre-creation error message needs to be passed 
to application as an exception
 Key: KAFKA-8596
 URL: https://issues.apache.org/jira/browse/KAFKA-8596
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 2.1.1
Reporter: ASHISH M VYAS


If i don't have a topic pre-created, I get an error log that reads "is unknown 
yet during rebalance," + " please make sure they have been pre-created before 
starting the Streams application." Ideally I expect an exception here being 
thrown that I can catch in my application and decide what I want to do. 

 

Without this, my app keeps running and actual functionality doesn't work making 
it time consuming to debug.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk8 #3749

2019-06-25 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] HOTFIX: removed extra footnote (#6996)

--
[...truncated 2.53 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest >