Re: [DISCUSS] KIP-486 Support for pluggable KeyStore and TrustStore

2019-08-19 Thread Maulin Vasavada
Instead of keep diverging in different directions, it would be helpful if
you guys take my detailed posts with 1st to 4th points I mentioned and
start referring/commenting on each of those if you agree with them or not.

On Mon, Aug 19, 2019 at 10:45 PM Maulin Vasavada 
wrote:

> Hi Colin
>
> When I refer to "standard" or "custom" algorithms I am following Java
> security Provider Terminology. You can refer to
> https://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html#TrustManagerFactory
> link I provided earlier in the emails. It says PKIX is the default
> Algorithm for TrustManagerFactory.
>
> 1. For SPIFFE, I am not sure why you are saying 'it does not implement
> custom algorithms' because the following file clearly indicates that it
> does use custom algorithm-
>
>
> https://github.com/spiffe/java-spiffe/blob/master/src/main/java/spiffe/provider/SpiffeProvider.java#L17
>
>
> Algorithm value:
> https://github.com/spiffe/java-spiffe/blob/master/src/main/java/spiffe/provider/SpiffeProviderConstants.java#L6
>
> @Harsha do you want to chime in since you use that provider?
>
> 2. I already mentioned in my 3rd point, in my previous post, why using
> ssl.provider does NOT work. I updated KIP-486 in "rejected alternatives"
> also why ssl.provider does not work.
>
> 3. Security.insertProviderAt() comments were based on assumption if
> KIP-492 changes are done and we use that mechanism to configure providers
> instead of ssl.provider configuration.
>
> Can you read my all the points, I mentioned in my previous post, very
> carefully? I am covering all the aspects in explaining. I am open to still
> discuss more to clarify any doubts.
>
> Thanks
> Maulin
>
>
>
> On Mon, Aug 19, 2019 at 9:52 AM Colin McCabe  wrote:
>
>> Hi Maulin,
>>
>> A lot of JSSE providers don't implement custom algorithms.  Spire is a
>> good example of a JSSE provider that doesn't, and yet is still useful to
>> many people.  Your JSSE provider can work fine even if it doesn't implement
>> a custom algorithm.
>>
>> Maybe I'm missing something, but I don't understand the discussion of
>> Security.insertProviderAt() that you included.  SslEngineBuilder doesn't
>> use that API to get the security provider.  Instead, it calls
>> "SSLContext.getInstance(protocol, provider)", where provider is the name of
>> the provider.
>>
>> best,
>> Colin
>>
>>
>> On Sat, Aug 17, 2019, at 20:13, Maulin Vasavada wrote:
>> > On top of everything above I feel strongly to add the 4th point which is
>> > based on Java APIs for TrustManagerFactory.init(KeyStore) (
>> >
>> https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/TrustManagerFactory.html#init(java.security.KeyStore)
>> )
>> > and KeyManagerFactory.init(KeyStore, char[]) (
>> >
>> https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/KeyManagerFactory.html#init(java.security.KeyStore,%20char[])
>> > ).
>> >
>> > 4. The above APIs are intended to support providing "trust/key material"
>> > from the user without having to write their own
>> TrustManager/KeyManagers.
>> > To quote from the TrustManagerFactory.init()'s documentation
>> "Initializes
>> > this factory with a source of certificate authorities and related trust
>> > material."
>> > To quote from the KeyManagerFactory.init()'s documentation "Initializes
>> > this factory with a source of key material."
>> >
>> > Based on this it is clear that there is a flexibility provided by Java
>> to
>> > to enable developers to provide the required trust/key material loaded
>> from
>> > "anywhere" without requiring them to write custom provider OR trust/key
>> > managers. This same flexibility is reflected in Kafka code also where it
>> > loads the trust/keys from a local file and doesn't require writing a
>> > Provider necessarily. If we do NOT have a custom algorithm, it makes
>> less
>> > sense to write a Provider.
>> >
>> > Thanks
>> > Maulin
>> >
>> >
>> >
>> >
>> >
>> >
>> > On Thu, Aug 15, 2019 at 11:45 PM Maulin Vasavada <
>> maulin.vasav...@gmail.com>
>> > wrote:
>> >
>> > > Hi Harsha/Colin
>> > >
>> > > I did the sample with a custom Provider for TrustStoreManager and
>> tried
>> > > using ssl.provider Kafka config AND the way KIP-492 is suggesting (by
>> > > adding Provider programmatically instead of relying on
>> > > ssl.provider+java.security. The below sample is followed by my
>> detailed
>> > > findings. I'll appreciate if you can go through it carefully and see
>> if you
>> > > see my point.
>> > >
>> > > package providertest;
>> > >
>> > > import java.security.Provider;
>> > >
>> > > public class MyProvider extends Provider {
>> > >
>> > > private static final String name = "MyProvider";
>> > > private static double version = 1.0d;
>> > > private static String info = "Maulin's SSL Provider v"+version;
>> > >
>> > > public MyProvider() {
>> > > super(name, version, info);
>> > > this.put("TrustManagerFactory.PKIX",
>> "providertest.MyTrustManagerFactory");
>> > > }
>> > > }
>> 

Re: [DISCUSS] KIP-486 Support for pluggable KeyStore and TrustStore

2019-08-19 Thread Maulin Vasavada
Hi Colin

When I refer to "standard" or "custom" algorithms I am following Java
security Provider Terminology. You can refer to
https://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html#TrustManagerFactory
link I provided earlier in the emails. It says PKIX is the default
Algorithm for TrustManagerFactory.

1. For SPIFFE, I am not sure why you are saying 'it does not implement
custom algorithms' because the following file clearly indicates that it
does use custom algorithm-

https://github.com/spiffe/java-spiffe/blob/master/src/main/java/spiffe/provider/SpiffeProvider.java#L17


Algorithm value:
https://github.com/spiffe/java-spiffe/blob/master/src/main/java/spiffe/provider/SpiffeProviderConstants.java#L6

@Harsha do you want to chime in since you use that provider?

2. I already mentioned in my 3rd point, in my previous post, why using
ssl.provider does NOT work. I updated KIP-486 in "rejected alternatives"
also why ssl.provider does not work.

3. Security.insertProviderAt() comments were based on assumption if KIP-492
changes are done and we use that mechanism to configure providers instead
of ssl.provider configuration.

Can you read my all the points, I mentioned in my previous post, very
carefully? I am covering all the aspects in explaining. I am open to still
discuss more to clarify any doubts.

Thanks
Maulin



On Mon, Aug 19, 2019 at 9:52 AM Colin McCabe  wrote:

> Hi Maulin,
>
> A lot of JSSE providers don't implement custom algorithms.  Spire is a
> good example of a JSSE provider that doesn't, and yet is still useful to
> many people.  Your JSSE provider can work fine even if it doesn't implement
> a custom algorithm.
>
> Maybe I'm missing something, but I don't understand the discussion of
> Security.insertProviderAt() that you included.  SslEngineBuilder doesn't
> use that API to get the security provider.  Instead, it calls
> "SSLContext.getInstance(protocol, provider)", where provider is the name of
> the provider.
>
> best,
> Colin
>
>
> On Sat, Aug 17, 2019, at 20:13, Maulin Vasavada wrote:
> > On top of everything above I feel strongly to add the 4th point which is
> > based on Java APIs for TrustManagerFactory.init(KeyStore) (
> >
> https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/TrustManagerFactory.html#init(java.security.KeyStore)
> )
> > and KeyManagerFactory.init(KeyStore, char[]) (
> >
> https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/KeyManagerFactory.html#init(java.security.KeyStore,%20char[])
> > ).
> >
> > 4. The above APIs are intended to support providing "trust/key material"
> > from the user without having to write their own TrustManager/KeyManagers.
> > To quote from the TrustManagerFactory.init()'s documentation "Initializes
> > this factory with a source of certificate authorities and related trust
> > material."
> > To quote from the KeyManagerFactory.init()'s documentation "Initializes
> > this factory with a source of key material."
> >
> > Based on this it is clear that there is a flexibility provided by Java to
> > to enable developers to provide the required trust/key material loaded
> from
> > "anywhere" without requiring them to write custom provider OR trust/key
> > managers. This same flexibility is reflected in Kafka code also where it
> > loads the trust/keys from a local file and doesn't require writing a
> > Provider necessarily. If we do NOT have a custom algorithm, it makes less
> > sense to write a Provider.
> >
> > Thanks
> > Maulin
> >
> >
> >
> >
> >
> >
> > On Thu, Aug 15, 2019 at 11:45 PM Maulin Vasavada <
> maulin.vasav...@gmail.com>
> > wrote:
> >
> > > Hi Harsha/Colin
> > >
> > > I did the sample with a custom Provider for TrustStoreManager and tried
> > > using ssl.provider Kafka config AND the way KIP-492 is suggesting (by
> > > adding Provider programmatically instead of relying on
> > > ssl.provider+java.security. The below sample is followed by my detailed
> > > findings. I'll appreciate if you can go through it carefully and see
> if you
> > > see my point.
> > >
> > > package providertest;
> > >
> > > import java.security.Provider;
> > >
> > > public class MyProvider extends Provider {
> > >
> > > private static final String name = "MyProvider";
> > > private static double version = 1.0d;
> > > private static String info = "Maulin's SSL Provider v"+version;
> > >
> > > public MyProvider() {
> > > super(name, version, info);
> > > this.put("TrustManagerFactory.PKIX",
> "providertest.MyTrustManagerFactory");
> > > }
> > > }
> > >
> > >
> > >
> > > *Details:*
> > >
> > > KIP-492 documents that it will use Security.addProvider() assuming it
> will
> > > add it as position '0' which is not a correct assumption. The
> > > addProvider()'s documentation says it will add it to the last available
> > > position. You may want to correct that to say
> > > Security.insertProviderAt(provider, 1).
> > >
> > > Now coming back to our specific discussion,
> > >
> > > 1. SPIFFE 

Build failed in Jenkins: kafka-trunk-jdk11 #766

2019-08-19 Thread Apache Jenkins Server
See 


Changes:

[me] MINOR: Upgrade ducktape to 0.7.6

--
[...truncated 2.61 MB...]
org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldStoreAndReturnStateStores PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureOutputRecordsUsingTo STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureOutputRecordsUsingTo PASSED

org.apache.kafka.streams.MockProcessorContextTest > shouldCaptureOutputRecords 
STARTED

org.apache.kafka.streams.MockProcessorContextTest > shouldCaptureOutputRecords 
PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
fullConstructorShouldSetAllExpectedAttributes STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
fullConstructorShouldSetAllExpectedAttributes PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureCommitsAndAllowReset STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureCommitsAndAllowReset PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldThrowIfForwardedWithDeprecatedChildName STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldThrowIfForwardedWithDeprecatedChildName PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldThrowIfForwardedWithDeprecatedChildIndex STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldThrowIfForwardedWithDeprecatedChildIndex PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureApplicationAndRecordMetadata STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureApplicationAndRecordMetadata PASSED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureRecordsOutputToChildByName STARTED

org.apache.kafka.streams.MockProcessorContextTest > 
shouldCaptureRecordsOutputToChildByName PASSED

org.apache.kafka.streams.MockProcessorContextTest > shouldCapturePunctuator 
STARTED

org.apache.kafka.streams.MockProcessorContextTest > shouldCapturePunctuator 
PASSED

> Task :streams:streams-scala:test

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes STARTED

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes PASSED

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes and repartition topic name STARTED

org.apache.kafka.streams.scala.kstream.JoinedTest > Create a Joined should 
create a Joined with Serdes and repartition topic name PASSED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped should 
create a Grouped with Serdes STARTED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped should 
create a Grouped with Serdes PASSED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped with 
repartition topic name should create a Grouped with Serdes, and repartition 
topic name STARTED

org.apache.kafka.streams.scala.kstream.GroupedTest > Create a Grouped with 
repartition topic name should create a Grouped with Serdes, and repartition 
topic name PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilWindowCloses should produce the correct suppression STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilWindowCloses should produce the correct suppression PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilTimeLimit should produce the correct suppression STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > 
Suppressed.untilTimeLimit should produce the correct suppression PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxRecords 
should produce the correct buffer config STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxRecords 
should produce the correct buffer config PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxBytes 
should produce the correct buffer config STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.maxBytes 
should produce the correct buffer config PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.unbounded 
should produce the correct buffer config STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig.unbounded 
should produce the correct buffer config PASSED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig should 
support very long chains of factory methods STARTED

org.apache.kafka.streams.scala.kstream.SuppressedTest > BufferConfig should 
support very long chains of factory methods PASSED

org.apache.kafka.streams.scala.kstream.ProducedTest > Create a Produced should 
create a Produced with Serdes STARTED


[jira] [Created] (KAFKA-8820) Use Admin API of Replica Reassignment in CLI tools

2019-08-19 Thread Gwen Shapira (Jira)
Gwen Shapira created KAFKA-8820:
---

 Summary: Use Admin API of Replica Reassignment in CLI tools
 Key: KAFKA-8820
 URL: https://issues.apache.org/jira/browse/KAFKA-8820
 Project: Kafka
  Issue Type: Bug
Reporter: Gwen Shapira
Assignee: Steve Rodrigues


KIP-455 and KAFKA-8345 add a protocol and AdminAPI that will be used for 
replica reassignments. We need to update the reassignment tool to use this new 
API rather than work with ZK directly.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] KIP-495: Dynamically Adjust Log Levels in Connect

2019-08-19 Thread Arjun Satish
Thanks, Konstantine.

Updated the KIP with the restrictions around log4j and added references to
similar KIPs.

Best,

On Mon, Aug 19, 2019 at 3:20 PM Konstantine Karantasis <
konstant...@confluent.io> wrote:

> Thanks Arjun, the example is useful!
>
> My point when I mentioned the restrictions around log4j is that this is
> information is significant and IMO needs to be included in the KIP.
>
> Speaking of its relevance to KIP-412, I think a reference would be nice
> too.
>
> Konstantine
>
>
>
> On Thu, Aug 15, 2019 at 4:00 PM Arjun Satish 
> wrote:
>
> > Hey Konstantine,
> >
> > Thanks for the feedback.
> >
> > re: the use of log4j, yes, the proposed changes will only work if log4j
> is
> > available in runtime. We will not add the mBean if log4j is not available
> > in classpath. If we change from log4j 1 to 2, that would involve another
> > KIP, and it would need to update the changes proposed in this KIP and
> > others (KIP-412, for instance).
> >
> > re: use of Object types, I've changed it from Boolean to the primitive
> type
> > for setLogLevel. We are changing the signature of the old method this
> way,
> > but since it never returned null, this should be fine.
> >
> > re: example usage, I've added some screenshot on how this feature would
> be
> > used with jconsole.
> >
> > Hope this works!
> >
> > Thanks very much,
> > Arjun
> >
> > On Wed, Aug 14, 2019 at 6:42 AM Konstantine Karantasis <
> > konstant...@confluent.io> wrote:
> >
> > > And one thing I forgot is also related to Chris's comment above. I
> agree
> > > that an example on how a user is expected to set the log level (for
> > > instance to DEBUG) would be nice, even if it's showing only one out of
> > the
> > > many possible ways to achieve that.
> > >
> > > - Konstantine
> > >
> > > On Wed, Aug 14, 2019 at 4:38 PM Konstantine Karantasis <
> > > konstant...@confluent.io> wrote:
> > >
> > > >
> > > > Thanks Arjun for tackling the need to support this very useful
> feature.
> > > >
> > > > One thing I noticed while reading the KIP is that I would have loved
> to
> > > > see more info regarding how this proposal depends on the underlying
> > > logging
> > > > APIs and implementations. For instance, my understanding is that
> slf4j
> > > can
> > > > not be leveraged and that the logging framework needs to be pegged to
> > > log4j
> > > > explicitly (or another logging implementation). Correct me if I'm
> > wrong,
> > > > but if such a dependency is introduced I believe it's worth
> mentioning.
> > > >
> > > > Additionally, if the above is correct, there are differences in
> log4j's
> > > > APIs between version 1 and version 2. In version 2, Logger#setLevel
> > > method
> > > > has been removed from the Logger interface and in order to set the
> log
> > > > level programmatically the Configurator class needs to used, which as
> > > > stated in the FAQ (
> > > >
> https://logging.apache.org/log4j/2.x/faq.html#reconfig_level_from_code
> > )
> > > > it's not part of log4j2's public API. Is this a concern? I believe
> that
> > > > even if these are implementation specific details for the wrappers
> > > > introduced by this KIP (which to a certain extent they are), a
> mention
> > in
> > > > the KIP text and a few references would be useful to understand the
> > > changes
> > > > and the dependencies introduced by this proposal.
> > > >
> > > > And a few minor comments:
> > > > - Is there any specific reason that object types were preferred in
> the
> > > > proposed interface compared to primitive types? My understanding is
> > that
> > > > `null` is not expected as a return value.
> > > > - Related to the above, I think it'd be nice for the javadoc to
> mention
> > > > when a parameter is not expected to be `null` with an appropriate
> > comment
> > > > (e.g. foo bar etc; may not be null)
> > > >
> > > > Cheers,
> > > > Konstantine
> > > >
> > > > On Tue, Aug 6, 2019 at 9:34 AM Cyrus Vafadari 
> > > wrote:
> > > >
> > > >> This looks like a useful feature, the strategy makes sense, and the
> > KIP
> > > is
> > > >> thorough and nicely written. Thanks!
> > > >>
> > > >> Cyrus
> > > >>
> > > >> On Thu, Aug 1, 2019, 12:40 PM Chris Egerton 
> > > wrote:
> > > >>
> > > >> > Thanks Arjun! Looks good to me.
> > > >> >
> > > >> > On Thu, Aug 1, 2019 at 12:33 PM Arjun Satish <
> > arjun.sat...@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Thanks for the feedback, Chris!
> > > >> > >
> > > >> > > Yes, the example is pretty much how Connect will use the new
> > > feature.
> > > >> > > Tweaked the section to make this more clear.
> > > >> > >
> > > >> > > Best,
> > > >> > >
> > > >> > > On Fri, Jul 26, 2019 at 11:52 AM Chris Egerton <
> > chr...@confluent.io
> > > >
> > > >> > > wrote:
> > > >> > >
> > > >> > > > Hi Arjun,
> > > >> > > >
> > > >> > > > This looks great. The changes to public interface are pretty
> > small
> > > >> and
> > > >> > > > moving the Log4jController class into the clients package
> seems
> > > like
> > > >> > the
> > > >> 

Re: [DISCUSS] KIP-444: Augment Metrics for Kafka Streams

2019-08-19 Thread Guozhang Wang
Hi Bruno,

Just realized that for `addRateSensor` and `addLatencyAndRateSensor` we've
actually added the total invocation metric already.


Guozhang

On Mon, Aug 19, 2019 at 4:11 PM Guozhang Wang  wrote:

> Hi Bruno,
>
>
> On Tue, Aug 6, 2019 at 1:51 AM Bruno Cadonna  wrote:
>
>> Hi Guozhang,
>>
>> I left my comments inline.
>>
>> On Thu, Jul 18, 2019 at 8:28 PM Guozhang Wang  wrote:
>> >
>> > Hello Bruno,
>> >
>> > Thanks for the feedbacks, replied inline.
>> >
>> > On Mon, Jul 1, 2019 at 7:08 AM Bruno Cadonna 
>> wrote:
>> >
>> > > Hi Guozhang,
>> > >
>> > > Thank you for the KIP.
>> > >
>> > > 1) As far as I understand, the StreamsMetrics interface is there for
>> > > user-defined processors. Would it make sense to also add a method to
>> > > the interface to specify a sensor that records skipped records?
>> > >
>> > > Not sure I follow.. if users want to add a specific skipped records
>> > sensor, she can still do that as a "throughput" sensor via "
>> > addThroughputSensor" and then "record" right?
>> >
>> > As an after-thought, maybe it's better to rename `throughput` to `rate`
>> in
>> > the public APIs since it is really meant for the latter semantics. I did
>> > not change it just to make less API changes / deprecate fewer functions.
>> > But if we feel it is important we can change it as well.
>> >
>>
>> I see now that a user can record the rate of skipped records. However,
>> I was referring to the total number of skipped records. Maybe my
>> question should be more general: should we allow the user to also
>> specify sensors for totals or combinations of rate and totals?
>>
>> Sounds good to me, I will add it to the wiki page as well for
> StreamsMetrics.
>
>
>
>> Regarding the naming, I like `rate` more than `throughput`, but I
>> would not fight for it.
>>
>> >
>> > > 2) What are the semantics of active-task-process and
>> standby-task-process
>> > >
>> > > Ah good catch, I think I made it in the wrong column. Just some
>> > explanations here: Within a thread's looped iterations, it will first
>> try
>> > to process some records from the active tasks, and then see if there are
>> > any standby-tasks that can be processed as well (i.e. just reading from
>> the
>> > restore consumer and apply to the local stores). The ratio metrics are
>> for
>> > indicating 1) what tasks (active or standby) does this thread own so
>> far,
>> > and 2) how much time in percentage does it spend on each of them.
>> >
>> > But this metric should really be a task-level one that includes both the
>> > thread-id and task-id, and upon task migrations they will be dynamically
>> > deleted / (re)-created. For each task-id it may be owned by multiple
>> > threads as one active and others standby, and hence the separation of
>> > active / standby seems still necessary.
>> >
>>
>> Makes sense.
>>
>>
>> >
>> >
>> > > 3) How do dropped-late-records and expired-window-record-drop relate
>> > > to each other? I guess the former is for records that fall outside the
>> > > grace period and the latter is for records that are processed after
>> > > the retention period of the window. Is this correct?
>> > >
>> > > Yes, that's correct. The names are indeed a bit confusing since they
>> are
>> > added at different releases historically..
>> >
>> > More precisely, the `grace period` is a notion of the operator (hence
>> the
>> > metric is node-level, though it would only be used for DSL operators)
>> while
>> > the `retention` is a notion of the store (hence the metric is
>> store-level).
>> > Usually grace period will be smaller than store retention though.
>> >
>> > Processor node is aware of `grace period` and when received a record
>> that
>> > is older than grace deadline, it will be dropped immediately; otherwise
>> it
>> > will still be processed a maybe a new update is "put" into the store.
>> The
>> > store is aware of its `retention period` and then upon a "put" call if
>> it
>> > realized it is older than the retention deadline, that put call would be
>> > ignored and metric is recorded.
>> >
>> > We have to separate them here since the window store can be used in both
>> > DSL and PAPI, and for the former case it would likely to be already
>> ignored
>> > at the processor node level due to the grace period which is usually
>> > smaller than retention; but for PAPI there's no grace period and hence
>> the
>> > processor would likely still process and call "put" on the store.
>> >
>>
>> Alright! Got it!
>>
>> >
>> > > 4) Is there an actual difference between skipped and dropped records?
>> > > If not, shall we unify the terminology?
>> > >
>> > >
>> > There is. Dropped records are only due to lateness; where as skipped
>> > records can be due to serde errors (and user's error handling indicate
>> > "skip and continue"), timestamp errors, etc.
>> >
>> > I've considered maybe a better (more extensible) way would be defining a
>> > single metric name, say skipped-records, but use different tags to
>> indicate
>> > if its 

Re: [DISCUSS] KIP-444: Augment Metrics for Kafka Streams

2019-08-19 Thread Guozhang Wang
Hi Bruno,


On Tue, Aug 6, 2019 at 1:51 AM Bruno Cadonna  wrote:

> Hi Guozhang,
>
> I left my comments inline.
>
> On Thu, Jul 18, 2019 at 8:28 PM Guozhang Wang  wrote:
> >
> > Hello Bruno,
> >
> > Thanks for the feedbacks, replied inline.
> >
> > On Mon, Jul 1, 2019 at 7:08 AM Bruno Cadonna  wrote:
> >
> > > Hi Guozhang,
> > >
> > > Thank you for the KIP.
> > >
> > > 1) As far as I understand, the StreamsMetrics interface is there for
> > > user-defined processors. Would it make sense to also add a method to
> > > the interface to specify a sensor that records skipped records?
> > >
> > > Not sure I follow.. if users want to add a specific skipped records
> > sensor, she can still do that as a "throughput" sensor via "
> > addThroughputSensor" and then "record" right?
> >
> > As an after-thought, maybe it's better to rename `throughput` to `rate`
> in
> > the public APIs since it is really meant for the latter semantics. I did
> > not change it just to make less API changes / deprecate fewer functions.
> > But if we feel it is important we can change it as well.
> >
>
> I see now that a user can record the rate of skipped records. However,
> I was referring to the total number of skipped records. Maybe my
> question should be more general: should we allow the user to also
> specify sensors for totals or combinations of rate and totals?
>
> Sounds good to me, I will add it to the wiki page as well for
StreamsMetrics.



> Regarding the naming, I like `rate` more than `throughput`, but I
> would not fight for it.
>
> >
> > > 2) What are the semantics of active-task-process and
> standby-task-process
> > >
> > > Ah good catch, I think I made it in the wrong column. Just some
> > explanations here: Within a thread's looped iterations, it will first try
> > to process some records from the active tasks, and then see if there are
> > any standby-tasks that can be processed as well (i.e. just reading from
> the
> > restore consumer and apply to the local stores). The ratio metrics are
> for
> > indicating 1) what tasks (active or standby) does this thread own so far,
> > and 2) how much time in percentage does it spend on each of them.
> >
> > But this metric should really be a task-level one that includes both the
> > thread-id and task-id, and upon task migrations they will be dynamically
> > deleted / (re)-created. For each task-id it may be owned by multiple
> > threads as one active and others standby, and hence the separation of
> > active / standby seems still necessary.
> >
>
> Makes sense.
>
>
> >
> >
> > > 3) How do dropped-late-records and expired-window-record-drop relate
> > > to each other? I guess the former is for records that fall outside the
> > > grace period and the latter is for records that are processed after
> > > the retention period of the window. Is this correct?
> > >
> > > Yes, that's correct. The names are indeed a bit confusing since they
> are
> > added at different releases historically..
> >
> > More precisely, the `grace period` is a notion of the operator (hence the
> > metric is node-level, though it would only be used for DSL operators)
> while
> > the `retention` is a notion of the store (hence the metric is
> store-level).
> > Usually grace period will be smaller than store retention though.
> >
> > Processor node is aware of `grace period` and when received a record that
> > is older than grace deadline, it will be dropped immediately; otherwise
> it
> > will still be processed a maybe a new update is "put" into the store. The
> > store is aware of its `retention period` and then upon a "put" call if it
> > realized it is older than the retention deadline, that put call would be
> > ignored and metric is recorded.
> >
> > We have to separate them here since the window store can be used in both
> > DSL and PAPI, and for the former case it would likely to be already
> ignored
> > at the processor node level due to the grace period which is usually
> > smaller than retention; but for PAPI there's no grace period and hence
> the
> > processor would likely still process and call "put" on the store.
> >
>
> Alright! Got it!
>
> >
> > > 4) Is there an actual difference between skipped and dropped records?
> > > If not, shall we unify the terminology?
> > >
> > >
> > There is. Dropped records are only due to lateness; where as skipped
> > records can be due to serde errors (and user's error handling indicate
> > "skip and continue"), timestamp errors, etc.
> >
> > I've considered maybe a better (more extensible) way would be defining a
> > single metric name, say skipped-records, but use different tags to
> indicate
> > if its skipping reason (errors, windowing semantics, etc). But there's
> > still a tricky difference: for serde caused skipping for example, they
> will
> > be skipped at the very beginning and there's no effects taken at all. For
> > some others e.g. null-key / value at the reduce operator, it is only
> > skipped at the middle of the processing, i.e. some 

Re: [DISCUSS] KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

2019-08-19 Thread Colin McCabe
Hi all,

The KIP has been out for a while, so I'm thinking about calling a vote some 
time this week.

best,
Colin

On Mon, Aug 19, 2019, at 15:52, Colin McCabe wrote:
> On Mon, Aug 19, 2019, at 12:52, David Arthur wrote:
> > Thanks for the KIP, Colin. This looks great!
> > 
> > I really like the idea of separating the Controller and Broker JVMs.
> > 
> > As you alluded to above, it might be nice to have a separate
> > broker-registration API to avoid overloading the metadata fetch API.
> >
> 
> Hi David,
> 
> Thanks for taking a look.
> 
> I removed the sentence about MetadataFetch also serving as the broker 
> registration API.  I think I agree that we will probably want a 
> separate RPC to fill this role.  We will have a follow-on KIP that will 
> go into more detail about metadata propagation and registration in the 
> post-ZK world.  That KIP will also have a full description of the 
> registration RPC, etc.  For now, I think the important part for KIP-500 
> is that the broker registers with the controller quorum.  On 
> registration, the controller quorum assigns it a new broker epoch, 
> which can distinguish successive broker incarnations.
> 
> > 
> > When a broker gets a metadata delta, will it be a sequence of deltas since
> > the last update or a cumulative delta since the last update?
> >
> 
> It will be a sequence of deltas.  Basically, the broker will be reading 
> from the metadata log.
> 
> >
> > Will we include any kind of integrity check on the deltas to ensure the 
> > brokers
> > have applied them correctly? Perhaps this will be addressed in one of the
> > follow-on KIPs.
> > 
> 
> In general, we will have checksums on the metadata that we fetch.  This 
> is similar to how we have checksums on regular data.  Or if the 
> question is about catching logic errors in the metadata handling code, 
> that sounds more like something that should be caught by test cases.
> 
> best,
> Colin
> 
> 
> > Thanks!
> > 
> > On Fri, Aug 9, 2019 at 1:17 PM Colin McCabe  wrote:
> > 
> > > Hi Mickael,
> > >
> > > Thanks for taking a look.
> > >
> > > I don't think we want to support that kind of multi-tenancy at the
> > > controller level.  If the cluster is small enough that we want to pack the
> > > controller(s) with something else, we could run them alongside the 
> > > brokers,
> > > or possibly inside three of the broker JVMs.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > On Wed, Aug 7, 2019, at 10:37, Mickael Maison wrote:
> > > > Thank Colin for kickstarting this initiative.
> > > >
> > > > Just one question.
> > > > - A nice feature of Zookeeper is the ability to use chroots and have
> > > > several Kafka clusters use the same Zookeeper ensemble. Is this
> > > > something we should keep?
> > > >
> > > > Thanks
> > > >
> > > > On Mon, Aug 5, 2019 at 7:44 PM Colin McCabe  wrote:
> > > > >
> > > > > On Mon, Aug 5, 2019, at 10:02, Tom Bentley wrote:
> > > > > > Hi Colin,
> > > > > >
> > > > > > Thanks for the KIP.
> > > > > >
> > > > > > Currently ZooKeeper provides a convenient notification mechanism for
> > > > > > knowing that broker and topic configuration has changed. While
> > > KIP-500 does
> > > > > > suggest that incremental metadata update is expected to come to
> > > clients
> > > > > > eventually, that would seem to imply that for some number of
> > > releases there
> > > > > > would be no equivalent mechanism for knowing about config changes.
> > > Is there
> > > > > > any thinking at this point about how a similar notification might be
> > > > > > provided in the future?
> > > > >
> > > > > We could eventually have some inotify-like mechanism where clients
> > > could register interest in various types of events and got notified when
> > > they happened.  Reading the metadata log is conceptually simple.  The main
> > > complexity would be in setting up an API that made sense and that didn't
> > > unduly constrain future implementations.  We'd have to think carefully
> > > about what the real use-cases for this were, though.
> > > > >
> > > > > best,
> > > > > Colin
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Tom
> > > > > >
> > > > > > On Mon, Aug 5, 2019 at 3:49 PM Viktor Somogyi-Vass <
> > > viktorsomo...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hey Colin,
> > > > > > >
> > > > > > > I think this is a long-awaited KIP, thanks for driving it. I'm
> > > excited to
> > > > > > > see this in Kafka once. I collected my questions (and I accept the
> > > "TBD"
> > > > > > > answer as they might be a bit deep for this high level :) ).
> > > > > > > 1.) Are there any specific reasons for the Controller just
> > > periodically
> > > > > > > persisting its state on disk periodically instead of
> > > asynchronously with
> > > > > > > every update? Wouldn't less frequent saves increase the chance for
> > > missing
> > > > > > > a state change if the controller crashes between two saves?
> > > > > > > 2.) Why can't we allow brokers to fetch metadata from the 

Re: [DISCUSS] KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

2019-08-19 Thread Colin McCabe
On Mon, Aug 19, 2019, at 12:52, David Arthur wrote:
> Thanks for the KIP, Colin. This looks great!
> 
> I really like the idea of separating the Controller and Broker JVMs.
> 
> As you alluded to above, it might be nice to have a separate
> broker-registration API to avoid overloading the metadata fetch API.
>

Hi David,

Thanks for taking a look.

I removed the sentence about MetadataFetch also serving as the broker 
registration API.  I think I agree that we will probably want a separate RPC to 
fill this role.  We will have a follow-on KIP that will go into more detail 
about metadata propagation and registration in the post-ZK world.  That KIP 
will also have a full description of the registration RPC, etc.  For now, I 
think the important part for KIP-500 is that the broker registers with the 
controller quorum.  On registration, the controller quorum assigns it a new 
broker epoch, which can distinguish successive broker incarnations.

> 
> When a broker gets a metadata delta, will it be a sequence of deltas since
> the last update or a cumulative delta since the last update?
>

It will be a sequence of deltas.  Basically, the broker will be reading from 
the metadata log.

>
> Will we include any kind of integrity check on the deltas to ensure the 
> brokers
> have applied them correctly? Perhaps this will be addressed in one of the
> follow-on KIPs.
> 

In general, we will have checksums on the metadata that we fetch.  This is 
similar to how we have checksums on regular data.  Or if the question is about 
catching logic errors in the metadata handling code, that sounds more like 
something that should be caught by test cases.

best,
Colin


> Thanks!
> 
> On Fri, Aug 9, 2019 at 1:17 PM Colin McCabe  wrote:
> 
> > Hi Mickael,
> >
> > Thanks for taking a look.
> >
> > I don't think we want to support that kind of multi-tenancy at the
> > controller level.  If the cluster is small enough that we want to pack the
> > controller(s) with something else, we could run them alongside the brokers,
> > or possibly inside three of the broker JVMs.
> >
> > best,
> > Colin
> >
> >
> > On Wed, Aug 7, 2019, at 10:37, Mickael Maison wrote:
> > > Thank Colin for kickstarting this initiative.
> > >
> > > Just one question.
> > > - A nice feature of Zookeeper is the ability to use chroots and have
> > > several Kafka clusters use the same Zookeeper ensemble. Is this
> > > something we should keep?
> > >
> > > Thanks
> > >
> > > On Mon, Aug 5, 2019 at 7:44 PM Colin McCabe  wrote:
> > > >
> > > > On Mon, Aug 5, 2019, at 10:02, Tom Bentley wrote:
> > > > > Hi Colin,
> > > > >
> > > > > Thanks for the KIP.
> > > > >
> > > > > Currently ZooKeeper provides a convenient notification mechanism for
> > > > > knowing that broker and topic configuration has changed. While
> > KIP-500 does
> > > > > suggest that incremental metadata update is expected to come to
> > clients
> > > > > eventually, that would seem to imply that for some number of
> > releases there
> > > > > would be no equivalent mechanism for knowing about config changes.
> > Is there
> > > > > any thinking at this point about how a similar notification might be
> > > > > provided in the future?
> > > >
> > > > We could eventually have some inotify-like mechanism where clients
> > could register interest in various types of events and got notified when
> > they happened.  Reading the metadata log is conceptually simple.  The main
> > complexity would be in setting up an API that made sense and that didn't
> > unduly constrain future implementations.  We'd have to think carefully
> > about what the real use-cases for this were, though.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Tom
> > > > >
> > > > > On Mon, Aug 5, 2019 at 3:49 PM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hey Colin,
> > > > > >
> > > > > > I think this is a long-awaited KIP, thanks for driving it. I'm
> > excited to
> > > > > > see this in Kafka once. I collected my questions (and I accept the
> > "TBD"
> > > > > > answer as they might be a bit deep for this high level :) ).
> > > > > > 1.) Are there any specific reasons for the Controller just
> > periodically
> > > > > > persisting its state on disk periodically instead of
> > asynchronously with
> > > > > > every update? Wouldn't less frequent saves increase the chance for
> > missing
> > > > > > a state change if the controller crashes between two saves?
> > > > > > 2.) Why can't we allow brokers to fetch metadata from the follower
> > > > > > controllers? I assume that followers would have up-to-date
> > information
> > > > > > therefore brokers could fetch from there in theory.
> > > > > >
> > > > > > Thanks,
> > > > > > Viktor
> > > > > >
> > > > > > On Sun, Aug 4, 2019 at 6:58 AM Boyang Chen <
> > reluctanthero...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks for explaining Ismael! Breaking down into follow-up KIPs
> > sounds

Re: [DISCUSS] KIP-495: Dynamically Adjust Log Levels in Connect

2019-08-19 Thread Konstantine Karantasis
Thanks Arjun, the example is useful!

My point when I mentioned the restrictions around log4j is that this is
information is significant and IMO needs to be included in the KIP.

Speaking of its relevance to KIP-412, I think a reference would be nice
too.

Konstantine



On Thu, Aug 15, 2019 at 4:00 PM Arjun Satish  wrote:

> Hey Konstantine,
>
> Thanks for the feedback.
>
> re: the use of log4j, yes, the proposed changes will only work if log4j is
> available in runtime. We will not add the mBean if log4j is not available
> in classpath. If we change from log4j 1 to 2, that would involve another
> KIP, and it would need to update the changes proposed in this KIP and
> others (KIP-412, for instance).
>
> re: use of Object types, I've changed it from Boolean to the primitive type
> for setLogLevel. We are changing the signature of the old method this way,
> but since it never returned null, this should be fine.
>
> re: example usage, I've added some screenshot on how this feature would be
> used with jconsole.
>
> Hope this works!
>
> Thanks very much,
> Arjun
>
> On Wed, Aug 14, 2019 at 6:42 AM Konstantine Karantasis <
> konstant...@confluent.io> wrote:
>
> > And one thing I forgot is also related to Chris's comment above. I agree
> > that an example on how a user is expected to set the log level (for
> > instance to DEBUG) would be nice, even if it's showing only one out of
> the
> > many possible ways to achieve that.
> >
> > - Konstantine
> >
> > On Wed, Aug 14, 2019 at 4:38 PM Konstantine Karantasis <
> > konstant...@confluent.io> wrote:
> >
> > >
> > > Thanks Arjun for tackling the need to support this very useful feature.
> > >
> > > One thing I noticed while reading the KIP is that I would have loved to
> > > see more info regarding how this proposal depends on the underlying
> > logging
> > > APIs and implementations. For instance, my understanding is that slf4j
> > can
> > > not be leveraged and that the logging framework needs to be pegged to
> > log4j
> > > explicitly (or another logging implementation). Correct me if I'm
> wrong,
> > > but if such a dependency is introduced I believe it's worth mentioning.
> > >
> > > Additionally, if the above is correct, there are differences in log4j's
> > > APIs between version 1 and version 2. In version 2, Logger#setLevel
> > method
> > > has been removed from the Logger interface and in order to set the log
> > > level programmatically the Configurator class needs to used, which as
> > > stated in the FAQ (
> > > https://logging.apache.org/log4j/2.x/faq.html#reconfig_level_from_code
> )
> > > it's not part of log4j2's public API. Is this a concern? I believe that
> > > even if these are implementation specific details for the wrappers
> > > introduced by this KIP (which to a certain extent they are), a mention
> in
> > > the KIP text and a few references would be useful to understand the
> > changes
> > > and the dependencies introduced by this proposal.
> > >
> > > And a few minor comments:
> > > - Is there any specific reason that object types were preferred in the
> > > proposed interface compared to primitive types? My understanding is
> that
> > > `null` is not expected as a return value.
> > > - Related to the above, I think it'd be nice for the javadoc to mention
> > > when a parameter is not expected to be `null` with an appropriate
> comment
> > > (e.g. foo bar etc; may not be null)
> > >
> > > Cheers,
> > > Konstantine
> > >
> > > On Tue, Aug 6, 2019 at 9:34 AM Cyrus Vafadari 
> > wrote:
> > >
> > >> This looks like a useful feature, the strategy makes sense, and the
> KIP
> > is
> > >> thorough and nicely written. Thanks!
> > >>
> > >> Cyrus
> > >>
> > >> On Thu, Aug 1, 2019, 12:40 PM Chris Egerton 
> > wrote:
> > >>
> > >> > Thanks Arjun! Looks good to me.
> > >> >
> > >> > On Thu, Aug 1, 2019 at 12:33 PM Arjun Satish <
> arjun.sat...@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Thanks for the feedback, Chris!
> > >> > >
> > >> > > Yes, the example is pretty much how Connect will use the new
> > feature.
> > >> > > Tweaked the section to make this more clear.
> > >> > >
> > >> > > Best,
> > >> > >
> > >> > > On Fri, Jul 26, 2019 at 11:52 AM Chris Egerton <
> chr...@confluent.io
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Hi Arjun,
> > >> > > >
> > >> > > > This looks great. The changes to public interface are pretty
> small
> > >> and
> > >> > > > moving the Log4jController class into the clients package seems
> > like
> > >> > the
> > >> > > > right way to go. One question I have--it looks like the purpose
> of
> > >> this
> > >> > > KIP
> > >> > > > is to enable dynamic setting of log levels in the Connect
> > framework,
> > >> > but
> > >> > > > it's not clear how the Connect framework will use that new
> > utility.
> > >> Is
> > >> > > the
> > >> > > > "Example Usage" section (which involves invoking the utility
> with
> > a
> > >> > > > namespace of "kafka.connect") actually meant to be part of the
> > >> proposed
> > >> > > > 

[jira] [Created] (KAFKA-8819) Plugin path for converters not working as intended

2019-08-19 Thread Magesh kumar Nandakumar (Jira)
Magesh kumar Nandakumar created KAFKA-8819:
--

 Summary: Plugin path for converters not working as intended
 Key: KAFKA-8819
 URL: https://issues.apache.org/jira/browse/KAFKA-8819
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Reporter: Magesh kumar Nandakumar
Assignee: Magesh kumar Nandakumar


KafakConnect allows all plugins to be available via a plugin path mechanism. 
This allows for classpath isolation. This is not working as designed under the 
following circumstances for Converters

 

I have 2 directories under plugin path `connector1` and `connector2`. I intend 
to use AvroConverter and its available in both the plugin directories. Under 
these circumstances, the Worker attempts to create the Converter available in 
the plugin director first which should ideally be deterministic but it's not 
because of the following reasons:-

 

[https://github.com/apache/kafka/blob/aa4ba8eee8e6f52a9d80a98fb2530b5bcc1b9a11/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/Worker.java#L421]
 would lead to all configs of type Class to be loaded and this would mean that 
they are not loaded in the context of the connectors plugin loader. IIUC, the 
current loaded would be the DelegatingClassLoader . This would mean that the 
AvroConverter could potentially be loaded from connector2 plugin path while 
loading the class. This should be made deterministic as intended. Also, there 
could be instances where the Converter itself could potentially be using 
ServiceLoader and for that to work when the Converter is created & configured 
the current class loader should be set to the one corresponding to the 
converter.

 

 

 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Build failed in Jenkins: kafka-trunk-jdk11 #765

2019-08-19 Thread Apache Jenkins Server
See 

--
[...truncated 2.09 MB...]

org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldThrowNullPointerOnTransformValuesWithKeyWhenMaterializedIsNull PASSED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowIsWithinThisWindow STARTED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowIsWithinThisWindow PASSED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldNotOverlapIfOtherWindowIsBeforeThisWindow STARTED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldNotOverlapIfOtherWindowIsBeforeThisWindow PASSED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
cannotCompareSessionWindowWithDifferentWindowType STARTED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
cannotCompareSessionWindowWithDifferentWindowType PASSED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowContainsThisWindow STARTED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowContainsThisWindow PASSED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowStartIsWithinThisWindow STARTED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowStartIsWithinThisWindow PASSED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowEndIsWithinThisWindow STARTED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldOverlapIfOtherWindowEndIsWithinThisWindow PASSED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldNotOverlapIsOtherWindowIsAfterThisWindow STARTED

org.apache.kafka.streams.kstream.internals.SessionWindowTest > 
shouldNotOverlapIsOtherWindowIsAfterThisWindow PASSED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testSendingOldValue STARTED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testSendingOldValue PASSED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testNotSendingOldValue STARTED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testNotSendingOldValue PASSED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testQueryableValueGetter STARTED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testQueryableValueGetter PASSED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testQueryableKTable STARTED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > 
testQueryableKTable PASSED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > testKTable 
STARTED

org.apache.kafka.streams.kstream.internals.KTableMapValuesTest > testKTable 
PASSED

org.apache.kafka.streams.kstream.internals.graph.TableProcessorNodeTest > 
shouldConvertToStringWithNullStoreBuilder STARTED

org.apache.kafka.streams.kstream.internals.graph.TableProcessorNodeTest > 
shouldConvertToStringWithNullStoreBuilder PASSED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldBeAbleToProcessNestedMultipleKeyChangingNodes STARTED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldBeAbleToProcessNestedMultipleKeyChangingNodes PASSED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldNotOptimizeWhenAThroughOperationIsDone STARTED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldNotOptimizeWhenAThroughOperationIsDone PASSED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldNotOptimizeWithValueOrKeyChangingOperatorsAfterInitialKeyChange STARTED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldNotOptimizeWithValueOrKeyChangingOperatorsAfterInitialKeyChange PASSED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldBeAbleToBuildTopologyIncrementally STARTED

org.apache.kafka.streams.kstream.internals.graph.StreamsGraphTest > 
shouldBeAbleToBuildTopologyIncrementally PASSED

org.apache.kafka.streams.kstream.internals.graph.GraphGraceSearchUtilTest > 
shouldThrowOnNull STARTED

org.apache.kafka.streams.kstream.internals.graph.GraphGraceSearchUtilTest > 
shouldThrowOnNull PASSED

org.apache.kafka.streams.kstream.internals.graph.GraphGraceSearchUtilTest > 
shouldExtractGraceFromKStreamWindowAggregateNode STARTED

org.apache.kafka.streams.kstream.internals.graph.GraphGraceSearchUtilTest > 
shouldExtractGraceFromKStreamWindowAggregateNode PASSED

org.apache.kafka.streams.kstream.internals.graph.GraphGraceSearchUtilTest > 
shouldExtractGraceFromSessionAncestorThroughStatefulParent STARTED

org.apache.kafka.streams.kstream.internals.graph.GraphGraceSearchUtilTest > 
shouldExtractGraceFromSessionAncestorThroughStatefulParent PASSED


[DISCUSS] KIP-509: Rebalance and restart Producers

2019-08-19 Thread W. D.
At the moment producers do not get any support when restarting:
where should it start producing data from?
what data had been committed successfully in the past?

and also for load distribution there is no support in the Kafka client
itself.
how many producers for a single source are currently active?
which producer reads which tables or database table partitions?

Doing something similar as the consumer provides, this can be solved and be
made more convenient for the producer developer.


Re: [DISCUSS] KIP-500: Replace ZooKeeper with a Self-Managed Metadata Quorum

2019-08-19 Thread David Arthur
Thanks for the KIP, Colin. This looks great!

I really like the idea of separating the Controller and Broker JVMs.

As you alluded to above, it might be nice to have a separate
broker-registration API to avoid overloading the metadata fetch API.

When a broker gets a metadata delta, will it be a sequence of deltas since
the last update or a cumulative delta since the last update? Will we
include any kind of integrity check on the deltas to ensure the brokers
have applied them correctly? Perhaps this will be addressed in one of the
follow-on KIPs.

Thanks!

On Fri, Aug 9, 2019 at 1:17 PM Colin McCabe  wrote:

> Hi Mickael,
>
> Thanks for taking a look.
>
> I don't think we want to support that kind of multi-tenancy at the
> controller level.  If the cluster is small enough that we want to pack the
> controller(s) with something else, we could run them alongside the brokers,
> or possibly inside three of the broker JVMs.
>
> best,
> Colin
>
>
> On Wed, Aug 7, 2019, at 10:37, Mickael Maison wrote:
> > Thank Colin for kickstarting this initiative.
> >
> > Just one question.
> > - A nice feature of Zookeeper is the ability to use chroots and have
> > several Kafka clusters use the same Zookeeper ensemble. Is this
> > something we should keep?
> >
> > Thanks
> >
> > On Mon, Aug 5, 2019 at 7:44 PM Colin McCabe  wrote:
> > >
> > > On Mon, Aug 5, 2019, at 10:02, Tom Bentley wrote:
> > > > Hi Colin,
> > > >
> > > > Thanks for the KIP.
> > > >
> > > > Currently ZooKeeper provides a convenient notification mechanism for
> > > > knowing that broker and topic configuration has changed. While
> KIP-500 does
> > > > suggest that incremental metadata update is expected to come to
> clients
> > > > eventually, that would seem to imply that for some number of
> releases there
> > > > would be no equivalent mechanism for knowing about config changes.
> Is there
> > > > any thinking at this point about how a similar notification might be
> > > > provided in the future?
> > >
> > > We could eventually have some inotify-like mechanism where clients
> could register interest in various types of events and got notified when
> they happened.  Reading the metadata log is conceptually simple.  The main
> complexity would be in setting up an API that made sense and that didn't
> unduly constrain future implementations.  We'd have to think carefully
> about what the real use-cases for this were, though.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Tom
> > > >
> > > > On Mon, Aug 5, 2019 at 3:49 PM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hey Colin,
> > > > >
> > > > > I think this is a long-awaited KIP, thanks for driving it. I'm
> excited to
> > > > > see this in Kafka once. I collected my questions (and I accept the
> "TBD"
> > > > > answer as they might be a bit deep for this high level :) ).
> > > > > 1.) Are there any specific reasons for the Controller just
> periodically
> > > > > persisting its state on disk periodically instead of
> asynchronously with
> > > > > every update? Wouldn't less frequent saves increase the chance for
> missing
> > > > > a state change if the controller crashes between two saves?
> > > > > 2.) Why can't we allow brokers to fetch metadata from the follower
> > > > > controllers? I assume that followers would have up-to-date
> information
> > > > > therefore brokers could fetch from there in theory.
> > > > >
> > > > > Thanks,
> > > > > Viktor
> > > > >
> > > > > On Sun, Aug 4, 2019 at 6:58 AM Boyang Chen <
> reluctanthero...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thanks for explaining Ismael! Breaking down into follow-up KIPs
> sounds
> > > > > like
> > > > > > a good idea.
> > > > > >
> > > > > > On Sat, Aug 3, 2019 at 10:14 AM Ismael Juma 
> wrote:
> > > > > >
> > > > > > > Hi Boyang,
> > > > > > >
> > > > > > > Yes, there will be several KIPs that will discuss the items you
> > > > > describe
> > > > > > in
> > > > > > > detail. Colin, it may be helpful to make this clear in the KIP
> 500
> > > > > > > description.
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Sat, Aug 3, 2019 at 9:32 AM Boyang Chen <
> reluctanthero...@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Colin for initiating this important effort!
> > > > > > > >
> > > > > > > > One question I have is whether we have a session discussing
> the
> > > > > > > controller
> > > > > > > > failover in the new architecture? I know we are using Raft
> protocol
> > > > > to
> > > > > > > > failover, yet it's still valuable to discuss the steps new
> cluster is
> > > > > > > going
> > > > > > > > to take to reach the stable stage again, so that we could
> easily
> > > > > > measure
> > > > > > > > the availability of the metadata servers.
> > > > > > > >
> > > > > > > > Another suggestion I have is to write a step-by-step design
> doc like
> > > > > > what
> > > > > > > > we did in KIP-98
> > > > > > > > <
> > > 

[jira] [Created] (KAFKA-8818) CreatePartitions Request protocol documentation

2019-08-19 Thread Jira
Fábio Silva created KAFKA-8818:
--

 Summary: CreatePartitions Request protocol documentation
 Key: KAFKA-8818
 URL: https://issues.apache.org/jira/browse/KAFKA-8818
 Project: Kafka
  Issue Type: Bug
Reporter: Fábio Silva






--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] KIP-507: Securing Internal Connect REST Endpoints

2019-08-19 Thread Chris Egerton
Hi Ryanne,

The reasoning for this is provided in the KIP: "There would be no clear way
to achieve consensus amongst the workers in a cluster on whether to switch
to this new behavior." To elaborate on this--it would be bad if a follower
worker began writing task configurations to the topic while the leader
continued to only listen on the internal REST endpoint. It's necessary to
ensure that every worker in a cluster supports this new behavior before
switching it on.

To be fair, it is technically possible to achieve consensus without using
the group coordination protocol by adding new configurations and using a
multi-phase rolling upgrade, but the operational complexity of this
approach would significantly complicate things for the default case of just
wanting to bump your Connect cluster to the next version without having to
alter your existing worker config files and with only a single restart per
worker. The proposed approach provides a much simpler user experience for
the most common use case and doesn't impose much additional complexity for
other use cases.

I've updated the KIP to expand on points from your last emails; let me know
what you think.

Cheers,

Chris

On Thu, Aug 15, 2019 at 4:53 PM Ryanne Dolan  wrote:

> Thanks Chris, that makes sense.
>
> I know you have already considered this, but I'm not convinced we should
> rule out using Kafka topics for this purpose. That would enable the same
> level of security without introducing any new authentication or
> authorization mechanisms (your keys). And as you say, you'd need to lock
> down Connect's topics and groups anyway.
>
> Can you explain what you mean when you say using Kafka topics would require
> "reworking the group coordination protocol"? I don't see how these are
> related. Why would it matter if the workers sent Kafka messages vs POST
> requests among themselves?
>
> Ryanne
>
> On Thu, Aug 15, 2019 at 3:57 PM Chris Egerton  wrote:
>
> > Hi Ryanne,
> >
> > Yes, if the Connect group is left unsecured then that is a potential
> > vulnerability. However, in a truly secure Connect cluster, the group
> would
> > need to be secured anyways to prevent attackers from joining the group
> with
> > the intent to either snoop on connector/task configurations or bring the
> > cluster to a halt by spamming the group with membership requests and then
> > not running the assigned connectors/tasks. Additionally, for a Connect
> > cluster to be secure, access to internal topics (for configs, offsets,
> and
> > statuses) would also need to be restricted so that attackers could not,
> > e.g., write arbitrary connector/task configurations to the configs topic.
> > This is all currently possible in Kafka with the use of ACLs.
> >
> > I think the bottom line here is that there's a number of steps that need
> to
> > be taken to effectively lock down a Connect cluster; the point of this
> KIP
> > is to close a loophole that exists even after all of those steps are
> taken,
> > not to completely secure this one vulnerability even when no other
> security
> > measures are taken.
> >
> > Cheers,
> >
> > Chris
> >
> > On Wed, Aug 14, 2019 at 10:56 PM Ryanne Dolan 
> > wrote:
> >
> > > Chris, I don't understand how the rebalance protocol can be used to
> give
> > > out session tokens in a secure way. It seems that any attacker could
> just
> > > join the group and sign requests with the provided token. Am I missing
> > > something?
> > >
> > > Ryanne
> > >
> > > On Wed, Aug 14, 2019, 2:31 PM Chris Egerton 
> wrote:
> > >
> > > > The KIP page can be found at
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-507%3A+Securing+Internal+Connect+REST+Endpoints
> > > > ,
> > > > by the way. Apologies for neglecting to include it in my initial
> email!
> > > >
> > > > On Wed, Aug 14, 2019 at 12:29 PM Chris Egerton 
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I'd like to start discussion on a KIP to secure the internal "POST
> > > > > /connectors//tasks" endpoint for the Connect framework. The
> > > > proposed
> > > > > changes address a vulnerability in the framework in its current
> state
> > > > that
> > > > > allows malicious users to write arbitrary task configurations for
> > > > > connectors; it is vital that this issue be addressed in order for
> any
> > > > > Connect cluster to be secure.
> > > > >
> > > > > Looking forward to your thoughts,
> > > > >
> > > > > Chris
> > > > >
> > > >
> > >
> >
>


ACL for group creation?

2019-08-19 Thread Adam Bellemare
Hi All

I am looking through the Confluent docs and core Kafka docs and don't see
an ACL for group creation:
https://docs.confluent.io/current/kafka/authorization.html#acl-format
and
https://kafka.apache.org/documentation/#security_authz

My scenario is simple: We use the consumer group as the means of
identifying a single application, including tooling for managing
application resets, offset management, lag monitoring, etc. We often have
situations where someone resets their consumer group by appending an
incremented integer ("cg" to "cg1"), but it throws the rest of the
monitoring and management tooling out of whack.

Is there a reason why we do not have ACL-based CREATE restrictions to a
particular consumer group? I am willing to do the work to implement this
and test it out, but I wanted to validate that there isn't a reason I am
missing.

Thanks
Adam


Re: Fwd: Warning from dev@kafka.apache.org

2019-08-19 Thread Matthias J. Sax
It seem the ezmlm bot was not able to deliver some emails from the
mailing list to you.

It's unclear what the root cause is though.

Ezmlm sends out this notification email from below to you to make you
aware that you may have missed some emails. Check the archive to read
those emails. Otherwise you can ignore the notification.


-Matthias

On 8/16/19 10:20 PM, Manish G wrote:
> ...why do I get this mail?
> 
> -- Forwarded message -
> From: 
> Date: Sat, Aug 17, 2019 at 10:25 AM
> Subject: Warning from dev@kafka.apache.org
> To: 
> 
> 
> Hi! This is the ezmlm program. I'm managing the
> dev@kafka.apache.org mailing list.
> 
> I'm working for my owner, who can be reached
> at dev-ow...@kafka.apache.org.
> 
> 
> Messages to you from the dev mailing list seem to
> have been bouncing. I've attached a copy of the first bounce
> message I received.
> 
> If this message bounces too, I will send you a probe. If the probe bounces,
> I will remove your address from the dev mailing list,
> without further notice.
> 
> 
> I've kept a list of which messages from the dev mailing list have
> bounced from your address.
> 
> Copies of these messages may be in the archive.
> To retrieve a set of messages 123-145 (a maximum of 100 per request),
> send a short message to:
>
> 
> To receive a subject and author list for the last 100 or so messages,
> send a short message to:
>
> 
> Here are the message numbers:
> 
>106348
>106424
>106493
>106491
>106380
> 
> --- Enclosed is a copy of the bounce message I received.
> 
> Return-Path: <>
> Received: (qmail 61167 invoked for bounce); 6 Aug 2019 23:28:32 -
> Date: 6 Aug 2019 23:28:32 -
> From: mailer-dae...@apache.org
> To: dev-return-1063...@kafka.apache.org
> Subject: failure notice
> 



signature.asc
Description: OpenPGP digital signature


[jira] [Created] (KAFKA-8817) Flaky Test KafkaProducerTest.testCloseIsForcedOnPendingAddOffsetRequest

2019-08-19 Thread Chris Pettitt (Jira)
Chris Pettitt created KAFKA-8817:


 Summary: Flaky Test 
KafkaProducerTest.testCloseIsForcedOnPendingAddOffsetRequest
 Key: KAFKA-8817
 URL: https://issues.apache.org/jira/browse/KAFKA-8817
 Project: Kafka
  Issue Type: Bug
  Components: core
Reporter: Chris Pettitt


Error:
{code:java}
org.junit.runners.model.TestTimedOutException: test timed out after 5000 
milliseconds
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1260)
at 
org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1190)
at 
org.apache.kafka.clients.producer.KafkaProducer.close(KafkaProducer.java:1167)
at 
org.apache.kafka.clients.producer.KafkaProducerTest.testCloseIsForcedOnPendingAddOffsetRequest(KafkaProducerTest.java:894)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.lang.Thread.run(Thread.java:748)
 {code}

Currently 100% reproducible locally when running the whole test suite. Does not 
repro when running this test class individually.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: [DISCUSS] KIP-482: The Kafka Protocol should Support Optional Fields

2019-08-19 Thread Colin McCabe
Hi Satish,

I wasn't originally going to propose supporting the struct type, although 
perhaps we could consider it.

In general, supporting a struct containing an array has the same serialization 
issues as just supporting the array.

Probably, we should just have a two-pass serialization mechanism where we 
calculate lengths first and then write out bytes.  If we do that, we can avoid 
having any restrictions on tagged fields vs. regular fields.  I will take a 
look at how complex this would be.

best,
Colin


On Sun, Aug 18, 2019, at 22:27, Satish Duggana wrote:
> Please read struct type as a complex record type in my earlier mail.
> The complex type seems to be defined as Schema[1] in the protocol
> types.
> 
> 1. 
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/protocol/types/Schema.java#L27
> 
> 
> On Mon, Aug 19, 2019 at 9:46 AM Satish Duggana  
> wrote:
> >
> > Sorry! Colin, I may not have been clear in my earlier query about
> > optional field type restriction. It is mentioned in one of your
> > replies "optional fields are serialized starting with their total
> > length". This brings the question of whether optional fields support
> > struct types (with or without array values). It seems struct types are
> > currently not serialized with total length. I may be missing something
> > here.
> >
> > Thanks,
> > Satish.
> >
> >
> > On Wed, Aug 14, 2019 at 8:03 AM Satish Duggana  
> > wrote:
> > >
> > > Hi Colin,
> > > Thanks for the KIP. Optional fields and var length encoding support is a 
> > > great
> > > improvement for the protocol.
> > >
> > > >>Optional fields can have any type, except that they cannot be arrays.
> > > Note that the restriction against having tagged arrays is just to simplify
> > > serialization.  We can relax this restriction in the future without 
> > > changing
> > > the protocol on the wire.
> > >
> > > Can an Optional field have a struct type which internally contains an 
> > > array
> > > field at any level?
> > >
> > > Thanks,
> > > Satish.
> > >
> > >
> > >
> > > On Tue, Aug 13, 2019 at 11:49 PM David Jacot  wrote:
> > > >
> > > > Hi Colin,
> > > >
> > > > Thank you for the KIP! Things are well explained!. It is huge 
> > > > improvement
> > > > for the Kafka protocol. I have few comments on the proposal:
> > > >
> > > > 1. The interleaved tag/length header sounds like a great optimisation 
> > > > as it
> > > > would be shorter on average. The downside, as
> > > > you already pointed out, is that it makes the decoding and the specs 
> > > > more
> > > > complex. Personally, I would also favour using two
> > > > vaints in this particular case to keep things simple.
> > > >
> > > > 2. As discussed, I wonder if it would make sense to extend to KIP to 
> > > > also
> > > > support optional fields in the Record Header. I think
> > > > that it could be interesting to have such capability for common fields
> > > > across all the requests or responses (e.g. tracing id).
> > > >
> > > > Regards,
> > > > David
> > > >
> > > >
> > > >
> > > > On Tue, Aug 13, 2019 at 10:00 AM Jason Gustafson  
> > > > wrote:
> > > >
> > > > > > Right, I was planning on doing exactly that for all the 
> > > > > > auto-generated
> > > > > RPCs. For the manual RPCs, it would be a lot of work. It’s probably a
> > > > > better use of time to convert the manual ones to auto gen first (with 
> > > > > the
> > > > > possible exception of Fetch/Produce, where the ROI may be higher for 
> > > > > the
> > > > > manual work)
> > > > >
> > > > > Yeah, that makes sense. Maybe we can include the version bump for all 
> > > > > RPCs
> > > > > in this KIP, but we can implement it lazily as the protocols are 
> > > > > converted.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Mon, Aug 12, 2019 at 7:16 PM Colin McCabe  
> > > > > wrote:
> > > > >
> > > > > > On Mon, Aug 12, 2019, at 11:22, Jason Gustafson wrote:
> > > > > > > Hi Colin,
> > > > > > >
> > > > > > > Thanks for the KIP! This is a significant improvement. One of my
> > > > > personal
> > > > > > > interests in this proposal is solving the compatibility problems 
> > > > > > > we
> > > > > have
> > > > > > > with the internal schemas used to define consumer offsets and
> > > > > transaction
> > > > > > > metadata. Currently we have to guard schema bumps with the 
> > > > > > > inter-broker
> > > > > > > protocol format. Once the format is bumped, there is no way to
> > > > > downgrade.
> > > > > > > By fixing this, we can potentially begin using the new schemas 
> > > > > > > before
> > > > > the
> > > > > > > IBP is bumped while still allowing downgrade.
> > > > > > >
> > > > > > > There are a surprising number of other situations we have 
> > > > > > > encountered
> > > > > > this
> > > > > > > sort of problem. We have hacked around it in special cases by 
> > > > > > > allowing
> > > > > > > nullable fields to the end of the schema, but this is not really 
> > > > > > > an
> > > > > > > extensible 

Re: [VOTE] KIP-503: deleted topics metric

2019-08-19 Thread David Arthur
Hello everyone, I'm going to close out the voting on this KIP. The results
follow:

* 3 binding +1 votes from Harsha, Manikumar, and Gwen
* 5 non-binding +1 votes from Stanislov, Mickael, Robert, David Jacot, and
Satish
* No -1 votes

Which gives us a passing vote. Thanks, everyone!

-David

On Sun, Aug 18, 2019 at 1:22 PM Gwen Shapira  wrote:

> +1 (binding)
> This will be most useful. Thank you.
>
> On Tue, Aug 13, 2019 at 12:08 PM David Arthur 
> wrote:
> >
> > Hello all,
> >
> > I'd like to start the vote on KIP-503
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-503%3A+Add+metric+for+number+of+topics+marked+for+deletion
> >
> > Thanks!
> > David
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>


-- 
David Arthur


Re: [DISCUSS] KIP-482: The Kafka Protocol should Support Optional Fields

2019-08-19 Thread Colin McCabe
Hi David,

Good point.  I think it makes the most sense to put the tagged fields sections 
at the end of the header.  For consistency's sake, we can also put them at the 
end of the structures / requests / responses as well.  I've updated the KIP.

best,
Colin

On Mon, Aug 19, 2019, at 02:23, David Jacot wrote:
> Hi Colin,
> 
> Thank you for the updated KIP.
> 
> wrt. Request and Response Headers v1, where do you plan to put the optional
> fields? You mention that all the Requests and Responses will start with a
> tagged field buffer. Is it for that purpose?
> 
> Best,
> David
> 
> On Mon, Aug 19, 2019 at 7:27 AM Satish Duggana 
> wrote:
> 
> > Please read struct type as a complex record type in my earlier mail.
> > The complex type seems to be defined as Schema[1] in the protocol
> > types.
> >
> > 1.
> > https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/protocol/types/Schema.java#L27
> >
> >
> > On Mon, Aug 19, 2019 at 9:46 AM Satish Duggana 
> > wrote:
> > >
> > > Sorry! Colin, I may not have been clear in my earlier query about
> > > optional field type restriction. It is mentioned in one of your
> > > replies "optional fields are serialized starting with their total
> > > length". This brings the question of whether optional fields support
> > > struct types (with or without array values). It seems struct types are
> > > currently not serialized with total length. I may be missing something
> > > here.
> > >
> > > Thanks,
> > > Satish.
> > >
> > >
> > > On Wed, Aug 14, 2019 at 8:03 AM Satish Duggana 
> > wrote:
> > > >
> > > > Hi Colin,
> > > > Thanks for the KIP. Optional fields and var length encoding support is
> > a great
> > > > improvement for the protocol.
> > > >
> > > > >>Optional fields can have any type, except that they cannot be arrays.
> > > > Note that the restriction against having tagged arrays is just to
> > simplify
> > > > serialization.  We can relax this restriction in the future without
> > changing
> > > > the protocol on the wire.
> > > >
> > > > Can an Optional field have a struct type which internally contains an
> > array
> > > > field at any level?
> > > >
> > > > Thanks,
> > > > Satish.
> > > >
> > > >
> > > >
> > > > On Tue, Aug 13, 2019 at 11:49 PM David Jacot 
> > wrote:
> > > > >
> > > > > Hi Colin,
> > > > >
> > > > > Thank you for the KIP! Things are well explained!. It is huge
> > improvement
> > > > > for the Kafka protocol. I have few comments on the proposal:
> > > > >
> > > > > 1. The interleaved tag/length header sounds like a great
> > optimisation as it
> > > > > would be shorter on average. The downside, as
> > > > > you already pointed out, is that it makes the decoding and the specs
> > more
> > > > > complex. Personally, I would also favour using two
> > > > > vaints in this particular case to keep things simple.
> > > > >
> > > > > 2. As discussed, I wonder if it would make sense to extend to KIP to
> > also
> > > > > support optional fields in the Record Header. I think
> > > > > that it could be interesting to have such capability for common
> > fields
> > > > > across all the requests or responses (e.g. tracing id).
> > > > >
> > > > > Regards,
> > > > > David
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Aug 13, 2019 at 10:00 AM Jason Gustafson 
> > wrote:
> > > > >
> > > > > > > Right, I was planning on doing exactly that for all the
> > auto-generated
> > > > > > RPCs. For the manual RPCs, it would be a lot of work. It’s
> > probably a
> > > > > > better use of time to convert the manual ones to auto gen first
> > (with the
> > > > > > possible exception of Fetch/Produce, where the ROI may be higher
> > for the
> > > > > > manual work)
> > > > > >
> > > > > > Yeah, that makes sense. Maybe we can include the version bump for
> > all RPCs
> > > > > > in this KIP, but we can implement it lazily as the protocols are
> > converted.
> > > > > >
> > > > > > -Jason
> > > > > >
> > > > > > On Mon, Aug 12, 2019 at 7:16 PM Colin McCabe 
> > wrote:
> > > > > >
> > > > > > > On Mon, Aug 12, 2019, at 11:22, Jason Gustafson wrote:
> > > > > > > > Hi Colin,
> > > > > > > >
> > > > > > > > Thanks for the KIP! This is a significant improvement. One of
> > my
> > > > > > personal
> > > > > > > > interests in this proposal is solving the compatibility
> > problems we
> > > > > > have
> > > > > > > > with the internal schemas used to define consumer offsets and
> > > > > > transaction
> > > > > > > > metadata. Currently we have to guard schema bumps with the
> > inter-broker
> > > > > > > > protocol format. Once the format is bumped, there is no way to
> > > > > > downgrade.
> > > > > > > > By fixing this, we can potentially begin using the new schemas
> > before
> > > > > > the
> > > > > > > > IBP is bumped while still allowing downgrade.
> > > > > > > >
> > > > > > > > There are a surprising number of other situations we have
> > encountered
> > > > > > > this
> > > > > > > > sort of problem. We have hacked around it in 

Re: [DISCUSS] KIP-486 Support for pluggable KeyStore and TrustStore

2019-08-19 Thread Colin McCabe
Hi Maulin,

A lot of JSSE providers don't implement custom algorithms.  Spire is a good 
example of a JSSE provider that doesn't, and yet is still useful to many 
people.  Your JSSE provider can work fine even if it doesn't implement a custom 
algorithm.

Maybe I'm missing something, but I don't understand the discussion of 
Security.insertProviderAt() that you included.  SslEngineBuilder doesn't use 
that API to get the security provider.  Instead, it calls 
"SSLContext.getInstance(protocol, provider)", where provider is the name of the 
provider.

best,
Colin


On Sat, Aug 17, 2019, at 20:13, Maulin Vasavada wrote:
> On top of everything above I feel strongly to add the 4th point which is
> based on Java APIs for TrustManagerFactory.init(KeyStore) (
> https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/TrustManagerFactory.html#init(java.security.KeyStore))
> and KeyManagerFactory.init(KeyStore, char[]) (
> https://docs.oracle.com/javase/7/docs/api/javax/net/ssl/KeyManagerFactory.html#init(java.security.KeyStore,%20char[])
> ).
> 
> 4. The above APIs are intended to support providing "trust/key material"
> from the user without having to write their own TrustManager/KeyManagers.
> To quote from the TrustManagerFactory.init()'s documentation "Initializes
> this factory with a source of certificate authorities and related trust
> material."
> To quote from the KeyManagerFactory.init()'s documentation "Initializes
> this factory with a source of key material."
> 
> Based on this it is clear that there is a flexibility provided by Java to
> to enable developers to provide the required trust/key material loaded from
> "anywhere" without requiring them to write custom provider OR trust/key
> managers. This same flexibility is reflected in Kafka code also where it
> loads the trust/keys from a local file and doesn't require writing a
> Provider necessarily. If we do NOT have a custom algorithm, it makes less
> sense to write a Provider.
> 
> Thanks
> Maulin
> 
> 
> 
> 
> 
> 
> On Thu, Aug 15, 2019 at 11:45 PM Maulin Vasavada 
> wrote:
> 
> > Hi Harsha/Colin
> >
> > I did the sample with a custom Provider for TrustStoreManager and tried
> > using ssl.provider Kafka config AND the way KIP-492 is suggesting (by
> > adding Provider programmatically instead of relying on
> > ssl.provider+java.security. The below sample is followed by my detailed
> > findings. I'll appreciate if you can go through it carefully and see if you
> > see my point.
> >
> > package providertest;
> >
> > import java.security.Provider;
> >
> > public class MyProvider extends Provider {
> >
> > private static final String name = "MyProvider";
> > private static double version = 1.0d;
> > private static String info = "Maulin's SSL Provider v"+version;
> >
> > public MyProvider() {
> > super(name, version, info);
> > this.put("TrustManagerFactory.PKIX", 
> > "providertest.MyTrustManagerFactory");
> > }
> > }
> >
> >
> >
> > *Details:*
> >
> > KIP-492 documents that it will use Security.addProvider() assuming it will
> > add it as position '0' which is not a correct assumption. The
> > addProvider()'s documentation says it will add it to the last available
> > position. You may want to correct that to say
> > Security.insertProviderAt(provider, 1).
> >
> > Now coming back to our specific discussion,
> >
> > 1. SPIFFE example uses Custom Algorithm - spiffe. Hence when you add that
> > provider in the provider list via Security.addProvider() the position where
> > it gets added doesn't matter (even if you don't end up adding it as first
> > entry) since that is the ONLY provider for SPIFFE specific algorithm you
> > might have.
> >
> > We do *not* have custom algorithm for Key/Trust StoreMangers. Which means
> > we have to use X509, PKIX etc "Standard Algorithms" ((
> > https://docs.oracle.com/javase/7/docs/technotes/guides/security/StandardNames.html))
> > in our provider to override the TrustStoreManager (see my sample code) and
> > KeyStoreManger and KeyManager. This creates another challenge mentioned in
> > the below point.
> >
> > 2. In order to make our Provider for loading custom TrustStore work, we
> > have to add the provider as 'first' in the list since there are others with
> > the same algorithm.
> >
> > However, the programatic way of adding provider
> > (Security.insertProviderAt()) is *not* deterministic for ordering since
> > different code can call the method for a different provider and depending
> > upon the order of the call our provider can be first or pushed down the
> > list. This can happen very well in any client application using Kafka. This
> > is specially problematic for a case when you want to guarantee order for a
> > Provider having "Standard Algorithms".
> >
> > If we add our provider in java.security file that definitely guarantees
> > the order(unless somebody calls removeProvider() which is less likely). But
> > if we add our provider in java.security file it will defeat 

[jira] [Created] (KAFKA-8816) RecordCollector offsets updated indirectly by StreamTask

2019-08-19 Thread Chris Pettitt (Jira)
Chris Pettitt created KAFKA-8816:


 Summary: RecordCollector offsets updated indirectly by StreamTask
 Key: KAFKA-8816
 URL: https://issues.apache.org/jira/browse/KAFKA-8816
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Chris Pettitt
Assignee: Chris Pettitt


Currently it is possible to indirectly update the offsets in 
RecordCollectorImpl via the offset read function:
{code:java}
@Override
public Map offsets() {
return offsets;
} {code}
The offsets here is the a private final field in RecordCollectorImpl. It 
appears that the intent is for this field to be updated only when the producer 
acknowledges an offset. However, because it is handed back in a mutable form, 
it is possible to update offsets through this call, as actually happens today 
in StreamTask:
{code:java}
protected Map activeTaskCheckpointableOffsets() {
final Map checkpointableOffsets = 
recordCollector.offsets();
for (final Map.Entry entry : 
consumedOffsets.entrySet()) {
checkpointableOffsets.putIfAbsent(entry.getKey(), entry.getValue());
}

return checkpointableOffsets;
}{code}
Here it is possible to set a new checkpoint if the topic partition is not 
already in the offsets map, which happens for the input topic when we're using 
optimized topologies and a KTable. The effect is that we continue to checkpoint 
the first offset seen (putIfAbsent).

It seems the correct behavior would be to return a read only view of the 
offsets from RecordCollectorImpl and create a copy of the returned map in 
activeTaskCheckpointableOffsets before we mutate it.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)


Re: all tests passed but still got ‘Some checks were not successful’

2019-08-19 Thread Colin McCabe
I think we've been having some test runs hit the 4-hour time limit recently.  I 
saw a few of those as well.

This is just a guess, but I bet that there is a test that is hanging, and has 
no time limit configured.

We should probably have time limits on every test.  The person writing the test 
should know what the most appropriate time limit is (30 seconds, 5 minutes, 
whatever) and should put in a timeout with code like this:

>@Rule
>final public Timeout globalTimeout = Timeout.millis(12);

That will avoid wasting a lot of Jenkins (and developer!) time.

best,
Colin


On Sun, Aug 18, 2019, at 08:19, M. Manna wrote:
> Could you send a retest ? Add “Retest this please” comment. It’ll kick off.
> 
> On Sun, 18 Aug 2019 at 16:16, hong mao  wrote:
> 
> > Hi all,
> > I submitted a PR and all testcases passed in Jenkins, but I still got a
> > 'Some checks were not successful' tip. Could anybody give some advices?
> > Here's the PR link https://github.com/apache/kafka/pull/7153
> > [image: image.png]
> >
> > Thanks!
> >
>


Re: [VOTE] KIP-504 - Add new Java Authorizer Interface

2019-08-19 Thread David Jacot
+1 (non-binding)

Thanks for the KIP, Rajini.

Best,
David

On Sun, Aug 18, 2019 at 9:42 PM Mickael Maison 
wrote:

> +1 (non binding)
> Thanks for the KIP!
>
> On Sun, Aug 18, 2019 at 3:05 PM Ron Dagostino  wrote:
> >
> > +1 (non-binding)
> >
> > Thanks, Rajini.
> >
> > Ron
> >
> > On Sat, Aug 17, 2019 at 10:16 AM Harsha Chintalapani 
> > wrote:
> >
> > > +1 (binding).
> > >
> > > Thanks,
> > > Harsha
> > >
> > >
> > > On Sat, Aug 17, 2019 at 2:53 AM, Manikumar 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > +1 (binding).
> > > >
> > > > Thanks for the KIP. LGTM.
> > > >
> > > > Regards,
> > > > Manikumar
> > > >
> > > > On Sat, Aug 17, 2019 at 4:41 AM Colin McCabe 
> wrote:
> > > >
> > > > +1 (binding)
> > > >
> > > > Thanks, Rajini!
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > On Fri, Aug 16, 2019, at 09:52, Rajini Sivaram wrote:
> > > >
> > > > Hi all,
> > > >
> > > > I would like to start the vote for KIP-504:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/
> > > > KIP-504+-+Add+new+Java+Authorizer+Interface
> > > >
> > > > This KIP replaces the Scala Authorizer API with a new Java API
> similar to
> > > > other pluggable APIs in the broker and addresses known limitations
> in the
> > > > existing API.
> > > >
> > > > Thanks for all the feedback!
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > >
> > >
>


Re: [DISCUSS] KIP-482: The Kafka Protocol should Support Optional Fields

2019-08-19 Thread David Jacot
Hi Colin,

Thank you for the updated KIP.

wrt. Request and Response Headers v1, where do you plan to put the optional
fields? You mention that all the Requests and Responses will start with a
tagged field buffer. Is it for that purpose?

Best,
David

On Mon, Aug 19, 2019 at 7:27 AM Satish Duggana 
wrote:

> Please read struct type as a complex record type in my earlier mail.
> The complex type seems to be defined as Schema[1] in the protocol
> types.
>
> 1.
> https://github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/common/protocol/types/Schema.java#L27
>
>
> On Mon, Aug 19, 2019 at 9:46 AM Satish Duggana 
> wrote:
> >
> > Sorry! Colin, I may not have been clear in my earlier query about
> > optional field type restriction. It is mentioned in one of your
> > replies "optional fields are serialized starting with their total
> > length". This brings the question of whether optional fields support
> > struct types (with or without array values). It seems struct types are
> > currently not serialized with total length. I may be missing something
> > here.
> >
> > Thanks,
> > Satish.
> >
> >
> > On Wed, Aug 14, 2019 at 8:03 AM Satish Duggana 
> wrote:
> > >
> > > Hi Colin,
> > > Thanks for the KIP. Optional fields and var length encoding support is
> a great
> > > improvement for the protocol.
> > >
> > > >>Optional fields can have any type, except that they cannot be arrays.
> > > Note that the restriction against having tagged arrays is just to
> simplify
> > > serialization.  We can relax this restriction in the future without
> changing
> > > the protocol on the wire.
> > >
> > > Can an Optional field have a struct type which internally contains an
> array
> > > field at any level?
> > >
> > > Thanks,
> > > Satish.
> > >
> > >
> > >
> > > On Tue, Aug 13, 2019 at 11:49 PM David Jacot 
> wrote:
> > > >
> > > > Hi Colin,
> > > >
> > > > Thank you for the KIP! Things are well explained!. It is huge
> improvement
> > > > for the Kafka protocol. I have few comments on the proposal:
> > > >
> > > > 1. The interleaved tag/length header sounds like a great
> optimisation as it
> > > > would be shorter on average. The downside, as
> > > > you already pointed out, is that it makes the decoding and the specs
> more
> > > > complex. Personally, I would also favour using two
> > > > vaints in this particular case to keep things simple.
> > > >
> > > > 2. As discussed, I wonder if it would make sense to extend to KIP to
> also
> > > > support optional fields in the Record Header. I think
> > > > that it could be interesting to have such capability for common
> fields
> > > > across all the requests or responses (e.g. tracing id).
> > > >
> > > > Regards,
> > > > David
> > > >
> > > >
> > > >
> > > > On Tue, Aug 13, 2019 at 10:00 AM Jason Gustafson 
> wrote:
> > > >
> > > > > > Right, I was planning on doing exactly that for all the
> auto-generated
> > > > > RPCs. For the manual RPCs, it would be a lot of work. It’s
> probably a
> > > > > better use of time to convert the manual ones to auto gen first
> (with the
> > > > > possible exception of Fetch/Produce, where the ROI may be higher
> for the
> > > > > manual work)
> > > > >
> > > > > Yeah, that makes sense. Maybe we can include the version bump for
> all RPCs
> > > > > in this KIP, but we can implement it lazily as the protocols are
> converted.
> > > > >
> > > > > -Jason
> > > > >
> > > > > On Mon, Aug 12, 2019 at 7:16 PM Colin McCabe 
> wrote:
> > > > >
> > > > > > On Mon, Aug 12, 2019, at 11:22, Jason Gustafson wrote:
> > > > > > > Hi Colin,
> > > > > > >
> > > > > > > Thanks for the KIP! This is a significant improvement. One of
> my
> > > > > personal
> > > > > > > interests in this proposal is solving the compatibility
> problems we
> > > > > have
> > > > > > > with the internal schemas used to define consumer offsets and
> > > > > transaction
> > > > > > > metadata. Currently we have to guard schema bumps with the
> inter-broker
> > > > > > > protocol format. Once the format is bumped, there is no way to
> > > > > downgrade.
> > > > > > > By fixing this, we can potentially begin using the new schemas
> before
> > > > > the
> > > > > > > IBP is bumped while still allowing downgrade.
> > > > > > >
> > > > > > > There are a surprising number of other situations we have
> encountered
> > > > > > this
> > > > > > > sort of problem. We have hacked around it in special cases by
> allowing
> > > > > > > nullable fields to the end of the schema, but this is not
> really an
> > > > > > > extensible approach. I'm looking forward to having a better
> option.
> > > > > >
> > > > > > Yeah, this problem keeps coming up.
> > > > > >
> > > > > > >
> > > > > > > With that said, I have a couple questions on the proposal:
> > > > > > >
> > > > > > > 1. For each request API, we need one version bump to begin
> support for
> > > > > > > "flexible versions." Until then, we won't have the option of
> using
> > > > > tagged
> > > > > > > fields even if the 

Permissions to create a KIP

2019-08-19 Thread W. D.
Hi, I would like to create a KIP with my user werner.daehn.

Thanks in advance

Werner