Re: [DISCUSS] KIP-491: Preferred Leader Deprioritized List (Temporary Blacklist)

2019-08-06 Thread George Li
 Hi Colin, Satish, Stanislav, 

Did I answer all your comments/concerns for KIP-491 ?  Please let me know if 
you have more questions regarding this feature.  I would like to start coding 
soon. I hope this feature can get into the open source trunk so every time we 
upgrade Kafka in our environment, we don't need to cherry pick this.

BTW, I have added below in KIP-491 for auto.leader.rebalance.enable behavior 
with the new Preferred Leader "Blacklist".  

"When auto.leader.rebalance.enable is enabled.  The broker(s) in the preferred 
leader "blacklist" should be excluded from being elected leaders. "


Thanks,
George

On Friday, August 2, 2019, 08:02:07 PM PDT, George Li 
 wrote:  
 
  Hi Colin,
Thanks for looking into this KIP.  Sorry for the late response. been busy. 

If a cluster has MAMY topic partitions, moving this "blacklist" broker to the 
end of replica list is still a rather "big" operation, involving submitting 
reassignments.  The KIP-491 way of blacklist is much simpler/easier and can 
undo easily without changing the replica assignment ordering. 
Major use case for me, a failed broker got swapped with new hardware, and 
starts up as empty (with latest offset of all partitions), the SLA of retention 
is 1 day, so before this broker is up to be in-sync for 1 day, we would like to 
blacklist this broker from serving traffic. after 1 day, the blacklist is 
removed and run preferred leader election.  This way, no need to run 
reassignments before/after.  This is the "temporary" use-case.

There are use-cases that this Preferred Leader "blacklist" can be somewhat 
permanent, as I explained in the AWS data center instances Vs. on-premises data 
center bare metal machines (heterogenous hardware), that the AWS broker_ids 
will be blacklisted.  So new topics created,  or existing topic expansion would 
not make them serve traffic even they could be the preferred leader. 

Please let me know there are more question. 


Thanks,
George

    On Thursday, July 25, 2019, 08:38:28 AM PDT, Colin McCabe 
 wrote:  
 
 We still want to give the "blacklisted" broker the leadership if nobody else 
is available.  Therefore, isn't putting a broker on the blacklist pretty much 
the same as moving it to the last entry in the replicas list and then 
triggering a preferred leader election?

If we want this to be undone after a certain amount of time, or under certain 
conditions, that seems like something that would be more effectively done by an 
external system, rather than putting all these policies into Kafka.

best,
Colin


On Fri, Jul 19, 2019, at 18:23, George Li wrote:
>  Hi Satish,
> Thanks for the reviews and feedbacks.
> 
> > > The following is the requirements this KIP is trying to accomplish:
> > This can be moved to the"Proposed changes" section.
> 
> Updated the KIP-491. 
> 
> > >>The logic to determine the priority/order of which broker should be
> > preferred leader should be modified.  The broker in the preferred leader
> > blacklist should be moved to the end (lowest priority) when
> > determining leadership.
> >
> > I believe there is no change required in the ordering of the preferred
> > replica list. Brokers in the preferred leader blacklist are skipped
> > until other brokers int he list are unavailable.
> 
> Yes. partition assignment remained the same, replica & ordering. The 
> blacklist logic can be optimized during implementation. 
> 
> > >>The blacklist can be at the broker level. However, there might be use 
> > >>cases
> > where a specific topic should blacklist particular brokers, which
> > would be at the
> > Topic level Config. For this use cases of this KIP, it seems that broker 
> > level
> > blacklist would suffice.  Topic level preferred leader blacklist might
> > be future enhancement work.
> > 
> > I agree that the broker level preferred leader blacklist would be
> > sufficient. Do you have any use cases which require topic level
> > preferred blacklist?
> 
> 
> 
> I don't have any concrete use cases for Topic level preferred leader 
> blacklist.  One scenarios I can think of is when a broker has high CPU 
> usage, trying to identify the big topics (High MsgIn, High BytesIn, 
> etc), then try to move the leaders away from this broker,  before doing 
> an actual reassignment to change its preferred leader,  try to put this 
> preferred_leader_blacklist in the Topic Level config, and run preferred 
> leader election, and see whether CPU decreases for this broker,  if 
> yes, then do the reassignments to change the preferred leaders to be 
> "permanent" (the topic may have many partitions like 256 that has quite 
> a few of them having this broker as preferred leader).  So this Topic 
> Level config is an easy way of doing trial and check the result. 
> 
> 
> > You can add the below workaround as an item in the rejected alternatives 
> > section
> > "Reassigning all the topic/partitions which the intended broker is a
> > replica for."
> 
> Updated the KIP-491. 
> 
> 
> 
> 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-08-06 Thread Boyang Chen
Thank you for the suggestions Jason. And a side note for Guozhang, I
updated the KIP to reflect the dependency on 447.

On Tue, Aug 6, 2019 at 11:35 AM Jason Gustafson  wrote:

> Hi Boyang, thanks for the updates. I have a few more comments:
>
> 1. We are adding some new fields to TxnOffsetCommit to support group-based
> fencing. Do we need these fields to be persisted in the offsets topic to
> ensure that the fencing still works after a coordinator failover?
>
> We already persist member.id, instance.id and generation.id in the offset
topic, what extra fields we need to store?


> 2. Since you are proposing a new `groupMetadata` API, have you considered
> whether we still need the `initTransactions` overload? Another way would be
> to pass it through the `sendOffsetsToTransaction` API:
>
> void sendOffsetsToTransaction(Map
> offsets, GroupMetadata groupMetadata) throws
> ProducerFencedException, IllegalGenerationException;
>
> This seems a little more consistent with the current API and avoids the
> direct dependence on the Consumer in the producer.
>
> Note that although we avoid one dependency to consumer, producer needs to
periodically update
its group metadata, or in this case the caller of
*sendOffsetsToTransaction(Map*
*offsets, GroupMetadata groupMetadata) *is responsible for getting the
latest value of group metadata.
This should be easily done on the stream side as we have
StreamsPartitionAssignor to reflect metadata changes upon #onAssignment(),
but non-stream user has to code the callback by hand, do you think the
convenience we sacrifice here worth the simplification benefit?


> 3. Can you clarify the behavior of the clients when the brokers do not
> support the latest API versions? This is both for the new TxnOffsetCommit
> and the OffsetFetch APIs. I guess the high level idea in streams is to
> detect broker support before instantiating the producer and consumer. I
> think that's reasonable, but we might need some approach for non-streams
> use cases. One option I was considering is enforcing the latest version
> through the new `sendOffsetsToTransaction` API. Basically when you use the
> new API, we require support for the latest TxnOffsetCommit version. This
> puts some burden on users, but it avoids breaking correctness assumptions
> when the new APIs are in use. What do you think?
>
Yes, I think we haven't covered this case, so the plan is to crash the
non-stream application when the job is using new sendOffsets API.

>
> -Jason
>
>
>
>
> On Mon, Aug 5, 2019 at 6:06 PM Boyang Chen 
> wrote:
>
> > Yep, Guozhang I think that would be best as passing in an entire consumer
> > instance is indeed cumbersome.
> >
> > Just saw you updated KIP-429, I will follow-up to change 447 as well.
> >
> > Best,
> > Boyang
> >
> > On Mon, Aug 5, 2019 at 3:18 PM Guozhang Wang  wrote:
> >
> > > okay I think I understand your concerns about ConsumerGroupMetadata
> now:
> > if
> > > we still want to only call initTxns once, then we should allow the
> > whatever
> > > passed-in parameter to reflect the latest value of generation id
> whenever
> > > sending the offset fetch request.
> > >
> > > Whereas the current ConsumerGroupMetadata is a static object.
> > >
> > > Maybe we can consider having an extended class of ConsumerGroupMetadata
> > > whose values are updated from the consumer's rebalance callback?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, Aug 5, 2019 at 9:26 AM Boyang Chen  >
> > > wrote:
> > >
> > > > Thank you Guozhang for the reply! I'm curious whether KIP-429 has
> > > reflected
> > > > the latest change on ConsumerGroupMetadata? Also regarding question
> > one,
> > > > the group metadata needs to be accessed via callback, does that mean
> we
> > > > need a separate producer API such like
> > > > "producer.refreshMetadata(groupMetadata)" to be able to access it
> > instead
> > > > of passing in the consumer instance?
> > > >
> > > > Boyang
> > > >
> > > > On Fri, Aug 2, 2019 at 4:36 PM Guozhang Wang 
> > wrote:
> > > >
> > > > > Thanks Boyang,
> > > > >
> > > > > I've made another pass on KIP-447 as well as
> > > > > https://github.com/apache/kafka/pull/7078, and have some minor
> > > comments
> > > > > about the proposed API:
> > > > >
> > > > > 1. it seems instead of needing the whole KafkaConsumer object,
> you'd
> > > only
> > > > > need the "ConsumerGroupMetadata", in that case can we just pass in
> > that
> > > > > object into the initTxns call?
> > > > >
> > > > > 2. the current trunk already has a public class named
> > > > > (ConsumerGroupMetadata)
> > > > > under o.a.k.clients.consumer created by KIP-429. If we want to just
> > use
> > > > > that then maybe it makes less sense to declare a base GroupMetadata
> > as
> > > we
> > > > > are already leaking such information on the assignor anyways.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Tue, Jul 30, 2019 at 1:55 PM Boyang Chen <
> > > reluctanthero...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thank 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Harsha Ch
Hi Colin,
"Hmm... I'm not sure I follow.  Users don't have to build their own
tooling, right?  They can use any of the shell scripts that we've shipped
in the last few releases.  For example, if any of your users run it, this
shell script will delete all of the topics from your non-security-enabled
cluster:

./bin/kafka-topics.sh --bootstrap-server localhost:9092 --list 2>/dev/null
| xargs -l ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete
--topic

They will need to fill in the correct bootstrap servers list, of course,
not localhost.  This deletion script will work on some pretty old brokers,
even back to the 0.10 releases.  It seems a little odd to trust your users
with this power, but not trust them to avoid changing a particular
configuration key."

The above will blocked by the server if we set delete.topic.enable to false
and thats exactly what I am asking for.

"The downside is that if we wanted to check a server side configuration
before sending the create topics request, the code would be more complex.
The behavior would also not be consistent with how topic auto-creation is
handled in Kafka Streams."

I am not sure why you need to check server side configuration before
sending create topics request. A user enables producer side config to
create topics.
Producer sends a request to the broker and if the broker has
auto.topic.create.enable to true (default) it will allow creation of
topics. If it set to false it returns error back to the client.
I don't see how this behavior will be different in Kafka streams. By
default server allows the topic creation and with this KIP, It will only
allow creation of topic when both producer and server side are turned on.
Its exactly the same behavior in KIP-361.

"In general, it would be nice if we could keep the security and access
control model simple and not introduce a lot of special cases and
exceptions.  Kafka has basically converged on using ACLs and
CreateTopicPolicy classes to control who can create topics and where.
Adding more knobs seems like a step backwards, especially when the proposed
knobs don't work consistently across components, and don't provide true
security."
This is not about access control at all. Shipping sane defaults should be
prioritized.
We keep talking about CreateTopicPolicy and yet we don't have default one
and asking every user of Kafka implement their own doesn't make sense here.
I am asking about exact behavior that we shipped for consumer side
https://cwiki.apache.org/confluence/display/KAFKA/KIP-361%3A+Add+Consumer+Configuration+to+Disable+Auto+Topic+Creation

I still didn't get any response why this behavior shouldn't be exactly like
Kafka consumer and why do we want to have producer to overrider server side
config and while not allowing consumer to do so.
We are not even allowing the same contract and this will cause more
confusion from the users standpoint.

Thanks,
Harsha



On Tue, Aug 6, 2019 at 9:09 PM Colin McCabe  wrote:

> On Tue, Aug 6, 2019, at 18:06, Harsha Ch wrote:
> > Not sure how the AdminClient applies here, It is an external API and
> > is not part of KafkaProducer so any user who updates to latest version of
> > Kafka with this feature get to create the topics.
> > They have to build a tooling around AdminClient allow themselves to
> create
> > topics.
>
> Hi Harsha,
>
> Hmm... I'm not sure I follow.  Users don't have to build their own
> tooling, right?  They can use any of the shell scripts that we've shipped
> in the last few releases.  For example, if any of your users run it, this
> shell script will delete all of the topics from your non-security-enabled
> cluster:
>
> ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --list 2>/dev/null
> | xargs -l ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete
> --topic
>
> They will need to fill in the correct bootstrap servers list, of course,
> not localhost.  This deletion script will work on some pretty old brokers,
> even back to the 0.10 releases.  It seems a little odd to trust your users
> with this power, but not trust them to avoid changing a particular
> configuration key.
>
> > There is no behavior in Kafka producer that allowed users to
> > delete the topics or delete the records. So citing them as an
> > example doesn't makes sense in this context.
>
> I think Kafka Streams is relevant here.  After all, it's software that we
> ship as part of the official Kafka release.  And it auto-creates topics
> even when auto.create.topics.enable is set to false on the broker.
>
> I think that auto.create.topics.enable was never intended as a security
> setting to control access.  It was intended as a way to disable the
> broker-side auto-create feature specifically, because there were some
> problems with that specific feature.  Broker-side auto-creation is
> frustrating because it's on by default, and because it can auto-create
> topics even if the producer or consumer didn't explicitly ask for that.
> Neither of those 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Colin McCabe
On Tue, Aug 6, 2019, at 18:06, Harsha Ch wrote:
> Not sure how the AdminClient applies here, It is an external API and
> is not part of KafkaProducer so any user who updates to latest version of
> Kafka with this feature get to create the topics.
> They have to build a tooling around AdminClient allow themselves to create
> topics.

Hi Harsha,

Hmm... I'm not sure I follow.  Users don't have to build their own tooling, 
right?  They can use any of the shell scripts that we've shipped in the last 
few releases.  For example, if any of your users run it, this shell script will 
delete all of the topics from your non-security-enabled cluster:

./bin/kafka-topics.sh --bootstrap-server localhost:9092 --list 2>/dev/null | 
xargs -l ./bin/kafka-topics.sh --bootstrap-server localhost:9092 --delete 
--topic

They will need to fill in the correct bootstrap servers list, of course, not 
localhost.  This deletion script will work on some pretty old brokers, even 
back to the 0.10 releases.  It seems a little odd to trust your users with this 
power, but not trust them to avoid changing a particular configuration key.

> There is no behavior in Kafka producer that allowed users to
> delete the topics or delete the records. So citing them as an
> example doesn't makes sense in this context.

I think Kafka Streams is relevant here.  After all, it's software that we ship 
as part of the official Kafka release.  And it auto-creates topics even when 
auto.create.topics.enable is set to false on the broker.

I think that auto.create.topics.enable was never intended as a security setting 
to control access.  It was intended as a way to disable the broker-side 
auto-create feature specifically, because there were some problems with that 
specific feature.  Broker-side auto-creation is frustrating because it's on by 
default, and because it can auto-create topics even if the producer or consumer 
didn't explicitly ask for that.  Neither of those problems applies to this KIP: 
producers have to specifically opt in, and it won't be on by default.  
Basically, we think that client-side auto-creation is actually a lot better-- 
hence this KIP :)

> But there is
> a functionality which allowed creation of topics if they don't exist in the
> cluster and this behavior could be controlled by a config on the server
> side. Now with this KIP we are allowing producer to make that decision
> without any gateway on the server via configs. Any user who is not aware of
> such changes
> can accidentally create these topics and we are essentially removing a
> config that exists in brokers today to block this accidental creation and
> allowing clients to take control.

Again, I hope I'm not misinterpreting, but I don't see how this can be turned 
on accidentally.  The user would have to specifically turn this on in the 
producer by setting the configuration key.

>   I still didn't get any positives of not having server side configs?
> if you want to turn them on and allow any client to create topics set the
> default to true
> and allow users who doesn't want to have this feature let them turn them
> off. It will be the exact behavior as it is today, as far as producer is
> concerned. I am not
> understanding why not having server side configs to gateways such a hard
> requirement and this behavior exists today. As far I am concerned this
> breaks the backward compatibility.

The downside is that if we wanted to check a server side configuration before 
sending the create topics request, the code would be more complex.  The 
behavior would also not be consistent with how topic auto-creation is handled 
in Kafka Streams.

In general, it would be nice if we could keep the security and access control 
model simple and not introduce a lot of special cases and exceptions.  Kafka 
has basically converged on using ACLs and CreateTopicPolicy classes to control 
who can create topics and where.  Adding more knobs seems like a step 
backwards, especially when the proposed knobs don't work consistently across 
components, and don't provide true security.

Maybe we should support an insecure mode where there are still users and ACLs.  
That would help people who don't want to set up Kerberos, etc. but still want 
to protect against misconfigured clients or accidents.  Hadoop has something 
like this, and I think it was useful.  NFS also supported (supports?) a mode 
where you just pass whatever user ID you want and the system believes you.  
These things clearly don't protect against malicious users, but they can help 
set up policies when needed.

best,
Colin

> 
> Thanks,
> Harsha
> 
> 
> On Tue, Aug 6, 2019 at 4:02 PM Colin McCabe  wrote:
> 
> > Hi Harsha,
> >
> > Thanks for taking a look.
> >
> > I'm not sure I follow the discussion about AdminClient.  KafkaAdminClient
> > has been around for about 2 years at this point as a public class.  There
> > are many programs that use it to automatically create topics.  Kafka
> > Streams does this, for 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Satish Duggana
Hi Justine,
Thanks for the clarifications.

I understand that auto-creation of topics will happen through
CreateTopic request instead of metadata request. What I meant in
earlier mail is producer client should not override broker config
about auto-creation of topics. I agree with Harsha on other mail about
this behavior. If auto-creation is disabled on broker, producer
clients will never be allowed to create topics even if
'allow.auto.create.topics' is true in producer client.

On Wed, Aug 7, 2019 at 1:28 AM Justine Olshan  wrote:
>
> Hi Satish,
>
> Thanks for looking at the KIP.
>
> Yes, the producer will wait for the topic to be created before it can send
> any messages to it.
>
> I would like to clarify "overriding" broker behavior. If the client enables
> client-side autocreation, the only difference will be that the topic
> auto-creation will no longer occur in the metadata request and will instead
> come from a CreateTopic request on the producer.
> Partitions and replication factor will be determined by the broker configs.
>
> Is this similar to what you were thinking? Please let me know if there is
> something you think I missed.
>
> Thank you,
> Justine
>
> On Tue, Aug 6, 2019 at 12:01 PM Satish Duggana 
> wrote:
>
> > Hi Justine,
> > Thanks for the KIP. This is a nice addition to the producer client
> > without running admin-client’s create topic APIs. Does producer wait
> > for the topic to be created successfully before it tries to publish
> > messages to that topic? I assume that this will not throw an error
> > that the topic does not exist.
> >
> > As mentioned by others, overriding broker behavior by producer looks
> > to be a concern. IMHO, broker should have a way to use either default
> > constraints or configure custom constraints before these can be
> > overridden by clients but not vice versa. There should be an option on
> > brokers whether those constraints can be overridden by producers or
> > not.
> >
> > Thanks,
> > Satish.
> >
> > On Tue, Aug 6, 2019 at 11:39 PM Justine Olshan 
> > wrote:
> > >
> > > Hi Harsha,
> > >
> > > After taking this all into consideration, I've updated the KIP to no
> > longer
> > > allow client-side configuration of replication factor and partitions.
> > > Instead, the broker defaults will be used as long as the broker supports
> > > KIP 464.
> > > If the broker does not support this KIP, then the client can not create
> > > topics on its own. (Behavior that exists now)
> > >
> > > I think this will help with your concerns. Please let me know if you
> > > further feedback.
> > >
> > > Thank you,
> > > Justine
> > >
> > > On Tue, Aug 6, 2019 at 10:49 AM Harsha Chintalapani 
> > wrote:
> > >
> > > > Hi,
> > > > Even with policies one needs to implement that, so for every user
> > who
> > > > doesn't want a producer to create topics or have limits around
> > partitions &
> > > > replication factor they have to implement a policy.
> > > >   The KIP is changing the behavior , it might not be introducing
> > the
> > > > new functionality but it will enable producers to override the create
> > topic
> > > > config settings on the broker. What I am asking for to provide a config
> > > > that will disable auto creation of topics and if its enabled set some
> > sane
> > > > defaults so that clients can create a topic with in those limits. I
> > don't
> > > > see how this not related to this KIP.
> > > >  If the server config options as I suggested doesn't interest you
> > at
> > > > least have a default CreateTopicPolices in place.
> > > >To give an example, In our environment we disable the
> > > > auto.create.topic.enable and force users to go through a centralized
> > > > service as we want capture more details about the topic creation and
> > > > requirements. With this KIP, a producer can create a topic with no
> > bounds.
> > > >  Another example max.message.size we define that at cluster level and
> > one
> > > > can override max.messsage.size at topic level, any misconfiguration at
> > this
> > > > will cause service degradation.  Its not always about the rogue
> > clients,
> > > > users can easily misconfigure and can cause an outage.
> > > > Again we can talk about CreateTopicPolicy but without having a default
> > > > implementation and asking everyone to implement their own while
> > changing
> > > > the behavior in producer  doesn't make sense to me.
> > > >
> > > > Thanks,
> > > > Harsha
> > > >
> > > >
> > > > On Tue, Aug 06, 2019 at 7:41 AM, Ismael Juma 
> > wrote:
> > > >
> > > > > Hi Harsha,
> > > > >
> > > > > I mentioned policies and the authorizer. For example, with
> > > > > CreateTopicPolicy, you can implement the limits you describe. If you
> > have
> > > > > ideas of how that should be improved, please submit a KIP. My point
> > is
> > > > that
> > > > > this KIP is not introducing any new functionality with regards to
> > what
> > > > > rogue clients can do. It's using the existing protocol that is
> > already
> > > 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Harsha Chintalapani
Hi Justin,
  Thanks for making changes. I still have concern that we are
prioritizing producer config over server side which is breaking the
backward compatibility of broker's auto.topic.create.enable as far as
producer is concerned.
   Also
https://cwiki.apache.org/confluence/display/KAFKA/KIP-361%3A+Add+Consumer+Configuration+to+Disable+Auto+Topic+Creation
only
allows creation of topic if both allow.auto.create.topics set to true
and auto.create.topics.enable
on the broker side set to true.  This KIP is changing that contract for
producers and allow them to take control  even broker side is turned off.
Why can't we have the same behavior.  You can still achieve the goal for
this KIP by allowing auto.create.topics.enable to take precedence and have
a similar behavior to that of consumer.


Thanks,
Harsha


On Tue, Aug 06, 2019 at 6:06 PM, Harsha Ch  wrote:

> Hi Colin,
>  There is no behavior in Kafka producer that allowed users to
> delete the topics or delete the records. So
> citing them as an example doesn't makes sense in this context. But there
> is a functionality which allowed creation of topics if they don't exist in
> the cluster and this behavior could be controlled by a config on the server
> side. Now with this KIP we are allowing producer to make that decision
> without any gateway on the server via configs. Any user who is not aware of
> such changes
> can accidentally create these topics and we are essentially removing a
> config that exists in brokers today to block this accidental creation and
> allowing clients to take control.
>   Not sure how the AdminClient applies here, It is an external API and
> is not part of KafkaProducer so any user who updates to latest version of
> Kafka with this feature get to create the topics.
> They have to build a tooling around AdminClient allow themselves to create
> topics.
>   I still didn't get any positives of not having server side configs?
> if you want to turn them on and allow any client to create topics set the
> default to true
> and allow users who doesn't want to have this feature let them turn them
> off. It will be the exact behavior as it is today, as far as producer is
> concerned. I am not
> understanding why not having server side configs to gateways such a hard
> requirement and this behavior exists today. As far I am concerned this
> breaks the backward compatibility.
>
> Thanks,
> Harsha
>
>
> On Tue, Aug 6, 2019 at 4:02 PM Colin McCabe  wrote:
>
> Hi Harsha,
>
> Thanks for taking a look.
>
> I'm not sure I follow the discussion about AdminClient.  KafkaAdminClient
> has been around for about 2 years at this point as a public class.  There
> are many programs that use it to automatically create topics.  Kafka
> Streams does this, for example.  If any of your users run Kafka Streams,
> they will be auto-creating topics, regardless of what setting you use for
> auto.create.topics.enable.
>
> A big part of the annoyance of auto-topic creation right now is that it is
> on by default.  The new configuration proposed by KIP-487 wouldn't be.
> Users would have to explicitly opt in to the new behavior of client-side
> topic creation.  If you run without security, you're already putting a huge
> amount of trust in your users.  For example, you trust them not to delete
> records with the kafka-delete-records.sh command, or delete topics with
> kafka-topics.sh.  Trusting them not to set a certain config value seems
> minor in comparison, right?
>
> best,
> Colin
>
> On Tue, Aug 6, 2019, at 10:49, Harsha Chintalapani wrote:
> > Hi,
> > Even with policies one needs to implement that, so for every user who
> > doesn't want a producer to create topics or have limits around
> partitions &
> > replication factor they have to implement a policy.
> >   The KIP is changing the behavior , it might not be introducing the
> > new functionality but it will enable producers to override the create
> topic
> > config settings on the broker. What I am asking for to provide a config
> > that will disable auto creation of topics and if its enabled set some
> sane
> > defaults so that clients can create a topic with in those limits. I don't
> > see how this not related to this KIP.
> >  If the server config options as I suggested doesn't interest you at
> > least have a default CreateTopicPolices in place.
> >To give an example, In our environment we disable the
> > auto.create.topic.enable and force users to go through a centralized
> > service as we want capture more details about the topic creation and
> > requirements. With this KIP, a producer can create a topic with no
> bounds.
> >  Another example max.message.size we define that at cluster level and one
> > can override max.messsage.size at topic level, any misconfiguration at
> this
> > will cause service degradation.  Its not always about the rogue clients,
> > users can easily misconfigure and can cause an outage.
> > Again we can talk about 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Harsha Ch
Hi Colin,
 There is no behavior in Kafka producer that allowed users to
delete the topics or delete the records. So
citing them as an example doesn't makes sense in this context. But there is
a functionality which allowed creation of topics if they don't exist in the
cluster and this behavior could be controlled by a config on the server
side. Now with this KIP we are allowing producer to make that decision
without any gateway on the server via configs. Any user who is not aware of
such changes
can accidentally create these topics and we are essentially removing a
config that exists in brokers today to block this accidental creation and
allowing clients to take control.
  Not sure how the AdminClient applies here, It is an external API and
is not part of KafkaProducer so any user who updates to latest version of
Kafka with this feature get to create the topics.
They have to build a tooling around AdminClient allow themselves to create
topics.
  I still didn't get any positives of not having server side configs?
if you want to turn them on and allow any client to create topics set the
default to true
and allow users who doesn't want to have this feature let them turn them
off. It will be the exact behavior as it is today, as far as producer is
concerned. I am not
understanding why not having server side configs to gateways such a hard
requirement and this behavior exists today. As far I am concerned this
breaks the backward compatibility.

Thanks,
Harsha


On Tue, Aug 6, 2019 at 4:02 PM Colin McCabe  wrote:

> Hi Harsha,
>
> Thanks for taking a look.
>
> I'm not sure I follow the discussion about AdminClient.  KafkaAdminClient
> has been around for about 2 years at this point as a public class.  There
> are many programs that use it to automatically create topics.  Kafka
> Streams does this, for example.  If any of your users run Kafka Streams,
> they will be auto-creating topics, regardless of what setting you use for
> auto.create.topics.enable.
>
> A big part of the annoyance of auto-topic creation right now is that it is
> on by default.  The new configuration proposed by KIP-487 wouldn't be.
> Users would have to explicitly opt in to the new behavior of client-side
> topic creation.  If you run without security, you're already putting a huge
> amount of trust in your users.  For example, you trust them not to delete
> records with the kafka-delete-records.sh command, or delete topics with
> kafka-topics.sh.  Trusting them not to set a certain config value seems
> minor in comparison, right?
>
> best,
> Colin
>
> On Tue, Aug 6, 2019, at 10:49, Harsha Chintalapani wrote:
> > Hi,
> > Even with policies one needs to implement that, so for every user who
> > doesn't want a producer to create topics or have limits around
> partitions &
> > replication factor they have to implement a policy.
> >   The KIP is changing the behavior , it might not be introducing the
> > new functionality but it will enable producers to override the create
> topic
> > config settings on the broker. What I am asking for to provide a config
> > that will disable auto creation of topics and if its enabled set some
> sane
> > defaults so that clients can create a topic with in those limits. I don't
> > see how this not related to this KIP.
> >  If the server config options as I suggested doesn't interest you at
> > least have a default CreateTopicPolices in place.
> >To give an example, In our environment we disable the
> > auto.create.topic.enable and force users to go through a centralized
> > service as we want capture more details about the topic creation and
> > requirements. With this KIP, a producer can create a topic with no
> bounds.
> >  Another example max.message.size we define that at cluster level and one
> > can override max.messsage.size at topic level, any misconfiguration at
> this
> > will cause service degradation.  Its not always about the rogue clients,
> > users can easily misconfigure and can cause an outage.
> > Again we can talk about CreateTopicPolicy but without having a default
> > implementation and asking everyone to implement their own while changing
> > the behavior in producer  doesn't make sense to me.
> >
> > Thanks,
> > Harsha
> >
> >
> > On Tue, Aug 06, 2019 at 7:41 AM, Ismael Juma  wrote:
> >
> > > Hi Harsha,
> > >
> > > I mentioned policies and the authorizer. For example, with
> > > CreateTopicPolicy, you can implement the limits you describe. If you
> have
> > > ideas of how that should be improved, please submit a KIP. My point is
> that
> > > this KIP is not introducing any new functionality with regards to what
> > > rogue clients can do. It's using the existing protocol that is already
> > > exposed via the AdminClient. So, I don't think we need to address it in
> > > this KIP. Does that make sense?
> > >
> > > Ismael
> > >
> > > On Tue, Aug 6, 2019 at 7:13 AM Harsha Chintalapani 
> > > wrote:
> > >
> > > Ismael,
> > > Sure AdminClient can do that and 

Build failed in Jenkins: kafka-2.0-jdk8 #285

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-8736: Streams performance improvement, use isEmpty() rather 
than

--
[...truncated 893.65 KB...]
kafka.coordinator.group.GroupCoordinatorTest > testLeaveGroupWrongCoordinator 
PASSED

kafka.coordinator.group.GroupCoordinatorTest > testCommitMaintainsSession 
STARTED

kafka.coordinator.group.GroupCoordinatorTest > testCommitMaintainsSession PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testFetchOffsetNotCoordinatorForGroup STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testFetchOffsetNotCoordinatorForGroup PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testLeaveGroupUnknownConsumerExistingGroup STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testLeaveGroupUnknownConsumerExistingGroup PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testJoinGroupUnknownConsumerNewGroup STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testJoinGroupUnknownConsumerNewGroup PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testJoinGroupFromUnchangedFollowerDoesNotRebalance STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testJoinGroupFromUnchangedFollowerDoesNotRebalance PASSED

kafka.coordinator.group.GroupCoordinatorTest > testValidJoinGroup STARTED

kafka.coordinator.group.GroupCoordinatorTest > testValidJoinGroup PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
shouldDelayRebalanceUptoRebalanceTimeout STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
shouldDelayRebalanceUptoRebalanceTimeout PASSED

kafka.coordinator.group.GroupCoordinatorTest > testFetchOffsets STARTED

kafka.coordinator.group.GroupCoordinatorTest > testFetchOffsets PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testSessionTimeoutDuringRebalance STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testSessionTimeoutDuringRebalance PASSED

kafka.coordinator.group.GroupCoordinatorTest > testFetchTxnOffsetsWithAbort 
STARTED

kafka.coordinator.group.GroupCoordinatorTest > testFetchTxnOffsetsWithAbort 
PASSED

kafka.coordinator.group.GroupCoordinatorTest > testSyncGroupLeaderAfterFollower 
STARTED

kafka.coordinator.group.GroupCoordinatorTest > testSyncGroupLeaderAfterFollower 
PASSED

kafka.coordinator.group.GroupCoordinatorTest > testSyncGroupFromUnknownMember 
STARTED

kafka.coordinator.group.GroupCoordinatorTest > testSyncGroupFromUnknownMember 
PASSED

kafka.coordinator.group.GroupCoordinatorTest > testValidLeaveGroup STARTED

kafka.coordinator.group.GroupCoordinatorTest > testValidLeaveGroup PASSED

kafka.coordinator.group.GroupCoordinatorTest > testDescribeGroupInactiveGroup 
STARTED

kafka.coordinator.group.GroupCoordinatorTest > testDescribeGroupInactiveGroup 
PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testFetchTxnOffsetsIgnoreSpuriousCommit STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testFetchTxnOffsetsIgnoreSpuriousCommit PASSED

kafka.coordinator.group.GroupCoordinatorTest > testSyncGroupNotCoordinator 
STARTED

kafka.coordinator.group.GroupCoordinatorTest > testSyncGroupNotCoordinator 
PASSED

kafka.coordinator.group.GroupCoordinatorTest > testBasicFetchTxnOffsets STARTED

kafka.coordinator.group.GroupCoordinatorTest > testBasicFetchTxnOffsets PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
shouldResetRebalanceDelayWhenNewMemberJoinsGroupInInitialRebalance STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
shouldResetRebalanceDelayWhenNewMemberJoinsGroupInInitialRebalance PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testHeartbeatUnknownConsumerExistingGroup STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testHeartbeatUnknownConsumerExistingGroup PASSED

kafka.coordinator.group.GroupCoordinatorTest > testValidHeartbeat STARTED

kafka.coordinator.group.GroupCoordinatorTest > testValidHeartbeat PASSED

kafka.coordinator.group.GroupCoordinatorTest > 
testRequestHandlingWhileLoadingInProgress STARTED

kafka.coordinator.group.GroupCoordinatorTest > 
testRequestHandlingWhileLoadingInProgress PASSED

kafka.network.SocketServerTest > testGracefulClose STARTED

kafka.network.SocketServerTest > testGracefulClose PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone PASSED

kafka.network.SocketServerTest > controlThrowable STARTED

kafka.network.SocketServerTest > controlThrowable PASSED

kafka.network.SocketServerTest > testRequestMetricsAfterStop STARTED

kafka.network.SocketServerTest > testRequestMetricsAfterStop PASSED

kafka.network.SocketServerTest > testConnectionIdReuse STARTED

kafka.network.SocketServerTest > testConnectionIdReuse PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
STARTED

kafka.network.SocketServerTest > 

Build failed in Jenkins: kafka-2.1-jdk8 #220

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-8736: Streams performance improvement, use isEmpty() rather 
than

--
[...truncated 234.20 KB...]
kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithTopicAndGroupRead 
STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithTopicAndGroupRead 
PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testAuthorizationWithTopicExisting STARTED

kafka.api.AuthorizerIntegrationTest > testAuthorizationWithTopicExisting PASSED

kafka.api.AuthorizerIntegrationTest > 
testUnauthorizedDeleteRecordsWithoutDescribe STARTED

kafka.api.AuthorizerIntegrationTest > 
testUnauthorizedDeleteRecordsWithoutDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testMetadataWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testMetadataWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testDescribeGroupApiWithNoGroupAcl STARTED

kafka.api.AuthorizerIntegrationTest > testDescribeGroupApiWithNoGroupAcl PASSED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionMatchingInternalTopic STARTED

kafka.api.AuthorizerIntegrationTest > 
testPatternSubscriptionMatchingInternalTopic PASSED

kafka.api.AuthorizerIntegrationTest > 
testSendOffsetsWithNoConsumerGroupDescribeAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSendOffsetsWithNoConsumerGroupDescribeAccess PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchTopicDescribe STARTED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicAndGroupRead STARTED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > 
testIdempotentProducerNoIdempotentWriteAclInInitProducerId STARTED

kafka.api.AuthorizerIntegrationTest > 
testIdempotentProducerNoIdempotentWriteAclInInitProducerId PASSED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithExplicitSeekAndNoGroupAccess STARTED

kafka.api.AuthorizerIntegrationTest > 
testSimpleConsumeWithExplicitSeekAndNoGroupAccess PASSED

kafka.api.RackAwareAutoTopicCreationTest > testAutoCreateTopic STARTED

kafka.api.RackAwareAutoTopicCreationTest > testAutoCreateTopic PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testTransactionalProducerWithAuthenticationFailure STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testTransactionalProducerWithAuthenticationFailure PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAuthenticationFailure STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAuthenticationFailure PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationSuccess STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationSuccess PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testProducerWithAuthenticationFailure STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testProducerWithAuthenticationFailure PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationFailure STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testConsumerGroupServiceWithAuthenticationFailure PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAutoCommitDisabledWithAuthenticationFailure 
STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testManualAssignmentConsumerWithAutoCommitDisabledWithAuthenticationFailure 
PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testKafkaAdminClientWithAuthenticationFailure STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testKafkaAdminClientWithAuthenticationFailure PASSED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testConsumerWithAuthenticationFailure STARTED

kafka.api.SaslClientsWithInvalidCredentialsTest > 
testConsumerWithAuthenticationFailure PASSED

kafka.api.LogAppendTimeTest > testProduceConsume STARTED

kafka.api.LogAppendTimeTest > testProduceConsume PASSED

kafka.api.PlaintextEndToEndAuthorizationTest > testListenerName STARTED

kafka.api.PlaintextEndToEndAuthorizationTest > testListenerName PASSED

kafka.api.PlaintextEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe STARTED

kafka.api.PlaintextEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe PASSED

kafka.api.PlaintextEndToEndAuthorizationTest > 
testProduceConsumeWithPrefixedAcls STARTED

kafka.api.PlaintextEndToEndAuthorizationTest > 

Build failed in Jenkins: kafka-trunk-jdk11 #736

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-8736: Streams performance improvement, use isEmpty() rather 
than

[gwen] MINOR: some small style fixes to RoundRobinPartitioner

--
[...truncated 2.60 MB...]

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessDateToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToString STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToString PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimeToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimeToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToDate STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToDate PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToTime STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToTime PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToUnix STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToUnix PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToDate STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToDate PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToTime STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToTime PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToUnix STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToUnix PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaUnixToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaUnixToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToString STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullValueToString PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessIdentity STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessIdentity PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullFieldToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullFieldToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaFieldConversion STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaFieldConversion PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullFieldToString STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaNullFieldToString PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessNullValueToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessNullValueToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessNullValueToString STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessNullValueToString PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaDateToTimestamp STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaDateToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaIdentity STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaIdentity PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToDate STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToDate PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToTime STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToTime PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToUnix STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessTimestampToUnix PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigMissingFormat STARTED


Build failed in Jenkins: kafka-trunk-jdk8 #3833

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-8736: Streams performance improvement, use isEmpty() rather 
than

[gwen] MINOR: some small style fixes to RoundRobinPartitioner

--
[...truncated 2.58 MB...]
org.apache.kafka.trogdor.agent.AgentTest > testAgentExecWithNormalExit PASSED

org.apache.kafka.trogdor.agent.AgentTest > testWorkerCompletions STARTED

org.apache.kafka.trogdor.agent.AgentTest > testWorkerCompletions PASSED

org.apache.kafka.trogdor.agent.AgentTest > testAgentFinishesTasks STARTED

org.apache.kafka.trogdor.agent.AgentTest > testAgentFinishesTasks PASSED

org.apache.kafka.trogdor.basic.BasicPlatformTest > testCreateBasicPlatform 
STARTED

org.apache.kafka.trogdor.basic.BasicPlatformTest > testCreateBasicPlatform 
PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskRequest STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskRequest PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskDestruction 
STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskDestruction 
PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTasksRequestMatches 
STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTasksRequestMatches 
PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testWorkersExitingAtDifferentTimes STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testWorkersExitingAtDifferentTimes PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testAgentFailureAndTaskExpiry STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testAgentFailureAndTaskExpiry PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskDistribution 
STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskDistribution 
PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testTaskRequestWithOldStartMsGetsUpdated STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testTaskRequestWithOldStartMsGetsUpdated PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testNetworkPartitionFault STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testNetworkPartitionFault PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testTaskRequestWithFutureStartMsDoesNotGetRun STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > 
testTaskRequestWithFutureStartMsDoesNotGetRun PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskCancellation 
STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTaskCancellation 
PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testCreateTask STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testCreateTask PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTasksRequest STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testTasksRequest PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testCoordinatorStatus 
STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testCoordinatorStatus 
PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testCoordinatorUptime 
STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorTest > testCoordinatorUptime 
PASSED

org.apache.kafka.trogdor.coordinator.CoordinatorClientTest > 
testPrettyPrintTaskInfo STARTED

org.apache.kafka.trogdor.coordinator.CoordinatorClientTest > 
testPrettyPrintTaskInfo PASSED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithSomePartitions STARTED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithSomePartitions PASSED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithNoPartitions STARTED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testMaterializeTopicsWithNoPartitions PASSED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testInvalidTopicNameRaisesExceptionInMaterialize STARTED

org.apache.kafka.trogdor.workload.ConsumeBenchSpecTest > 
testInvalidTopicNameRaisesExceptionInMaterialize PASSED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionNumbers STARTED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionNumbers PASSED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testMaterialize STARTED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testMaterialize PASSED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionsSpec STARTED

org.apache.kafka.trogdor.workload.TopicsSpecTest > testPartitionsSpec PASSED

org.apache.kafka.trogdor.workload.TimeIntervalTransactionsGeneratorTest > 
testCommitsTransactionAfterIntervalPasses STARTED

org.apache.kafka.trogdor.workload.TimeIntervalTransactionsGeneratorTest > 
testCommitsTransactionAfterIntervalPasses PASSED

org.apache.kafka.trogdor.workload.ThrottleTest > testThrottle 

Build failed in Jenkins: kafka-2.3-jdk8 #81

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-8736: Streams performance improvement, use isEmpty() rather 
than

--
[...truncated 2.95 MB...]
kafka.controller.ControllerChannelManagerTest > testLeaderAndIsrRequestIsNew 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaRequestsWhileTopicQueuedForDeletion STARTED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaRequestsWhileTopicQueuedForDeletion PASSED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrRequestSentToLiveOrShuttingDownBrokers STARTED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrRequestSentToLiveOrShuttingDownBrokers PASSED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaInterBrokerProtocolVersion STARTED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaInterBrokerProtocolVersion PASSED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaSentOnlyToLiveAndShuttingDownBrokers STARTED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaSentOnlyToLiveAndShuttingDownBrokers PASSED

kafka.controller.ControllerChannelManagerTest > testStopReplicaGroupsByBroker 
STARTED

kafka.controller.ControllerChannelManagerTest > testStopReplicaGroupsByBroker 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataDoesNotIncludePartitionsWithoutLeaderAndIsr STARTED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataDoesNotIncludePartitionsWithoutLeaderAndIsr PASSED

kafka.controller.ControllerChannelManagerTest > 
testMixedDeleteAndNotDeleteStopReplicaRequests STARTED

kafka.controller.ControllerChannelManagerTest > 
testMixedDeleteAndNotDeleteStopReplicaRequests PASSED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrInterBrokerProtocolVersion STARTED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrInterBrokerProtocolVersion PASSED

kafka.controller.ControllerChannelManagerTest > testUpdateMetadataRequestSent 
STARTED

kafka.controller.ControllerChannelManagerTest > testUpdateMetadataRequestSent 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataRequestDuringTopicDeletion STARTED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataRequestDuringTopicDeletion PASSED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataIncludesLiveOrShuttingDownBrokers STARTED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataIncludesLiveOrShuttingDownBrokers PASSED

kafka.controller.ControllerChannelManagerTest > testStopReplicaRequestSent 
STARTED

kafka.controller.ControllerChannelManagerTest > testStopReplicaRequestSent 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaRequestsWhileTopicDeletionStarted STARTED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaRequestsWhileTopicDeletionStarted PASSED

kafka.controller.ControllerChannelManagerTest > testLeaderAndIsrRequestSent 
STARTED

kafka.controller.ControllerChannelManagerTest > testLeaderAndIsrRequestSent 
PASSED

kafka.controller.TopicDeletionManagerTest > 
testBrokerFailureAfterDeletionStarted STARTED

kafka.controller.TopicDeletionManagerTest > 
testBrokerFailureAfterDeletionStarted PASSED

kafka.controller.TopicDeletionManagerTest > testInitialization STARTED

kafka.controller.TopicDeletionManagerTest > testInitialization PASSED

kafka.controller.TopicDeletionManagerTest > testBasicDeletion STARTED

kafka.controller.TopicDeletionManagerTest > testBasicDeletion PASSED

kafka.controller.TopicDeletionManagerTest > testDeletionWithBrokerOffline 
STARTED

kafka.controller.TopicDeletionManagerTest > testDeletionWithBrokerOffline PASSED

kafka.controller.ControllerFailoverTest > testHandleIllegalStateException 
STARTED

kafka.controller.ControllerFailoverTest > testHandleIllegalStateException PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerDetectsBouncedBrokers STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerDetectsBouncedBrokers PASSED

kafka.controller.ControllerIntegrationTest > testControlledShutdown STARTED

kafka.controller.ControllerIntegrationTest > testControlledShutdown PASSED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentWithOfflineReplicaHaltingProgress STARTED

kafka.controller.ControllerIntegrationTest > 
testPartitionReassignmentWithOfflineReplicaHaltingProgress PASSED

kafka.controller.ControllerIntegrationTest > 
testControllerEpochPersistsWhenAllBrokersDown STARTED

kafka.controller.ControllerIntegrationTest > 
testControllerEpochPersistsWhenAllBrokersDown PASSED

kafka.controller.ControllerIntegrationTest > 
testTopicCreationWithOfflineReplica STARTED

kafka.controller.ControllerIntegrationTest > 
testTopicCreationWithOfflineReplica PASSED

kafka.controller.ControllerIntegrationTest > 

Re: [ANNOUNCE] Apache Kafka 2.3.0

2019-08-06 Thread Colin McCabe
Thanks for the bug report, Mickael.  As Matthias said, it should be fixed now.  
I also updated the CVEs page.

cheers,
Colin


On Wed, Jul 31, 2019, at 02:53, Mickael Maison wrote:
> Hi,
> 
> It looks like the protocol page was not updated. It still only lists 2.2 APIs.
> http://kafka.apache.org/protocol
> 
> Thanks
> 
> On Tue, Jul 2, 2019 at 2:05 AM Colin McCabe  wrote:
> >
> > Hi Mickael,
> >
> > Thanks for pointing this out.  It should be fixed now.
> >
> > best,
> > Colin
> >
> > On Mon, Jul 1, 2019, at 09:14, Mickael Maison wrote:
> > > Colin,
> > >
> > > The javadocs links are broken:
> > > The requested URL /23/javadoc/index.html was not found on this server.
> > >
> > > It's the 3rd time in a row this happens (2.1 and 2.2 had the same
> > > issue at release). Last time, Guozhang confirmed this step is in the
> > > release process but maybe this needs to be highlighted
> > >
> > > On Tue, Jun 25, 2019 at 8:22 PM Colin McCabe  wrote:
> > > >
> > > > Thanks to everyone who reviewed the Apache blog post about 2.3.  It's 
> > > > live now at https://blogs.apache.org/kafka/date/20190624
> > > >
> > > > Plus, Tim Berglund made a video about what's new in this release.  
> > > > https://www.youtube.com/watch?v=sNqwJT2WguQ
> > > >
> > > > Finally, check out Stéphane Maarek's video about 2.3 here: 
> > > > https://www.youtube.com/watch?v=YutjYKSGd64
> > > >
> > > > cheers,
> > > > Colin
> > > >
> > > >
> > > > On Tue, Jun 25, 2019, at 09:40, Colin McCabe wrote:
> > > > > The Apache Kafka community is pleased to announce the release for
> > > > > Apache Kafka 2.3.0.
> > > > > This release includes several new features, including:
> > > > >
> > > > > - There have been several improvements to the Kafka Connect REST API.
> > > > > - Kafka Connect now supports incremental cooperative rebalancing.
> > > > > - Kafka Streams now supports an in-memory session store and window
> > > > > store.
> > > > > - The AdminClient now allows users to determine what operations they
> > > > > are authorized to perform on topics.
> > > > > - There is a new broker start time metric.
> > > > > - JMXTool can now connect to secured RMI ports.
> > > > > - An incremental AlterConfigs API has been added.  The old 
> > > > > AlterConfigs
> > > > > API has been deprecated.
> > > > > - We now track partitions which are under their min ISR count.
> > > > > - Consumers can now opt-out of automatic topic creation, even when it
> > > > > is enabled on the broker.
> > > > > - Kafka components can now use external configuration stores (KIP-421)
> > > > > - We have implemented improved replica fetcher behavior when errors 
> > > > > are
> > > > > encountered
> > > > >
> > > > > All of the changes in this release can be found in the release notes:
> > > > > https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html
> > > > >
> > > > >
> > > > > You can download the source and binary release (Scala 2.11 and 2.12) 
> > > > > from:
> > > > > https://kafka.apache.org/downloads#2.3.0
> > > > >
> > > > > All of the changes in this release can be found in the release notes:
> > > > > https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html
> > > > >
> > > > > You can download the source and binary release (Scala 2.11 and 2.12) 
> > > > > from:
> > > > > https://kafka.apache.org/downloads#2.3.0
> > > > >
> > > > > ---
> > > > >
> > > > > Apache Kafka is a distributed streaming platform with four core APIs:
> > > > >
> > > > > ** The Producer API allows an application to publish a stream records
> > > > > to one or more Kafka topics.
> > > > >
> > > > > ** The Consumer API allows an application to subscribe to one or more
> > > > > topics and process the stream of records produced to them.
> > > > >
> > > > > ** The Streams API allows an application to act as a stream processor,
> > > > > consuming an input stream from one or more topics and producing an
> > > > > output stream to one or more output topics, effectively transforming
> > > > > the input streams to output streams.
> > > > >
> > > > > ** The Connector API allows building and running reusable producers or
> > > > > consumers that connect Kafka topics to existing applications or data
> > > > > systems. For example, a connector to a relational database might
> > > > > capture every change to a table.
> > > > >
> > > > >
> > > > > With these APIs, Kafka can be used for two broad classes of 
> > > > > application:
> > > > > ** Building real-time streaming data pipelines that reliably get data
> > > > > between systems or applications.
> > > > > ** Building real-time streaming applications that transform or react 
> > > > > to
> > > > > the streams of data.
> > > > >
> > > > > Apache Kafka is in use at large and small companies worldwide,
> > > > > including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
> > > > > Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
> > > > > Zalando, 

Re: [VOTE] KIP-455: Create an Administrative API for Replica Reassignment

2019-08-06 Thread Colin McCabe
Hi Koushik,

Thanks for the idea.  This KIP is already pretty big, so I think we'll have to 
consider ideas like this in follow-on KIPs.

In general, figuring out what's wrong with replication is a pretty tough 
problem.  If we had an API for this, we'd probably want it to be unified, and 
not specific to reassigning partitions.

regards,
Colin


On Tue, Aug 6, 2019, at 10:57, Koushik Chitta wrote:
> Hey Colin,
> 
> Can the ListPartitionReassignmentsResult include the status of the 
> current reassignment progress of each partition? A reassignment can be 
> in progress for different reasons and the status can give the option to 
> alter the current reassignment.
> 
> Example -  A leaderISRRequest of a new assigned replicas can be 
> ignored/errored because of a storage exception.  And reassignment batch 
> will be waiting indefinitely for the new assigned replicas to be in 
> sync with the leader of the partition.  
> Showing the status will give an option to alter the affected 
> partitions and allow the batch to complete reassignment.
> 
> OAR = {1, 2, 3} and RAR = {4,5,6}
> 
>  AR leader/isr
> {1,2,3,4,5,6}1/{1,2,3,4,6}   =>  LeaderISRRequest 
> was lost/skipped for 5 and the reassignment operation will be waiting 
> indefinitely for the 5 to be insync.
> 
> 
> 
> Thanks,
> Koushik
> 
> -Original Message-
> From: Jun Rao  
> Sent: Friday, August 2, 2019 10:04 AM
> To: dev 
> Subject: Re: [VOTE] KIP-455: Create an Administrative API for Replica 
> Reassignment
> 
> Hi, Colin,
> 
> First, since we are changing the format of LeaderAndIsrRequest, which 
> is an inter broker request, it seems that we will need IBP during 
> rolling upgrade. Could we add that to the compatibility section?
> 
> Regarding UnsupportedVersionException, even without ZK node version 
> bump, we probably want to only use the new ZK value fields after all 
> brokers have been upgraded to the new binary. Otherwise, the 
> reassignment task may not be completed if the controller changes to a 
> broker still on the old binary.
> IBP is one way to achieve that. The main thing is that we need some way 
> for the controller to deal with the new ZK fields. Dealing with the 
> additional ZK node version bump seems a small thing on top of that?
> 
> Thanks,
> 
> Jun
> 
> On Thu, Aug 1, 2019 at 3:05 PM Colin McCabe  wrote:
> 
> > On Thu, Aug 1, 2019, at 12:00, Jun Rao wrote:
> > > Hi, Colin,
> > >
> > > 10. Sounds good.
> > >
> > > 13. Our current convention is to bump up the version of ZK value if 
> > > there is any format change. For example, we have bumped up the 
> > > version of the value in /brokers/ids/nnn multiple times and all of 
> > > those changes are compatible (just adding new fields). This has the 
> > > slight benefit that it makes it clear there is a format change. 
> > > Rolling upgrades and downgrades can still be supported with the 
> > > version bump. For example, if you
> > downgrade
> > > from a compatible change, you can leave the new format in ZK and the 
> > > old code will only pick up fields relevant to the old version. 
> > > Upgrade will
> > be
> > > controlled by inter broker protocol.
> >
> > Hmm.  If we bump that ZK node version, we will need a new inter-broker 
> > protocol version.  We also need to return UnsupportedVersionException 
> > from the alterPartitionReassignments and listPartitionReassignments 
> > APIs when the IBP is too low.  This sounds doable, although we might 
> > need a release note that upgrading the IBP is necessary to allow 
> > reassignment operations after an upgrade.
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Jul 31, 2019 at 1:22 PM Colin McCabe  wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > Thanks for taking another look at this.
> > > >
> > > > On Wed, Jul 31, 2019, at 09:22, Jun Rao wrote:
> > > > > Hi, Stan,
> > > > >
> > > > > Thanks for the explanation.
> > > > >
> > > > > 10. If those new fields in LeaderAndIsr are only needed for 
> > > > > future
> > work,
> > > > > perhaps they should be added when we do the future work instead 
> > > > > of
> > now?
> > > >
> > > > I think this ties in with one of the big goals of this KIP, making 
> > > > it possible to distinguish reassigning replicas from normal replicas.
> > This is
> > > > the key to follow-on work like being able to ensure that 
> > > > partitions
> > with a
> > > > reassignment don't get falsely flagged as under-replicated in the
> > metrics,
> > > > or implementing reassignment quotas that don't accidentally affect
> > normal
> > > > replication traffic when a replica falls out of the ISR.
> > > >
> > > > For these follow-on improvements, we need to have that information 
> > > > in LeaderAndIsrRequest.  We could add the information in a 
> > > > follow-on KIP,
> > of
> > > > course, but then all the improvements are blocked on that 
> > > > follow-on
> > KIP.
> > > > That would slow things down for all of the 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Colin McCabe
Hi Harsha,

Thanks for taking a look.

I'm not sure I follow the discussion about AdminClient.  KafkaAdminClient has 
been around for about 2 years at this point as a public class.  There are many 
programs that use it to automatically create topics.  Kafka Streams does this, 
for example.  If any of your users run Kafka Streams, they will be 
auto-creating topics, regardless of what setting you use for 
auto.create.topics.enable.

A big part of the annoyance of auto-topic creation right now is that it is on 
by default.  The new configuration proposed by KIP-487 wouldn't be.  Users 
would have to explicitly opt in to the new behavior of client-side topic 
creation.  If you run without security, you're already putting a huge amount of 
trust in your users.  For example, you trust them not to delete records with 
the kafka-delete-records.sh command, or delete topics with kafka-topics.sh.  
Trusting them not to set a certain config value seems minor in comparison, 
right?

best,
Colin

On Tue, Aug 6, 2019, at 10:49, Harsha Chintalapani wrote:
> Hi,
> Even with policies one needs to implement that, so for every user who
> doesn't want a producer to create topics or have limits around partitions &
> replication factor they have to implement a policy.
>   The KIP is changing the behavior , it might not be introducing the
> new functionality but it will enable producers to override the create topic
> config settings on the broker. What I am asking for to provide a config
> that will disable auto creation of topics and if its enabled set some sane
> defaults so that clients can create a topic with in those limits. I don't
> see how this not related to this KIP.
>  If the server config options as I suggested doesn't interest you at
> least have a default CreateTopicPolices in place.
>To give an example, In our environment we disable the
> auto.create.topic.enable and force users to go through a centralized
> service as we want capture more details about the topic creation and
> requirements. With this KIP, a producer can create a topic with no bounds.
>  Another example max.message.size we define that at cluster level and one
> can override max.messsage.size at topic level, any misconfiguration at this
> will cause service degradation.  Its not always about the rogue clients,
> users can easily misconfigure and can cause an outage.
> Again we can talk about CreateTopicPolicy but without having a default
> implementation and asking everyone to implement their own while changing
> the behavior in producer  doesn't make sense to me.
> 
> Thanks,
> Harsha
> 
> 
> On Tue, Aug 06, 2019 at 7:41 AM, Ismael Juma  wrote:
> 
> > Hi Harsha,
> >
> > I mentioned policies and the authorizer. For example, with
> > CreateTopicPolicy, you can implement the limits you describe. If you have
> > ideas of how that should be improved, please submit a KIP. My point is that
> > this KIP is not introducing any new functionality with regards to what
> > rogue clients can do. It's using the existing protocol that is already
> > exposed via the AdminClient. So, I don't think we need to address it in
> > this KIP. Does that make sense?
> >
> > Ismael
> >
> > On Tue, Aug 6, 2019 at 7:13 AM Harsha Chintalapani 
> > wrote:
> >
> > Ismael,
> > Sure AdminClient can do that and we should've shipped a config or use the
> > existing one to block that. Not all users are yet to upgrade to AdminClient
> > and start using that to cause issues yet. In shared environment we should
> > allow server to set sane defaults and not allow every client to go ahead
> > create random no.of topic/partitions and replication factor. Even if the
> > users want to allow topic creation proposed in the KIP , it makes sense to
> > have some guards against the no.of partitions and replication factor.
> > Authorizer is not always an answer to block requests and having users set
> > server side configs to protect a multi-tenant environment is required. In a
> > non-secure environment Authorizer is a blunt instrument either you end up
> > blocking everyone or allowing everyone.
> > I am asking to have server side that allow clients to create topics or not
> > , if they are allowed set a ceiling on max no.of partitions and
> > replication-factor.
> >
> > -Harsha
> >
> > On Mon, Aug 5 2019 at 8:58 PM,  wrote:
> >
> > Harsha,
> >
> > Rogue clients can use the admin client to create topics and partitions.
> > ACLs and policies can help in that case as well as this one.
> >
> > Ismael
> >
> > On Mon, Aug 5, 2019, 7:48 PM Harsha Chintalapani 
> >
> > wrote:
> >
> > Hi Justine,
> > Thanks for the KIP.
> > "When server-side auto-creation is disabled, client-side auto-creation
> > will try to use client-side configurations"
> > If I understand correctly, this KIP is removing any server-side blocking
> > client auto creation of topic?
> > if so this will present potential issue of rogue client creating ton of
> > topic-partitions and potentially bringing down the 

Build failed in Jenkins: kafka-1.0-jdk7 #275

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-8602: Backport bugfix for standby task creation (#7148)

--
[...truncated 295.67 KB...]

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe STARTED

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaSubscribe PASSED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaAssign STARTED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaAssign PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoConsumeWithDescribeAclViaAssign 
STARTED

kafka.api.SslEndToEndAuthorizationTest > testNoConsumeWithDescribeAclViaAssign 
PASSED

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe STARTED

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithDescribeAclViaSubscribe PASSED

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaAssign STARTED

kafka.api.SslEndToEndAuthorizationTest > 
testNoConsumeWithoutDescribeAclViaAssign PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoGroupAcl STARTED

kafka.api.SslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl STARTED

kafka.api.SslEndToEndAuthorizationTest > testNoProduceWithDescribeAcl PASSED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaSubscribe STARTED

kafka.api.SslEndToEndAuthorizationTest > testProduceConsumeViaSubscribe PASSED

kafka.api.SslEndToEndAuthorizationTest > testNoProduceWithoutDescribeAcl STARTED

kafka.api.SslEndToEndAuthorizationTest > testNoProduceWithoutDescribeAcl PASSED

kafka.api.GroupCoordinatorIntegrationTest > 
testGroupCoordinatorPropagatesOfffsetsTopicCompressionCodec STARTED

kafka.api.GroupCoordinatorIntegrationTest > 
testGroupCoordinatorPropagatesOfffsetsTopicCompressionCodec PASSED

kafka.api.SaslSslConsumerTest > testCoordinatorFailover STARTED

kafka.api.SaslSslConsumerTest > testCoordinatorFailover PASSED

kafka.api.SaslSslConsumerTest > testSimpleConsumption STARTED

kafka.api.SaslSslConsumerTest > testSimpleConsumption PASSED

kafka.api.ConsumerBounceTest > testCloseDuringRebalance STARTED

kafka.api.ConsumerBounceTest > testCloseDuringRebalance PASSED

kafka.api.ConsumerBounceTest > testClose STARTED

kafka.api.ConsumerBounceTest > testClose PASSED

kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures STARTED

kafka.api.ConsumerBounceTest > testSeekAndCommitWithBrokerFailures PASSED

kafka.api.ConsumerBounceTest > testSubscribeWhenTopicUnavailable STARTED

kafka.api.ConsumerBounceTest > testSubscribeWhenTopicUnavailable PASSED

kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures STARTED

kafka.api.ConsumerBounceTest > testConsumptionWithBrokerFailures SKIPPED

kafka.api.AdminClientIntegrationTest > testDescribeReplicaLogDirs STARTED

kafka.api.AdminClientIntegrationTest > testDescribeReplicaLogDirs PASSED

kafka.api.AdminClientIntegrationTest > testInvalidAlterConfigs STARTED

kafka.api.AdminClientIntegrationTest > testInvalidAlterConfigs PASSED

kafka.api.AdminClientIntegrationTest > testClose STARTED

kafka.api.AdminClientIntegrationTest > testClose PASSED

kafka.api.AdminClientIntegrationTest > testMinimumRequestTimeouts STARTED

kafka.api.AdminClientIntegrationTest > testMinimumRequestTimeouts PASSED

kafka.api.AdminClientIntegrationTest > testForceClose STARTED

kafka.api.AdminClientIntegrationTest > testForceClose PASSED

kafka.api.AdminClientIntegrationTest > testListNodes STARTED

kafka.api.AdminClientIntegrationTest > testListNodes PASSED

kafka.api.AdminClientIntegrationTest > testDelayedClose STARTED

kafka.api.AdminClientIntegrationTest > testDelayedClose PASSED

kafka.api.AdminClientIntegrationTest > testCreateDeleteTopics STARTED

kafka.api.AdminClientIntegrationTest > testCreateDeleteTopics PASSED

kafka.api.AdminClientIntegrationTest > 
testAlterReplicaLogDirsBeforeTopicCreation STARTED

kafka.api.AdminClientIntegrationTest > 
testAlterReplicaLogDirsBeforeTopicCreation PASSED

kafka.api.AdminClientIntegrationTest > testDescribeLogDirs STARTED

kafka.api.AdminClientIntegrationTest > testDescribeLogDirs PASSED

kafka.api.AdminClientIntegrationTest > testAclOperations STARTED

kafka.api.AdminClientIntegrationTest > testAclOperations PASSED

kafka.api.AdminClientIntegrationTest > testDescribeCluster STARTED

kafka.api.AdminClientIntegrationTest > testDescribeCluster PASSED

kafka.api.AdminClientIntegrationTest > testCreatePartitions STARTED

kafka.api.AdminClientIntegrationTest > testCreatePartitions PASSED

kafka.api.AdminClientIntegrationTest > testDescribeNonExistingTopic STARTED

kafka.api.AdminClientIntegrationTest > testDescribeNonExistingTopic PASSED

kafka.api.AdminClientIntegrationTest > testDescribeAndAlterConfigs STARTED

kafka.api.AdminClientIntegrationTest > testDescribeAndAlterConfigs PASSED


[DISCUSS] KIP-504 - Add new Java Authorizer Interface

2019-08-06 Thread Rajini Sivaram
Hi all,

I have created a KIP to replace the Scala Authorizer API with a new Java
API:

   -
   
https://cwiki.apache.org/confluence/display/KAFKA/KIP-504+-+Add+new+Java+Authorizer+Interface

This is replacement for KIP-50 which was accepted but never merged. Apart
from moving to a Java API consistent with other pluggable interfaces in the
broker, KIP-504 also attempts to address known limitations in the
authorizer. If you have come across other limitations that you would like
to see addressed in the new API, please raise these on the discussion
thread so that we can consider those too. All suggestions and feedback are
welcome.

Thank you,

Rajini


Re: [DISCUSS] KIP-441: Smooth Scaling Out for Kafka Streams

2019-08-06 Thread Guozhang Wang
Hello Sophie,

Thanks for the proposed KIP. I left some comments on the wiki itself, and I
think I'm still not very clear on a couple or those:

1. With this proposal, does that mean with num.standby.replicas == 0, we
may sometimes still have some standby tasks which may violate the config?

2. I think I understand the rationale to consider lags that is below the
specified threshold to be equal, rather than still considering 5000 is
better than 5001 -- we do not want to "over-optimize" and potentially falls
into endless rebalances back and forth.

But I'm not clear about the rationale of the second parameter of
constrainedBalancedAssignment(StatefulTasksToRankedCandidates,
balance_factor):

Does that mean, e.g. with balance_factor of 3, we'd consider two
assignments one resulting balance_factor 0 and one resulting balance_factor
3 to be equally optimized assignment and therefore may "stop early"? This
was not very convincing to me :P

3. There are a couple of minor comments about the algorithm itself, left on
the wiki page since it needs to refer to the exact line and better
displayed there.

3.a Another wild thought about the threshold itself: today the assignment
itself is memoryless, so we would not know if the reported `TaskLag` itself
is increasing or decreasing even if the current value is under the
threshold. I wonder if it worthy to make it a bit more complicated to track
task lag trend at the assignor? Practically it may not be very uncommon
that stand-by tasks are not keeping up due to the fact that other active
tasks hosted on the same thread is starving the standby tasks.

4. There's a potential race condition risk when reporting `TaskLags` in the
subscription: right after reporting it to the leader, the cleanup thread
kicks in and deletes the state directory. If the task was assigned to the
host it would cause it to restore from beginning and effectively make the
seemingly optimized assignment very sub-optimal.

To be on the safer side we should consider either prune out those tasks
that are "close to be cleaned up" in the subscription, or we should delay
the cleanup right after we've included them in the subscription in case
they are been selected as assigned tasks by the assignor.

5. This is a meta comment: I think it would be helpful to add some user
visibility on the standby tasks lagging as well, via metrics for example.
Today it is hard for us to observe how far are our current standby tasks
compared to the active tasks and whether that lag is being increasing or
decreasing. As a follow-up task, for example, the rebalance should also be
triggered if we realize that some standby task's lag is increasing
indefinitely means that it cannot keep up (which is another indicator
either you need to add more resources with the num.standbys or your are
still not balanced enough).


On Tue, Aug 6, 2019 at 1:32 PM Sophie Blee-Goldman 
wrote:

> Hey all,
>
> I'd like to kick off discussion on KIP-441, aimed at the long restore times
> in Streams during which further active processing and IQ are blocked.
> Please give it a read and let us know your thoughts
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-441:+Smooth+Scaling+Out+for+Kafka+Streams
>
> Cheers,
> Sophie
>


-- 
-- Guozhang


Build failed in Jenkins: kafka-2.2-jdk8 #156

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-8736: Streams performance improvement, use isEmpty() rather 
than

--
[...truncated 2.38 MB...]
> Task :kafka-2.2-jdk8:streams:processResources NO-SOURCE
> Task :kafka-2.2-jdk8:streams:classes UP-TO-DATE
> Task :kafka-2.2-jdk8:streams:copyDependantLibs
> Task :kafka-2.2-jdk8:streams:jar UP-TO-DATE
> Task :kafka-2.2-jdk8:streams:test-utils:compileJava UP-TO-DATE
> Task :kafka-2.2-jdk8:streams:test-utils:processResources NO-SOURCE
> Task :kafka-2.2-jdk8:streams:test-utils:classes UP-TO-DATE
> Task :kafka-2.2-jdk8:streams:test-utils:copyDependantLibs
> Task :kafka-2.2-jdk8:streams:test-utils:jar UP-TO-DATE
> Task :kafka-2.2-jdk8:streams:compileTestJava
> Task :kafka-2.2-jdk8:streams:processTestResources UP-TO-DATE
> Task :kafka-2.2-jdk8:streams:testClasses
> Task :kafka-2.2-jdk8:streams:streams-scala:compileJava NO-SOURCE

> Task :kafka-2.2-jdk8:streams:streams-scala:compileScala
Pruning sources from previous analysis, due to incompatible CompileSetup.

> Task :kafka-2.2-jdk8:streams:streams-scala:processResources NO-SOURCE
> Task :kafka-2.2-jdk8:streams:streams-scala:classes
> Task :kafka-2.2-jdk8:streams:streams-scala:checkstyleMain NO-SOURCE
> Task :kafka-2.2-jdk8:streams:streams-scala:compileTestJava NO-SOURCE

> Task :kafka-2.2-jdk8:streams:streams-scala:compileTestScala
Pruning sources from previous analysis, due to incompatible CompileSetup.
:76:
 local val actualClicksPerRegion in method testShouldCountClicksPerRegion is 
never used
val actualClicksPerRegion: java.util.List[KeyValue[String, Long]] =
^
:116:
 local val actualClicksPerRegion in method 
testShouldCountClicksPerRegionWithNamedRepartitionTopic is never used
val actualClicksPerRegion: java.util.List[KeyValue[String, Long]] =
^
:139:
 local val wordCounts in method getTopologyJava is never used
  val wordCounts: KTableJ[String, java.lang.Long] = grouped.count()
  ^
:160:
 local val clicksPerRegion in method getTopologyScala is never used
  val clicksPerRegion: KTable[String, Long] =
  ^
:204:
 local val clicksPerRegion in method getTopologyJava is never used
  val clicksPerRegion: KTableJ[String, JLong] = clicksByRegion
  ^
:274:
 local val wordCounts in method getTopologyJava is never used
  val wordCounts: KTableJ[String, java.lang.Long] = grouped.count()
  ^
6 warnings found

> Task :kafka-2.2-jdk8:streams:streams-scala:processTestResources UP-TO-DATE
> Task :kafka-2.2-jdk8:streams:streams-scala:testClasses
> Task :kafka-2.2-jdk8:streams:streams-scala:checkstyleTest NO-SOURCE
> Task :kafka-2.2-jdk8:streams:streams-scala:spotbugsMain

> Task :kafka-2.2-jdk8:streams:streams-scala:test

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsMaterialized 
STARTED

> Task :kafka-2.2-jdk8:core:spotbugsMain

> Task :kafka-2.2-jdk8:streams:streams-scala:test

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsMaterialized 
PASSED

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsJava STARTED

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWordsJava FAILED
java.lang.AssertionError: expected: 
but was:
at org.junit.Assert.fail(Assert.java:88)
at org.junit.Assert.failNotEquals(Assert.java:834)
at org.junit.Assert.assertEquals(Assert.java:118)
at org.junit.Assert.assertEquals(Assert.java:144)
at 
org.apache.kafka.streams.scala.WordCountTest.testShouldCountWordsJava(WordCountTest.scala:178)

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWords STARTED

> Task :kafka-2.2-jdk8:core:spotbugsMain
The following classes needed for analysis were missing:
  org.apache.log4j.Logger
  org.apache.log4j.Level
  org.apache.log4j.LogManager

> Task :kafka-2.2-jdk8:streams:streams-scala:test

org.apache.kafka.streams.scala.WordCountTest > testShouldCountWords PASSED

org.apache.kafka.streams.scala.kstream.MaterializedTest > Create a Materialized 

Can we get a 2.3.1 release?

2019-08-06 Thread Jonathan Gordon
Hi all,

2.3.0 was released June 25, 2019.  There are 22 fixes scheduled for inclusion 
in 2.3.1 so far, 18 of which are resolved/closed.

https://issues.apache.org/jira/browse/KAFKA-8736?jql=project%20%3D%20KAFKA%20AND%20status%20in%20(Resolved%2C%20Closed)%20AND%20fixVersion%20%3D%202.3.1

Do folks have a sense as to when they would like to see a release? There's at 
least two show-stoppers that are preventing my team from upgrading, which we're 
very eager to do. The hotfixes we've tried are very promising.

Thanks,

Jonathan



[jira] [Created] (KAFKA-8763) Flaky Test SaslSslAdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevels

2019-08-06 Thread Sophie Blee-Goldman (JIRA)
Sophie Blee-Goldman created KAFKA-8763:
--

 Summary: Flaky Test 
SaslSslAdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevels
 Key: KAFKA-8763
 URL: https://issues.apache.org/jira/browse/KAFKA-8763
 Project: Kafka
  Issue Type: Bug
  Components: admin, core, unit tests
Reporter: Sophie Blee-Goldman


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/6740/testReport/junit/kafka.api/SaslSslAdminClientIntegrationTest/testIncrementalAlterConfigsForLog4jLogLevels/]

 
h3. Error Message

org.junit.ComparisonFailure: expected:<[OFF]> but was:<[INFO]>
h3. Stacktrace

org.junit.ComparisonFailure: expected:<[OFF]> but was:<[INFO]> at 
org.junit.Assert.assertEquals(Assert.java:117) at 
org.junit.Assert.assertEquals(Assert.java:146) at 
kafka.api.AdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevels(AdminClientIntegrationTest.scala:1850)
 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
java.base/java.lang.Thread.run(Thread.java:834)
h3. Standard Output

Debug is true storeKey true useTicketCache false useKeyTab true doNotPrompt 
false ticketCache is null isInitiator true KeyTab is 
/tmp/kafka3326803823781197290.tmp refreshKrb5Config is false principal is 
kafka/localh...@example.com tryFirstPass is false useFirstPass is false 
storePass is false clearPass is false principal is kafka/localh...@example.com 
Will use keytab Commit Succeeded



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[VOTE] KIP-481: SerDe Improvements for Connect Decimal type in JSON

2019-08-06 Thread Almog Gavra
Hello Everyone,

After discussions on
https://lists.apache.org/thread.html/fa665a6dc59f73ca294a00bcbef2eaa3ad00cc69626e91c516fa4fca@%3Cdev.kafka.apache.org%3E
I've opened this KIP up for formal voting.

KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-481%3A+SerDe+Improvements+for+Connect+Decimal+type+in+JSON

Please update the DISCUSS thread with any concerns/comments.

Cheers!
Almog


[jira] [Created] (KAFKA-8762) Flaky Test AdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevelsCanResetLoggerToCurrentRoot

2019-08-06 Thread Sophie Blee-Goldman (JIRA)
Sophie Blee-Goldman created KAFKA-8762:
--

 Summary: Flaky Test 
AdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevelsCanResetLoggerToCurrentRoot
 Key: KAFKA-8762
 URL: https://issues.apache.org/jira/browse/KAFKA-8762
 Project: Kafka
  Issue Type: Bug
  Components: admin, core, unit tests
Reporter: Sophie Blee-Goldman


h3. Error Message

org.junit.ComparisonFailure: expected:<[TRACE]> but was:<[INFO]>
h3. Stacktrace

org.junit.ComparisonFailure: expected:<[TRACE]> but was:<[INFO]> at 
org.junit.Assert.assertEquals(Assert.java:117) at 
org.junit.Assert.assertEquals(Assert.java:146) at 
kafka.api.AdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevelsCanResetLoggerToCurrentRoot(AdminClientIntegrationTest.scala:1918)
 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
java.base/java.lang.Thread.run(Thread.java:834)

 

[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/6740/testReport/junit/kafka.api/AdminClientIntegrationTest/testIncrementalAlterConfigsForLog4jLogLevelsCanResetLoggerToCurrentRoot/]



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] KIP-481: SerDe Improvements for Connect Decimal type in JSON

2019-08-06 Thread Almog Gavra
Hello Everyone,

Summarizing an in-person discussion with Randall (this is copied from the
KIP):

The original KIP suggested supporting an additional representation - base10
encoded text (e.g. `{"asText":"10.2345"}`). This causes issues because it
is impossible to disambiguate between TEXT and BINARY without an additional
config - furthermore, this makes the migration from one to the other nearly
impossible because it would require that all consumers stop consuming and
producers stop producing and atomically updating the config on all of them
after deploying the new code, or waiting for the full retention period to
pass - neither option is viable. The suggestion in the KIP is strictly an
improvement over the existing behavior, even if it doesn't support all
combinations.

It seems that since most real-world use cases actually use the numeric
representation (not string) we can consider this an improvement. With the
new suggestion, we don't need a deserialization configuration (only a
serialization option) and the consumers will be able to always
automatically determine the serialization format.

Based on this, I'll be opening up the simplified version of the KIP to a
vote.

Almog

On Mon, Jul 29, 2019 at 9:29 AM Almog Gavra  wrote:

> I'm mostly happy with your current suggestion (two configs, one for
> serialization and one for deserialization) and your implementation
> suggestion. One thing to note:
>
> > We should _always_ be able to deserialize a standard JSON
> > number as a decimal
>
> I was doing some research into decimals and JSON, and I can imagine a
> compelling reason to require string representations to avoid losing
> precision and to be certain that whomever is sending the data isn't losing
> precision (e.g. https://stackoverflow.com/a/38357877/2258040).
>
> I'm okay with always allowing numerics, but thought it's worth raising the
> thought.
>
> On Mon, Jul 29, 2019 at 4:57 AM Andy Coates  wrote:
>
>> The way I see it, we need to control two seperate things:
>>
>> 1. How do we _deserialize_ a decimal type if we encounter a text node in
>> the JSON?(We should _always_ be able to deserialize a standard JSON
>> number as a decimal).
>> 2. How do we chose how we want decimals to be _serialized_.
>>
>> This looks to fits well with your second suggestion of slightly different
>> configs names for serialization vs deserialization.
>> a, For deserialization we only care about how to handle text nodes: `
>> deserialization.decimal.*text*.format`, which should only have two valid
>> values BINARY | TEXT.
>> b. For serialization we need all three: `serialization.decimal.format`,
>> which should support all three options: BINARY | TEXT | NUMERIC.
>>
>> Implementation wise, I think these should be two separate enums, rather
>> than one shared enum and throwing an error if the deserializer is set to
>> NUMERIC.  Mainly as this means the enums reflect the options available,
>> rather than this being hidden in config checking code.  But that's a minor
>> implementation detail.
>>
>> Personally, I'd be tempted to have the BINARY value named something like
>> `LEGACY` or `LEGACY_BINARY` as a way of encouraging users to move away
>> from
>> it.
>>
>> It's a real shame that both of these settings require a default of BINARY
>> for backwards compatibility, but I agree that discussions / plans around
>> switching the defaults should not block this KIP.
>>
>> Andy
>>
>>
>> On Thu, 25 Jul 2019 at 18:26, Almog Gavra  wrote:
>>
>> > Thanks for the replies Andy and Andrew (2x Andy?)!
>> >
>> > > Is the text decimal a base16 encoded number, or is it base16 encoded
>> > binary
>> > > form of the number?
>> >
>> > The conversion happens as decimal.unscaledValue().toByteArray() and then
>> > the byte array is converted to a hex string, so it's definitely the
>> binary
>> > form of the number converted to base16. Whether or not that's the same
>> as
>> > the base16 encoded number is a good question (toByteArray returns a byte
>> > array containing a signed, big-endian, two's complement representation
>> of
>> > the big integer).
>> >
>> > > One suggestion I have is to change the proposed new config to only
>> affect
>> > > decimals stored as text, i.e. to switch between the current base16 and
>> > the
>> > > more common base10.   Then add another config to the serializer only
>> that
>> > > controls if decimals should be serialized as text or numeric.
>> >
>> > I think we need to be able to handle all mappings from serialization
>> format
>> > to deserialization format (e.g. read in BINARY and output TEXT), which I
>> > think would be impossible with the alternative suggestion. I agree that
>> > automatically deserializing numerics is valuable. I see two other ways
>> to
>> > get this, both keeping the serialization.format config the same:
>> >
>> > - have json.decimal.deserialization.format accept all three formats. if
>> set
>> > to BINARY/TEXT, numerics would be automatically supported. If set to
>> > NUMERIC, then 

[jira] [Created] (KAFKA-8761) Flaky Test AdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevels

2019-08-06 Thread Sophie Blee-Goldman (JIRA)
Sophie Blee-Goldman created KAFKA-8761:
--

 Summary: Flaky Test 
AdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevels
 Key: KAFKA-8761
 URL: https://issues.apache.org/jira/browse/KAFKA-8761
 Project: Kafka
  Issue Type: Bug
  Components: admin, core, unit tests
Reporter: Sophie Blee-Goldman


[https://builds.apache.org/job/kafka-pr-jdk11-scala2.12/6740/testReport/junit/kafka.api/AdminClientIntegrationTest/testIncrementalAlterConfigsForLog4jLogLevels/]

 
h3. Error Message

org.junit.ComparisonFailure: expected:<[FATAL]> but was:<[INFO]>
h3. Stacktrace

org.junit.ComparisonFailure: expected:<[FATAL]> but was:<[INFO]> at 
org.junit.Assert.assertEquals(Assert.java:117) at 
org.junit.Assert.assertEquals(Assert.java:146) at 
kafka.api.AdminClientIntegrationTest.testIncrementalAlterConfigsForLog4jLogLevels(AdminClientIntegrationTest.scala:1850)
 at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native 
Method) at 
java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at 
java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.base/java.lang.reflect.Method.invoke(Method.java:566) at 
org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
 at 
org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
 at 
org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
 at 
org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
 at 
org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) 
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) 
at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:288)
 at 
org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:282)
 at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at 
java.base/java.lang.Thread.run(Thread.java:834)



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


[DISCUSS] KIP-441: Smooth Scaling Out for Kafka Streams

2019-08-06 Thread Sophie Blee-Goldman
Hey all,

I'd like to kick off discussion on KIP-441, aimed at the long restore times
in Streams during which further active processing and IQ are blocked.
Please give it a read and let us know your thoughts

https://cwiki.apache.org/confluence/display/KAFKA/KIP-441:+Smooth+Scaling+Out+for+Kafka+Streams

Cheers,
Sophie


Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Justine Olshan
Hi Satish,

Thanks for looking at the KIP.

Yes, the producer will wait for the topic to be created before it can send
any messages to it.

I would like to clarify "overriding" broker behavior. If the client enables
client-side autocreation, the only difference will be that the topic
auto-creation will no longer occur in the metadata request and will instead
come from a CreateTopic request on the producer.
Partitions and replication factor will be determined by the broker configs.

Is this similar to what you were thinking? Please let me know if there is
something you think I missed.

Thank you,
Justine

On Tue, Aug 6, 2019 at 12:01 PM Satish Duggana 
wrote:

> Hi Justine,
> Thanks for the KIP. This is a nice addition to the producer client
> without running admin-client’s create topic APIs. Does producer wait
> for the topic to be created successfully before it tries to publish
> messages to that topic? I assume that this will not throw an error
> that the topic does not exist.
>
> As mentioned by others, overriding broker behavior by producer looks
> to be a concern. IMHO, broker should have a way to use either default
> constraints or configure custom constraints before these can be
> overridden by clients but not vice versa. There should be an option on
> brokers whether those constraints can be overridden by producers or
> not.
>
> Thanks,
> Satish.
>
> On Tue, Aug 6, 2019 at 11:39 PM Justine Olshan 
> wrote:
> >
> > Hi Harsha,
> >
> > After taking this all into consideration, I've updated the KIP to no
> longer
> > allow client-side configuration of replication factor and partitions.
> > Instead, the broker defaults will be used as long as the broker supports
> > KIP 464.
> > If the broker does not support this KIP, then the client can not create
> > topics on its own. (Behavior that exists now)
> >
> > I think this will help with your concerns. Please let me know if you
> > further feedback.
> >
> > Thank you,
> > Justine
> >
> > On Tue, Aug 6, 2019 at 10:49 AM Harsha Chintalapani 
> wrote:
> >
> > > Hi,
> > > Even with policies one needs to implement that, so for every user
> who
> > > doesn't want a producer to create topics or have limits around
> partitions &
> > > replication factor they have to implement a policy.
> > >   The KIP is changing the behavior , it might not be introducing
> the
> > > new functionality but it will enable producers to override the create
> topic
> > > config settings on the broker. What I am asking for to provide a config
> > > that will disable auto creation of topics and if its enabled set some
> sane
> > > defaults so that clients can create a topic with in those limits. I
> don't
> > > see how this not related to this KIP.
> > >  If the server config options as I suggested doesn't interest you
> at
> > > least have a default CreateTopicPolices in place.
> > >To give an example, In our environment we disable the
> > > auto.create.topic.enable and force users to go through a centralized
> > > service as we want capture more details about the topic creation and
> > > requirements. With this KIP, a producer can create a topic with no
> bounds.
> > >  Another example max.message.size we define that at cluster level and
> one
> > > can override max.messsage.size at topic level, any misconfiguration at
> this
> > > will cause service degradation.  Its not always about the rogue
> clients,
> > > users can easily misconfigure and can cause an outage.
> > > Again we can talk about CreateTopicPolicy but without having a default
> > > implementation and asking everyone to implement their own while
> changing
> > > the behavior in producer  doesn't make sense to me.
> > >
> > > Thanks,
> > > Harsha
> > >
> > >
> > > On Tue, Aug 06, 2019 at 7:41 AM, Ismael Juma 
> wrote:
> > >
> > > > Hi Harsha,
> > > >
> > > > I mentioned policies and the authorizer. For example, with
> > > > CreateTopicPolicy, you can implement the limits you describe. If you
> have
> > > > ideas of how that should be improved, please submit a KIP. My point
> is
> > > that
> > > > this KIP is not introducing any new functionality with regards to
> what
> > > > rogue clients can do. It's using the existing protocol that is
> already
> > > > exposed via the AdminClient. So, I don't think we need to address it
> in
> > > > this KIP. Does that make sense?
> > > >
> > > > Ismael
> > > >
> > > > On Tue, Aug 6, 2019 at 7:13 AM Harsha Chintalapani 
> > > > wrote:
> > > >
> > > > Ismael,
> > > > Sure AdminClient can do that and we should've shipped a config or
> use the
> > > > existing one to block that. Not all users are yet to upgrade to
> > > AdminClient
> > > > and start using that to cause issues yet. In shared environment we
> should
> > > > allow server to set sane defaults and not allow every client to go
> ahead
> > > > create random no.of topic/partitions and replication factor. Even if
> the
> > > > users want to allow topic creation proposed in the KIP , it makes
> sense
> > > to
> > 

Build failed in Jenkins: kafka-trunk-jdk8 #3832

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] Minor: Refactor methods to add metrics to sensor in 
`StreamsMetricsImpl`

--
[...truncated 2.59 MB...]

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testSchemalessUnixToTimestamp PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToString STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testWithSchemaTimestampToString PASSED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigInvalidFormat STARTED

org.apache.kafka.connect.transforms.TimestampConverterTest > 
testConfigInvalidFormat PASSED

org.apache.kafka.connect.transforms.util.NonEmptyListValidatorTest > 
testNullList STARTED

org.apache.kafka.connect.transforms.util.NonEmptyListValidatorTest > 
testNullList PASSED

org.apache.kafka.connect.transforms.util.NonEmptyListValidatorTest > 
testEmptyList STARTED

org.apache.kafka.connect.transforms.util.NonEmptyListValidatorTest > 
testEmptyList PASSED

org.apache.kafka.connect.transforms.util.NonEmptyListValidatorTest > 
testValidList STARTED

org.apache.kafka.connect.transforms.util.NonEmptyListValidatorTest > 
testValidList PASSED

org.apache.kafka.connect.transforms.ReplaceFieldTest > withSchema STARTED

org.apache.kafka.connect.transforms.ReplaceFieldTest > withSchema PASSED

org.apache.kafka.connect.transforms.ReplaceFieldTest > schemaless STARTED

org.apache.kafka.connect.transforms.ReplaceFieldTest > schemaless PASSED

org.apache.kafka.connect.transforms.TimestampRouterTest > defaultConfiguration 
STARTED

org.apache.kafka.connect.transforms.TimestampRouterTest > defaultConfiguration 
PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeDateRecordValueWithSchemaString STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeDateRecordValueWithSchemaString PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessBooleanTrue STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessBooleanTrue PASSED

org.apache.kafka.connect.transforms.CastTest > testConfigInvalidTargetType 
STARTED

org.apache.kafka.connect.transforms.CastTest > testConfigInvalidTargetType 
PASSED

org.apache.kafka.connect.transforms.CastTest > 
testConfigMixWholeAndFieldTransformation STARTED

org.apache.kafka.connect.transforms.CastTest > 
testConfigMixWholeAndFieldTransformation PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessFloat32 STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessFloat32 PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessFloat64 STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessFloat64 PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessUnsupportedType STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessUnsupportedType PASSED

org.apache.kafka.connect.transforms.CastTest > castWholeRecordKeySchemaless 
STARTED

org.apache.kafka.connect.transforms.CastTest > castWholeRecordKeySchemaless 
PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueWithSchemaFloat32 STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueWithSchemaFloat32 PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueWithSchemaFloat64 STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueWithSchemaFloat64 PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueWithSchemaString STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueWithSchemaString PASSED

org.apache.kafka.connect.transforms.CastTest > testConfigEmpty STARTED

org.apache.kafka.connect.transforms.CastTest > testConfigEmpty PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessInt16 STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessInt16 PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessInt32 STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessInt32 PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessInt64 STARTED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueSchemalessInt64 PASSED

org.apache.kafka.connect.transforms.CastTest > castFieldsSchemaless STARTED

org.apache.kafka.connect.transforms.CastTest > castFieldsSchemaless PASSED

org.apache.kafka.connect.transforms.CastTest > testUnsupportedTargetType STARTED

org.apache.kafka.connect.transforms.CastTest > testUnsupportedTargetType PASSED

org.apache.kafka.connect.transforms.CastTest > 
castWholeRecordValueWithSchemaInt16 STARTED


Build failed in Jenkins: kafka-trunk-jdk11 #735

2019-08-06 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] Minor: Refactor methods to add metrics to sensor in 
`StreamsMetricsImpl`

--
[...truncated 2.59 MB...]

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowConfigExceptionIfMaxInFlightRequestsPerConnectionIsInvalidStringIfEosEnabled
 PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfProducerConfig STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfProducerConfig PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedGlobalConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedGlobalConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeSupportNonPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeSupportNonPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
consumerConfigShouldContainAdminClientConfigsForRetriesAndRetryBackOffMsWithAdminPrefix
 STARTED

org.apache.kafka.streams.StreamsConfigTest > 
consumerConfigShouldContainAdminClientConfigsForRetriesAndRetryBackOffMsWithAdminPrefix
 PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConifgsOnRestoreConsumer STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConifgsOnRestoreConsumer PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfGlobalConsumerConfig STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfGlobalConsumerConfig PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptExactlyOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptExactlyOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
testGetGlobalConsumerConfigsWithGlobalConsumerOverridenPrefix STARTED

org.apache.kafka.streams.StreamsConfigTest > 
testGetGlobalConsumerConfigsWithGlobalConsumerOverridenPrefix PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowExceptionIfMaxInFlightRequestsGreaterThanFiveIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowExceptionIfMaxInFlightRequestsGreaterThanFiveIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetInternalLeaveGroupOnCloseConfigToFalseInConsumer STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetInternalLeaveGroupOnCloseConfigToFalseInConsumer PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfConsumerConfig STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedPropertiesThatAreNotPartOfConsumerConfig PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldForwardCustomConfigsWithNoPrefixToAllClients STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldForwardCustomConfigsWithNoPrefixToAllClients PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfRestoreConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfRestoreConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSpecifyCorrectValueSerdeClassOnError STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSpecifyCorrectValueSerdeClassOnError PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowToSpecifyMaxInFlightRequestsPerConnectionAsStringIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowToSpecifyMaxInFlightRequestsPerConnectionAsStringIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfGlobalConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfGlobalConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetGroupInstanceIdConfigs 
STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetGroupInstanceIdConfigs 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerIsolationLevelIsOverriddenIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerIsolationLevelIsOverriddenIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeSupportNonPrefixedGlobalConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldBeSupportNonPrefixedGlobalConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportNonPrefixedAdminConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportNonPrefixedAdminConfigs PASSED


Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Satish Duggana
Hi Justine,
Thanks for the KIP. This is a nice addition to the producer client
without running admin-client’s create topic APIs. Does producer wait
for the topic to be created successfully before it tries to publish
messages to that topic? I assume that this will not throw an error
that the topic does not exist.

As mentioned by others, overriding broker behavior by producer looks
to be a concern. IMHO, broker should have a way to use either default
constraints or configure custom constraints before these can be
overridden by clients but not vice versa. There should be an option on
brokers whether those constraints can be overridden by producers or
not.

Thanks,
Satish.

On Tue, Aug 6, 2019 at 11:39 PM Justine Olshan  wrote:
>
> Hi Harsha,
>
> After taking this all into consideration, I've updated the KIP to no longer
> allow client-side configuration of replication factor and partitions.
> Instead, the broker defaults will be used as long as the broker supports
> KIP 464.
> If the broker does not support this KIP, then the client can not create
> topics on its own. (Behavior that exists now)
>
> I think this will help with your concerns. Please let me know if you
> further feedback.
>
> Thank you,
> Justine
>
> On Tue, Aug 6, 2019 at 10:49 AM Harsha Chintalapani  wrote:
>
> > Hi,
> > Even with policies one needs to implement that, so for every user who
> > doesn't want a producer to create topics or have limits around partitions &
> > replication factor they have to implement a policy.
> >   The KIP is changing the behavior , it might not be introducing the
> > new functionality but it will enable producers to override the create topic
> > config settings on the broker. What I am asking for to provide a config
> > that will disable auto creation of topics and if its enabled set some sane
> > defaults so that clients can create a topic with in those limits. I don't
> > see how this not related to this KIP.
> >  If the server config options as I suggested doesn't interest you at
> > least have a default CreateTopicPolices in place.
> >To give an example, In our environment we disable the
> > auto.create.topic.enable and force users to go through a centralized
> > service as we want capture more details about the topic creation and
> > requirements. With this KIP, a producer can create a topic with no bounds.
> >  Another example max.message.size we define that at cluster level and one
> > can override max.messsage.size at topic level, any misconfiguration at this
> > will cause service degradation.  Its not always about the rogue clients,
> > users can easily misconfigure and can cause an outage.
> > Again we can talk about CreateTopicPolicy but without having a default
> > implementation and asking everyone to implement their own while changing
> > the behavior in producer  doesn't make sense to me.
> >
> > Thanks,
> > Harsha
> >
> >
> > On Tue, Aug 06, 2019 at 7:41 AM, Ismael Juma  wrote:
> >
> > > Hi Harsha,
> > >
> > > I mentioned policies and the authorizer. For example, with
> > > CreateTopicPolicy, you can implement the limits you describe. If you have
> > > ideas of how that should be improved, please submit a KIP. My point is
> > that
> > > this KIP is not introducing any new functionality with regards to what
> > > rogue clients can do. It's using the existing protocol that is already
> > > exposed via the AdminClient. So, I don't think we need to address it in
> > > this KIP. Does that make sense?
> > >
> > > Ismael
> > >
> > > On Tue, Aug 6, 2019 at 7:13 AM Harsha Chintalapani 
> > > wrote:
> > >
> > > Ismael,
> > > Sure AdminClient can do that and we should've shipped a config or use the
> > > existing one to block that. Not all users are yet to upgrade to
> > AdminClient
> > > and start using that to cause issues yet. In shared environment we should
> > > allow server to set sane defaults and not allow every client to go ahead
> > > create random no.of topic/partitions and replication factor. Even if the
> > > users want to allow topic creation proposed in the KIP , it makes sense
> > to
> > > have some guards against the no.of partitions and replication factor.
> > > Authorizer is not always an answer to block requests and having users set
> > > server side configs to protect a multi-tenant environment is required.
> > In a
> > > non-secure environment Authorizer is a blunt instrument either you end up
> > > blocking everyone or allowing everyone.
> > > I am asking to have server side that allow clients to create topics or
> > not
> > > , if they are allowed set a ceiling on max no.of partitions and
> > > replication-factor.
> > >
> > > -Harsha
> > >
> > > On Mon, Aug 5 2019 at 8:58 PM,  wrote:
> > >
> > > Harsha,
> > >
> > > Rogue clients can use the admin client to create topics and partitions.
> > > ACLs and policies can help in that case as well as this one.
> > >
> > > Ismael
> > >
> > > On Mon, Aug 5, 2019, 7:48 PM Harsha Chintalapani 
> > >
> > > wrote:
> > >
> > > 

Re: [DISCUSS] KIP-447: Producer scalability for exactly once semantics

2019-08-06 Thread Jason Gustafson
Hi Boyang, thanks for the updates. I have a few more comments:

1. We are adding some new fields to TxnOffsetCommit to support group-based
fencing. Do we need these fields to be persisted in the offsets topic to
ensure that the fencing still works after a coordinator failover?

2. Since you are proposing a new `groupMetadata` API, have you considered
whether we still need the `initTransactions` overload? Another way would be
to pass it through the `sendOffsetsToTransaction` API:

void sendOffsetsToTransaction(Map
offsets, GroupMetadata groupMetadata) throws
ProducerFencedException, IllegalGenerationException;

This seems a little more consistent with the current API and avoids the
direct dependence on the Consumer in the producer.

3. Can you clarify the behavior of the clients when the brokers do not
support the latest API versions? This is both for the new TxnOffsetCommit
and the OffsetFetch APIs. I guess the high level idea in streams is to
detect broker support before instantiating the producer and consumer. I
think that's reasonable, but we might need some approach for non-streams
use cases. One option I was considering is enforcing the latest version
through the new `sendOffsetsToTransaction` API. Basically when you use the
new API, we require support for the latest TxnOffsetCommit version. This
puts some burden on users, but it avoids breaking correctness assumptions
when the new APIs are in use. What do you think?


-Jason






On Mon, Aug 5, 2019 at 6:06 PM Boyang Chen 
wrote:

> Yep, Guozhang I think that would be best as passing in an entire consumer
> instance is indeed cumbersome.
>
> Just saw you updated KIP-429, I will follow-up to change 447 as well.
>
> Best,
> Boyang
>
> On Mon, Aug 5, 2019 at 3:18 PM Guozhang Wang  wrote:
>
> > okay I think I understand your concerns about ConsumerGroupMetadata now:
> if
> > we still want to only call initTxns once, then we should allow the
> whatever
> > passed-in parameter to reflect the latest value of generation id whenever
> > sending the offset fetch request.
> >
> > Whereas the current ConsumerGroupMetadata is a static object.
> >
> > Maybe we can consider having an extended class of ConsumerGroupMetadata
> > whose values are updated from the consumer's rebalance callback?
> >
> >
> > Guozhang
> >
> >
> > On Mon, Aug 5, 2019 at 9:26 AM Boyang Chen 
> > wrote:
> >
> > > Thank you Guozhang for the reply! I'm curious whether KIP-429 has
> > reflected
> > > the latest change on ConsumerGroupMetadata? Also regarding question
> one,
> > > the group metadata needs to be accessed via callback, does that mean we
> > > need a separate producer API such like
> > > "producer.refreshMetadata(groupMetadata)" to be able to access it
> instead
> > > of passing in the consumer instance?
> > >
> > > Boyang
> > >
> > > On Fri, Aug 2, 2019 at 4:36 PM Guozhang Wang 
> wrote:
> > >
> > > > Thanks Boyang,
> > > >
> > > > I've made another pass on KIP-447 as well as
> > > > https://github.com/apache/kafka/pull/7078, and have some minor
> > comments
> > > > about the proposed API:
> > > >
> > > > 1. it seems instead of needing the whole KafkaConsumer object, you'd
> > only
> > > > need the "ConsumerGroupMetadata", in that case can we just pass in
> that
> > > > object into the initTxns call?
> > > >
> > > > 2. the current trunk already has a public class named
> > > > (ConsumerGroupMetadata)
> > > > under o.a.k.clients.consumer created by KIP-429. If we want to just
> use
> > > > that then maybe it makes less sense to declare a base GroupMetadata
> as
> > we
> > > > are already leaking such information on the assignor anyways.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Tue, Jul 30, 2019 at 1:55 PM Boyang Chen <
> > reluctanthero...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thank you Guozhang for the reply. We will consider the interface
> > change
> > > > > from 429 as a backup plan for 447.
> > > > >
> > > > > And bumping this thread for more discussion.
> > > > >
> > > > > On Mon, Jul 22, 2019 at 6:28 PM Guozhang Wang 
> > > > wrote:
> > > > >
> > > > > > On Sat, Jul 20, 2019 at 9:50 AM Boyang Chen <
> > > > reluctanthero...@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Thank you Guozhang for the suggestion! I would normally prefer
> > > > naming a
> > > > > > > flag corresponding to its functionality. Seems to me
> > > > `isolation_level`
> > > > > > > makes us another hop on information track.
> > > > > > >
> > > > > > > Fair enough, let's use a separate flag name then :)
> > > > > >
> > > > > >
> > > > > > > As for the generation.id exposure, I'm fine leveraging the new
> > API
> > > > > from
> > > > > > > 429, but however is that design finalized yet, and whether the
> > API
> > > > will
> > > > > > be
> > > > > > > added on the generic Consumer interface?
> > > > > > >
> > > > > > > The current PartitionAssignor is inside `internals` package and
> > in
> > > > > > KIP-429
> > > > > > we are going to create a new interface out of `internals` 

[jira] [Created] (KAFKA-8760) KIP-504: Add new Java Authorizer API

2019-08-06 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-8760:
-

 Summary: KIP-504: Add new Java Authorizer API 
 Key: KAFKA-8760
 URL: https://issues.apache.org/jira/browse/KAFKA-8760
 Project: Kafka
  Issue Type: New Feature
  Components: security
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 2.4.0


See 
[https://cwiki.apache.org/confluence/display/KAFKA/KIP-504+-+Add+new+Java+Authorizer+Interface]
 for details.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [VOTE] KIP-492 Add java security providers in Kafka Security config

2019-08-06 Thread Sandeep Mopuri
Hi Jeff, can you make this comment in the PR
https://github.com/apache/kafka/pull/7090, instead of the vote thread.
Let's move the discussion there.

On Tue, Aug 6, 2019 at 10:26 AM Jeff Huang  wrote:

>
>
> On 2019/07/29 19:22:02, Sandeep Mopuri  wrote:
> > Hi all, after some good discussion
> >  about
> the
> > KIP
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-492%3A+Add+java+security+providers+in+Kafka+Security+config
> >,
> > I'm starting the voting.
> >
> > This KIP proposes adding new security configuration to accept custom
> > security providers that can provide algorithms for SSL or SASL.
> >
> > --
> > Thanks,
> > M.Sai Sandeep
> >
>
> Hello,
>
> How do we handle a scenario that some providers require more information
> for installing providers?
>
> For instance, Bouncy Castle(BC) provider requires input parameter
> "fips:BCFIPS" for enabling FIPS mode.
> Example:
> Static Configuration in java.security file:
> security.provider.1=org.bouncycastle.jsse.provider.BouncyCastleJsseProvider
> fips:BCFIPS
> Dynamic Installation
> Security.addProvider(new BouncyCastleJsseProvider(“fips:BCFIPS”))
>
> So I suggested we might consider providing more info for the new config
> property, example like:
> security.provider.info=classname of provider/name of provider/initial
> parameters,
> Example for BC case:
> security.provider.info
> =org.bouncycastle.jsse.provider.BouncyCastleJsseProvider/BC/fips:BCFIPS,sun.security.provider.Sun/SUN,
> Basically info for each provider will consist of three pieces information:
> name of class, name of provider(for unit testing purpose),initial parameter
> for instantiating class.
> Still use comma ","  for separating each provider info.
>
> Jeff Huang,
>
>
>

-- 
Thanks,
M.Sai Sandeep


Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Justine Olshan
Hi Harsha,

After taking this all into consideration, I've updated the KIP to no longer
allow client-side configuration of replication factor and partitions.
Instead, the broker defaults will be used as long as the broker supports
KIP 464.
If the broker does not support this KIP, then the client can not create
topics on its own. (Behavior that exists now)

I think this will help with your concerns. Please let me know if you
further feedback.

Thank you,
Justine

On Tue, Aug 6, 2019 at 10:49 AM Harsha Chintalapani  wrote:

> Hi,
> Even with policies one needs to implement that, so for every user who
> doesn't want a producer to create topics or have limits around partitions &
> replication factor they have to implement a policy.
>   The KIP is changing the behavior , it might not be introducing the
> new functionality but it will enable producers to override the create topic
> config settings on the broker. What I am asking for to provide a config
> that will disable auto creation of topics and if its enabled set some sane
> defaults so that clients can create a topic with in those limits. I don't
> see how this not related to this KIP.
>  If the server config options as I suggested doesn't interest you at
> least have a default CreateTopicPolices in place.
>To give an example, In our environment we disable the
> auto.create.topic.enable and force users to go through a centralized
> service as we want capture more details about the topic creation and
> requirements. With this KIP, a producer can create a topic with no bounds.
>  Another example max.message.size we define that at cluster level and one
> can override max.messsage.size at topic level, any misconfiguration at this
> will cause service degradation.  Its not always about the rogue clients,
> users can easily misconfigure and can cause an outage.
> Again we can talk about CreateTopicPolicy but without having a default
> implementation and asking everyone to implement their own while changing
> the behavior in producer  doesn't make sense to me.
>
> Thanks,
> Harsha
>
>
> On Tue, Aug 06, 2019 at 7:41 AM, Ismael Juma  wrote:
>
> > Hi Harsha,
> >
> > I mentioned policies and the authorizer. For example, with
> > CreateTopicPolicy, you can implement the limits you describe. If you have
> > ideas of how that should be improved, please submit a KIP. My point is
> that
> > this KIP is not introducing any new functionality with regards to what
> > rogue clients can do. It's using the existing protocol that is already
> > exposed via the AdminClient. So, I don't think we need to address it in
> > this KIP. Does that make sense?
> >
> > Ismael
> >
> > On Tue, Aug 6, 2019 at 7:13 AM Harsha Chintalapani 
> > wrote:
> >
> > Ismael,
> > Sure AdminClient can do that and we should've shipped a config or use the
> > existing one to block that. Not all users are yet to upgrade to
> AdminClient
> > and start using that to cause issues yet. In shared environment we should
> > allow server to set sane defaults and not allow every client to go ahead
> > create random no.of topic/partitions and replication factor. Even if the
> > users want to allow topic creation proposed in the KIP , it makes sense
> to
> > have some guards against the no.of partitions and replication factor.
> > Authorizer is not always an answer to block requests and having users set
> > server side configs to protect a multi-tenant environment is required.
> In a
> > non-secure environment Authorizer is a blunt instrument either you end up
> > blocking everyone or allowing everyone.
> > I am asking to have server side that allow clients to create topics or
> not
> > , if they are allowed set a ceiling on max no.of partitions and
> > replication-factor.
> >
> > -Harsha
> >
> > On Mon, Aug 5 2019 at 8:58 PM,  wrote:
> >
> > Harsha,
> >
> > Rogue clients can use the admin client to create topics and partitions.
> > ACLs and policies can help in that case as well as this one.
> >
> > Ismael
> >
> > On Mon, Aug 5, 2019, 7:48 PM Harsha Chintalapani 
> >
> > wrote:
> >
> > Hi Justine,
> > Thanks for the KIP.
> > "When server-side auto-creation is disabled, client-side auto-creation
> > will try to use client-side configurations"
> > If I understand correctly, this KIP is removing any server-side blocking
> > client auto creation of topic?
> > if so this will present potential issue of rogue client creating ton of
> > topic-partitions and potentially bringing down the service for everyone
> >
> > or
> >
> > degrade the service itself.
> > By reading the KIP its not clear to me that there is a clear way to block
> > auto creation topics of all together from clients by server side config.
> > Server side configs of default topic, partitions should take higher
> > precedence and client shouldn't be able to create a topic with higher
> >
> > no.of
> >
> > partitions, replication than what server config specifies.
> >
> > Thanks,
> > Harsha
> >
> > On Mon, Aug 05, 2019 at 5:24 PM, Justine 

RE: [VOTE] KIP-455: Create an Administrative API for Replica Reassignment

2019-08-06 Thread Koushik Chitta
Hey Colin,

Can the ListPartitionReassignmentsResult include the status of the current 
reassignment progress of each partition? A reassignment can be in progress for 
different reasons and the status can give the option to alter the current 
reassignment.

Example -  A leaderISRRequest of a new assigned replicas can be ignored/errored 
because of a storage exception.  And reassignment batch will be waiting 
indefinitely for the new assigned replicas to be in sync with the leader of the 
partition.  
  Showing the status will give an option to alter the affected 
partitions and allow the batch to complete reassignment.

OAR = {1, 2, 3} and RAR = {4,5,6}

 AR leader/isr
{1,2,3,4,5,6}1/{1,2,3,4,6}   =>  LeaderISRRequest was 
lost/skipped for 5 and the reassignment operation will be waiting indefinitely 
for the 5 to be insync.



Thanks,
Koushik

-Original Message-
From: Jun Rao  
Sent: Friday, August 2, 2019 10:04 AM
To: dev 
Subject: Re: [VOTE] KIP-455: Create an Administrative API for Replica 
Reassignment

Hi, Colin,

First, since we are changing the format of LeaderAndIsrRequest, which is an 
inter broker request, it seems that we will need IBP during rolling upgrade. 
Could we add that to the compatibility section?

Regarding UnsupportedVersionException, even without ZK node version bump, we 
probably want to only use the new ZK value fields after all brokers have been 
upgraded to the new binary. Otherwise, the reassignment task may not be 
completed if the controller changes to a broker still on the old binary.
IBP is one way to achieve that. The main thing is that we need some way for the 
controller to deal with the new ZK fields. Dealing with the additional ZK node 
version bump seems a small thing on top of that?

Thanks,

Jun

On Thu, Aug 1, 2019 at 3:05 PM Colin McCabe  wrote:

> On Thu, Aug 1, 2019, at 12:00, Jun Rao wrote:
> > Hi, Colin,
> >
> > 10. Sounds good.
> >
> > 13. Our current convention is to bump up the version of ZK value if 
> > there is any format change. For example, we have bumped up the 
> > version of the value in /brokers/ids/nnn multiple times and all of 
> > those changes are compatible (just adding new fields). This has the 
> > slight benefit that it makes it clear there is a format change. 
> > Rolling upgrades and downgrades can still be supported with the 
> > version bump. For example, if you
> downgrade
> > from a compatible change, you can leave the new format in ZK and the 
> > old code will only pick up fields relevant to the old version. 
> > Upgrade will
> be
> > controlled by inter broker protocol.
>
> Hmm.  If we bump that ZK node version, we will need a new inter-broker 
> protocol version.  We also need to return UnsupportedVersionException 
> from the alterPartitionReassignments and listPartitionReassignments 
> APIs when the IBP is too low.  This sounds doable, although we might 
> need a release note that upgrading the IBP is necessary to allow 
> reassignment operations after an upgrade.
>
> best,
> Colin
>
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Jul 31, 2019 at 1:22 PM Colin McCabe  wrote:
> >
> > > Hi Jun,
> > >
> > > Thanks for taking another look at this.
> > >
> > > On Wed, Jul 31, 2019, at 09:22, Jun Rao wrote:
> > > > Hi, Stan,
> > > >
> > > > Thanks for the explanation.
> > > >
> > > > 10. If those new fields in LeaderAndIsr are only needed for 
> > > > future
> work,
> > > > perhaps they should be added when we do the future work instead 
> > > > of
> now?
> > >
> > > I think this ties in with one of the big goals of this KIP, making 
> > > it possible to distinguish reassigning replicas from normal replicas.
> This is
> > > the key to follow-on work like being able to ensure that 
> > > partitions
> with a
> > > reassignment don't get falsely flagged as under-replicated in the
> metrics,
> > > or implementing reassignment quotas that don't accidentally affect
> normal
> > > replication traffic when a replica falls out of the ISR.
> > >
> > > For these follow-on improvements, we need to have that information 
> > > in LeaderAndIsrRequest.  We could add the information in a 
> > > follow-on KIP,
> of
> > > course, but then all the improvements are blocked on that 
> > > follow-on
> KIP.
> > > That would slow things down for all of the downstream KIPs that 
> > > are
> blocked
> > > on this.
> > >
> > > Also, to keep things consistent, I think it would be best if the
> format of
> > > the data in the LeaderAndIsrRequest matched the format of the data 
> > > in ZooKeeper.  Since we're deciding on the ZK format in this KIP, 
> > > I think
> it
> > > makes sense to also decide on the format in the LeaderAndIsrRequest.
> > >
> > > > > > Should we include those two fields in UpdateMetadata and
> potentially
> > > > > > Metadata requests too?
> > >
> > > We had some discussion earlier about how metadata responses to 
> > > clients
> are
> > > getting too large, in part because they 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Harsha Chintalapani
Hi,
Even with policies one needs to implement that, so for every user who
doesn't want a producer to create topics or have limits around partitions &
replication factor they have to implement a policy.
  The KIP is changing the behavior , it might not be introducing the
new functionality but it will enable producers to override the create topic
config settings on the broker. What I am asking for to provide a config
that will disable auto creation of topics and if its enabled set some sane
defaults so that clients can create a topic with in those limits. I don't
see how this not related to this KIP.
 If the server config options as I suggested doesn't interest you at
least have a default CreateTopicPolices in place.
   To give an example, In our environment we disable the
auto.create.topic.enable and force users to go through a centralized
service as we want capture more details about the topic creation and
requirements. With this KIP, a producer can create a topic with no bounds.
 Another example max.message.size we define that at cluster level and one
can override max.messsage.size at topic level, any misconfiguration at this
will cause service degradation.  Its not always about the rogue clients,
users can easily misconfigure and can cause an outage.
Again we can talk about CreateTopicPolicy but without having a default
implementation and asking everyone to implement their own while changing
the behavior in producer  doesn't make sense to me.

Thanks,
Harsha


On Tue, Aug 06, 2019 at 7:41 AM, Ismael Juma  wrote:

> Hi Harsha,
>
> I mentioned policies and the authorizer. For example, with
> CreateTopicPolicy, you can implement the limits you describe. If you have
> ideas of how that should be improved, please submit a KIP. My point is that
> this KIP is not introducing any new functionality with regards to what
> rogue clients can do. It's using the existing protocol that is already
> exposed via the AdminClient. So, I don't think we need to address it in
> this KIP. Does that make sense?
>
> Ismael
>
> On Tue, Aug 6, 2019 at 7:13 AM Harsha Chintalapani 
> wrote:
>
> Ismael,
> Sure AdminClient can do that and we should've shipped a config or use the
> existing one to block that. Not all users are yet to upgrade to AdminClient
> and start using that to cause issues yet. In shared environment we should
> allow server to set sane defaults and not allow every client to go ahead
> create random no.of topic/partitions and replication factor. Even if the
> users want to allow topic creation proposed in the KIP , it makes sense to
> have some guards against the no.of partitions and replication factor.
> Authorizer is not always an answer to block requests and having users set
> server side configs to protect a multi-tenant environment is required. In a
> non-secure environment Authorizer is a blunt instrument either you end up
> blocking everyone or allowing everyone.
> I am asking to have server side that allow clients to create topics or not
> , if they are allowed set a ceiling on max no.of partitions and
> replication-factor.
>
> -Harsha
>
> On Mon, Aug 5 2019 at 8:58 PM,  wrote:
>
> Harsha,
>
> Rogue clients can use the admin client to create topics and partitions.
> ACLs and policies can help in that case as well as this one.
>
> Ismael
>
> On Mon, Aug 5, 2019, 7:48 PM Harsha Chintalapani 
>
> wrote:
>
> Hi Justine,
> Thanks for the KIP.
> "When server-side auto-creation is disabled, client-side auto-creation
> will try to use client-side configurations"
> If I understand correctly, this KIP is removing any server-side blocking
> client auto creation of topic?
> if so this will present potential issue of rogue client creating ton of
> topic-partitions and potentially bringing down the service for everyone
>
> or
>
> degrade the service itself.
> By reading the KIP its not clear to me that there is a clear way to block
> auto creation topics of all together from clients by server side config.
> Server side configs of default topic, partitions should take higher
> precedence and client shouldn't be able to create a topic with higher
>
> no.of
>
> partitions, replication than what server config specifies.
>
> Thanks,
> Harsha
>
> On Mon, Aug 05, 2019 at 5:24 PM, Justine Olshan 
> wrote:
>
> Hi all,
> I made some changes to the KIP. Hopefully this configuration change
>
> will
>
> make things a little clearer.
>
> https://cwiki.apache.org/confluence/display/KAFKA/
> KIP-487%3A+Client-side+Automatic+Topic+Creation+on+Producer
>
> Please let me know if you have any feedback or questions!
>
> Thank you,
> Justine
>
> On Wed, Jul 31, 2019 at 1:44 PM Colin McCabe 
>
> wrote:
>
> Hi Mickael,
>
> I think you bring up a good point. It would be better if we didn't ever
> have to set up client-side configuration for this feature, and KIP-464
> would let us skip this entirely.
>
> On Wed, Jul 31, 2019, at 09:19, Justine Olshan wrote:
>
> Hi Mickael,
> I agree that KIP-464 works on newer brokers, but I was 

Re: [VOTE] KIP-396: Add Commit/List Offsets Operations to AdminClient

2019-08-06 Thread Jason Gustafson
Thanks for the KIP. This makes sense to me. Just a couple small comments:

1. Can the listOffsets API be used to get the start and end offsets? In the
consumer, we use separate APIs for this: `beginningOffsets` and
`endOffsets` to avoid the need for sentinels. An alternative would be to
introduce an `OffsetSpec` (or maybe `OffsetQuery`) object to customize the
query. For example:

public ListOffsetsResult listOffsets(Map
partitionOffsetSpecs)

The benefit is that we can avoid sentinel values and we have an extension
point for additional query options in the future. What do you think?

2. The ListOffset response includes the leader epoch corresponding to the
offset that was found. This is useful for finer-grained reasoning about the
log. We expose this in the consumer in the OffsetAndTimestamp object which
is returned from `offsetsForTimes`. Does it make sense to add this to
`ListOffsetsResultInfo` as well?

3. If the group is still active, the call to reset offsets will fail.
Currently this would result in an UNKNOWN_MEMBER_ID error. I think it would
make sense to map this exception to a friendlier error before raising to
the user. For example, `NonEmptyGroupException` or something like that.

-Jason





On Tue, Aug 6, 2019 at 9:33 AM Mickael Maison 
wrote:

> Hi Colin,
>
> Thank you for taking a look!
> I agree, being able to set consumer group offsets via the AdminClient
> would be really useful, hence I created this KIP.
>
> With the total absence of binding votes, I guessed I needed to make
> some changes. Do you mean you preferred the previous naming
> (commitConsumerGroupOffsets) over "resetConsumerGroupOffsets"?
>
> Thanks
>
> On Mon, Aug 5, 2019 at 8:26 PM Colin McCabe  wrote:
> >
> > I think it would be useful to have this in AdminClient.  Especially if
> we implement KIP-496: Administrative API to delete consumer offsets.  It
> would be odd to have a way to delete consumer offsets in AdminClient, but
> not to create them.  What do you think?
> >
> > best,
> > Colin
> >
> >
> > On Sun, Aug 4, 2019, at 09:27, Mickael Maison wrote:
> > > Hi,
> > >
> > > In an attempt to unblock this KIP, I've made some adjustments:
> > > I've renamed the commitConsumerGroupOffsets() methods to
> > > resetConsumerGroupOffsets() to reduce confusion. That should better
> > > highlight the differences with the regular commit() operation from the
> > > Consumer API. I've also added some details to the motivation section.
> > >
> > > So we have +5 non binding votes and 0 binding votes
> > >
> > > On Mon, Mar 25, 2019 at 1:10 PM Mickael Maison <
> mickael.mai...@gmail.com> wrote:
> > > >
> > > > Bumping this thread once again
> > > >
> > > > Ismael, have I answered your questions?
> > > > While this has received a few non-binding +1s, no committers have
> > > > voted yet. If you have concerns or questions, please let me know.
> > > >
> > > > Thanks
> > > >
> > > > On Mon, Feb 11, 2019 at 11:51 AM Mickael Maison
> > > >  wrote:
> > > > >
> > > > > Bumping this thread as it's been a couple of weeks.
> > > > >
> > > > > On Tue, Jan 22, 2019 at 2:26 PM Mickael Maison <
> mickael.mai...@gmail.com> wrote:
> > > > > >
> > > > > > Thanks Ismael for the feedback. I think your point has 2 parts:
> > > > > > - Having the reset functionality in the AdminClient:
> > > > > > The fact we have a command line tool illustrate that this
> operation is
> > > > > > relatively common. I seems valuable to be able to perform this
> > > > > > operation directly via a proper API in addition of the CLI tool.
> > > > > >
> > > > > > - Sending an OffsetCommit directly instead of relying on
> KafkaConsumer:
> > > > > > The KafkaConsumer requires a lot of stuff to commit offsets. Its
> group
> > > > > > cannot change so you need to start a new Consumer every time,
> that
> > > > > > creates new connections and overal sends more requests. Also
> there are
> > > > > > already  a bunch of AdminClient APIs that have logic very close
> to
> > > > > > what needs to be done to send a commit request, keeping the code
> small
> > > > > > and consistent.
> > > > > >
> > > > > > I've updated the KIP with these details and moved the 2nd part to
> > > > > > "Proposed changes" as it's more an implementation detail.
> > > > > >
> > > > > > I hope this answers your question
> > > > > >
> > > > > > On Mon, Jan 21, 2019 at 7:41 PM Ismael Juma 
> wrote:
> > > > > > >
> > > > > > > The KIP doesn't discuss the option of using KafkaConsumer
> directly as far
> > > > > > > as I can tell. We have tried to avoid having the same
> functionality in
> > > > > > > multiple clients so it would be good to explain why this is
> necessary here
> > > > > > > (not saying it isn't).
> > > > > > >
> > > > > > > Ismael
> > > > > > >
> > > > > > > On Mon, Jan 21, 2019, 10:29 AM Mickael Maison <
> mickael.mai...@gmail.com
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Ryanne for the feedback, all suggestions sounded
> good, I've
> > > > > > > > updated the KIP accordingly.
> > > 

Re: [VOTE] KIP-492 Add java security providers in Kafka Security config

2019-08-06 Thread Jeff Huang



On 2019/07/29 19:22:02, Sandeep Mopuri  wrote: 
> Hi all, after some good discussion
>  about the
> KIP
> ,
> I'm starting the voting.
> 
> This KIP proposes adding new security configuration to accept custom
> security providers that can provide algorithms for SSL or SASL.
> 
> -- 
> Thanks,
> M.Sai Sandeep
> 

Hello,

How do we handle a scenario that some providers require more information for 
installing providers? 

For instance, Bouncy Castle(BC) provider requires input parameter "fips:BCFIPS" 
for enabling FIPS mode. 
Example:
Static Configuration in java.security file:
security.provider.1=org.bouncycastle.jsse.provider.BouncyCastleJsseProvider  
fips:BCFIPS
Dynamic Installation 
Security.addProvider(new BouncyCastleJsseProvider(“fips:BCFIPS”)) 

So I suggested we might consider providing more info for the new config 
property, example like:
security.provider.info=classname of provider/name of provider/initial 
parameters,
Example for BC case:
security.provider.info=org.bouncycastle.jsse.provider.BouncyCastleJsseProvider/BC/fips:BCFIPS,sun.security.provider.Sun/SUN,
Basically info for each provider will consist of three pieces information: name 
of class, name of provider(for unit testing purpose),initial parameter for 
instantiating class.
Still use comma ","  for separating each provider info.

Jeff Huang,




Re: [ANNOUNCE] Apache Kafka 2.3.0

2019-08-06 Thread Matthias J. Sax
Thanks for pointing out!

Should be fixed now. Colin did a corresponding PR against kafka-site
repo. Feel free to do a PR directly for fixes like this.


-Matthias

On 7/31/19 2:53 AM, Mickael Maison wrote:
> Hi,
> 
> It looks like the protocol page was not updated. It still only lists 2.2 APIs.
> http://kafka.apache.org/protocol
> 
> Thanks
> 
> On Tue, Jul 2, 2019 at 2:05 AM Colin McCabe  wrote:
>>
>> Hi Mickael,
>>
>> Thanks for pointing this out.  It should be fixed now.
>>
>> best,
>> Colin
>>
>> On Mon, Jul 1, 2019, at 09:14, Mickael Maison wrote:
>>> Colin,
>>>
>>> The javadocs links are broken:
>>> The requested URL /23/javadoc/index.html was not found on this server.
>>>
>>> It's the 3rd time in a row this happens (2.1 and 2.2 had the same
>>> issue at release). Last time, Guozhang confirmed this step is in the
>>> release process but maybe this needs to be highlighted
>>>
>>> On Tue, Jun 25, 2019 at 8:22 PM Colin McCabe  wrote:

 Thanks to everyone who reviewed the Apache blog post about 2.3.  It's live 
 now at https://blogs.apache.org/kafka/date/20190624

 Plus, Tim Berglund made a video about what's new in this release.  
 https://www.youtube.com/watch?v=sNqwJT2WguQ

 Finally, check out Stéphane Maarek's video about 2.3 here: 
 https://www.youtube.com/watch?v=YutjYKSGd64

 cheers,
 Colin


 On Tue, Jun 25, 2019, at 09:40, Colin McCabe wrote:
> The Apache Kafka community is pleased to announce the release for
> Apache Kafka 2.3.0.
> This release includes several new features, including:
>
> - There have been several improvements to the Kafka Connect REST API.
> - Kafka Connect now supports incremental cooperative rebalancing.
> - Kafka Streams now supports an in-memory session store and window
> store.
> - The AdminClient now allows users to determine what operations they
> are authorized to perform on topics.
> - There is a new broker start time metric.
> - JMXTool can now connect to secured RMI ports.
> - An incremental AlterConfigs API has been added.  The old AlterConfigs
> API has been deprecated.
> - We now track partitions which are under their min ISR count.
> - Consumers can now opt-out of automatic topic creation, even when it
> is enabled on the broker.
> - Kafka components can now use external configuration stores (KIP-421)
> - We have implemented improved replica fetcher behavior when errors are
> encountered
>
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html
>
>
> You can download the source and binary release (Scala 2.11 and 2.12) from:
> https://kafka.apache.org/downloads#2.3.0
>
> All of the changes in this release can be found in the release notes:
> https://www.apache.org/dist/kafka/2.3.0/RELEASE_NOTES.html
>
> You can download the source and binary release (Scala 2.11 and 2.12) from:
> https://kafka.apache.org/downloads#2.3.0
>
> ---
>
> Apache Kafka is a distributed streaming platform with four core APIs:
>
> ** The Producer API allows an application to publish a stream records
> to one or more Kafka topics.
>
> ** The Consumer API allows an application to subscribe to one or more
> topics and process the stream of records produced to them.
>
> ** The Streams API allows an application to act as a stream processor,
> consuming an input stream from one or more topics and producing an
> output stream to one or more output topics, effectively transforming
> the input streams to output streams.
>
> ** The Connector API allows building and running reusable producers or
> consumers that connect Kafka topics to existing applications or data
> systems. For example, a connector to a relational database might
> capture every change to a table.
>
>
> With these APIs, Kafka can be used for two broad classes of application:
> ** Building real-time streaming data pipelines that reliably get data
> between systems or applications.
> ** Building real-time streaming applications that transform or react to
> the streams of data.
>
> Apache Kafka is in use at large and small companies worldwide,
> including Capital One, Goldman Sachs, ING, LinkedIn, Netflix,
> Pinterest, Rabobank, Target, The New York Times, Uber, Yelp, and
> Zalando, among others.
>
> A big thank you for the following 101 contributors to this release!
>
> Aishwarya Gune, Alex Diachenko, Alex Dunayevsky, Anna Povzner, Arabelle
> Hou, Arjun Satish, A. Sophie Blee-Goldman, asutosh936, Bill Bejeck, Bob
> Barrett, Boyang Chen, Brian Bushree, cadonna, Casey Green, Chase
> Walden, Chia-Ping Tsai, Chris Egerton, 

Re: [VOTE] KIP-396: Add Commit/List Offsets Operations to AdminClient

2019-08-06 Thread Mickael Maison
Hi Colin,

Thank you for taking a look!
I agree, being able to set consumer group offsets via the AdminClient
would be really useful, hence I created this KIP.

With the total absence of binding votes, I guessed I needed to make
some changes. Do you mean you preferred the previous naming
(commitConsumerGroupOffsets) over "resetConsumerGroupOffsets"?

Thanks

On Mon, Aug 5, 2019 at 8:26 PM Colin McCabe  wrote:
>
> I think it would be useful to have this in AdminClient.  Especially if we 
> implement KIP-496: Administrative API to delete consumer offsets.  It would 
> be odd to have a way to delete consumer offsets in AdminClient, but not to 
> create them.  What do you think?
>
> best,
> Colin
>
>
> On Sun, Aug 4, 2019, at 09:27, Mickael Maison wrote:
> > Hi,
> >
> > In an attempt to unblock this KIP, I've made some adjustments:
> > I've renamed the commitConsumerGroupOffsets() methods to
> > resetConsumerGroupOffsets() to reduce confusion. That should better
> > highlight the differences with the regular commit() operation from the
> > Consumer API. I've also added some details to the motivation section.
> >
> > So we have +5 non binding votes and 0 binding votes
> >
> > On Mon, Mar 25, 2019 at 1:10 PM Mickael Maison  
> > wrote:
> > >
> > > Bumping this thread once again
> > >
> > > Ismael, have I answered your questions?
> > > While this has received a few non-binding +1s, no committers have
> > > voted yet. If you have concerns or questions, please let me know.
> > >
> > > Thanks
> > >
> > > On Mon, Feb 11, 2019 at 11:51 AM Mickael Maison
> > >  wrote:
> > > >
> > > > Bumping this thread as it's been a couple of weeks.
> > > >
> > > > On Tue, Jan 22, 2019 at 2:26 PM Mickael Maison 
> > > >  wrote:
> > > > >
> > > > > Thanks Ismael for the feedback. I think your point has 2 parts:
> > > > > - Having the reset functionality in the AdminClient:
> > > > > The fact we have a command line tool illustrate that this operation is
> > > > > relatively common. I seems valuable to be able to perform this
> > > > > operation directly via a proper API in addition of the CLI tool.
> > > > >
> > > > > - Sending an OffsetCommit directly instead of relying on 
> > > > > KafkaConsumer:
> > > > > The KafkaConsumer requires a lot of stuff to commit offsets. Its group
> > > > > cannot change so you need to start a new Consumer every time, that
> > > > > creates new connections and overal sends more requests. Also there are
> > > > > already  a bunch of AdminClient APIs that have logic very close to
> > > > > what needs to be done to send a commit request, keeping the code small
> > > > > and consistent.
> > > > >
> > > > > I've updated the KIP with these details and moved the 2nd part to
> > > > > "Proposed changes" as it's more an implementation detail.
> > > > >
> > > > > I hope this answers your question
> > > > >
> > > > > On Mon, Jan 21, 2019 at 7:41 PM Ismael Juma  wrote:
> > > > > >
> > > > > > The KIP doesn't discuss the option of using KafkaConsumer directly 
> > > > > > as far
> > > > > > as I can tell. We have tried to avoid having the same functionality 
> > > > > > in
> > > > > > multiple clients so it would be good to explain why this is 
> > > > > > necessary here
> > > > > > (not saying it isn't).
> > > > > >
> > > > > > Ismael
> > > > > >
> > > > > > On Mon, Jan 21, 2019, 10:29 AM Mickael Maison 
> > > > > >  > > > > > wrote:
> > > > > >
> > > > > > > Thanks Ryanne for the feedback, all suggestions sounded good, I've
> > > > > > > updated the KIP accordingly.
> > > > > > >
> > > > > > > On Mon, Jan 21, 2019 at 3:43 PM Ryanne Dolan 
> > > > > > > 
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > +1 (non-binding)
> > > > > > > >
> > > > > > > > But I suggest:
> > > > > > > >
> > > > > > > > - drop "get" from getOffset, getTimestamp.
> > > > > > > >
> > > > > > > > - add to the motivation section why this is better than 
> > > > > > > > constructing a
> > > > > > > > KafkaConsumer and using seek(), commit() etc.
> > > > > > > >
> > > > > > > > - add some rejected alternatives.
> > > > > > > >
> > > > > > > > Ryanne
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jan 21, 2019, 7:57 AM Dongjin Lee  > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > We have +4 non-binding for this vote. Is there any committer 
> > > > > > > > > who is
> > > > > > > > > interested in this issue?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > > Dongjin
> > > > > > > > >
> > > > > > > > > On Mon, Jan 21, 2019 at 10:33 PM Andrew Schofield <
> > > > > > > > > andrew_schofi...@live.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1 (non-binding). Thanks for the KIP.
> > > > > > > > > >
> > > > > > > > > > On 21/01/2019, 12:45, "Eno Thereska" 
> > > > > > > > > > 
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > +1 (non binding). Thanks.
> > > > > > > > > >
> > > > > > > > > > On Mon, Jan 21, 2019 at 12:30 PM Mickael Maison <
> > > > > > > > > > 

Re: [VOTE] KIP-476: Add Java AdminClient interface

2019-08-06 Thread Andy Coates
Hi all,

Just a quick note to let you all know that the KIP ran into a slight hiccup
along the way.  The original change saw the return value of
`KafkaClientSupplier.getAdminClient` changed from `AdminClient` to the new
`Admin`, thereby allowing implementers to return a proxy is they so
wanted.  However, changing the return value from the class to the interface
was a binary incompatible change - doh!  Hence the KIP has been updated to
instead leave `getAdminClient`s signature as-is and just deprecate it in
favour of a new `Admin getAdmin(final Map config)` method.

PR here: https://github.com/apache/kafka/pull/7162

Let me know if anyone has any issues / suggestions, etc.

Andy

On Fri, 12 Jul 2019 at 20:35, Andy Coates  wrote:

> Awesome sauce - so I'd like to close the voting. final vote was:
>
> 4 for binding, none against
> 3 non-binding, none against.
>
> I'll update the KIP to reflect the passing of the vote.
>
> Thanks you all for your time & brain power,
>
> Andy
>
> On Thu, 11 Jul 2019 at 14:51, Rajini Sivaram 
> wrote:
>
>> +1 (binding)
>>
>> Thanks for the KIP, Andy!
>>
>> Regards,
>>
>> Rajini
>>
>>
>> On Thu, Jul 11, 2019 at 1:18 PM Gwen Shapira  wrote:
>>
>> > +1 (binding)
>> >
>> > Thank you for the improvement.
>> >
>> > On Thu, Jul 11, 2019, 3:53 AM Andy Coates  wrote:
>> >
>> > > Hi All,
>> > >
>> > > So voting currently stands on:
>> > >
>> > > Binding:
>> > > +1 Matthias,
>> > > +1 Colin
>> > >
>> > > Non-binding:
>> > > +1  Thomas Becker
>> > > +1 Satish Guggana
>> > > +1 Ryan Dolan
>> > >
>> > > So we're still 1 binding vote short. :(
>> > >
>> > >
>> > > On Wed, 3 Jul 2019 at 23:08, Matthias J. Sax 
>> > > wrote:
>> > >
>> > > > Thanks for the details Colin and Andy.
>> > > >
>> > > > My indent was not to block the KIP, but it seems to be a fair
>> question
>> > > > to ask.
>> > > >
>> > > > I talked to Ismael offline about it and understand his reasoning
>> better
>> > > > now. If we don't deprecate `abstract AdminClient` class, it seems
>> > > > reasonable to not deprecate the corresponding factory methods
>> either.
>> > > >
>> > > >
>> > > > +1 (binding) on the current proposal
>> > > >
>> > > >
>> > > >
>> > > > -Matthias
>> > > >
>> > > > On 7/3/19 5:03 AM, Andy Coates wrote:
>> > > > > Matthias,
>> > > > >
>> > > > > I was referring to platforms such as spark or flink that support
>> > > multiple
>> > > > > versions of the Kafka clients. Ismael mentioned this higher up on
>> the
>> > > > > thread.
>> > > > >
>> > > > > I'd prefer this KIP didn't get held up over somewhat unrelated
>> > change,
>> > > > i.e.
>> > > > > should the factory method be on the interface or utility class.
>> > > Surely,
>> > > > > now would be a great time to change this if we wanted, but we can
>> > also
>> > > > > change this later if we need to.  In the interest of moving
>> forward,
>> > > can
>> > > > I
>> > > > > propose we leave the factory methods as they are in the KIP?
>> > > > >
>> > > > > Thanks,
>> > > > >
>> > > > > Andy
>> > > > >
>> > > > > On Tue, 2 Jul 2019 at 17:14, Colin McCabe 
>> > wrote:
>> > > > >
>> > > > >> On Tue, Jul 2, 2019, at 09:14, Colin McCabe wrote:
>> > > > >>> On Mon, Jul 1, 2019, at 23:30, Matthias J. Sax wrote:
>> > > >  Not sure, if I understand the argument?
>> > > > 
>> > > >  Why would anyone need to support multiple client side versions?
>> > > >  Clients/brokers are forward/backward compatible anyway.
>> > > > >>>
>> > > > >>> When you're using many different libraries, it is helpful if
>> they
>> > > don't
>> > > > >>> impose tight constraints on what versions their dependencies
>> are.
>> > > > >>> Otherwise you can easily get in a situation where the
>> constraints
>> > > can't
>> > > > >>> be satisfied.
>> > > > >>>
>> > > > 
>> > > >  Also, if one really supports multiple client side versions,
>> won't
>> > > they
>> > > >  use multiple shaded dependencies for different versions?
>> > > > >>>
>> > > > >>> Shading the Kafka client doesn't really work, because of how we
>> use
>> > > > >> reflection.
>> > > > >>>
>> > > > 
>> > > >  Last, it's possible to suppress warnings (at least in Java).
>> > > > >>>
>> > > > >>> But not in Scala.  So that does not help (for example), Scala
>> > users.
>> > > > >>
>> > > > >> I meant to write "Spark users" here.
>> > > > >>
>> > > > >> C.
>> > > > >>
>> > > > >>>
>> > > > >>> I agree that in general we should be using deprecation when
>> > > > >>> appropriate, regardless of the potential annoyances to users.
>> But
>> > > I'm
>> > > > >>> not sure deprecating this method is really worth it.
>> > > > >>>
>> > > > >>> best,
>> > > > >>> Colin
>> > > > >>>
>> > > > >>>
>> > > > 
>> > > >  Can you elaborate?
>> > > > 
>> > > >  IMHO, just adding a statement to JavaDocs is a little weak,
>> and at
>> > > > some
>> > > >  point, we need to deprecate those methods anyway if we ever
>> want
>> > to
>> > > >  remove them. The earlier we deprecate them, the earlier 

Re: [DISCUSS] KIP-317: Transparent Data Encryption

2019-08-06 Thread Sönke Liebau
Hi,

I have so far received pretty much no comments on the technical details
outlined in the KIP. While I am happy to continue with my own ideas of how
to implement this, I would much prefer to at least get a very broad "looks
good in principle, but still lots to flesh out" from a few people before I
but more work into this.

Best regards,
Sönke




On Tue, 21 May 2019 at 14:15, Sönke Liebau 
wrote:

> Hi everybody,
>
> I'd like to rekindle the discussion around KIP-317.
> I have reworked the KIP a little bit in order to design everything as a
> pluggable implementation. During the course of that work I've also decided
> to rename the KIP, as encryption will only be transparent in some cases. It
> is now called "Add end to end data encryption functionality to Apache
> Kafka" [1].
>
> I'd very much appreciate it if you could give the KIP a quick read. This
> is not at this point a fully fleshed out design, as I would like to agree
> on the underlying structure that I came up with first, before spending time
> on details.
>
> TL/DR is:
> Create three pluggable classes:
> KeyManager runs on the broker and manages which keys to use, key rollover
> etc
> KeyProvider runs on the client and retrieves keys based on what the
> KeyManager tells it
> EncryptionEngine runs on the client andhandles the actual encryption
> First idea of control flow between these components can be seen at [2]
>
> Please let me know any thoughts or concerns that you may have!
>
> Best regards,
> Sönke
>
> [1]
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-317%3A+Add+end-to-end+data+encryption+functionality+to+Apache+Kafka
> [2]
> https://cwiki.apache.org/confluence/download/attachments/85479936/kafka_e2e-encryption_control-flow.png?version=1=1558439227551=v2
>
>
>
> On Fri, 10 Aug 2018 at 14:05, Sönke Liebau 
> wrote:
>
>> Hi Viktor,
>>
>> thanks for your input! We could accommodate magic headers by removing any
>> known fixed bytes pre-encryption, sticking them in a header field and
>> prepending them after decryption. However, I am not sure whether this is
>> actually necessary, as most modern (AES for sure) algorithms are considered
>> to be resistant to known-plaintext types of attack. Even if the entire
>> plaintext is known to the attacker he still needs to brute-force the key -
>> which may take a while.
>>
>> Something different to consider in this context are compression
>> sidechannel attacks like CRIME or BREACH, which may be relevant depending
>> on what type of data is being sent through Kafka. Both these attacks depend
>> on the encrypted record containing a combination of secret and user
>> controlled data.
>> For example if Kafka was used to forward data that the user entered on a
>> website along with a secret API key that the website adds to a back-end
>> server and the user can obtain the Kafka messages, these attacks would
>> become relevant. Not much we can do about that except disallow encryption
>> when compression is enabled (TLS chose this approach in version 1.3)
>>
>> I agree with you, that we definitely need to clearly document any risks
>> and how much security can reasonably be expected in any given scenario. We
>> might even consider logging a warning message when sending data that is
>> compressed and encrypted.
>>
>> On a different note, I've started amending the KIP to make key management
>> and distribution pluggable, should hopefully be able to publish sometime
>> Monday.
>>
>> Best regards,
>> Sönke
>>
>>
>> On Thu, Jun 21, 2018 at 12:26 PM, Viktor Somogyi > > wrote:
>>
>>> Hi Sönke,
>>>
>>> Compressing before encrypting has its dangers as well. Suppose you have a
>>> known compression format which adds a magic header and you're using a
>>> block
>>> cipher with a small enough block, then it becomes much easier to figure
>>> out
>>> the encryption key. For instance you can look at Snappy's stream
>>> identifier:
>>> https://github.com/google/snappy/blob/master/framing_format.txt
>>> . Based on this you should only use block ciphers where block sizes are
>>> much larger then 6 bytes. AES for instance should be good with its 128
>>> bits
>>> = 16 bytes but even this isn't entirely secure as the first 6 bytes
>>> already
>>> leaked some information - and it depends on the cypher that how much it
>>> is.
>>> Also if we suppose that an adversary accesses a broker and takes all the
>>> data, they'll have much easier job to decrypt it as they'll have much
>>> more
>>> examples.
>>> So overall we should make sure to define and document the compatible
>>> encryptions with the supported compression methods and the level of
>>> security they provide to make sure the users are fully aware of the
>>> security implications.
>>>
>>> Cheers,
>>> Viktor
>>>
>>> On Tue, Jun 19, 2018 at 11:55 AM Sönke Liebau
>>>  wrote:
>>>
>>> > Hi Stephane,
>>> >
>>> > thanks for pointing out the broken pictures, I fixed those.
>>> >
>>> > Regarding encrypting before or after batching the messages, you are
>>> > correct, I 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Ismael Juma
Hi Harsha,

I mentioned policies and the authorizer. For example, with
CreateTopicPolicy, you can implement the limits you describe. If you have
ideas of how that should be improved, please submit a KIP. My point is that
this KIP is not introducing any new functionality with regards to what
rogue clients can do. It's using the existing protocol that is already
exposed via the AdminClient. So, I don't think we need to address it in
this KIP. Does that make sense?

Ismael

On Tue, Aug 6, 2019 at 7:13 AM Harsha Chintalapani  wrote:

> Ismael,
>   Sure AdminClient can do that and we should've shipped a config or
> use the existing one to block that. Not all users are yet to upgrade to
> AdminClient and start using that to cause  issues yet.
>  In shared environment we should allow server to set sane
> defaults and not allow every client to go ahead create random no.of
> topic/partitions and replication factor.   Even if the users want to allow
> topic creation proposed in the KIP , it makes sense to have some guards
> against the no.of partitions and replication factor. Authorizer is not
> always an answer to block requests and having users set server side configs
> to protect a multi-tenant environment is required.  In a non-secure
> environment Authorizer is a blunt instrument either you end up blocking
> everyone or allowing everyone.
> I am asking to have server side that allow clients to create topics or not
> , if they are allowed  set a ceiling on max no.of partitions and
> replication-factor.
>
> -Harsha
>
>
>
> On Mon, Aug 5 2019 at 8:58 PM,  wrote:
>
> > Harsha,
> >
> > Rogue clients can use the admin client to create topics and partitions.
> > ACLs and policies can help in that case as well as this one.
> >
> > Ismael
> >
> > On Mon, Aug 5, 2019, 7:48 PM Harsha Chintalapani 
> wrote:
> >
> > Hi Justine,
> > Thanks for the KIP.
> > "When server-side auto-creation is disabled, client-side auto-creation
> > will try to use client-side configurations"
> > If I understand correctly, this KIP is removing any server-side blocking
> > client auto creation of topic?
> > if so this will present potential issue of rogue client creating ton of
> > topic-partitions and potentially bringing down the service for everyone
> or
> > degrade the service itself.
> > By reading the KIP its not clear to me that there is a clear way to block
> > auto creation topics of all together from clients by server side config.
> > Server side configs of default topic, partitions should take higher
> > precedence and client shouldn't be able to create a topic with higher
> no.of
> > partitions, replication than what server config specifies.
> >
> > >
> >
> > Thanks,
> > Harsha
> >
> > >
> > >
> > >
> >
> > On Mon, Aug 05, 2019 at 5:24 PM, Justine Olshan 
> > wrote:
> >
> > >
> >
> > > Hi all,
> > > I made some changes to the KIP. Hopefully this configuration change
> will
> > > make things a little clearer.
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/
> > > KIP-487%3A+Client-side+Automatic+Topic+Creation+on+Producer
> > >
> > > Please let me know if you have any feedback or questions!
> > >
> > > Thank you,
> > > Justine
> > >
> > > On Wed, Jul 31, 2019 at 1:44 PM Colin McCabe 
> > wrote:
> > >
> > > Hi Mickael,
> > >
> > > I think you bring up a good point. It would be better if we didn't ever
> > > have to set up client-side configuration for this feature, and KIP-464
> > > would let us skip this entirely.
> > >
> > > On Wed, Jul 31, 2019, at 09:19, Justine Olshan wrote:
> > >
> > > Hi Mickael,
> > > I agree that KIP-464 works on newer brokers, but I was a bit worried
> how
> > > things would play out on older brokers that* do not *have KIP 464
> > >
> > > included.
> > >
> > > Is it enough to throw an error in this case when producer configs are
> > not
> > > specified?
> > >
> > > I think the right thing to do would be to log an error message in the
> > > client. We will need to have that capability in any case, to cover
> > > scenarios like the client trying to auto-create a topic that they don't
> > > have permission to create. Or a client trying to create a topic on a
> > broker
> > > so old that CreateTopicsRequest is not supported.
> > >
> > > The big downside to relying on KIP-464 is that it is a very recent
> > feature
> > > -- so recent that it hasn't even made its way to any official Apache
> > > release. It's scheduled for the upcoming 2.4 release in a few months.
> > >
> > > So if you view this KIP as a step towards removing broker-side
> > > auto-create, you might want to support older brokers just to accelerate
> > > adoption, and hasten the day when we can finally flip broker-side
> > > auto-create to off (or even remove it entirely).
> > >
> > > I have to agree, though, that having client-side configurations for
> > number
> > > of partitions and replication factor is messy. Maybe it would be worth
> > it
> > > to restrict support to post-KIP-464 brokers, if we could avoid 

Re: [DISCUSS] KIP-498: Add client-side configuration for maximum response size to protect against OOM

2019-08-06 Thread Gokul Ramanan Subramanian
Hi Alexandre.

Thanks for this analysis.

IMHO, there are 4 ways to ago about this:

1. We don't fix the bug directly but instead update the Kafka documentation
telling clients to configure themselves correctly - Silly but easy to
achieve.
2. We adopt Stanislav's solution that fixes the problem - Easy to achieve,
potentially adds inflexibility in the future if we want to change handshake
protocol. However, changing the handshake protocol is going to be a
backwards incompatible change anyway with or without Stanislav's solution.
3. We adopt Alexandre's solution - Easy to achieve, but has shortcomings
Alexandre has highlighted.
4. We pivot KIP-498 to focus on standardizing the handshake protocol - Not
easy to achieve, since this will be a backwards incompatible change and
will require client and server redeployments in correct order. Further,
this can be a hard problem to solve given that various transport layer
protocols have different headers. In order for the "new handshake" protocol
to work, it would have to work in the same namespace as those headers,
which will require careful tuning of handshake constants.

Any thoughts from the community on how we can proceed?

Thanks.

On Tue, Aug 6, 2019 at 12:41 PM Alexandre Dupriez <
alexandre.dupr...@gmail.com> wrote:

> Hello,
>
> I wrote a small change [1] to make clients validate the size of messages
> received from a broker at the protocol-level.
> However I don't like the change. It does not really solve the problem which
> originally motivated the KIP - that is, protocol mismatch (plaintext to SSL
> endpoint). More specifically, few problems I can see are listed below. This
> is a non-exhaustive list. They also have been added to KIP-498 [2].
>
> 1) Incorrect failure mode
> As was experimented and as can be seen as part of the integration tests,
> when an invalid size is detected and the exception InvalidReceiveException
> is thrown, consumers and producers keeps retrying until the poll timeout
> expires or the maximum number of retries is reached. This is incorrect if
> we consider the use case of protocol mismatch which motivated this change.
> Indeed, producers and consumers would need to fail fast as retries will
> only prolong the time to remediation from users/administrators.
>
> 2) Incomplete remediation
> The proposed change cannot provide an definite guarantee against OOM in all
> scenarios - for instance, it will still manifest if the maximum size is set
> to 100 MB and the JVM is under memory pressure and have less than 100 MB of
> allocatable memory.
>
> 3) Illegitimate message rejection
> Even worse: what if the property is incorrectly configured and prevent
> legitimate messages from reaching the client?
>
> 4) Unclear configuration parameter
> 4.a) The name max.response.size intends to mirror the existing
> max.request.size from the producer's configuration properties. However,
> max.request.size intends to check the size of producer records as provided
> by a client; while max.response.size is to check the size directly decoded
> from the network according to Kafka's binary protocol.
> 4.b) On the broker, the property socket.request.max.bytes is used to
> validate the size of messages received by the server. The new property
> serves the same purpose, which introduces duplicated semantic, even if one
> property is characterised with the keyword "request" and the other with
> "response", in both cases reflecting the perspective adopted from either a
> client or a server.
>
> Please let me know what you think. An alternative mitigation may be worth
> investigated for the detection of protocol mismatch in the client.
>
> [1] https://github.com/apache/kafka/pull/7160
> [2]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-498%3A+Add+client-side+configuration+for+maximum+response+size+to+protect+against+OOM
>
> Le jeu. 1 août 2019 à 09:42, Alexandre Dupriez <
> alexandre.dupr...@gmail.com>
> a écrit :
>
> > Thanks Gokul and Stanislav for your answers.
> >
> > I went through the PR 5940 [1]. Indeed Gokul, your reasoning echoes the
> > observations of Ismael about a potential inter-protocol layering
> violation.
> >
> > As you said Stanislav, the server-side SSL engine responds with an alert
> > with code 80 (internal_error) from what I saw when reproducing the OOM.
> > Since the Alert is generated below the application layer, I am not sure
> > what could be done on the broker to handle the scenario more gracefully.
> > Interestingly, the SSL engine emits the possibility of receiving a
> > plaintext message in debug logs [2].
> >
> > The idea was indeed to perform a simple check on the message size decoded
> > in NetworkReceive against a configurable value, and throw
> > an InvalidReceiveException in a similar fashion as what happens on the
> > broker, where a non-unlimited maxSize can be provided. Basically, mimic
> the
> > behaviour on the broker. The default value for the maximal request size
> on
> > the broker is 100 MB which you are 

Re: [DISCUSS] KIP-487: Automatic Topic Creation on Producer

2019-08-06 Thread Harsha Chintalapani
Ismael,
  Sure AdminClient can do that and we should've shipped a config or
use the existing one to block that. Not all users are yet to upgrade to
AdminClient and start using that to cause  issues yet.
 In shared environment we should allow server to set sane
defaults and not allow every client to go ahead create random no.of
topic/partitions and replication factor.   Even if the users want to allow
topic creation proposed in the KIP , it makes sense to have some guards
against the no.of partitions and replication factor. Authorizer is not
always an answer to block requests and having users set server side configs
to protect a multi-tenant environment is required.  In a non-secure
environment Authorizer is a blunt instrument either you end up blocking
everyone or allowing everyone.
I am asking to have server side that allow clients to create topics or not
, if they are allowed  set a ceiling on max no.of partitions and
replication-factor.

-Harsha



On Mon, Aug 5 2019 at 8:58 PM,  wrote:

> Harsha,
>
> Rogue clients can use the admin client to create topics and partitions.
> ACLs and policies can help in that case as well as this one.
>
> Ismael
>
> On Mon, Aug 5, 2019, 7:48 PM Harsha Chintalapani  wrote:
>
> Hi Justine,
> Thanks for the KIP.
> "When server-side auto-creation is disabled, client-side auto-creation
> will try to use client-side configurations"
> If I understand correctly, this KIP is removing any server-side blocking
> client auto creation of topic?
> if so this will present potential issue of rogue client creating ton of
> topic-partitions and potentially bringing down the service for everyone or
> degrade the service itself.
> By reading the KIP its not clear to me that there is a clear way to block
> auto creation topics of all together from clients by server side config.
> Server side configs of default topic, partitions should take higher
> precedence and client shouldn't be able to create a topic with higher no.of
> partitions, replication than what server config specifies.
>
> >
>
> Thanks,
> Harsha
>
> >
> >
> >
>
> On Mon, Aug 05, 2019 at 5:24 PM, Justine Olshan 
> wrote:
>
> >
>
> > Hi all,
> > I made some changes to the KIP. Hopefully this configuration change will
> > make things a little clearer.
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/
> > KIP-487%3A+Client-side+Automatic+Topic+Creation+on+Producer
> >
> > Please let me know if you have any feedback or questions!
> >
> > Thank you,
> > Justine
> >
> > On Wed, Jul 31, 2019 at 1:44 PM Colin McCabe 
> wrote:
> >
> > Hi Mickael,
> >
> > I think you bring up a good point. It would be better if we didn't ever
> > have to set up client-side configuration for this feature, and KIP-464
> > would let us skip this entirely.
> >
> > On Wed, Jul 31, 2019, at 09:19, Justine Olshan wrote:
> >
> > Hi Mickael,
> > I agree that KIP-464 works on newer brokers, but I was a bit worried how
> > things would play out on older brokers that* do not *have KIP 464
> >
> > included.
> >
> > Is it enough to throw an error in this case when producer configs are
> not
> > specified?
> >
> > I think the right thing to do would be to log an error message in the
> > client. We will need to have that capability in any case, to cover
> > scenarios like the client trying to auto-create a topic that they don't
> > have permission to create. Or a client trying to create a topic on a
> broker
> > so old that CreateTopicsRequest is not supported.
> >
> > The big downside to relying on KIP-464 is that it is a very recent
> feature
> > -- so recent that it hasn't even made its way to any official Apache
> > release. It's scheduled for the upcoming 2.4 release in a few months.
> >
> > So if you view this KIP as a step towards removing broker-side
> > auto-create, you might want to support older brokers just to accelerate
> > adoption, and hasten the day when we can finally flip broker-side
> > auto-create to off (or even remove it entirely).
> >
> > I have to agree, though, that having client-side configurations for
> number
> > of partitions and replication factor is messy. Maybe it would be worth
> it
> > to restrict support to post-KIP-464 brokers, if we could avoid creating
> > more configs.
> >
> > best,
> > Colin
> >
> > On Wed, Jul 31, 2019 at 9:10 AM Mickael Maison  >
> > wrote:
> >
> > Hi Justine,
> >
> > We can rely on KIP-464 which allows to omit the partition count or
> > replication factor when creating a topic. In that case, the broker
> defaults
> > are used.
> >
> > On Wed, Jul 31, 2019 at 4:55 PM Justine Olshan 
> > wrote:
> >
> > Michael,
> > That makes sense to me!
> > To clarify, in the current state of the KIP, the producer does not
> >
> > rely
> >
> > on
> >
> > the broker to autocreate--if the broker's config is disabled, then
> >
> > the
> >
> > producer can autocreate on its own with a create topic request (the
> >
> > same
> >
> > type of request the admin client uses).
> > However, if both configs are 

[jira] [Created] (KAFKA-8759) Message Order is reversed when client run behind a VPN

2019-08-06 Thread M. Manna (JIRA)
M. Manna created KAFKA-8759:
---

 Summary: Message Order is reversed when client run behind a VPN
 Key: KAFKA-8759
 URL: https://issues.apache.org/jira/browse/KAFKA-8759
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 2.2.0
Reporter: M. Manna


We have noticed this behaviour whilst testing console producer against a kafka 
service installed on GCP. We have been using a fork from confluent Helm Chart.

[https://github.com/helm/charts/tree/master/incubator/kafka]

FYI - we've used cp 5.3.0 with Apache Kafka 2.3.0

Our VPN connection throughput was 1 mbps. Upon connecting to VPN, we opened a 
console producer client (2.2.0) with the following command:


{code:java}
kafka-console-producer.bat --topic some_topic --broker-list 
gcp_broker1:19092,gcp_broker2:19092,gcp_broker3:19092{code}
 

Similarly, we ran a consumer with the following command before publishing 
messages
{code:java}
kafka-console-consumer.bat --topic some_topic --bootstrap-server 
gcp_broker1:19092,gcp_broker2:19092,gcp_broker3:19092{code}

For producer console, we did receive a carat (>) prompt for publishing, so we 
entered messages:
{code:java}
>one
>two
>three
>{code}
After a while, it responded with NETWORK_EXCEPTION


{code:java}
[2019-08-02 11:17:19,690] WARN [Producer clientId=console-producer] Got error 
produce response with correlation id 8 on topic-partition some_topic-0, 
retrying (2 attempts left). Error: NETWORK_EXCEPTION 
(org.apache.kafka.clients.producer.internals.Sender){code}

We then hit "Enter" and received a carat (>) back


{code:java}
[2019-08-02 11:17:19,690] WARN [Producer clientId=console-producer] Got error 
produce response with correlation id 8 on topic-partition some_topic-0, 
retrying (2 attempts left). Error: NETWORK_EXCEPTION 
(org.apache.kafka.clients.producer.internals.Sender)

>{code}
 

Immediately after that, on consumer window, we received the following:
{code:java}
three
two
one{code}
 

We ran the same exercise from a regular network (wifi/lan) and didn't see this 
issue (i.e. works as described on Quickstart). 

This is slightly concerning for us since tunneling into a VPN shouldn't have 
any impact (or, should it) how kafka message protocol works over tcp. It seems 
that Kafka couldn't guarantee order of messages when network latency is 
involved. 

FYI 
1) We tried on VPN with --request_timeout_ms 12 and still same results.
2) Our setup was 3 node (3 br, 3 zk) with every topic having 1 partition only 
(RF - 3).



 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] KIP-503: deleted topics metric

2019-08-06 Thread Stanislav Kozlovski
Thanks for the KIP David,

As you mentioned in the KIP - "when a large number of topics (partitions,
really) are deleted at once, it can take significant time for the
Controller to process everything.
In that sense, does it make sense to have the metric expose the number of
partitions that are pending deletion, as opposed to topics? Perhaps even
both?
My reasoning is that this metric alone wouldn't say much if we had one
topic with 1000 partitions versus a topic with 1 partition

On Mon, Aug 5, 2019 at 8:19 PM Harsha Chintalapani  wrote:

> Thanks for the KIP.  Its useful metric to have.  LGTM.
> -Harsha
>
>
> On Mon, Aug 05, 2019 at 11:24 AM, David Arthur 
> wrote:
>
> > Hello all, I'd like to start a discussion for
> > https://cwiki.apache.org/confluence/display/KAFKA/
> > KIP-503%3A+Add+metric+for+number+of+topics+marked+for+deletion
> >
> > Thanks!
> > David
> >
>


-- 
Best,
Stanislav


Re: [DISCUSS] KIP-498: Add client-side configuration for maximum response size to protect against OOM

2019-08-06 Thread Alexandre Dupriez
Hello,

I wrote a small change [1] to make clients validate the size of messages
received from a broker at the protocol-level.
However I don't like the change. It does not really solve the problem which
originally motivated the KIP - that is, protocol mismatch (plaintext to SSL
endpoint). More specifically, few problems I can see are listed below. This
is a non-exhaustive list. They also have been added to KIP-498 [2].

1) Incorrect failure mode
As was experimented and as can be seen as part of the integration tests,
when an invalid size is detected and the exception InvalidReceiveException
is thrown, consumers and producers keeps retrying until the poll timeout
expires or the maximum number of retries is reached. This is incorrect if
we consider the use case of protocol mismatch which motivated this change.
Indeed, producers and consumers would need to fail fast as retries will
only prolong the time to remediation from users/administrators.

2) Incomplete remediation
The proposed change cannot provide an definite guarantee against OOM in all
scenarios - for instance, it will still manifest if the maximum size is set
to 100 MB and the JVM is under memory pressure and have less than 100 MB of
allocatable memory.

3) Illegitimate message rejection
Even worse: what if the property is incorrectly configured and prevent
legitimate messages from reaching the client?

4) Unclear configuration parameter
4.a) The name max.response.size intends to mirror the existing
max.request.size from the producer's configuration properties. However,
max.request.size intends to check the size of producer records as provided
by a client; while max.response.size is to check the size directly decoded
from the network according to Kafka's binary protocol.
4.b) On the broker, the property socket.request.max.bytes is used to
validate the size of messages received by the server. The new property
serves the same purpose, which introduces duplicated semantic, even if one
property is characterised with the keyword "request" and the other with
"response", in both cases reflecting the perspective adopted from either a
client or a server.

Please let me know what you think. An alternative mitigation may be worth
investigated for the detection of protocol mismatch in the client.

[1] https://github.com/apache/kafka/pull/7160
[2]
https://cwiki.apache.org/confluence/display/KAFKA/KIP-498%3A+Add+client-side+configuration+for+maximum+response+size+to+protect+against+OOM

Le jeu. 1 août 2019 à 09:42, Alexandre Dupriez 
a écrit :

> Thanks Gokul and Stanislav for your answers.
>
> I went through the PR 5940 [1]. Indeed Gokul, your reasoning echoes the
> observations of Ismael about a potential inter-protocol layering violation.
>
> As you said Stanislav, the server-side SSL engine responds with an alert
> with code 80 (internal_error) from what I saw when reproducing the OOM.
> Since the Alert is generated below the application layer, I am not sure
> what could be done on the broker to handle the scenario more gracefully.
> Interestingly, the SSL engine emits the possibility of receiving a
> plaintext message in debug logs [2].
>
> The idea was indeed to perform a simple check on the message size decoded
> in NetworkReceive against a configurable value, and throw
> an InvalidReceiveException in a similar fashion as what happens on the
> broker, where a non-unlimited maxSize can be provided. Basically, mimic the
> behaviour on the broker. The default value for the maximal request size on
> the broker is 100 MB which you are suggesting to use client-side.
>
> Adding a client configuration property for clients may be an overkill
> here. What I am going to ask is naive but - is it theoretically possible
> for the broker to legitimately send responses over 100 MB in size?
>
> Thanks,
> Alexandre
>
> [1] https://github.com/apache/kafka/pull/5940
> [2]
>
> kafka-network-thread-0-ListenerName(SSL)-SSL-4, fatal error: 80: problem
> unwrapping net record
>
> javax.net.ssl.SSLException: Unrecognized SSL message, plaintext connection?
>
> kafka-network-thread-0-ListenerName(SSL)-SSL-4, SEND TLSv1.2 ALERT:  fatal,
> description = internal_error
>
> kafka-network-thread-0-ListenerName(SSL)-SSL-4, WRITE: TLSv1.2 Alert,
> length = 2
>
> kafka-network-thread-0-ListenerName(SSL)-SSL-4, called closeOutbound()
>
> kafka-network-thread-0-ListenerName(SSL)-SSL-4, closeOutboundInternal()
>
> kafka-network-thread-0-ListenerName(SSL)-SSL-4, called closeInbound()
>
> kafka-network-thread-0-ListenerName(SSL)-SSL-4, fatal: engine already
> closed.  Rethrowing javax.net.ssl.SSLException: Inbound closed before
> receiving peer's close_notify: possible truncation attack?
>
> [Raw write]: length = 7
>
> : 15 03 03 00 02 02 50   ..P
>
>
> Le jeu. 1 août 2019 à 08:50, Stanislav Kozlovski 
> a écrit :
>
>> Hey Alexandre, thanks for the KIP!
>>
>> I had personally hit the same problem and found it very annoying.
>> Have you considered detecting 

Re: [DISCUSS] KIP-373: Allow users to create delegation tokens for other users

2019-08-06 Thread Manikumar
Hi Viktor,

Thanks for taking over this KP.

Current proposed ACL changes allows users to create tokens for any user.
Thinking again about this, admins may want to configure a user to
impersonate limited number of other users.
This allows us to configure fine-grained permissions.  But this requires a
new resourceType "User".  What do you think?


Thanks,
Manikumar


On Wed, Jul 31, 2019 at 2:26 PM Viktor Somogyi-Vass 
wrote:

> Hi Folks,
>
> I'm starting a vote on this.
>
> Viktor
>
> On Thu, Jun 27, 2019 at 12:02 PM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com> wrote:
>
> > Hi Folks,
> >
> > I took over this issue from Manikumar. Recently another motivation have
> > been raised in Spark for this (SPARK-28173) and I think it'd be great to
> > continue this task.
> > I updated the KIP and will wait for a few days to get some feedback then
> > proceed for the vote.
> >
> > Thanks,
> > Viktor
> >
> > On Tue, Dec 11, 2018 at 8:29 AM Manikumar 
> > wrote:
> >
> >> Hi Harsha,
> >>
> >> Thanks for the review.
> >>
> >> With this KIP a designated superuser can create tokens without requiring
> >> individual user credentials.
> >> Any client can authenticate brokers using the created tokens. We may not
> >> call this as impersonation,
> >> since the clients API calls are executing on their own authenticated
> >> connections.
> >>
> >> Thanks,
> >> Manikumar
> >>
> >> On Fri, Dec 7, 2018 at 11:56 PM Harsha  wrote:
> >>
> >> > Hi Mani,
> >> >  Overall KIP looks good to me. Can we call this
> >> Impersonation
> >> > support, which is what the KIP is doing?
> >> > Also instead of using super.uses as the config which essentially
> giving
> >> > cluster-wide support to the users, we can introduce
> impersonation.users
> >> as
> >> > a config and users listed in the config are allowed to impersonate
> other
> >> > users.
> >> >
> >> > Thanks,
> >> > Harsha
> >> >
> >> >
> >> > On Fri, Dec 7, 2018, at 3:58 AM, Manikumar wrote:
> >> > > Bump up! to get some attention.
> >> > >
> >> > > BTW, recently Apache Spark added  support for Kafka delegation
> token.
> >> > > https://issues.apache.org/jira/browse/SPARK-25501
> >> > >
> >> > > On Fri, Dec 7, 2018 at 5:27 PM Manikumar  >
> >> > wrote:
> >> > >
> >> > > > Bump up! to get some attention.
> >> > > >
> >> > > > BTW, recently Apache Spark added for Kafka delegation token
> support.
> >> > > > https://issues.apache.org/jira/browse/SPARK-25501
> >> > > >
> >> > > > On Tue, Sep 25, 2018 at 9:56 PM Manikumar <
> >> manikumar.re...@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > >> Hi all,
> >> > > >>
> >> > > >> I have created a KIP that proposes to allow users to create
> >> delegation
> >> > > >> tokens for other users.
> >> > > >>
> >> > > >>
> >> > > >>
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-373%3A+Allow+users+to+create+delegation+tokens+for+other+users
> >> > > >>
> >> > > >> Please take a look when you get a chance.
> >> > > >>
> >> > > >> Thanks,
> >> > > >> Manikumar
> >> > > >>
> >> > > >
> >> >
> >>
> >
>


Re: [VOTE] KIP-373: Allow users to create delegation tokens for other users

2019-08-06 Thread Viktor Somogyi-Vass
Hi All,

Bumping this, I'd be happy to get some additional feedback and/or votes.

Thanks,
Viktor

On Wed, Jul 31, 2019 at 11:04 AM Viktor Somogyi-Vass <
viktorsomo...@gmail.com> wrote:

> Hi All,
>
> I'd like to start a vote on this KIP.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-373%3A+Allow+users+to+create+delegation+tokens+for+other+users
>
> To summarize it: the proposed feature would allow users (usually
> superusers) to create delegation tokens for other users. This is especially
> helpful in Spark where the delegation token created this way can be
> distributed to workers.
>
> I'd be happy to receive any votes or additional feedback.
>
> Viktor
>


[jira] [Created] (KAFKA-8758) Know your KafkaPrincipal

2019-08-06 Thread JIRA
Stéphane Derosiaux created KAFKA-8758:
-

 Summary: Know your KafkaPrincipal
 Key: KAFKA-8758
 URL: https://issues.apache.org/jira/browse/KAFKA-8758
 Project: Kafka
  Issue Type: Wish
  Components: admin
Reporter: Stéphane Derosiaux


Hi,

In order to manage some tools around ACLs, it seems it's not possible to know 
"who you are" through the Admin API (to prevent deleting your own permissions 
for instance, but more globally, to know who you are).

The KafkaPrincipal is determined in the broker according to the channel and the 
principalBuilder, thus a client only can't determine its own identity.

Is it feasible to expose this info through a new KafkaAdmin API?

 



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)


Re: [DISCUSS] KIP-444: Augment Metrics for Kafka Streams

2019-08-06 Thread Bruno Cadonna
Hi Guozhang,

I left my comments inline.

On Thu, Jul 18, 2019 at 8:28 PM Guozhang Wang  wrote:
>
> Hello Bruno,
>
> Thanks for the feedbacks, replied inline.
>
> On Mon, Jul 1, 2019 at 7:08 AM Bruno Cadonna  wrote:
>
> > Hi Guozhang,
> >
> > Thank you for the KIP.
> >
> > 1) As far as I understand, the StreamsMetrics interface is there for
> > user-defined processors. Would it make sense to also add a method to
> > the interface to specify a sensor that records skipped records?
> >
> > Not sure I follow.. if users want to add a specific skipped records
> sensor, she can still do that as a "throughput" sensor via "
> addThroughputSensor" and then "record" right?
>
> As an after-thought, maybe it's better to rename `throughput` to `rate` in
> the public APIs since it is really meant for the latter semantics. I did
> not change it just to make less API changes / deprecate fewer functions.
> But if we feel it is important we can change it as well.
>

I see now that a user can record the rate of skipped records. However,
I was referring to the total number of skipped records. Maybe my
question should be more general: should we allow the user to also
specify sensors for totals or combinations of rate and totals?

Regarding the naming, I like `rate` more than `throughput`, but I
would not fight for it.

>
> > 2) What are the semantics of active-task-process and standby-task-process
> >
> > Ah good catch, I think I made it in the wrong column. Just some
> explanations here: Within a thread's looped iterations, it will first try
> to process some records from the active tasks, and then see if there are
> any standby-tasks that can be processed as well (i.e. just reading from the
> restore consumer and apply to the local stores). The ratio metrics are for
> indicating 1) what tasks (active or standby) does this thread own so far,
> and 2) how much time in percentage does it spend on each of them.
>
> But this metric should really be a task-level one that includes both the
> thread-id and task-id, and upon task migrations they will be dynamically
> deleted / (re)-created. For each task-id it may be owned by multiple
> threads as one active and others standby, and hence the separation of
> active / standby seems still necessary.
>

Makes sense.


>
>
> > 3) How do dropped-late-records and expired-window-record-drop relate
> > to each other? I guess the former is for records that fall outside the
> > grace period and the latter is for records that are processed after
> > the retention period of the window. Is this correct?
> >
> > Yes, that's correct. The names are indeed a bit confusing since they are
> added at different releases historically..
>
> More precisely, the `grace period` is a notion of the operator (hence the
> metric is node-level, though it would only be used for DSL operators) while
> the `retention` is a notion of the store (hence the metric is store-level).
> Usually grace period will be smaller than store retention though.
>
> Processor node is aware of `grace period` and when received a record that
> is older than grace deadline, it will be dropped immediately; otherwise it
> will still be processed a maybe a new update is "put" into the store. The
> store is aware of its `retention period` and then upon a "put" call if it
> realized it is older than the retention deadline, that put call would be
> ignored and metric is recorded.
>
> We have to separate them here since the window store can be used in both
> DSL and PAPI, and for the former case it would likely to be already ignored
> at the processor node level due to the grace period which is usually
> smaller than retention; but for PAPI there's no grace period and hence the
> processor would likely still process and call "put" on the store.
>

Alright! Got it!

>
> > 4) Is there an actual difference between skipped and dropped records?
> > If not, shall we unify the terminology?
> >
> >
> There is. Dropped records are only due to lateness; where as skipped
> records can be due to serde errors (and user's error handling indicate
> "skip and continue"), timestamp errors, etc.
>
> I've considered maybe a better (more extensible) way would be defining a
> single metric name, say skipped-records, but use different tags to indicate
> if its skipping reason (errors, windowing semantics, etc). But there's
> still a tricky difference: for serde caused skipping for example, they will
> be skipped at the very beginning and there's no effects taken at all. For
> some others e.g. null-key / value at the reduce operator, it is only
> skipped at the middle of the processing, i.e. some effects may have already
> been taken in up-stream sub-topologies. And that's why for skipped-records
> I've defined it on both task-level and node-level and the aggregate of the
> latter may still be smaller than the former, whereas for dropped-records it
> is only for node-level.
>
> So how about an even more significant change then: we enlarge the
> `dropped-late-records` to 

[PLC4X] [Kafka Connect] Help with Kafka Connect configuration?

2019-08-06 Thread Christofer Dutz
Hi,

we are currently working on improving the PLC4X Kafka Connect plugin.
While the current version offered a pretty simple configuration, this is not 
quite suitable for production scenarios.

In contrast to normal Kafka Connect sources/sinks the PLC4X source and sink 
should distribute the load that a connection to an industrial controller is 
only initiated by one Kafka connect node at a time
(The reason for this is, that industrial controllers are only able to accept a 
very limited number of connections)
So the Kafka Connect system should distribute these sources to it’s nodes.

For each source we also are able to run multiple jobs which collect given sets 
of parameters in individual intervals and push the data to individual Kafka 
topics.

We came up with the following configuration:

name=plc-source-test
connector.class=org.apache.plc4x.kafka.Plc4xSourceConnector

defaults.topic=some/default

sources.machineA.connectionString=s7://1.2.3.4/1/1
sources.machineA.jobReferences.s7-dashboard.enabled=true
sources.machineA.jobReferences.s7-heartbeat.enabled=true
sources.machineA.jobReferences.s7-heartbeat.topic=heartbeat

sources.machineB.connectionString=s7://10.20.30.40/1/1
sources.machineB.topic=heartbeat
sources.machineB.jobReferences.s7-heartbeat.enabled=true

sources.machineC.connectionString=ads://1.2.3.4.5.6
sources.machineC.topic=heartbeat
sources.machineC.jobReferences.ads-heartbeat.enabled=true

jobs.s7-dashboard.interval=500
jobs.s7-dashboard.fields.inputPreasure=%DB.DB1.4:INT
jobs.s7-dashboard.fields.outputPreasure=%Q1:BYT
jobs.s7-dashboard.fields.temperature=%I3:INT

jobs.s7-heartbeat.interval=1000
jobs.s7-heartbeat.fields.active=%I0.2:BOOL

jobs.ads-heartbeat.interval=1000
jobs.ads-heartbeat.fields.active=Main.running

So we define the individual sources which map to PLC connections and each 
connection references a set of jobs.
“Source” and “Job” are PLC4X terms, so if they collide with Kafka Connect 
terms, please don’t be confused.

We are able to successfully process this into a working configuration.
The problem is, that we would like to certify the driver and therefore are 
required to use the ConfigDef objects to describe the configuration structure.
What would be the best way to do this? Or is our approach completely wrong? As 
far as I can see it, I can only configure my Kafka Connect plugins via property 
structures (Even if they’re transferred via JAVA in the distributed mode)

We would greatly appreciate some help here.

Chris




Re: [DISCUSS] KIP-495: Dynamically Adjust Log Levels in Connect

2019-08-06 Thread Cyrus Vafadari
This looks like a useful feature, the strategy makes sense, and the KIP is
thorough and nicely written. Thanks!

Cyrus

On Thu, Aug 1, 2019, 12:40 PM Chris Egerton  wrote:

> Thanks Arjun! Looks good to me.
>
> On Thu, Aug 1, 2019 at 12:33 PM Arjun Satish 
> wrote:
>
> > Thanks for the feedback, Chris!
> >
> > Yes, the example is pretty much how Connect will use the new feature.
> > Tweaked the section to make this more clear.
> >
> > Best,
> >
> > On Fri, Jul 26, 2019 at 11:52 AM Chris Egerton 
> > wrote:
> >
> > > Hi Arjun,
> > >
> > > This looks great. The changes to public interface are pretty small and
> > > moving the Log4jController class into the clients package seems like
> the
> > > right way to go. One question I have--it looks like the purpose of this
> > KIP
> > > is to enable dynamic setting of log levels in the Connect framework,
> but
> > > it's not clear how the Connect framework will use that new utility. Is
> > the
> > > "Example Usage" section (which involves invoking the utility with a
> > > namespace of "kafka.connect") actually meant to be part of the proposed
> > > changes to public interface?
> > >
> > > Cheers,
> > >
> > > Chris
> > >
> > > On Mon, Jul 22, 2019 at 11:03 PM Arjun Satish 
> > > wrote:
> > >
> > > > Hi everyone.
> > > >
> > > > I'd like to propose the following KIP to implement changing log
> levels
> > on
> > > > the fly in Connect workers:
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-495%3A+Dynamically+Adjust+Log+Levels+in+Connect
> > > >
> > > > Would like to hear your thoughts on this.
> > > >
> > > > Thanks very much,
> > > > Arjun
> > > >
> > >
> >
>