FIPS Compliant(Apache Kafka)

2020-12-17 Thread brajesh . shishir

Hi,

I was wondering if Apache Kafka is Fips Compliant or not. I see that 
Confluent Kafka is FIPS 
Complaint(https://docs.confluent.io/platform/current/security/compliance.html).


What about Apache Kafka, is there any documentation like above for 
apache kafka also?


Regards,

Brajesh



Re: [VOTE] 2.7.0 RC6

2020-12-17 Thread John Roesler
Thanks for taking on this release, Bill!

I'm +1 (binding).

I verified the signatures, built from source, and ran all
the tests. I also read over the upgrade documentation.

Thanks,
-John

On Wed, 2020-12-16 at 09:53 -0500, Bill Bejeck wrote:
> Hello Kafka users, developers and client-developers,
> 
> This is the seventh candidate for release of Apache Kafka 2.7.0.
> 
> * Configurable TCP connection timeout and improve the initial metadata fetch
> * Enforce broker-wide and per-listener connection creation rate (KIP-612,
> part 1)
> * Throttle Create Topic, Create Partition and Delete Topic Operations
> * Add TRACE-level end-to-end latency metrics to Streams
> * Add Broker-side SCRAM Config API
> * Support PEM format for SSL certificates and private key
> * Add RocksDB Memory Consumption to RocksDB Metrics
> * Add Sliding-Window support for Aggregations
> 
> This release also includes a few other features, 53 improvements, and 91
> bug fixes.
> 
> *** Please download, test and vote by Monday, December 21, 12 PM ET
> 
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
> 
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/
> 
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
> 
> * Javadoc:
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/javadoc/
> 
> * Tag to be voted upon (off 2.7 branch) is the 2.7.0 tag:
> https://github.com/apache/kafka/releases/tag/2.7.0-rc6
> 
> * Documentation:
> https://kafka.apache.org/27/documentation.html
> 
> * Protocol:
> https://kafka.apache.org/27/protocol.html
> 
> * Successful Jenkins builds for the 2.7 branch:
> Unit/integration tests:
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-2.7-jdk8/detail/kafka-2.7-jdk8/81/
> 
> Thanks,
> Bill




[jira] [Created] (KAFKA-10864) Convert End Transaction Marker to use auto-generated protocal

2020-12-17 Thread dengziming (Jira)
dengziming created KAFKA-10864:
--

 Summary: Convert End Transaction Marker to use auto-generated 
protocal
 Key: KAFKA-10864
 URL: https://issues.apache.org/jira/browse/KAFKA-10864
 Project: Kafka
  Issue Type: Improvement
Reporter: dengziming
Assignee: dengziming


Similar to other issues such as KAFKA-10497



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] 2.6.1 RC3

2020-12-17 Thread Ismael Juma
Looks like you have your votes Mickael. :)

Ismael

On Fri, Dec 11, 2020 at 7:23 AM Mickael Maison  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the fourth candidate for release of Apache Kafka 2.6.1.
>
> Since RC2, the following JIRAs have been fixed: KAFKA-10811, KAFKA-10802
>
> Release notes for the 2.6.1 release:
> https://home.apache.org/~mimaison/kafka-2.6.1-rc3/RELEASE_NOTES.html
>
> *** Please download, test and vote by Friday, December 18, 12 PM ET ***
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~mimaison/kafka-2.6.1-rc3/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> https://home.apache.org/~mimaison/kafka-2.6.1-rc3/javadoc/
>
> * Tag to be voted upon (off 2.6 branch) is the 2.6.1 tag:
> https://github.com/apache/kafka/releases/tag/2.6.1-rc3
>
> * Documentation:
> https://kafka.apache.org/26/documentation.html
>
> * Protocol:
> https://kafka.apache.org/26/protocol.html
>
> * Successful Jenkins builds for the 2.6 branch:
> Unit/integration tests:
> https://ci-builds.apache.org/job/Kafka/job/kafka-2.6-jdk8/62/
>
> /**
>
> Thanks,
> Mickael
>


Re: [DISCUSS] KIP-691: Transactional Producer Exception Handling

2020-12-17 Thread Boyang Chen
Thanks for everyone's feedback so far. I have polished the KIP after
offline discussion with some folks working on EOS to make the exception
handling more lightweight. The essential change is that we are not
inventing a new intermediate exception type, but instead separating a full
transaction into two phases:

1. The data transmission phase
2. The commit phase

This way for any exception thrown from phase 1, will be an indicator in
phase 2 whether we should do commit or abort, and from now on
`commitTransaction` should only throw fatal exceptions, similar to
`abortTransaction`, so that any KafkaException caught in the commit phase
will be definitely fatal to crash the app. For more advanced users such as
Streams, we have the ability to further wrap selected types of fatal
exceptions to trigger task migration if necessary.

More details in the KIP, feel free to take another look, thanks!

On Thu, Dec 17, 2020 at 8:36 PM Boyang Chen 
wrote:

> Thanks Bruno for the feedback.
>
> On Mon, Dec 7, 2020 at 5:26 AM Bruno Cadonna  wrote:
>
>> Thanks Boyang for the KIP!
>>
>> Like Matthias, I do also not know the producer internal well enough to
>> comment on the categorization. However, I think having a super exception
>> (e.g. RetriableException) that  encodes if an exception is fatal or not
>> is cleaner, better understandable and less error-prone, because ideally
>> when you add a new non-fatal exception in future you just need to think
>> about letting it inherit from the super exception and all the rest of
>> the code will just behave correctly without the need to wrap the new
>> exception into another exception each time it is thrown (maybe it is
>> thrown at different location in the code).
>>
>> As far as I understand the following statement from your previous e-mail
>> is the reason that currently such a super exception is not possible:
>>
>> "Right now we have RetriableException type, if we are going to add a
>> `ProducerRetriableException` type, we have to put this new interface as
>> the parent of the RetriableException, because not all thrown non-fatal
>> exceptions are `retriable` in general for producer"
>>
>>
>> In the list of exceptions in your KIP, I found non-fatal exceptions that
>> do not inherit from RetriableException. I guess those are the ones you
>> are referring to in your statement:
>>
>> InvalidProducerEpochException
>> InvalidPidMappingException
>> TransactionAbortedException
>>
>> All of those exceptions are non-fatal and do not inherit from
>> RetriableException. Is there a reason for that? If they depended from
>> RetriableException we would be a bit closer to a super exception I
>> mention above.
>>
>> The reason is that sender may catch those exceptions in the
> ProduceResponse, and it currently does infinite
> retries on RetriableException. To make sure we could trigger the
> abortTransaction() by catching non-fatal thrown
> exceptions and reinitialize the txn state, we chose not to let those
> exceptions inherit RetriableException so that
> they won't cause infinite retry on sender.
>
>
>> With OutOfOrderSequenceException and UnknownProducerIdException, I think
>> to understand that their fatality depends on the type (i.e.
>> configuration) of the producer. That makes it difficult to have a super
>> exception that encodes the retriability as mentioned above. Would it be
>> possible to introduce exceptions that inherit from RetriableException
>> and that are thrown when those exceptions are caught from the brokers
>> and the type of the producer is such that the exceptions are retriable?
>>
>> Yea, I think in general the exception type mixing is difficult to get
> them all right. I have already proposed another solution based on my
> offline discussion with some folks working on EOS
> to make the handling more straightforward for end users without the need
> to distinguish exception fatality.
>
>> Best,
>> Bruno
>>
>>
>> On 04.12.20 19:34, Guozhang Wang wrote:
>> > Thanks Boyang for the proposal! I made a pass over the list and here are
>> > some thoughts:
>> >
>> > 0) Although this is not part of the public API, I think we should make
>> sure
>> > that the suggested pattern (i.e. user can always call abortTxn() when
>> > handling non-fatal errors) are indeed supported. E.g. if the txn is
>> already
>> > aborted by the broker side, then users can still call abortTxn which
>> would
>> > not throw another exception but just be treated as a no-op.
>> >
>> > 1) *ConcurrentTransactionsException*: I think this error can also be
>> > returned but not documented yet. This should be a non-fatal error.
>> >
>> > 2) *InvalidTxnStateException*: this error is returned from broker when
>> txn
>> > state transition failed (e.g. it is trying to transit to complete-commit
>> > while the current state is not prepare-commit). This error could
>> indicates
>> > a bug on the client internal code or the broker code, OR a user error
>> --- a
>> > similar error is ConcurrentTransactionsException, i.e. 

Re: [DISCUSS] KIP-691: Transactional Producer Exception Handling

2020-12-17 Thread Boyang Chen
Thanks Bruno for the feedback.

On Mon, Dec 7, 2020 at 5:26 AM Bruno Cadonna  wrote:

> Thanks Boyang for the KIP!
>
> Like Matthias, I do also not know the producer internal well enough to
> comment on the categorization. However, I think having a super exception
> (e.g. RetriableException) that  encodes if an exception is fatal or not
> is cleaner, better understandable and less error-prone, because ideally
> when you add a new non-fatal exception in future you just need to think
> about letting it inherit from the super exception and all the rest of
> the code will just behave correctly without the need to wrap the new
> exception into another exception each time it is thrown (maybe it is
> thrown at different location in the code).
>
> As far as I understand the following statement from your previous e-mail
> is the reason that currently such a super exception is not possible:
>
> "Right now we have RetriableException type, if we are going to add a
> `ProducerRetriableException` type, we have to put this new interface as
> the parent of the RetriableException, because not all thrown non-fatal
> exceptions are `retriable` in general for producer"
>
>
> In the list of exceptions in your KIP, I found non-fatal exceptions that
> do not inherit from RetriableException. I guess those are the ones you
> are referring to in your statement:
>
> InvalidProducerEpochException
> InvalidPidMappingException
> TransactionAbortedException
>
> All of those exceptions are non-fatal and do not inherit from
> RetriableException. Is there a reason for that? If they depended from
> RetriableException we would be a bit closer to a super exception I
> mention above.
>
> The reason is that sender may catch those exceptions in the
ProduceResponse, and it currently does infinite
retries on RetriableException. To make sure we could trigger the
abortTransaction() by catching non-fatal thrown
exceptions and reinitialize the txn state, we chose not to let those
exceptions inherit RetriableException so that
they won't cause infinite retry on sender.


> With OutOfOrderSequenceException and UnknownProducerIdException, I think
> to understand that their fatality depends on the type (i.e.
> configuration) of the producer. That makes it difficult to have a super
> exception that encodes the retriability as mentioned above. Would it be
> possible to introduce exceptions that inherit from RetriableException
> and that are thrown when those exceptions are caught from the brokers
> and the type of the producer is such that the exceptions are retriable?
>
> Yea, I think in general the exception type mixing is difficult to get them
all right. I have already proposed another solution based on my offline
discussion with some folks working on EOS
to make the handling more straightforward for end users without the need to
distinguish exception fatality.

> Best,
> Bruno
>
>
> On 04.12.20 19:34, Guozhang Wang wrote:
> > Thanks Boyang for the proposal! I made a pass over the list and here are
> > some thoughts:
> >
> > 0) Although this is not part of the public API, I think we should make
> sure
> > that the suggested pattern (i.e. user can always call abortTxn() when
> > handling non-fatal errors) are indeed supported. E.g. if the txn is
> already
> > aborted by the broker side, then users can still call abortTxn which
> would
> > not throw another exception but just be treated as a no-op.
> >
> > 1) *ConcurrentTransactionsException*: I think this error can also be
> > returned but not documented yet. This should be a non-fatal error.
> >
> > 2) *InvalidTxnStateException*: this error is returned from broker when
> txn
> > state transition failed (e.g. it is trying to transit to complete-commit
> > while the current state is not prepare-commit). This error could
> indicates
> > a bug on the client internal code or the broker code, OR a user error
> --- a
> > similar error is ConcurrentTransactionsException, i.e. if Kafka is
> bug-free
> > these exceptions would only be returned if users try to do something
> wrong,
> > e.g. calling abortTxn right after a commitTxn, etc. So I'm thinking it
> > should be a non-fatal error instead of a fatal error, wdyt?
> >
> > 3) *KafkaException*: case i "indicates fatal transactional sequence
> > (Fatal)", this is a bit conflicting with the *OutOfSequenceException*
> that
> > is treated as non-fatal. I guess your proposal is that
> > OutOfOrderSequenceException would be treated either as fatal with
> > transactional producer, or non-fatal with idempotent producer, is that
> > right? If the producer is only configured with idempotency but not
> > transaction, then throwing a TransactionStateCorruptedException for
> > non-fatal errors would be confusing since users are not using
> transactions
> > at all.. So I suggest we always throw OutOfSequenceException as-is (i.e.
> > not wrapped) no matter how the producer is configured, and let the caller
> > decide how to handle it based on whether it is only idempotent or
> 

[jira] [Resolved] (KAFKA-10846) FileStreamSourceTask buffer can grow without bound

2020-12-17 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-10846.

Fix Version/s: 2.8.0
   Resolution: Fixed

> FileStreamSourceTask buffer can grow without bound
> --
>
> Key: KAFKA-10846
> URL: https://issues.apache.org/jira/browse/KAFKA-10846
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Reporter: Tom Bentley
>Assignee: Tom Bentley
>Priority: Major
> Fix For: 2.8.0
>
>
> When reading a large file the buffer used by {{FileStreamSourceTask}} can 
> grow without bound. Even in the unit test 
> org.apache.kafka.connect.file.FileStreamSourceTaskTest#testBatchSize the 
> buffer grows from 1,024 to 524,288 bytes just reading 10,000 copies of a line 
> of <100 chars.
> The problem is that the condition for growing the buffer is incorrect. The 
> buffer is doubled whenever some bytes were read and the used space in the 
> buffer == the buffer length.
> The requirement to increase the buffer size should be related to whether 
> {{extractLine()}} actually managed to read any lines. It's only when no 
> complete lines were read since the last call to {{read()}} that we need to 
> increase the buffer size (to cope with the large line).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] 2.7.0 RC6

2020-12-17 Thread Sophie Blee-Goldman
Sorry, that should actually say (non-binding) :)

Thanks Bill!

On Thu, Dec 17, 2020 at 7:46 PM Sophie Blee-Goldman 
wrote:

> Thanks for driving this release! I built from the tag and ran the tests,
> and verified the signatures.
>
> +1 (binding)
>
> Sophie
>
> On Thu, Dec 17, 2020 at 3:53 PM Jakub Scholz  wrote:
>
>> +1 (non-binding) ... I used the binaries (Scala 2.12) and the staged Maven
>> artifacts - all seems to work fine. Thanks.
>>
>> Jakub
>>
>> On Wed, Dec 16, 2020 at 3:53 PM Bill Bejeck  wrote:
>>
>> > Hello Kafka users, developers and client-developers,
>> >
>> > This is the seventh candidate for release of Apache Kafka 2.7.0.
>> >
>> > * Configurable TCP connection timeout and improve the initial metadata
>> > fetch
>> > * Enforce broker-wide and per-listener connection creation rate
>> (KIP-612,
>> > part 1)
>> > * Throttle Create Topic, Create Partition and Delete Topic Operations
>> > * Add TRACE-level end-to-end latency metrics to Streams
>> > * Add Broker-side SCRAM Config API
>> > * Support PEM format for SSL certificates and private key
>> > * Add RocksDB Memory Consumption to RocksDB Metrics
>> > * Add Sliding-Window support for Aggregations
>> >
>> > This release also includes a few other features, 53 improvements, and 91
>> > bug fixes.
>> >
>> > *** Please download, test and vote by Monday, December 21, 12 PM ET
>> >
>> > Kafka's KEYS file containing PGP keys we use to sign the release:
>> > https://kafka.apache.org/KEYS
>> >
>> > * Release artifacts to be voted upon (source and binary):
>> > https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/
>> >
>> > * Maven artifacts to be voted upon:
>> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
>> >
>> > * Javadoc:
>> > https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/javadoc/
>> >
>> > * Tag to be voted upon (off 2.7 branch) is the 2.7.0 tag:
>> > https://github.com/apache/kafka/releases/tag/2.7.0-rc6
>> >
>> > * Documentation:
>> > https://kafka.apache.org/27/documentation.html
>> >
>> > * Protocol:
>> > https://kafka.apache.org/27/protocol.html
>> >
>> > * Successful Jenkins builds for the 2.7 branch:
>> > Unit/integration tests:
>> >
>> >
>> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-2.7-jdk8/detail/kafka-2.7-jdk8/81/
>> >
>> > Thanks,
>> > Bill
>> >
>>
>


Re: [VOTE] 2.7.0 RC6

2020-12-17 Thread Gwen Shapira
+1 (binding)

Thank you for the release, Bill!
I validated signatures, built from source package and ran the perf
producer/consumer to validate.

On Wed, Dec 16, 2020 at 6:53 AM Bill Bejeck  wrote:
>
> Hello Kafka users, developers and client-developers,
>
> This is the seventh candidate for release of Apache Kafka 2.7.0.
>
> * Configurable TCP connection timeout and improve the initial metadata fetch
> * Enforce broker-wide and per-listener connection creation rate (KIP-612,
> part 1)
> * Throttle Create Topic, Create Partition and Delete Topic Operations
> * Add TRACE-level end-to-end latency metrics to Streams
> * Add Broker-side SCRAM Config API
> * Support PEM format for SSL certificates and private key
> * Add RocksDB Memory Consumption to RocksDB Metrics
> * Add Sliding-Window support for Aggregations
>
> This release also includes a few other features, 53 improvements, and 91
> bug fixes.
>
> *** Please download, test and vote by Monday, December 21, 12 PM ET
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/javadoc/
>
> * Tag to be voted upon (off 2.7 branch) is the 2.7.0 tag:
> https://github.com/apache/kafka/releases/tag/2.7.0-rc6
>
> * Documentation:
> https://kafka.apache.org/27/documentation.html
>
> * Protocol:
> https://kafka.apache.org/27/protocol.html
>
> * Successful Jenkins builds for the 2.7 branch:
> Unit/integration tests:
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-2.7-jdk8/detail/kafka-2.7-jdk8/81/
>
> Thanks,
> Bill



-- 
Gwen Shapira
Engineering Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Colin McCabe
On Thu, Dec 17, 2020, at 17:02, Jun Rao wrote:
> Hi, Colin,
> 
> Thanks for the reply. Sounds good. One more comment.
> 
> 231. Currently, we have sasl.mechanism.inter.broker.protocol for inter
> broker communication. It seems that we need a similar config for specifying
> the sasl mechanism for the communication between the broker and the
> controller.
> 

Hi Jun,

Yeah... sounds like we could use a new configuration key for this.  I added 
sasl.mechanism.controller.protocol for this.

regards,
Colin

> Jun
> 
> On Thu, Dec 17, 2020 at 2:29 PM Colin McCabe  wrote:
> 
> > On Thu, Dec 17, 2020, at 10:19, Jun Rao wrote:
> > > Hi, Colin,
> > >
> > > Thanks for the reply.
> > >
> > > 211. Hmm, I still don't see a clear benefit of registering a broker
> > before
> > > recovery. It's possible for the recovery to take time. However, during
> > the
> > > recovery mode, it seems the broker will still be in the fenced mode and
> > > won't be able to do work for the clients. So, registering and
> > heartbeating
> > > early seems to only add unnecessary overhead. For your point on topic
> > > creation, I thought we now allow replicas to be created on
> > > unregistered brokers.
> > >
> >
> > Hi Jun,
> >
> > Thanks again for the reviews.
> >
> > Topics cannot be created on unregistered brokers.  They can be created on
> > registered but fenced brokers.  So for that reason I think it makes sense
> > to register as early as possible.
> >
> > > 230. Currently, we do have a ControllerId field in MetadataResponse. In
> > the
> > > early discussion, I thought that we want to expose the controller for
> > > debugging purposes, but not used by the client library.
> > >
> >
> > The current plan is that we will expose the controller node ID, but the
> > controller will not be included in the list of nodes in the metadata
> > response.
> >
> > It's not really possible to include the controller in that list of nodes
> > because the controller may not share the same set of listeners as the
> > broker.  So, for example, maybe the controller endpoint is using a
> > different type of security than the broker.  So while we could pass back a
> > hostname and port, the client would have no way to connect since it doesn't
> > know what security settings to use.
> >
> > regards,
> > Colin
> >
> > > Jun
> > >
> > > On Wed, Dec 16, 2020 at 9:13 PM Colin McCabe  wrote:
> > >
> > > > On Wed, Dec 16, 2020, at 18:13, Jun Rao wrote:
> > > > > Hi, Colin,
> > > > >
> > > > > Thanks for the reply. Just a couple of more comments.
> > > > >
> > > > > 211. Currently, the broker only registers itself in ZK after log
> > > > recovery.
> > > > > Is there any benefit to change that? As you mentioned, the broker
> > can't
> > > > do
> > > > > much before completing log recovery.
> > > > >
> > > >
> > > > Hi Jun,
> > > >
> > > > Previously, it wasn't possible to register in ZK without immediately
> > > > getting added to the MetadataResponse.  So I think that's the main
> > reason
> > > > why registration was delayed until after log recovery.  Since that
> > > > constraint doesn't exist any more, there seems to be no reason to delay
> > > > registration.
> > > >
> > > > I think delaying registration would have some major downsides.  If log
> > > > recovery takes a while, that's a longer window during which someone
> > else
> > > > could register a broker with the same ID.  Having parts of the cluster
> > > > missing for a while gives up some of the benefit of separating
> > registration
> > > > from fencing.  For example, if a broker somehow gets unregistered and
> > we
> > > > want to re-register it, but we have to wait for a 10 minute log
> > recovery to
> > > > do that, that could be a window during which topics can't be created
> > that
> > > > need to be on that broker, and so forth.
> > > >
> > > > > 230. Regarding MetadataResponse, there is a slight awkwardness. We
> > return
> > > > > rack for each node. However, if that node is for the controller, the
> > rack
> > > > > field is not really relevant. Should we clean it up here or in
> > another
> > > > KIP
> > > > > like KIP-700?
> > > >
> > > > Oh, controllers don't appear in the MetadataResponses returned to
> > clients,
> > > > since clients can't access them.  I should have been more clear about
> > that
> > > > in the KIP-- I added a sentence to "Networking" describing this.
> > > >
> > > > best,
> > > > Colin
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Wed, Dec 16, 2020 at 4:23 PM Colin McCabe 
> > wrote:
> > > > >
> > > > > > On Wed, Dec 16, 2020, at 13:40, Jun Rao wrote:
> > > > > > > Hi, Colin,
> > > > > > >
> > > > > > > Thanks for the reply. A few follow up comments.
> > > > > > >
> > > > > > > 211. When does the broker send the BrokerRegistration request to
> > the
> > > > > > > controller? Is it after the recovery phase? If so, at that
> > point, the
> > > > > > > broker has already caught up on the metadata (in order to clean
> > up
> > > > > > 

Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Jun Rao
Hi, Colin,

Thanks for the reply. Sounds good. One more comment.

231. Currently, we have sasl.mechanism.inter.broker.protocol for inter
broker communication. It seems that we need a similar config for specifying
the sasl mechanism for the communication between the broker and the
controller.

Jun

On Thu, Dec 17, 2020 at 2:29 PM Colin McCabe  wrote:

> On Thu, Dec 17, 2020, at 10:19, Jun Rao wrote:
> > Hi, Colin,
> >
> > Thanks for the reply.
> >
> > 211. Hmm, I still don't see a clear benefit of registering a broker
> before
> > recovery. It's possible for the recovery to take time. However, during
> the
> > recovery mode, it seems the broker will still be in the fenced mode and
> > won't be able to do work for the clients. So, registering and
> heartbeating
> > early seems to only add unnecessary overhead. For your point on topic
> > creation, I thought we now allow replicas to be created on
> > unregistered brokers.
> >
>
> Hi Jun,
>
> Thanks again for the reviews.
>
> Topics cannot be created on unregistered brokers.  They can be created on
> registered but fenced brokers.  So for that reason I think it makes sense
> to register as early as possible.
>
> > 230. Currently, we do have a ControllerId field in MetadataResponse. In
> the
> > early discussion, I thought that we want to expose the controller for
> > debugging purposes, but not used by the client library.
> >
>
> The current plan is that we will expose the controller node ID, but the
> controller will not be included in the list of nodes in the metadata
> response.
>
> It's not really possible to include the controller in that list of nodes
> because the controller may not share the same set of listeners as the
> broker.  So, for example, maybe the controller endpoint is using a
> different type of security than the broker.  So while we could pass back a
> hostname and port, the client would have no way to connect since it doesn't
> know what security settings to use.
>
> regards,
> Colin
>
> > Jun
> >
> > On Wed, Dec 16, 2020 at 9:13 PM Colin McCabe  wrote:
> >
> > > On Wed, Dec 16, 2020, at 18:13, Jun Rao wrote:
> > > > Hi, Colin,
> > > >
> > > > Thanks for the reply. Just a couple of more comments.
> > > >
> > > > 211. Currently, the broker only registers itself in ZK after log
> > > recovery.
> > > > Is there any benefit to change that? As you mentioned, the broker
> can't
> > > do
> > > > much before completing log recovery.
> > > >
> > >
> > > Hi Jun,
> > >
> > > Previously, it wasn't possible to register in ZK without immediately
> > > getting added to the MetadataResponse.  So I think that's the main
> reason
> > > why registration was delayed until after log recovery.  Since that
> > > constraint doesn't exist any more, there seems to be no reason to delay
> > > registration.
> > >
> > > I think delaying registration would have some major downsides.  If log
> > > recovery takes a while, that's a longer window during which someone
> else
> > > could register a broker with the same ID.  Having parts of the cluster
> > > missing for a while gives up some of the benefit of separating
> registration
> > > from fencing.  For example, if a broker somehow gets unregistered and
> we
> > > want to re-register it, but we have to wait for a 10 minute log
> recovery to
> > > do that, that could be a window during which topics can't be created
> that
> > > need to be on that broker, and so forth.
> > >
> > > > 230. Regarding MetadataResponse, there is a slight awkwardness. We
> return
> > > > rack for each node. However, if that node is for the controller, the
> rack
> > > > field is not really relevant. Should we clean it up here or in
> another
> > > KIP
> > > > like KIP-700?
> > >
> > > Oh, controllers don't appear in the MetadataResponses returned to
> clients,
> > > since clients can't access them.  I should have been more clear about
> that
> > > in the KIP-- I added a sentence to "Networking" describing this.
> > >
> > > best,
> > > Colin
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Wed, Dec 16, 2020 at 4:23 PM Colin McCabe 
> wrote:
> > > >
> > > > > On Wed, Dec 16, 2020, at 13:40, Jun Rao wrote:
> > > > > > Hi, Colin,
> > > > > >
> > > > > > Thanks for the reply. A few follow up comments.
> > > > > >
> > > > > > 211. When does the broker send the BrokerRegistration request to
> the
> > > > > > controller? Is it after the recovery phase? If so, at that
> point, the
> > > > > > broker has already caught up on the metadata (in order to clean
> up
> > > > > deleted
> > > > > > partitions). Then, it seems we don't need the ShouldFence field
> > > > > > in BrokerHeartbeatRequest?
> > > > >
> > > > > Hi Jun,
> > > > >
> > > > > Thanks again for the reviews.
> > > > >
> > > > > The broker sends the registration request as soon as it starts
> up.  It
> > > > > cannot wait until the recovery phase is over since sometimes log
> > > recovery
> > > > > can take quite a long time.
> > > > >
> > > > > >
> > > > > > 213. 

Re: [VOTE] 2.7.0 RC6

2020-12-17 Thread Sophie Blee-Goldman
Thanks for driving this release! I built from the tag and ran the tests,
and verified the signatures.

+1 (binding)

Sophie

On Thu, Dec 17, 2020 at 3:53 PM Jakub Scholz  wrote:

> +1 (non-binding) ... I used the binaries (Scala 2.12) and the staged Maven
> artifacts - all seems to work fine. Thanks.
>
> Jakub
>
> On Wed, Dec 16, 2020 at 3:53 PM Bill Bejeck  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the seventh candidate for release of Apache Kafka 2.7.0.
> >
> > * Configurable TCP connection timeout and improve the initial metadata
> > fetch
> > * Enforce broker-wide and per-listener connection creation rate (KIP-612,
> > part 1)
> > * Throttle Create Topic, Create Partition and Delete Topic Operations
> > * Add TRACE-level end-to-end latency metrics to Streams
> > * Add Broker-side SCRAM Config API
> > * Support PEM format for SSL certificates and private key
> > * Add RocksDB Memory Consumption to RocksDB Metrics
> > * Add Sliding-Window support for Aggregations
> >
> > This release also includes a few other features, 53 improvements, and 91
> > bug fixes.
> >
> > *** Please download, test and vote by Monday, December 21, 12 PM ET
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > https://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc:
> > https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/javadoc/
> >
> > * Tag to be voted upon (off 2.7 branch) is the 2.7.0 tag:
> > https://github.com/apache/kafka/releases/tag/2.7.0-rc6
> >
> > * Documentation:
> > https://kafka.apache.org/27/documentation.html
> >
> > * Protocol:
> > https://kafka.apache.org/27/protocol.html
> >
> > * Successful Jenkins builds for the 2.7 branch:
> > Unit/integration tests:
> >
> >
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-2.7-jdk8/detail/kafka-2.7-jdk8/81/
> >
> > Thanks,
> > Bill
> >
>


Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Colin McCabe
On Thu, Dec 17, 2020, at 10:19, Jun Rao wrote:
> Hi, Colin,
> 
> Thanks for the reply.
> 
> 211. Hmm, I still don't see a clear benefit of registering a broker before
> recovery. It's possible for the recovery to take time. However, during the
> recovery mode, it seems the broker will still be in the fenced mode and
> won't be able to do work for the clients. So, registering and heartbeating
> early seems to only add unnecessary overhead. For your point on topic
> creation, I thought we now allow replicas to be created on
> unregistered brokers.
> 

Hi Jun,

Thanks again for the reviews.

Topics cannot be created on unregistered brokers.  They can be created on 
registered but fenced brokers.  So for that reason I think it makes sense to 
register as early as possible.

> 230. Currently, we do have a ControllerId field in MetadataResponse. In the
> early discussion, I thought that we want to expose the controller for
> debugging purposes, but not used by the client library.
> 

The current plan is that we will expose the controller node ID, but the 
controller will not be included in the list of nodes in the metadata response.

It's not really possible to include the controller in that list of nodes 
because the controller may not share the same set of listeners as the broker.  
So, for example, maybe the controller endpoint is using a different type of 
security than the broker.  So while we could pass back a hostname and port, the 
client would have no way to connect since it doesn't know what security 
settings to use.

regards,
Colin

> Jun
> 
> On Wed, Dec 16, 2020 at 9:13 PM Colin McCabe  wrote:
> 
> > On Wed, Dec 16, 2020, at 18:13, Jun Rao wrote:
> > > Hi, Colin,
> > >
> > > Thanks for the reply. Just a couple of more comments.
> > >
> > > 211. Currently, the broker only registers itself in ZK after log
> > recovery.
> > > Is there any benefit to change that? As you mentioned, the broker can't
> > do
> > > much before completing log recovery.
> > >
> >
> > Hi Jun,
> >
> > Previously, it wasn't possible to register in ZK without immediately
> > getting added to the MetadataResponse.  So I think that's the main reason
> > why registration was delayed until after log recovery.  Since that
> > constraint doesn't exist any more, there seems to be no reason to delay
> > registration.
> >
> > I think delaying registration would have some major downsides.  If log
> > recovery takes a while, that's a longer window during which someone else
> > could register a broker with the same ID.  Having parts of the cluster
> > missing for a while gives up some of the benefit of separating registration
> > from fencing.  For example, if a broker somehow gets unregistered and we
> > want to re-register it, but we have to wait for a 10 minute log recovery to
> > do that, that could be a window during which topics can't be created that
> > need to be on that broker, and so forth.
> >
> > > 230. Regarding MetadataResponse, there is a slight awkwardness. We return
> > > rack for each node. However, if that node is for the controller, the rack
> > > field is not really relevant. Should we clean it up here or in another
> > KIP
> > > like KIP-700?
> >
> > Oh, controllers don't appear in the MetadataResponses returned to clients,
> > since clients can't access them.  I should have been more clear about that
> > in the KIP-- I added a sentence to "Networking" describing this.
> >
> > best,
> > Colin
> >
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Dec 16, 2020 at 4:23 PM Colin McCabe  wrote:
> > >
> > > > On Wed, Dec 16, 2020, at 13:40, Jun Rao wrote:
> > > > > Hi, Colin,
> > > > >
> > > > > Thanks for the reply. A few follow up comments.
> > > > >
> > > > > 211. When does the broker send the BrokerRegistration request to the
> > > > > controller? Is it after the recovery phase? If so, at that point, the
> > > > > broker has already caught up on the metadata (in order to clean up
> > > > deleted
> > > > > partitions). Then, it seems we don't need the ShouldFence field
> > > > > in BrokerHeartbeatRequest?
> > > >
> > > > Hi Jun,
> > > >
> > > > Thanks again for the reviews.
> > > >
> > > > The broker sends the registration request as soon as it starts up.  It
> > > > cannot wait until the recovery phase is over since sometimes log
> > recovery
> > > > can take quite a long time.
> > > >
> > > > >
> > > > > 213. kafka-cluster.sh
> > > > > 213.1 For the decommision example, should the command take a broker
> > id?
> > > >
> > > > Yes, the example should have a broker id.  Fixed.
> > > >
> > > > > 213.2 Existing tools use the "--" command line option (e.g.
> > kafka-topics
> > > > > --list --topic test). Should we follow the same convention
> > > > > for kafka-cluster.sh (and kafka-storage.sh)?
> > > >
> > > > Hmm.  I don't think argparse4j supports using double dashes to identify
> > > > subcommands.  I think it might be confusing as well, since the
> > subcommand
> > > > must come first, unlike a plain old argument 

Jenkins build is back to normal : Kafka » kafka-trunk-jdk8 #297

2020-12-17 Thread Apache Jenkins Server
See 




Re: [VOTE] 2.7.0 RC6

2020-12-17 Thread Jakub Scholz
+1 (non-binding) ... I used the binaries (Scala 2.12) and the staged Maven
artifacts - all seems to work fine. Thanks.

Jakub

On Wed, Dec 16, 2020 at 3:53 PM Bill Bejeck  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the seventh candidate for release of Apache Kafka 2.7.0.
>
> * Configurable TCP connection timeout and improve the initial metadata
> fetch
> * Enforce broker-wide and per-listener connection creation rate (KIP-612,
> part 1)
> * Throttle Create Topic, Create Partition and Delete Topic Operations
> * Add TRACE-level end-to-end latency metrics to Streams
> * Add Broker-side SCRAM Config API
> * Support PEM format for SSL certificates and private key
> * Add RocksDB Memory Consumption to RocksDB Metrics
> * Add Sliding-Window support for Aggregations
>
> This release also includes a few other features, 53 improvements, and 91
> bug fixes.
>
> *** Please download, test and vote by Monday, December 21, 12 PM ET
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/javadoc/
>
> * Tag to be voted upon (off 2.7 branch) is the 2.7.0 tag:
> https://github.com/apache/kafka/releases/tag/2.7.0-rc6
>
> * Documentation:
> https://kafka.apache.org/27/documentation.html
>
> * Protocol:
> https://kafka.apache.org/27/protocol.html
>
> * Successful Jenkins builds for the 2.7 branch:
> Unit/integration tests:
>
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-2.7-jdk8/detail/kafka-2.7-jdk8/81/
>
> Thanks,
> Bill
>


Re: [VOTE] KIP-695: Improve Streams Time Synchronization

2020-12-17 Thread John Roesler
Thanks Jason,

We would only return the metadata for the latest fetches.
So, if someone wanted to use this to lazily maintain a
client-side metadata map for all partitions, they'd have to
store it separately and merge in new updates as they arrive.

This way:
1. We don't need to increase the complexity of the client by
storing that metadata
2. Users will be able to treat all returned metadata as
"fresh" without having to reason about the timestamps.
3. All parts of the returned ConsumerRecords object have the
same lifecycle: all the data and metadata are the results of
the most recent round of fetch responses that had not been
previously polled.

Does that seem sensible to you? I'll update the KIP to
clarify this.

Thanks,
-John

On Wed, 2020-12-16 at 10:29 -0800, Jason Gustafson wrote:
> Hi John,
> 
> Just one question. It wasn't very clear to me exactly when the metadata
> would be returned in `ConsumerRecords`. Would we /always/ include the
> metadata for all partitions that are assigned, or would it be based on the
> latest fetches?
> 
> Thanks,
> Jason
> 
> On Fri, Dec 11, 2020 at 4:07 PM John Roesler  wrote:
> 
> > Thanks, Guozhang!
> > 
> > All of your feedback sounds good to me. I’ll update the KIP when I am able.
> > 
> > 3) I believe it is the position after the fetch, but I will confirm. I
> > think omitting position may render beginning and end offsets useless as
> > well, which leaves only lag. That would be fine with me, but it also seems
> > nice to supply this extra metadata since it is well defined and probably
> > handy for others. Therefore, I’d go the route of specifying the exact
> > semantics and keeping it.
> > 
> > Thanks for the review,
> > John
> > 
> > On Fri, Dec 11, 2020, at 17:36, Guozhang Wang wrote:
> > > Hello John,
> > > 
> > > Thanks for the updates! I've made a pass on the KIP and also the POC PR,
> > > here are some minor comments:
> > > 
> > > 1) nit: "receivedTimestamp" -> it seems the metadata keep getting
> > updated,
> > > and we do not create a new object but just update the values in-place, so
> > > maybe calling it `lastUpdateTimstamp` is better?
> > > 
> > > 2) It will be great to verify in javadocs that the new API
> > > "ConsumerRecords#metadata(): Map" may return a
> > > superset of TopicPartitions than the existing API that returns the data
> > by
> > > partitions, in case users assume their map key-entries would always be
> > the
> > > same.
> > > 
> > > 3) The "position()" API of the call needs better clarification: is it the
> > > current position AFTER the records are returned, or is it BEFORE the
> > > records are returned? Personally I'd suggest we do not include it if it
> > is
> > > not used anywhere yet just to avoid possible misuage, but I'm fine if you
> > > like to keep it still; in that case just clarify its semantics.
> > > 
> > > 
> > > Other than that,I'm +1 on the KIP as well !
> > > 
> > > 
> > > Guozhang
> > > 
> > > 
> > > On Fri, Dec 11, 2020 at 8:15 AM Walker Carlson 
> > > wrote:
> > > 
> > > > Thanks for the KIP!
> > > > 
> > > > +1 (non-binding)
> > > > 
> > > > walker
> > > > 
> > > > On Wed, Dec 9, 2020 at 11:40 AM Bruno Cadonna 
> > wrote:
> > > > 
> > > > > Thanks for the KIP, John!
> > > > > 
> > > > > +1 (non-binding)
> > > > > 
> > > > > Best,
> > > > > Bruno
> > > > > 
> > > > > On 08.12.20 18:03, John Roesler wrote:
> > > > > > Hello all,
> > > > > > 
> > > > > > There hasn't been much discussion on KIP-695 so far, so I'd
> > > > > > like to go ahead and call for a vote.
> > > > > > 
> > > > > > As a reminder, the purpose of KIP-695 to improve on the
> > > > > > "task idling" feature we introduced in KIP-353. This KIP
> > > > > > will allow Streams to offer deterministic time semantics in
> > > > > > join-type topologies. For example, it makes sure that
> > > > > > when you join two topics, that we collate the topics by
> > > > > > timestamp. That was always the intent with task idling (KIP-
> > > > > > 353), but it turns out the previous mechanism couldn't
> > > > > > provide the desired semantics.
> > > > > > 
> > > > > > The details are here:
> > > > > > https://cwiki.apache.org/confluence/x/JSXZCQ
> > > > > > 
> > > > > > Thanks,
> > > > > > -John
> > > > > > 
> > > > > 
> > > > 
> > > 
> > > 
> > > --
> > > -- Guozhang
> > > 
> > 




Re: [VOTE] 2.7.0 RC6

2020-12-17 Thread Matthias J. Sax
Thanks for running the release Bill!

+1 (binding)

- verified signatures
- build from sources and ran tests
- verified quickstarts using Scala 2.12 binaries


-Matthias

On 12/16/20 6:53 AM, Bill Bejeck wrote:
> Hello Kafka users, developers and client-developers,
> 
> This is the seventh candidate for release of Apache Kafka 2.7.0.
> 
> * Configurable TCP connection timeout and improve the initial metadata fetch
> * Enforce broker-wide and per-listener connection creation rate (KIP-612,
> part 1)
> * Throttle Create Topic, Create Partition and Delete Topic Operations
> * Add TRACE-level end-to-end latency metrics to Streams
> * Add Broker-side SCRAM Config API
> * Support PEM format for SSL certificates and private key
> * Add RocksDB Memory Consumption to RocksDB Metrics
> * Add Sliding-Window support for Aggregations
> 
> This release also includes a few other features, 53 improvements, and 91
> bug fixes.
> 
> *** Please download, test and vote by Monday, December 21, 12 PM ET
> 
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
> 
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/
> 
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
> 
> * Javadoc:
> https://home.apache.org/~bbejeck/kafka-2.7.0-rc6/javadoc/
> 
> * Tag to be voted upon (off 2.7 branch) is the 2.7.0 tag:
> https://github.com/apache/kafka/releases/tag/2.7.0-rc6
> 
> * Documentation:
> https://kafka.apache.org/27/documentation.html
> 
> * Protocol:
> https://kafka.apache.org/27/protocol.html
> 
> * Successful Jenkins builds for the 2.7 branch:
> Unit/integration tests:
> https://ci-builds.apache.org/blue/organizations/jenkins/Kafka%2Fkafka-2.7-jdk8/detail/kafka-2.7-jdk8/81/
> 
> Thanks,
> Bill
> 


Re: [VOTE] 2.6.1 RC3

2020-12-17 Thread Matthias J. Sax
Thanks for running the release Mickael!

+1 (binding)

- verified signatures
- build from sources and ran tests
- verified the three quickstarts


-Matthias

On 12/16/20 4:17 AM, Manikumar wrote:
> Hi,
> 
> +1 (binding)
> - verified signatures
> - ran the tests on the source archive with Scala 2.13
> - verified the core/connect/streams quickstart with Scala 2.13 binary
> archive.
> - verified the artifacts, javadoc
> 
> Thanks for running the release!
> 
> 
> Thanks,
> Manikumar
> 
> On Tue, Dec 15, 2020 at 9:01 PM Rajini Sivaram  wrote:
> 
>> +1 (binding)
>>
>> Verified signatures, ran tests from source build (one flaky test failed but
>> passed on rerun), ran Kafka quick start with the binary with both Scala
>> 2.12 and Scala 2.13.
>>
>> Thanks for running the release, Mickael!
>>
>> Regards,
>>
>> Rajini
>>
>> On Fri, Dec 11, 2020 at 3:23 PM Mickael Maison 
>> wrote:
>>
>>> Hello Kafka users, developers and client-developers,
>>>
>>> This is the fourth candidate for release of Apache Kafka 2.6.1.
>>>
>>> Since RC2, the following JIRAs have been fixed: KAFKA-10811, KAFKA-10802
>>>
>>> Release notes for the 2.6.1 release:
>>> https://home.apache.org/~mimaison/kafka-2.6.1-rc3/RELEASE_NOTES.html
>>>
>>> *** Please download, test and vote by Friday, December 18, 12 PM ET ***
>>>
>>> Kafka's KEYS file containing PGP keys we use to sign the release:
>>> https://kafka.apache.org/KEYS
>>>
>>> * Release artifacts to be voted upon (source and binary):
>>> https://home.apache.org/~mimaison/kafka-2.6.1-rc3/
>>>
>>> * Maven artifacts to be voted upon:
>>> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>>>
>>> * Javadoc:
>>> https://home.apache.org/~mimaison/kafka-2.6.1-rc3/javadoc/
>>>
>>> * Tag to be voted upon (off 2.6 branch) is the 2.6.1 tag:
>>> https://github.com/apache/kafka/releases/tag/2.6.1-rc3
>>>
>>> * Documentation:
>>> https://kafka.apache.org/26/documentation.html
>>>
>>> * Protocol:
>>> https://kafka.apache.org/26/protocol.html
>>>
>>> * Successful Jenkins builds for the 2.6 branch:
>>> Unit/integration tests:
>>> https://ci-builds.apache.org/job/Kafka/job/kafka-2.6-jdk8/62/
>>>
>>> /**
>>>
>>> Thanks,
>>> Mickael
>>>
>>
> 


[apache/kafka] KAFKA-10413 Allow even distribution of lost/new tasks when more than one worker

2020-12-17 Thread Ramesh Krishnan
Hi Team,

Can some help re trigger the build for this PR.

https://github.com/apache/kafka/pull/9319

Thanks
Ramesh


Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-12-17 Thread Jun Rao
Hi, Colin,

Thanks for the reply.

211. Hmm, I still don't see a clear benefit of registering a broker before
recovery. It's possible for the recovery to take time. However, during the
recovery mode, it seems the broker will still be in the fenced mode and
won't be able to do work for the clients. So, registering and heartbeating
early seems to only add unnecessary overhead. For your point on topic
creation, I thought we now allow replicas to be created on
unregistered brokers.

230. Currently, we do have a ControllerId field in MetadataResponse. In the
early discussion, I thought that we want to expose the controller for
debugging purposes, but not used by the client library.

Jun

On Wed, Dec 16, 2020 at 9:13 PM Colin McCabe  wrote:

> On Wed, Dec 16, 2020, at 18:13, Jun Rao wrote:
> > Hi, Colin,
> >
> > Thanks for the reply. Just a couple of more comments.
> >
> > 211. Currently, the broker only registers itself in ZK after log
> recovery.
> > Is there any benefit to change that? As you mentioned, the broker can't
> do
> > much before completing log recovery.
> >
>
> Hi Jun,
>
> Previously, it wasn't possible to register in ZK without immediately
> getting added to the MetadataResponse.  So I think that's the main reason
> why registration was delayed until after log recovery.  Since that
> constraint doesn't exist any more, there seems to be no reason to delay
> registration.
>
> I think delaying registration would have some major downsides.  If log
> recovery takes a while, that's a longer window during which someone else
> could register a broker with the same ID.  Having parts of the cluster
> missing for a while gives up some of the benefit of separating registration
> from fencing.  For example, if a broker somehow gets unregistered and we
> want to re-register it, but we have to wait for a 10 minute log recovery to
> do that, that could be a window during which topics can't be created that
> need to be on that broker, and so forth.
>
> > 230. Regarding MetadataResponse, there is a slight awkwardness. We return
> > rack for each node. However, if that node is for the controller, the rack
> > field is not really relevant. Should we clean it up here or in another
> KIP
> > like KIP-700?
>
> Oh, controllers don't appear in the MetadataResponses returned to clients,
> since clients can't access them.  I should have been more clear about that
> in the KIP-- I added a sentence to "Networking" describing this.
>
> best,
> Colin
>
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Dec 16, 2020 at 4:23 PM Colin McCabe  wrote:
> >
> > > On Wed, Dec 16, 2020, at 13:40, Jun Rao wrote:
> > > > Hi, Colin,
> > > >
> > > > Thanks for the reply. A few follow up comments.
> > > >
> > > > 211. When does the broker send the BrokerRegistration request to the
> > > > controller? Is it after the recovery phase? If so, at that point, the
> > > > broker has already caught up on the metadata (in order to clean up
> > > deleted
> > > > partitions). Then, it seems we don't need the ShouldFence field
> > > > in BrokerHeartbeatRequest?
> > >
> > > Hi Jun,
> > >
> > > Thanks again for the reviews.
> > >
> > > The broker sends the registration request as soon as it starts up.  It
> > > cannot wait until the recovery phase is over since sometimes log
> recovery
> > > can take quite a long time.
> > >
> > > >
> > > > 213. kafka-cluster.sh
> > > > 213.1 For the decommision example, should the command take a broker
> id?
> > >
> > > Yes, the example should have a broker id.  Fixed.
> > >
> > > > 213.2 Existing tools use the "--" command line option (e.g.
> kafka-topics
> > > > --list --topic test). Should we follow the same convention
> > > > for kafka-cluster.sh (and kafka-storage.sh)?
> > >
> > > Hmm.  I don't think argparse4j supports using double dashes to identify
> > > subcommands.  I think it might be confusing as well, since the
> subcommand
> > > must come first, unlike a plain old argument which can be anywhere on
> the
> > > command line.
> > >
> > > > 213.3 Should we add an admin api for broker decommission so that
> this can
> > > > be done programmatically?
> > > >
> > >
> > > Yes, there is an admin client API for decommissioning as well.
> > >
> > > > 220. DecommissionBroker: "NOT_CONTROLLER if the node that the
> request was
> > > > sent to is not the controller". I thought the clients never send a
> > > request
> > > > to the controller directly now and the broker will forward it to the
> > > > controller?
> > > >
> > >
> > > If the controller moved recently, it's possible that the broker could
> send
> > > to a controller that has just recently become inactive.  In that case
> > > NOT_CONTROLLER would be returned.  (A standby controller returns
> > > NOT_CONTROLLER for most APIs).
> > >
> > > > 221. Could we add the required ACL for the new requests?
> > > >
> > >
> > > Good point.  I added the required ACL for each new RPC.
> > >
> > > best,
> > > Colin
> > >
> > >
> > > > Jun
> > > >
> > > > On Wed, Dec 16, 2020 at 

[jira] [Resolved] (KAFKA-10862) kafka stream consume from the earliest by default

2020-12-17 Thread Yuexi Liu (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yuexi Liu resolved KAFKA-10862.
---
Resolution: Not A Problem

> kafka stream consume from the earliest by default
> -
>
> Key: KAFKA-10862
> URL: https://issues.apache.org/jira/browse/KAFKA-10862
> Project: Kafka
>  Issue Type: Bug
>  Components: config, consumer
>Affects Versions: 2.3.1
> Environment: MAC
>Reporter: Yuexi Liu
>Priority: Major
>
> on [https://kafka.apache.org/documentation/#auto.offset.reset] it shows 
> auto.offset.reset is by default using latest, but from code, it is not
>  
> [https://github.com/apache/kafka/blob/72918a98161ba71ff4fa8116fdf8ed02b09a0580/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L884]
>  and when I create a kafka stream without specified offset reset policy, it 
> consumed from the beginning



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-10850) Use primitive type to replace deprecated 'new Integer' from BrokerToControllerRequestThreadTest

2020-12-17 Thread Chia-Ping Tsai (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chia-Ping Tsai resolved KAFKA-10850.

Fix Version/s: 2.8.0
   Resolution: Fixed

> Use primitive type to replace deprecated 'new Integer' from 
> BrokerToControllerRequestThreadTest
> ---
>
> Key: KAFKA-10850
> URL: https://issues.apache.org/jira/browse/KAFKA-10850
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Chia-Ping Tsai
>Assignee: Govinda Sakhare
>Priority: Minor
>  Labels: newbie
> Fix For: 2.8.0
>
>
> as title



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] KIP-695: Improve Streams Time Synchronization

2020-12-17 Thread Bill Bejeck
Hi John,

I've made a pass over the KIP and I think it will be a good addition.

Modulo Jason's question, I'm a +1 (binding).

Thanks,
Bill

On Wed, Dec 16, 2020 at 1:29 PM Jason Gustafson  wrote:

> Hi John,
>
> Just one question. It wasn't very clear to me exactly when the metadata
> would be returned in `ConsumerRecords`. Would we /always/ include the
> metadata for all partitions that are assigned, or would it be based on the
> latest fetches?
>
> Thanks,
> Jason
>
> On Fri, Dec 11, 2020 at 4:07 PM John Roesler  wrote:
>
> > Thanks, Guozhang!
> >
> > All of your feedback sounds good to me. I’ll update the KIP when I am
> able.
> >
> > 3) I believe it is the position after the fetch, but I will confirm. I
> > think omitting position may render beginning and end offsets useless as
> > well, which leaves only lag. That would be fine with me, but it also
> seems
> > nice to supply this extra metadata since it is well defined and probably
> > handy for others. Therefore, I’d go the route of specifying the exact
> > semantics and keeping it.
> >
> > Thanks for the review,
> > John
> >
> > On Fri, Dec 11, 2020, at 17:36, Guozhang Wang wrote:
> > > Hello John,
> > >
> > > Thanks for the updates! I've made a pass on the KIP and also the POC
> PR,
> > > here are some minor comments:
> > >
> > > 1) nit: "receivedTimestamp" -> it seems the metadata keep getting
> > updated,
> > > and we do not create a new object but just update the values in-place,
> so
> > > maybe calling it `lastUpdateTimstamp` is better?
> > >
> > > 2) It will be great to verify in javadocs that the new API
> > > "ConsumerRecords#metadata(): Map" may return
> a
> > > superset of TopicPartitions than the existing API that returns the data
> > by
> > > partitions, in case users assume their map key-entries would always be
> > the
> > > same.
> > >
> > > 3) The "position()" API of the call needs better clarification: is it
> the
> > > current position AFTER the records are returned, or is it BEFORE the
> > > records are returned? Personally I'd suggest we do not include it if it
> > is
> > > not used anywhere yet just to avoid possible misuage, but I'm fine if
> you
> > > like to keep it still; in that case just clarify its semantics.
> > >
> > >
> > > Other than that,I'm +1 on the KIP as well !
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Fri, Dec 11, 2020 at 8:15 AM Walker Carlson 
> > > wrote:
> > >
> > > > Thanks for the KIP!
> > > >
> > > > +1 (non-binding)
> > > >
> > > > walker
> > > >
> > > > On Wed, Dec 9, 2020 at 11:40 AM Bruno Cadonna 
> > wrote:
> > > >
> > > > > Thanks for the KIP, John!
> > > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Best,
> > > > > Bruno
> > > > >
> > > > > On 08.12.20 18:03, John Roesler wrote:
> > > > > > Hello all,
> > > > > >
> > > > > > There hasn't been much discussion on KIP-695 so far, so I'd
> > > > > > like to go ahead and call for a vote.
> > > > > >
> > > > > > As a reminder, the purpose of KIP-695 to improve on the
> > > > > > "task idling" feature we introduced in KIP-353. This KIP
> > > > > > will allow Streams to offer deterministic time semantics in
> > > > > > join-type topologies. For example, it makes sure that
> > > > > > when you join two topics, that we collate the topics by
> > > > > > timestamp. That was always the intent with task idling (KIP-
> > > > > > 353), but it turns out the previous mechanism couldn't
> > > > > > provide the desired semantics.
> > > > > >
> > > > > > The details are here:
> > > > > > https://cwiki.apache.org/confluence/x/JSXZCQ
> > > > > >
> > > > > > Thanks,
> > > > > > -John
> > > > > >
> > > > >
> > > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>


[jira] [Resolved] (KAFKA-9126) Extend `StreamJoined` to allow more store configs

2020-12-17 Thread Leah Thomas (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leah Thomas resolved KAFKA-9126.

Fix Version/s: 2.8.0
   Resolution: Fixed

> Extend `StreamJoined` to allow more store configs
> -
>
> Key: KAFKA-9126
> URL: https://issues.apache.org/jira/browse/KAFKA-9126
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.4.0
>Reporter: Matthias J. Sax
>Assignee: Leah Thomas
>Priority: Minor
>  Labels: needs-kip, newbie, newbie++
> Fix For: 2.8.0
>
>
> In 2.4.0 release, we introduced `StreamJoined` configuration object via 
> KIP-479 (KAFKA-8558). The idea of `StreamJoined` is to be an equivalent to 
> `Materialized` but customized for stream-stream joines, that have two stores 
> (in contrast to the usage of `Materialized` that is used for single-store 
> operators).
> During the KIP discussion, the idea to allow setting the store retention time 
> and enable/disable changelogging for the store was discussed. However, at 
> some point this idea was dropped for unknown reasons (seems it slipped).
> We should consider to extend `StreamJoined` with `withRetentionPeriod()` and 
> `loggingEnabled()`/`loggingDisabled()` methods to get feature parity to 
> `Materialized`.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10863) Conver CONTROL_RECORD_KEY_SCHEMA_VERSION to use auto-generated protocal

2020-12-17 Thread dengziming (Jira)
dengziming created KAFKA-10863:
--

 Summary: Conver CONTROL_RECORD_KEY_SCHEMA_VERSION to use 
auto-generated protocal
 Key: KAFKA-10863
 URL: https://issues.apache.org/jira/browse/KAFKA-10863
 Project: Kafka
  Issue Type: Improvement
Reporter: dengziming
Assignee: dengziming


Similar to other issues such as KAFKA-10497



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [VOTE] voting on KIP-631: the quorum-based Kafka controller

2020-12-17 Thread David Jacot
Thanks for driving this KIP, Colin. The KIP is really well written. This is
so exciting!

+1 (binding)

Best,
David

On Wed, Dec 16, 2020 at 11:51 PM Colin McCabe  wrote:

> On Wed, Dec 16, 2020, at 13:08, Ismael Juma wrote:
> > Thanks for all the work on the KIP. Given the magnitude of the KIP, I
> > expect that some tweaks will be made as the code is implemented, reviewed
> > and tested. I'm overall +1 (binding).
> >
>
> Thanks, Ismael.
>
> > A few comments below:
> > 1. It's a bit weird for kafka-storage to output a random uuid. Would it
> be
> > better to have a dedicated command for that?
>
> I'm not sure.  The nice thing about putting it in kafka-storage.sh is that
> it's there when you need it.  I also think that having subcommands, like we
> do here, really reduces the "clutter" that we have in some other
> command-line tools.  When you get help about the "info" subcommand, you
> don't see flags for any other subcommand, for example.  I guess we can move
> this later if it seems more intuitive though.
>
> > Also, since we use base64
> > encoded uuids nearly everywhere (including cluster and topic ids), it
> would
> > be good to follow that pattern instead of the less compact
> > "51380268-1036-410d-a8fc-fb3b55f48033".
>
> Good idea.  I have updated this to use base64 encoded UUIDs.
>
> > 2. This is a nit, but I think it would be better to talk about built-in
> > quorum mode instead of KIP-500 mode. It's more self descriptive than a
> KIP
> > reference.
>
> I do like the sound of "quorum mode."  I guess the main question is, if we
> later implement raft quorums for regular topics, would that nomenclature be
> confusing?  I guess we could talk about "metadata quorum mode" to avoid
> confusion.  Hmm.
>
> > 3. Did we consider using `session` (like the group coordinator) instead
> of
> > `regsitration` in `broker.registration.timeout.ms`?
>
> Hmm, broker.session.timeout.ms does sound better.  I changed it to that.
>
> > 4. The flat id space for the controller and broker while requiring a
> > different id in embedded mode seems a bit unintuitive. Are there any
> other
> > systems that do this? I know we covered some of the reasons in the
> "Shared
> > IDs between Multiple Nodes" rejected alternatives section, but it didn't
> > seem totally convincing to me.
>
> One of my concerns here is that using separate ID spaces for controllers
> versus brokers would potentially lead to metrics or logging collisions.  We
> can take a look at that again once the implementation is further along, I
> guess, to see how often that is a problem in practice.
>
> > 5. With regards to the controller process listening on a separate port,
> it
> > may be worth adding a sentence about the forwarding KIP as that is a main
> > reason why the controller port doesn't need to be accessible.
>
> Good idea... I added a short reference to KIP-590 in the "Networking"
> section.
>
> > 6. The internal topic seems to be called @metadata. I'm personally not
> > convinced about the usage of @ in this way. I think I would go with the
> > same convention we have used for other existing internal topics.
>
> I knew this one would be controversial :)
>
> I guess the main argument here is that using @ avoids collisions with any
> existing topic.  Leading underscores, even double underscores, can be used
> by users to create new topics, but an "at sign" cannot  It would be nice to
> have a namespace for system topics that we knew nobody else could break
> into.
>
> > 7. We talk about the metadata.format feature flag. Is this intended to
> > allow for single roll upgrades?
> > 8. Could the incarnation id be called registration id? Or is there a
> reason
> > why this would be a bad name?
>
> I liked "incarnation id" because it expresses the idea that each new
> incarnation of the broker gets a different one.  I think "registration id"
> might be confused with "the broker id is the ID we're registering."
>
> > 9. Could `CurMetadataOffset` be called `CurrentMetadataOffset` for
> > `BrokerRegistrationRequest`? The abbreviation here doesn't seem to help
> > much and makes things slightly less readable. It would also make it
> > consistent with `BrokerHeartbeatRequest`.
>
> Yeah, the abbreviated name is inconsistent.  I will change it to
> CurrentMetadataOffset.
>
> > 10. Should `UnregisterBrokerRecord` be `DeregisterBrokerRecord`?
>
> Hmm, "Register/Unregister" is more consistent with "Fence/Unfence" which
> is why I went with Unregister.  It looks like they're both in the
> dictionary, so I'm not sure if "deregister" has an advantage...
>
> > 11. Broker metrics typically have a PerSec suffix, should we stick with
> > that for the `MetadataCommitRate`?
>
> Added.
>
> > 12. For the lag metrics, would it be clearer if we included "Offset" in
> the
> > name? In theory, we could have time based lag metrics too. Having said
> > that, existing offset lag metrics do seem to just have `Lag` in their
> name
> > without further qualification.
> >
>
> Yeah, I 

Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #296

2020-12-17 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10861; Fix race condition in flaky test 
`testFencingOnSendOffsets` (#9762)


--
[...truncated 6.93 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowNoSuchElementExceptionForUnusedOutputTopicWithDynamicRouting[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowNoSuchElementExceptionForUnusedOutputTopicWithDynamicRouting[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest >