Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #105

2020-10-01 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10134 Follow-up: Set the re-join flag in heartbeat failure 
(#9354)


--
[...truncated 3.33 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldThrowForMissingTime[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureInternalTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTimeDeprecated[Eos enabled = false] STARTED


Build failed in Jenkins: Kafka » kafka-2.6-jdk8 #22

2020-10-01 Thread Apache Jenkins Server
See 


Changes:

[wangguoz] KAFKA-10134 Follow-up: Set the re-join flag in heartbeat failure 
(#9354)


--
[...truncated 3.16 MB...]
org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardInit 
PASSED

org.apache.kafka.streams.TestTopicsTest > testNonUsedOutputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonUsedOutputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testEmptyTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testEmptyTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testStartTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testStartTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testNegativeAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
STARTED

org.apache.kafka.streams.TestTopicsTest > shouldNotAllowToCreateWithNullDriver 
PASSED

org.apache.kafka.streams.TestTopicsTest > testDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testValue STARTED

org.apache.kafka.streams.TestTopicsTest > testValue PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestampAutoAdvance PASSED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testOutputWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputTopicWithNullTopicName PASSED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde STARTED

org.apache.kafka.streams.TestTopicsTest > testWrongSerde PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMapWithNull PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingOutputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics STARTED

org.apache.kafka.streams.TestTopicsTest > testMultipleTopics PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueList PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateOutputWithNullDriver PASSED

org.apache.kafka.streams.TestTopicsTest > testValueList STARTED

org.apache.kafka.streams.TestTopicsTest > testValueList PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordList PASSED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic STARTED

org.apache.kafka.streams.TestTopicsTest > testNonExistingInputTopic PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValuesToMap PASSED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList STARTED

org.apache.kafka.streams.TestTopicsTest > testRecordsToList PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValueListDuration PASSED

org.apache.kafka.streams.TestTopicsTest > testInputToString STARTED

org.apache.kafka.streams.TestTopicsTest > testInputToString PASSED

org.apache.kafka.streams.TestTopicsTest > testTimestamp STARTED

org.apache.kafka.streams.TestTopicsTest > testTimestamp PASSED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders STARTED

org.apache.kafka.streams.TestTopicsTest > testWithHeaders PASSED

org.apache.kafka.streams.TestTopicsTest > testKeyValue STARTED

org.apache.kafka.streams.TestTopicsTest > testKeyValue PASSED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName STARTED

org.apache.kafka.streams.TestTopicsTest > 
shouldNotAllowToCreateTopicWithNullTopicName PASSED

> Task :streams:upgrade-system-tests-0100:spotbugsMain NO-SOURCE
> Task :streams:upgrade-system-tests-0100:test
> Task :streams:upgrade-system-tests-0101:compileJava NO-SOURCE
> Task :streams:upgrade-system-tests-0101:processResources NO-SOURCE
> Task :streams:upgrade-system-tests-0101:classes UP-TO-DATE
> Task :streams:upgrade-system-tests-0101:checkstyleMain NO-SOURCE
> Task :streams:upgrade-system-tests-0101:compileTestJava
> Task 

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-10-01 Thread Justine Olshan
Hi Jun,
Thanks for looking at it again. Here's the new spec. (I fixed the typo in
it too.)
{"name": "id", "type": "string", "doc": option[UUID]}

Justine


On Thu, Oct 1, 2020 at 5:03 PM Jun Rao  wrote:

> Hi, Justine,
>
> Thanks for the update. The KIP looks good to me now. Just a minor comment
> below.
>
> 30. Perhaps "option[UUID]" can be put in the doc.
>
> Jun
>
> On Thu, Oct 1, 2020 at 3:28 PM Justine Olshan 
> wrote:
>
> > Hi Jun,
> > Thanks for the response!
> >
> > 30. I think I might have changed this in between. The current version
> > says:  {"name":
> > "id", "type": "option[UUID]"}, "doc": topic id}
> > I have switched to the option type to cover the migration case where a
> > TopicZNode does not yet have a topic ID.
> > I understand that due to json, this field is written as a string, so if I
> > should move the "option[uuid]" to the doc field and the type should be
> > "string" please let me know.
> >
> > 40. I've added a definition for UUID.
> > 41,42. Fixed
> >
> > Thank you,
> > Justine
> >
> > On Wed, Sep 30, 2020 at 1:15 PM Jun Rao  wrote:
> >
> > > Hi, Justine,
> > >
> > > Thanks for the summary. Just a few minor comments blow.
> > >
> > > 30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc
> > should
> > > say UUID. The issue seems unfixed.
> > >
> > > 40. Since UUID is public facing, could you include its definition?
> > >
> > > 41. StopReplicaResponse still includes the topic field.
> > >
> > > 42. "It is unnecessary to include the name of the topic in the
> following
> > > Request/Response calls" It would be useful to include all modified
> > requests
> > > (e.g. produce) in the list below.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Wed, Sep 30, 2020 at 10:17 AM Justine Olshan 
> > > wrote:
> > >
> > > > Hello again,
> > > >
> > > > I've taken some time to discuss some of the remaining points brought
> up
> > > by
> > > > the previous emails offline. Here are some of the conclusions.
> > > >
> > > > 1. Directory Structure:
> > > > There was some talk about whether the directory structure should be
> > > changed
> > > > to replace all topic names with topic IDs.
> > > > This will be a useful change, but will prevent downgrades. It will be
> > > best
> > > > to wait until a major release, likely alongside KIP-500 changes that
> > will
> > > > prevent downgrades. I've updated the KIP to include this change with
> > the
> > > > note about migration and deprecation.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-log.dirlayout
> > > >
> > > > 2. Partition Metadata file
> > > > There was some disagreement about the inclusion of the partition
> > metadata
> > > > file.
> > > > This file will be needed to persist the topic ID, especially while we
> > > still
> > > > have the old directory structure. Even after the changes, this file
> can
> > > be
> > > > useful for debugging and recovery.
> > > > Creating a single mapping file from topic names to topic IDs was
> > > > considered, but ultimately scrapped as it would not be as easy to
> > > maintain.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-PartitionMetadatafile
> > > >
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-PersistingTopicIDs
> > > >
> > > > 3. Produce Protocols
> > > > After some further discussion, this replacing the topic name with
> topic
> > > ID
> > > > in these protocols has been added to the KIP.
> > > > This will cut down on the size of the protocol. Since changes to
> fetch
> > > are
> > > > included, it does make sense to update these protocols.
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-Produce
> > > >
> > > > 4. KIP-500 Compatibility
> > > > After some discussion with those involved with KIP-500, it seems best
> > to
> > > > use a sentinel topic ID for the metadata topic that is used before
> the
> > > > first controller election.
> > > > We had an issue where this metadata topic may not be assigned an ID
> > > before
> > > > utilizing the Vote and Fetch protocols. It was decided to reserve a
> > > unique
> > > > ID for this topic to be used until the controller could give the
> topic
> > a
> > > > unique ID.
> > > > Having the topic keep the sentinel ID (not be reassigned to a unique
> > ID)
> > > > was considered, but it seemed like a good idea to leave the option
> open
> > > for
> > > > the metadata topic to have a unique ID in cases where it would need
> to
> > be
> > > > differentiated from other clusters' metadata topics. (ex. tiered
> > storage)
> > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-CompatibilitywithKIP-500
> > > >
> > > > I've also split up the KIP 

Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-10-01 Thread Jose Garcia Sancio
Comments below.

Here are the change to the KIP:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763=34=33

> 41. That's a good point. With compacted topic, the cleaning won't be done
> until the active segment rolls. With snapshots, I guess we don't have this
> restriction? So, it would be useful to guard against too frequent
> snapshotting. Does the new proposal address this completely? If the
> snapshot has only 1 record, and the new record keeps updating the same key,
> does that still cause the snapshot to be generated frequently?

That is true. In addition to metadata.snapshot.min.cleanable.ratio we
can add the following configuration:

metadata.snapshot.min.records.size - This is the minimum number of
bytes in the replicated log between the latest snapshot and the
high-watermark needed before generating a new snapshot. The default is
20MB.

Both configurations need to be satisfied before generating a new
snapshot. I have updated the KIP.

> 42. One of the reasons that we added the per partition limit is to allow
> each partition to make relatively even progress during the catchup phase.
> This helps kstreams by potentially reducing the gap between stream time
> from different partitions. If we can achieve the same thing without the
> partition limit, it will be fine too. "3. For the remaining partitions send
> at most the average max bytes plus the average remaining bytes." Do we
> guarantee that request level max_bytes can be filled up when it can? Could
> we document how we distribute the request level max_bytes to partitions in
> the KIP?

I want to allow some flexibility in the implementation. How about the
following update to the FetchSnapshot Request Handling section:

3. Send the bytes in the snapshot from Position. If there are multiple
partitions in the FetchSnapshot request, then the leader will evenly
distribute the number of bytes sent across all of the partitions. The
leader will not send more bytes in the response than ResponseMaxBytes,
the minimum of MaxBytes in the request and the value configured in
replica.fetch.response.max.bytes.
  a. Each topic partition is guaranteed to receive at least the
average of ResponseMaxBytes if that snapshot has enough bytes
remaining.
  b. If there are topic partitions with snapshots that have remaining
bytes less than the average ResponseMaxBytes, then those bytes may be
used to send snapshot bytes for other topic partitions.

I should also point out that in practice for KIP-630 this request will
only have one topic partition (__cluster_metadata-0).

I should also point out that FetchSnapshot is sending bytes not
records so there is no requirement that the response must contain at
least one record like Fetch.

>Also, should we change Fetch accordingly?

If we want to make this change I think we should do this in another
KIP. What do you think?

> 46. If we don't have IBP, how do we make sure that a broker doesn't
> issue FetchSnapshotRequest when the receiving broker hasn't been upgraded
> yet?

For a broker to send a FetchSnapshotRequest it means that it received
a FetchResponse that contained a SnapshotId field. For the leader to
send a SnapshotId in the FetchResponse it means that the leader is
executing code that knows how to handle FetchSnapshotRequests.

The inverse is also true. For the follower to receive a SnapshotId for
the FetchResponse it means that it sent the FetchRequest to the leader
of the __cluster_metadata-0 topic partitions. Only the KafkaRaftClient
will send that fetch request.

After writing the above, I see what you are saying. The broker needs
to know if it should enable the KafkaRaftClient and send FetchRequests
to the __cluster_metadata-0 topic partition. I think that there is
also a question of how to perform a rolling migration of a cluster
from ZK to KIP-500. I think we will write a future KIP that documents
this process.

Thanks for your help here. For now, I'll mention that we will bump the
IBP. The new wording for the "Compatibility, Deprecation, and
Migration Plan" section:

This KIP is only implemented for the internal topic
__cluster_metadata. The inter-broker protocol (IBP) will be increased
to indicate that all of the brokers in the cluster support KIP-595 and
KIP-630.

Thanks,
-Jose


Re: [VOTE] KIP-478 Strongly Typed Streams Processor API

2020-10-01 Thread Matthias J. Sax
Thanks John.

SGTM.

On 10/1/20 2:50 PM, John Roesler wrote:
> Hello again, all,
> 
> I'm sorry to make another tweak to this KIP, but during the
> implementation of the design we've just agreed on, I
> realized that Processors would almost never need to
> reference the RecordMetadata. Therefore, I'm proposing to
> streamline the API by moving the Optional to
> the new ProcessorContext as a method, rather than making it
> a method argument to Processor#process.
> 
> The change is visible here:
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=118172121=16=15
> 
> All of the same semantics and considerations we discussed
> still apply, it's just that Processor implementers won't
> have to think about it unless they actually _need_ the
> topic/partition/offset information from the RecordMetadata.
> 
> Also, the PR for this part of the KIP is now available here:
> https://github.com/apache/kafka/pull/9361
> 
> I know it's a bit on the heavy side; I've annotated the PR
> to try and ease the reviewer's job. I'd greatly appreciate
> it if anyone can take the time to review.
> 
> Thanks,
> -John
> 
> On Wed, 2020-09-30 at 10:16 -0500, John Roesler wrote:
>> Thanks, Matthias!
>>
>> I can certainly document it. I didn't bother because the old
>> Processor, Supplier, and Context will themselves be
>> deprecated, so any method that handles them won't be able to
>> avoid the deprecation warning. Nevertheless, it doesn't hurt
>> just to explicitly deprecated those methods.
>>
>> Thanks,
>> -John
>>
>> On Wed, 2020-09-30 at 08:10 -0700, Matthias J. Sax wrote:
>>> Thanks John. I like the proposal.
>>>
>>> Btw: I was just going over the KIP and realized that we add new methods
>>> to `StreamBuilder`, `Topology`, and `KStream` that take the new
>>> `ProcessorSupplier` class -- should we also deprecate the corresponding
>>> existing ones that take the old `ProcessorSupplier`?
>>>
>>>
>>> -Matthias
>>>
>>>
>>> On 9/30/20 7:46 AM, John Roesler wrote:
 Thanks Paul and Sophie,

 Your feedback certainly underscores the need to be explicit
 in the javadoc about why that parameter is Optional. Getting
 this kind of feedback before the release is exactly the kind
 of outcome we hope to get from the KIP process!

 Thanks,
 -John

 On Tue, 2020-09-29 at 22:32 -0500, Paul Whalen wrote:
> John, I totally agree that adding a method to Processor is cumbersome and
> not a good path.  I was imagining maybe a separate interface that could be
> used in the appropriate context, but I don't think that makes too much
> sense - it's just too far away from what Kafka Streams is.  I was
> originally more interested in the "why" Optional than the "how" (I think 
> my
> original reply overplayed the "optional as an argument" concern).  But
> you've convinced me that there is a perfectly legitimate "why".  We should
> make sure that it's clear why it's Optional, but I suppose that goes
> without saying.  It's a nice opportunity to make the API reflect more what
> is actually going on under the hood.
>
> Thanks!
> Paul
>
> On Tue, Sep 29, 2020 at 10:05 PM Sophie Blee-Goldman 
> wrote:
>
>> FWIW, while I'm really not a fan of Optional in general, I agree that its
>> usage
>> here seems appropriate. Even for those rare software developers who
>> carefully
>> read all the docs several times over, I think it wouldn't be too hard to
>> miss a
>> note about the RecordMetadata possibly being null.
>>
>> Especially because it's not that obvious why at first glance, and takes a
>> bit of
>> thinking to realize that records originating from a Punctuator wouldn't
>> have a
>> "current record". This  is something that has definitely confused users
>> today.
>>
>> It's on us to improve the education here -- and an 
>> Optional
>> would naturally raise awareness of this subtlety
>>
>> On Tue, Sep 29, 2020 at 7:40 PM Sophie Blee-Goldman 
>> wrote:
>>
>>> Does my reply address your concerns?
>>>
>>>
>>> Yes; also, I definitely misread part of the proposal earlier and thought
>>> you had put
>>> the timestamp field in RecordMetadata. Sorry for not giving things a
>>> closer look
>>> before responding! I'm not sure my original message made much sense 
>>> given
>>> the misunderstanding, but thanks for responding anyway :P
>>>
>>> Having given the proposal a second pass, I agree, it's very elegant. +1
>>>
>>> On Tue, Sep 29, 2020 at 6:50 PM John Roesler 
>> wrote:
 Thanks for the reply, Sophie,

 I think I may have summarized too much in my prior reply.

 In the currently proposed KIP, any caller of forward() must
 supply a Record, which consists of:
 * key
 * value
 * timestamp
 * headers (with a 

Re: [DISCUSS] KIP-516: Topic Identifiers

2020-10-01 Thread Jun Rao
Hi, Justine,

Thanks for the update. The KIP looks good to me now. Just a minor comment
below.

30. Perhaps "option[UUID]" can be put in the doc.

Jun

On Thu, Oct 1, 2020 at 3:28 PM Justine Olshan  wrote:

> Hi Jun,
> Thanks for the response!
>
> 30. I think I might have changed this in between. The current version
> says:  {"name":
> "id", "type": "option[UUID]"}, "doc": topic id}
> I have switched to the option type to cover the migration case where a
> TopicZNode does not yet have a topic ID.
> I understand that due to json, this field is written as a string, so if I
> should move the "option[uuid]" to the doc field and the type should be
> "string" please let me know.
>
> 40. I've added a definition for UUID.
> 41,42. Fixed
>
> Thank you,
> Justine
>
> On Wed, Sep 30, 2020 at 1:15 PM Jun Rao  wrote:
>
> > Hi, Justine,
> >
> > Thanks for the summary. Just a few minor comments blow.
> >
> > 30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc
> should
> > say UUID. The issue seems unfixed.
> >
> > 40. Since UUID is public facing, could you include its definition?
> >
> > 41. StopReplicaResponse still includes the topic field.
> >
> > 42. "It is unnecessary to include the name of the topic in the following
> > Request/Response calls" It would be useful to include all modified
> requests
> > (e.g. produce) in the list below.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Wed, Sep 30, 2020 at 10:17 AM Justine Olshan 
> > wrote:
> >
> > > Hello again,
> > >
> > > I've taken some time to discuss some of the remaining points brought up
> > by
> > > the previous emails offline. Here are some of the conclusions.
> > >
> > > 1. Directory Structure:
> > > There was some talk about whether the directory structure should be
> > changed
> > > to replace all topic names with topic IDs.
> > > This will be a useful change, but will prevent downgrades. It will be
> > best
> > > to wait until a major release, likely alongside KIP-500 changes that
> will
> > > prevent downgrades. I've updated the KIP to include this change with
> the
> > > note about migration and deprecation.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-log.dirlayout
> > >
> > > 2. Partition Metadata file
> > > There was some disagreement about the inclusion of the partition
> metadata
> > > file.
> > > This file will be needed to persist the topic ID, especially while we
> > still
> > > have the old directory structure. Even after the changes, this file can
> > be
> > > useful for debugging and recovery.
> > > Creating a single mapping file from topic names to topic IDs was
> > > considered, but ultimately scrapped as it would not be as easy to
> > maintain.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-PartitionMetadatafile
> > >
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-PersistingTopicIDs
> > >
> > > 3. Produce Protocols
> > > After some further discussion, this replacing the topic name with topic
> > ID
> > > in these protocols has been added to the KIP.
> > > This will cut down on the size of the protocol. Since changes to fetch
> > are
> > > included, it does make sense to update these protocols.
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-Produce
> > >
> > > 4. KIP-500 Compatibility
> > > After some discussion with those involved with KIP-500, it seems best
> to
> > > use a sentinel topic ID for the metadata topic that is used before the
> > > first controller election.
> > > We had an issue where this metadata topic may not be assigned an ID
> > before
> > > utilizing the Vote and Fetch protocols. It was decided to reserve a
> > unique
> > > ID for this topic to be used until the controller could give the topic
> a
> > > unique ID.
> > > Having the topic keep the sentinel ID (not be reassigned to a unique
> ID)
> > > was considered, but it seemed like a good idea to leave the option open
> > for
> > > the metadata topic to have a unique ID in cases where it would need to
> be
> > > differentiated from other clusters' metadata topics. (ex. tiered
> storage)
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-CompatibilitywithKIP-500
> > >
> > > I've also split up the KIP into sub-tasks on the JIRA. Hopefully this
> > will
> > > provide a better idea about what tasks we have, and eventually provide
> a
> > > place to see what's done and what is left.
> > > If there is a task I am missing, please let me know!
> > > https://issues.apache.org/jira/browse/KAFKA-8872
> > >
> > > Of course, these decisions are not set in stone, and I would love to
> hear
> > > any feedback.
> > >
> > > Thanks,
> > > Justine
> > >
> > > On Mon, Sep 28, 2020 at 11:38 AM Justine Olshan 
> > > 

[jira] [Created] (KAFKA-10565) Console producer displays interactive prompt even when input is not interactive

2020-10-01 Thread Sergei Morozov (Jira)
Sergei Morozov created KAFKA-10565:
--

 Summary: Console producer displays interactive prompt even when 
input is not interactive
 Key: KAFKA-10565
 URL: https://issues.apache.org/jira/browse/KAFKA-10565
 Project: Kafka
  Issue Type: Bug
  Components: tools
Affects Versions: 2.6.0
Reporter: Sergei Morozov


The prompt introduced in KAFKA-2955 may be indeed helpful when a user enters 
messages manually but when the messages are read from the file, it's not 
helpful and may be really annoying.
h5. Steps to reproduce
 # Create a file with a decent number of messages (e.g. 80,000 in my case)
 # Start console producer and forward the file contents to its STDIN:

{{$ kafka-console-producer \}}
{{  --broker-list b1,b2,b3 \}}
{{  --topic test < messages.txt}}

{{>> ... >>}}

For each message the producer reads from the file, there's one > displayed 
polluting the output.
h5. Expected behavior:
 # If the producer can detect that the input stream is a TTY, it should not 
display the prompt.
 # Ideally, there should be a configuration parameter to disable this 
explicitly.

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: Kafka » kafka-2.4-jdk8 #8

2020-10-01 Thread Apache Jenkins Server
See 


Changes:

[github] Backport Jenkinsfile to 2.4 branch (#9329)


--
[...truncated 2.90 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED


Re: [DISCUSS] KIP-516: Topic Identifiers

2020-10-01 Thread Justine Olshan
Hi Jun,
Thanks for the response!

30. I think I might have changed this in between. The current version
says:  {"name":
"id", "type": "option[UUID]"}, "doc": topic id}
I have switched to the option type to cover the migration case where a
TopicZNode does not yet have a topic ID.
I understand that due to json, this field is written as a string, so if I
should move the "option[uuid]" to the doc field and the type should be
"string" please let me know.

40. I've added a definition for UUID.
41,42. Fixed

Thank you,
Justine

On Wed, Sep 30, 2020 at 1:15 PM Jun Rao  wrote:

> Hi, Justine,
>
> Thanks for the summary. Just a few minor comments blow.
>
> 30.  {"name": "id", "type": "string", "doc": "version id"}}: The doc should
> say UUID. The issue seems unfixed.
>
> 40. Since UUID is public facing, could you include its definition?
>
> 41. StopReplicaResponse still includes the topic field.
>
> 42. "It is unnecessary to include the name of the topic in the following
> Request/Response calls" It would be useful to include all modified requests
> (e.g. produce) in the list below.
>
> Thanks,
>
> Jun
>
>
> On Wed, Sep 30, 2020 at 10:17 AM Justine Olshan 
> wrote:
>
> > Hello again,
> >
> > I've taken some time to discuss some of the remaining points brought up
> by
> > the previous emails offline. Here are some of the conclusions.
> >
> > 1. Directory Structure:
> > There was some talk about whether the directory structure should be
> changed
> > to replace all topic names with topic IDs.
> > This will be a useful change, but will prevent downgrades. It will be
> best
> > to wait until a major release, likely alongside KIP-500 changes that will
> > prevent downgrades. I've updated the KIP to include this change with the
> > note about migration and deprecation.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-log.dirlayout
> >
> > 2. Partition Metadata file
> > There was some disagreement about the inclusion of the partition metadata
> > file.
> > This file will be needed to persist the topic ID, especially while we
> still
> > have the old directory structure. Even after the changes, this file can
> be
> > useful for debugging and recovery.
> > Creating a single mapping file from topic names to topic IDs was
> > considered, but ultimately scrapped as it would not be as easy to
> maintain.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-PartitionMetadatafile
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-PersistingTopicIDs
> >
> > 3. Produce Protocols
> > After some further discussion, this replacing the topic name with topic
> ID
> > in these protocols has been added to the KIP.
> > This will cut down on the size of the protocol. Since changes to fetch
> are
> > included, it does make sense to update these protocols.
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-Produce
> >
> > 4. KIP-500 Compatibility
> > After some discussion with those involved with KIP-500, it seems best to
> > use a sentinel topic ID for the metadata topic that is used before the
> > first controller election.
> > We had an issue where this metadata topic may not be assigned an ID
> before
> > utilizing the Vote and Fetch protocols. It was decided to reserve a
> unique
> > ID for this topic to be used until the controller could give the topic a
> > unique ID.
> > Having the topic keep the sentinel ID (not be reassigned to a unique ID)
> > was considered, but it seemed like a good idea to leave the option open
> for
> > the metadata topic to have a unique ID in cases where it would need to be
> > differentiated from other clusters' metadata topics. (ex. tiered storage)
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-516%3A+Topic+Identifiers#KIP516:TopicIdentifiers-CompatibilitywithKIP-500
> >
> > I've also split up the KIP into sub-tasks on the JIRA. Hopefully this
> will
> > provide a better idea about what tasks we have, and eventually provide a
> > place to see what's done and what is left.
> > If there is a task I am missing, please let me know!
> > https://issues.apache.org/jira/browse/KAFKA-8872
> >
> > Of course, these decisions are not set in stone, and I would love to hear
> > any feedback.
> >
> > Thanks,
> > Justine
> >
> > On Mon, Sep 28, 2020 at 11:38 AM Justine Olshan 
> > wrote:
> >
> > > Hello all,
> > >
> > > I just wanted to follow up on this discussion. Did we come to an
> > > understanding about the directory structure?
> > >
> > > I think the biggest question here is what is acceptable to leave out
> due
> > > to scope vs. what is considered to be too much tech debt.
> > > This KIP is already pretty large with the number of changes, but it
> also
> > > makes sense to do things correctly, so I'd love to hear everyone's
> > thoughts.
> > >
> > > 

Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-10-01 Thread Guozhang Wang
Thanks for the clarification Jose, that clears my confusions already :)


Guozhang

On Thu, Oct 1, 2020 at 10:51 AM Jose Garcia Sancio 
wrote:

> Thanks for the email Guozhang.
>
> > Thanks for the replies and the KIP updates. Just want to clarify one more
> > thing regarding my previous comment 3): I understand that when a snapshot
> > has completed loading, then we can use it in our handling logic of vote
> > request. And I understand that:
> >
> > 1) Before a snapshot has been completely received (e.g. if we've only
> > received a subset of the "chunks"), then we just handle vote requests "as
> > like" there's no snapshot yet.
> >
> > 2) After a snapshot has been completely received and loaded into main
> > memory, we can handle vote requests "as of" the received snapshot.
> >
> > What I'm wondering if, in between of these two synchronization barriers,
> > after all the snapshot chunks have been received but before it has been
> > completely parsed and loaded into the memory's metadata cache, if we
> > received a request (note they may be handled by different threads, hence
> > concurrently), what should we do? Or are you proposing that the
> > fetchSnapshot request would also be handled in that single-threaded raft
> > client loop so it is in order with all other requests, if that's the case
> > then we do not have any concurrency issues to worry, but then on the
> other
> > hand the reception of the last snapshot chunk and loading them to main
> > memory may also take long time during which a client may not be able to
> > handle any other requests.
>
> Yes. The FetchSnapshot request and response handling will be performed
> by the KafkaRaftClient in a single threaded fashion. The
> KafkaRaftClient doesn't need to load the snapshot to know what state
> it is in. It only needs to scan the "checkpoints" folder, load the
> quorum state file and know the LEO of the replicated log. I would
> modify 2) above to the following:
>
> 3) After the snapshot has been validated by
>   a) Fetching all of the chunks
>   b) Verifying the CRC of the records in the snapshot
>   c) Atomically moving the temporary snapshot to the permanent location
>
> After 3.c), the KafkaRaftClient only needs to scan and parse the
> filenames in the directory called "checkpoints" to find the
> largest/latest permanent snapshot.
>
> As you point out in 1) before 3.c) the KafkaRaftClient, in regards to
> leader election, will behave as if the temporary snapshot didn't
> exists.
>
> The loading of the snapshot will be done by the state machine (Kafka
> Controller or Metadata Cache) and it can perform this on a different
> thread. The KafkaRaftClient will provide an API for finding and
> reading the latest valid snapshot stored locally.
>
> Are you also concerned that the snapshot could have been corrupted after
> 3.c?
>
> I also updated the "Changes to leader Election" section to make this a
> bit clearer.
>
> Thanks,
> Jose
>


-- 
-- Guozhang


Re: [VOTE] KIP-478 Strongly Typed Streams Processor API

2020-10-01 Thread John Roesler
Hello again, all,

I'm sorry to make another tweak to this KIP, but during the
implementation of the design we've just agreed on, I
realized that Processors would almost never need to
reference the RecordMetadata. Therefore, I'm proposing to
streamline the API by moving the Optional to
the new ProcessorContext as a method, rather than making it
a method argument to Processor#process.

The change is visible here:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=118172121=16=15

All of the same semantics and considerations we discussed
still apply, it's just that Processor implementers won't
have to think about it unless they actually _need_ the
topic/partition/offset information from the RecordMetadata.

Also, the PR for this part of the KIP is now available here:
https://github.com/apache/kafka/pull/9361

I know it's a bit on the heavy side; I've annotated the PR
to try and ease the reviewer's job. I'd greatly appreciate
it if anyone can take the time to review.

Thanks,
-John

On Wed, 2020-09-30 at 10:16 -0500, John Roesler wrote:
> Thanks, Matthias!
> 
> I can certainly document it. I didn't bother because the old
> Processor, Supplier, and Context will themselves be
> deprecated, so any method that handles them won't be able to
> avoid the deprecation warning. Nevertheless, it doesn't hurt
> just to explicitly deprecated those methods.
> 
> Thanks,
> -John
> 
> On Wed, 2020-09-30 at 08:10 -0700, Matthias J. Sax wrote:
> > Thanks John. I like the proposal.
> > 
> > Btw: I was just going over the KIP and realized that we add new methods
> > to `StreamBuilder`, `Topology`, and `KStream` that take the new
> > `ProcessorSupplier` class -- should we also deprecate the corresponding
> > existing ones that take the old `ProcessorSupplier`?
> > 
> > 
> > -Matthias
> > 
> > 
> > On 9/30/20 7:46 AM, John Roesler wrote:
> > > Thanks Paul and Sophie,
> > > 
> > > Your feedback certainly underscores the need to be explicit
> > > in the javadoc about why that parameter is Optional. Getting
> > > this kind of feedback before the release is exactly the kind
> > > of outcome we hope to get from the KIP process!
> > > 
> > > Thanks,
> > > -John
> > > 
> > > On Tue, 2020-09-29 at 22:32 -0500, Paul Whalen wrote:
> > > > John, I totally agree that adding a method to Processor is cumbersome 
> > > > and
> > > > not a good path.  I was imagining maybe a separate interface that could 
> > > > be
> > > > used in the appropriate context, but I don't think that makes too much
> > > > sense - it's just too far away from what Kafka Streams is.  I was
> > > > originally more interested in the "why" Optional than the "how" (I 
> > > > think my
> > > > original reply overplayed the "optional as an argument" concern).  But
> > > > you've convinced me that there is a perfectly legitimate "why".  We 
> > > > should
> > > > make sure that it's clear why it's Optional, but I suppose that goes
> > > > without saying.  It's a nice opportunity to make the API reflect more 
> > > > what
> > > > is actually going on under the hood.
> > > > 
> > > > Thanks!
> > > > Paul
> > > > 
> > > > On Tue, Sep 29, 2020 at 10:05 PM Sophie Blee-Goldman 
> > > > 
> > > > wrote:
> > > > 
> > > > > FWIW, while I'm really not a fan of Optional in general, I agree that 
> > > > > its
> > > > > usage
> > > > > here seems appropriate. Even for those rare software developers who
> > > > > carefully
> > > > > read all the docs several times over, I think it wouldn't be too hard 
> > > > > to
> > > > > miss a
> > > > > note about the RecordMetadata possibly being null.
> > > > > 
> > > > > Especially because it's not that obvious why at first glance, and 
> > > > > takes a
> > > > > bit of
> > > > > thinking to realize that records originating from a Punctuator 
> > > > > wouldn't
> > > > > have a
> > > > > "current record". This  is something that has definitely confused 
> > > > > users
> > > > > today.
> > > > > 
> > > > > It's on us to improve the education here -- and an 
> > > > > Optional
> > > > > would naturally raise awareness of this subtlety
> > > > > 
> > > > > On Tue, Sep 29, 2020 at 7:40 PM Sophie Blee-Goldman 
> > > > > 
> > > > > wrote:
> > > > > 
> > > > > > Does my reply address your concerns?
> > > > > > 
> > > > > > 
> > > > > > Yes; also, I definitely misread part of the proposal earlier and 
> > > > > > thought
> > > > > > you had put
> > > > > > the timestamp field in RecordMetadata. Sorry for not giving things a
> > > > > > closer look
> > > > > > before responding! I'm not sure my original message made much sense 
> > > > > > given
> > > > > > the misunderstanding, but thanks for responding anyway :P
> > > > > > 
> > > > > > Having given the proposal a second pass, I agree, it's very 
> > > > > > elegant. +1
> > > > > > 
> > > > > > On Tue, Sep 29, 2020 at 6:50 PM John Roesler 
> > > > > wrote:
> > > > > > > Thanks for the reply, Sophie,
> > > > > > > 
> > > > > > > I think I may have summarized too much in my prior 

Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-10-01 Thread Jun Rao
Hi, Jose,

Thanks for the reply. A few more comments.

41. That's a good point. With compacted topic, the cleaning won't be done
until the active segment rolls. With snapshots, I guess we don't have this
restriction? So, it would be useful to guard against too frequent
snapshotting. Does the new proposal address this completely? If the
snapshot has only 1 record, and the new record keeps updating the same key,
does that still cause the snapshot to be generated frequently?

42. One of the reasons that we added the per partition limit is to allow
each partition to make relatively even progress during the catchup phase.
This helps kstreams by potentially reducing the gap between stream time
from different partitions. If we can achieve the same thing without the
partition limit, it will be fine too. "3. For the remaining partitions send
at most the average max bytes plus the average remaining bytes." Do we
guarantee that request level max_bytes can be filled up when it can? Could
we document how we distribute the request level max_bytes to partitions in
the KIP? Also, should we change Fetch accordingly?

46. If we don't have IBP, how do we make sure that a broker doesn't
issue FetchSnapshotRequest when the receiving broker hasn't been upgraded
yet?

Thanks,

Jun

On Thu, Oct 1, 2020 at 10:09 AM Jose Garcia Sancio 
wrote:

> Thank you for the quick response Jun. Excuse the delayed response but
> I wanted to confirm some things regarding IBP. See comments below.
>
> Here are my changes to the KIP:
>
> https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763=30=28
>
> > 40. LBO: Code wise, logStartOffset is used in so many places like Log,
> > ReplicaManager, LogCleaner, ReplicaFetcher, checkpoint files, etc. I am
> not
> > if it's worth renaming in all those places. If the main concern is to
> > avoid confusion, we can just always spell out logStartOffset.
>
> Done. Keeping it as LogStartOffset is better. I was also concerned
> with external tools that may be generating code from the JSON schema.
>
> > 41. metadata.snapshot.min.cleanable.ratio: Since the snapshot could be
> > empty initially, it's probably better to define the ratio as
> new_data_size
> > / (new_data_size + snapshot_size). This avoids the dividing by zero issue
> > and is also consistent with the cleaner ratio definition for compacted
> > topics.
>
> I am assuming that snapshot_size is the size of the largest snapshot
> on disk, is this correct? If we use this formal then we will generate
> snapshots very quickly if the snapshot on disk is zero or very small.
>
> In general what we care about is if the replicated log has a lot of
> records that delete or update records in the snapshot. I was thinking
> something along the following formula:
>
> (size of delete snapshot records + size of updated records) / (total
> size of snapshot), where total size of snapshot is greater than zero.
> 0, where total size of snapshot is zero
>
> This means that in the extreme case where the replicated log only
> contains "addition" records then we never generate a snapshot. I think
> this is the desired behavior since generating a snapshot will consume
> disk bandwidth without saving disk space. What do you think?
>
> >
> > 42. FetchSnapshotRequest: Since this is designed to fetch more than one
> > partition, it seems it's useful to have a per-partition maxBytes, in
> > addition to the request level maxBytes, just like a Fetch request?
>
> Yeah, we have debated this in another thread from Jason. The argument
> is that MaxBytes at the top level is all that we need if we implement
> the following heuristic:
>
> 1. Compute the average max bytes per partition by dividing the max by
> the number of partitions in the request.
> 2. For all of the partitions with remaining bytes less than this
> average max bytes, then send all of those bytes and sum the remaining
> bytes.
> 3. For the remaining partitions send at most the average max bytes
> plus the average remaining bytes.
>
> Note that this heuristic will only be performed once and not at worst
> N times for N partitions.
>
> What do you think? Besides consistency with Fetch requests, is there
> another reason to have MaxBytes per partition?
>
> > 43. FetchSnapshotResponse:
> > 43.1 I think the confusing part for OFFSET_OUT_OF_RANGE is
> > that FetchSnapshotRequest includes EndOffset. So, OFFSET_OUT_OF_RANGE
> seems
> > to suggest that the provided EndOffset is wrong, which is not the
> intention
> > for the error code.
>
> Yeah. Added a new error called POSITION_OUT_OF_RANGE.
>
> > 43.1 Position field seems to be the same as the one in
> > FetchSnapshotRequest. If we have both, should the requester verify the
> > consistency between two values and what should the requester do if the
> two
> > values don't match?
>
> Yeah the Position in the response will be the same value as the
> Position in the request. I was thinking of only verifying Position
> against the state on the temporary 

[jira] [Created] (KAFKA-10564) Continuous logging about deleting obsolete state directories

2020-10-01 Thread Michael Bingham (Jira)
Michael Bingham created KAFKA-10564:
---

 Summary: Continuous logging about deleting obsolete state 
directories
 Key: KAFKA-10564
 URL: https://issues.apache.org/jira/browse/KAFKA-10564
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.6.0
Reporter: Michael Bingham


The internal process which automatically cleans up obsolete task state 
directories was modified in https://issues.apache.org/jira/browse/KAFKA-6647. 
The current logic in 2.6 is to remove all files from the task directory except 
the {{.lock}} file:

[https://github.com/apache/kafka/blob/2.6/streams/src/main/java/org/apache/kafka/streams/processor/internals/StateDirectory.java#L335]

However, the directory is only removed in its entirely for a manual cleanup:

[https://github.com/apache/kafka/blob/2.6/streams/src/main/java/org/apache/kafka/streams/processor/internals/StateDirectory.java#L349-L353]

The result of this is that Streams will continue revisiting this directory and 
trying to clean it up, since it determines what to clean based on 
last-modification time of the task directory (which is now no longer deleted 
during the automatic cleanup). So a user will see log messages like:
stream-thread [app-c2d773e6-8ac3-4435-9777-378e0ec0ab82-CleanupThread] Deleting 
obsolete state directory 0_8 for task 0_8 as 6061ms has elapsed (cleanup 
delay is 60ms)
repeated again and again.

This issue doesn't seem to break anything - it's more about avoiding 
unnecessary logging and cleaning up empty task directories.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10563) Make sure task directories don't remain locked by dead threads

2020-10-01 Thread Sophie Blee-Goldman (Jira)
Sophie Blee-Goldman created KAFKA-10563:
---

 Summary: Make sure task directories don't remain locked by dead 
threads
 Key: KAFKA-10563
 URL: https://issues.apache.org/jira/browse/KAFKA-10563
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: Sophie Blee-Goldman
 Fix For: 2.7.0


Most common/expected exceptions within Streams are handled gracefully, and the 
thread will make sure to clean up all resources such as task locks during 
shutdown. However, there are some instances where an unexpected exception such 
as an IllegalStateException can leave some resources orphaned.

We have seen this happen to task directories after an IllegalStateException is 
hit during the TaskManager's rebalance handling logic – the Thread shuts down, 
but loses track of some tasks before unlocking them. This blocks any further 
work on that task by any other thread in the same instance.

Previously we decided that this was "ok" because an IllegalStateException means 
all bets are off. But with the upcoming work of KIP-663 and KIP-671, users will 
be able to react smartly on dying threads and replace them with new ones, 
making it more important than ever to ensure that the application can continue 
on with no lasting repercussions of a thread death. If we allow users to 
revive/replace a thread that dies due to IllegalStateException, that thread 
should not be blocked from doing any work by the ghost of its predecessor. 

It might be easiest to just add some logic to the cleanup thread to verify all 
the existing locks against the list of live threads, and remove any zombie 
locks. But we probably want to do this purging more frequently than the cleanup 
thread runs (10min by default) – so maybe we can leverage the work in KIP-671 
and have each thread purge any locks still owned by it after the uncaught 
exception handler runs, but before the thread dies.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-10-01 Thread Jose Garcia Sancio
Thanks for the email Guozhang.

> Thanks for the replies and the KIP updates. Just want to clarify one more
> thing regarding my previous comment 3): I understand that when a snapshot
> has completed loading, then we can use it in our handling logic of vote
> request. And I understand that:
>
> 1) Before a snapshot has been completely received (e.g. if we've only
> received a subset of the "chunks"), then we just handle vote requests "as
> like" there's no snapshot yet.
>
> 2) After a snapshot has been completely received and loaded into main
> memory, we can handle vote requests "as of" the received snapshot.
>
> What I'm wondering if, in between of these two synchronization barriers,
> after all the snapshot chunks have been received but before it has been
> completely parsed and loaded into the memory's metadata cache, if we
> received a request (note they may be handled by different threads, hence
> concurrently), what should we do? Or are you proposing that the
> fetchSnapshot request would also be handled in that single-threaded raft
> client loop so it is in order with all other requests, if that's the case
> then we do not have any concurrency issues to worry, but then on the other
> hand the reception of the last snapshot chunk and loading them to main
> memory may also take long time during which a client may not be able to
> handle any other requests.

Yes. The FetchSnapshot request and response handling will be performed
by the KafkaRaftClient in a single threaded fashion. The
KafkaRaftClient doesn't need to load the snapshot to know what state
it is in. It only needs to scan the "checkpoints" folder, load the
quorum state file and know the LEO of the replicated log. I would
modify 2) above to the following:

3) After the snapshot has been validated by
  a) Fetching all of the chunks
  b) Verifying the CRC of the records in the snapshot
  c) Atomically moving the temporary snapshot to the permanent location

After 3.c), the KafkaRaftClient only needs to scan and parse the
filenames in the directory called "checkpoints" to find the
largest/latest permanent snapshot.

As you point out in 1) before 3.c) the KafkaRaftClient, in regards to
leader election, will behave as if the temporary snapshot didn't
exists.

The loading of the snapshot will be done by the state machine (Kafka
Controller or Metadata Cache) and it can perform this on a different
thread. The KafkaRaftClient will provide an API for finding and
reading the latest valid snapshot stored locally.

Are you also concerned that the snapshot could have been corrupted after 3.c?

I also updated the "Changes to leader Election" section to make this a
bit clearer.

Thanks,
Jose


Re: [DISCUSS] Apache Kafka 2.7.0 release

2020-10-01 Thread Bill Bejeck
All,

With the KIP acceptance deadline passing yesterday, I've updated the
planned KIP content section of the 2.7.0 release plan

.

Removed proposed KIPs for 2.7.0 not getting approval

   1. KIP-653
   

   2. KIP-608
   

   3. KIP-508
   


KIPs added

   1. KIP-671
   



Please let me know if I've missed anything.

Thanks,
Bill

On Thu, Sep 24, 2020 at 1:47 PM Bill Bejeck  wrote:

> Hi All,
>
> Just a reminder that the KIP freeze is next Wednesday, September 30th.
> Any KIP aiming to go in the 2.7.0 release needs to be accepted by this date.
>
> Thanks,
> BIll
>
> On Tue, Sep 22, 2020 at 12:11 PM Bill Bejeck  wrote:
>
>> Boyan,
>>
>> Done. Thanks for the heads up.
>>
>> -Bill
>>
>> On Mon, Sep 21, 2020 at 6:36 PM Boyang Chen 
>> wrote:
>>
>>> Hey Bill,
>>>
>>> unfortunately KIP-590 will not be in 2.7 release, could you move it to
>>> postponed KIPs?
>>>
>>> Best,
>>> Boyang
>>>
>>> On Thu, Sep 10, 2020 at 2:41 PM Bill Bejeck  wrote:
>>>
>>> > Hi Gary,
>>> >
>>> > It's been added.
>>> >
>>> > Regards,
>>> > Bill
>>> >
>>> > On Thu, Sep 10, 2020 at 4:14 PM Gary Russell 
>>> wrote:
>>> >
>>> > > Can someone add a link to the release plan page [1] to the Future
>>> > Releases
>>> > > page [2]?
>>> > >
>>> > > I have the latter bookmarked.
>>> > >
>>> > > Thanks.
>>> > >
>>> > > [1]:
>>> > >
>>> >
>>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=158872629
>>> > > [2]:
>>> > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
>>> > > 
>>> > > From: Bill Bejeck 
>>> > > Sent: Wednesday, September 9, 2020 4:35 PM
>>> > > To: dev 
>>> > > Subject: Re: [DISCUSS] Apache Kafka 2.7.0 release
>>> > >
>>> > > Hi Dongjin,
>>> > >
>>> > > I've moved both KIPs to the release plan.
>>> > >
>>> > > Keep in mind the cutoff for KIP acceptance is September 30th. If the
>>> KIP
>>> > > discussions are completed, I'd recommend starting a vote for them.
>>> > >
>>> > > Regards,
>>> > > Bill
>>> > >
>>> > > On Wed, Sep 9, 2020 at 8:39 AM Dongjin Lee 
>>> wrote:
>>> > >
>>> > > > Hi Bill,
>>> > > >
>>> > > > Could you add the following KIPs to the plan?
>>> > > >
>>> > > > - KIP-508: Make Suppression State Queriable
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-508%253A%2BMake%2BSuppression%2BState%2BQueriabledata=02%7C01%7Cgrussell%40vmware.com%7Cf9f4193557084c2f746508d854ffe7d2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637352805378806436sdata=CkJill9%2FuBqp2HdVQrIjElj2z1nMgQXRaUyWrvY94dk%3Dreserved=0
>>> > > > >
>>> > > > - KIP-653: Upgrade log4j to log4j2
>>> > > > <
>>> > > >
>>> > >
>>> >
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-653%253A%2BUpgrade%2Blog4j%2Bto%2Blog4j2data=02%7C01%7Cgrussell%40vmware.com%7Cf9f4193557084c2f746508d854ffe7d2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637352805378806436sdata=nHbw6WiQpkWT3KgPfanEtDCh3sWcL0O%2By8Fu0Bl4ivc%3Dreserved=0
>>> > > > >
>>> > > >
>>> > > > Both KIPs are completely implemented with passing all tests, but
>>> not
>>> > got
>>> > > > reviewed by the committers. Could anyone have a look?
>>> > > >
>>> > > > Thanks,
>>> > > > Dongjin
>>> > > >
>>> > > > On Wed, Sep 9, 2020 at 8:38 AM Leah Thomas 
>>> > wrote:
>>> > > >
>>> > > > > Hi Bill,
>>> > > > >
>>> > > > > Could you also add KIP-450 to the release plan? It's been merged.
>>> > > > >
>>> > > > >
>>> > > >
>>> > >
>>> >
>>> https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-450%253A%2BSliding%2BWindow%2BAggregations%2Bin%2Bthe%2BDSLdata=02%7C01%7Cgrussell%40vmware.com%7Cf9f4193557084c2f746508d854ffe7d2%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637352805378806436sdata=1KbAPyL7NKWQSZWBKItUTpAJF5SY6%2FMCj8Rn%2Fw2qO20%3Dreserved=0
>>> > > > >
>>> > > > > Cheers,
>>> > > > > Leah
>>> > > > >
>>> > > > > On Tue, Sep 8, 2020 at 9:32 AM Bill Bejeck 
>>> > wrote:
>>> > > > >
>>> > > > > > Hi Bruno,
>>> > > > > >
>>> > > > > > Thanks for letting me know, I've added KIP-662 to the release
>>> plan.
>>> > > > > >
>>> > > > > > -Bill
>>> > > > > >
>>> > > > > > On Mon, Sep 7, 2020 at 11:33 AM Bruno Cadonna <
>>> br...@confluent.io>
>>> > > > > wrote:
>>> > > > > >
>>> > > > > > > Hi Bill,
>>> > > > > > >
>>> > > > > > > Could you add KIP-662 [1] to the release plan. The KIP has
>>> been
>>> > > > already
>>> > > > > > > 

Re: [DISCUSS] Apache Kafka 2.6.1 release

2020-10-01 Thread Bill Bejeck
Thanks, Mickael it's a + 1 for me.

-Bill

On Thu, Oct 1, 2020 at 1:17 PM Matthias J. Sax  wrote:

> +1
>
> On 10/1/20 8:24 AM, Ismael Juma wrote:
> > Thanks Mickael! +1
> >
> > Ismael
> >
> > On Thu, Oct 1, 2020 at 7:40 AM Mickael Maison 
> > wrote:
> >
> >> Hi,
> >>
> >> I'd like to volunteer to be the release manager for the next bugfix
> >> release, 2.6.1.
> >> I created the release plan on the wiki:
> >> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+2.6.1
> >>
> >> Thanks
> >>
> >
>


Re: [DISCUSS] Apache Kafka 2.6.1 release

2020-10-01 Thread Matthias J. Sax
+1

On 10/1/20 8:24 AM, Ismael Juma wrote:
> Thanks Mickael! +1
> 
> Ismael
> 
> On Thu, Oct 1, 2020 at 7:40 AM Mickael Maison 
> wrote:
> 
>> Hi,
>>
>> I'd like to volunteer to be the release manager for the next bugfix
>> release, 2.6.1.
>> I created the release plan on the wiki:
>> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+2.6.1
>>
>> Thanks
>>
> 


Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-10-01 Thread Jose Garcia Sancio
Thank you for the quick response Jun. Excuse the delayed response but
I wanted to confirm some things regarding IBP. See comments below.

Here are my changes to the KIP:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763=30=28

> 40. LBO: Code wise, logStartOffset is used in so many places like Log,
> ReplicaManager, LogCleaner, ReplicaFetcher, checkpoint files, etc. I am not
> if it's worth renaming in all those places. If the main concern is to
> avoid confusion, we can just always spell out logStartOffset.

Done. Keeping it as LogStartOffset is better. I was also concerned
with external tools that may be generating code from the JSON schema.

> 41. metadata.snapshot.min.cleanable.ratio: Since the snapshot could be
> empty initially, it's probably better to define the ratio as new_data_size
> / (new_data_size + snapshot_size). This avoids the dividing by zero issue
> and is also consistent with the cleaner ratio definition for compacted
> topics.

I am assuming that snapshot_size is the size of the largest snapshot
on disk, is this correct? If we use this formal then we will generate
snapshots very quickly if the snapshot on disk is zero or very small.

In general what we care about is if the replicated log has a lot of
records that delete or update records in the snapshot. I was thinking
something along the following formula:

(size of delete snapshot records + size of updated records) / (total
size of snapshot), where total size of snapshot is greater than zero.
0, where total size of snapshot is zero

This means that in the extreme case where the replicated log only
contains "addition" records then we never generate a snapshot. I think
this is the desired behavior since generating a snapshot will consume
disk bandwidth without saving disk space. What do you think?

>
> 42. FetchSnapshotRequest: Since this is designed to fetch more than one
> partition, it seems it's useful to have a per-partition maxBytes, in
> addition to the request level maxBytes, just like a Fetch request?

Yeah, we have debated this in another thread from Jason. The argument
is that MaxBytes at the top level is all that we need if we implement
the following heuristic:

1. Compute the average max bytes per partition by dividing the max by
the number of partitions in the request.
2. For all of the partitions with remaining bytes less than this
average max bytes, then send all of those bytes and sum the remaining
bytes.
3. For the remaining partitions send at most the average max bytes
plus the average remaining bytes.

Note that this heuristic will only be performed once and not at worst
N times for N partitions.

What do you think? Besides consistency with Fetch requests, is there
another reason to have MaxBytes per partition?

> 43. FetchSnapshotResponse:
> 43.1 I think the confusing part for OFFSET_OUT_OF_RANGE is
> that FetchSnapshotRequest includes EndOffset. So, OFFSET_OUT_OF_RANGE seems
> to suggest that the provided EndOffset is wrong, which is not the intention
> for the error code.

Yeah. Added a new error called POSITION_OUT_OF_RANGE.

> 43.1 Position field seems to be the same as the one in
> FetchSnapshotRequest. If we have both, should the requester verify the
> consistency between two values and what should the requester do if the two
> values don't match?

Yeah the Position in the response will be the same value as the
Position in the request. I was thinking of only verifying Position
against the state on the temporary snapshot file on disk. If Position
is not equal to the size of the file then reject the response and send
another FetchSnapshot request.

> 44. metric: Would a metric that captures the lag in offset between the last
> snapshot and the logEndOffset be useful?

Yes. How about the difference between the last snapshot offset and the
high-watermark? Snapshot can only be created up to the high-watermark.

Added this metric. Let me know if you still think we need a metric for
the difference between the largest snapshot end offset and the
high-watermark.

> 45. It seems the KIP assumes that every voter (leader and follower) and
> observer has a local replicated log for __cluster_metadata. It would be
> useful to make that clear in the overview section.

Updated the overview section. I think that this decision affects the
section "Changes to Leader Election". That section should not affect
observers since they don't participate in leader elections. It also
affects the section "Validation of Snapshot and Log" but it should be
possible to fix that section if observers don't have the replicated
log on disk.

> 46. Does this KIP cover upgrading from older versions of Kafka? If so, do
> we need IBP to guard the usage of modified FetchRequest and the new
> FetchSnapshotRequest? If not, could we make it clear that upgrading will be
> covered somewhere else?

In short, I don't think we need to increase the IBP. When we implement
snapshots for other topics like __consumer_offset and 

Re: [DISCUSS] KIP-660: Pluggable ReplicaAssignor

2020-10-01 Thread Mickael Maison
Thanks Tom for the feedback!

1. If the data returned by the ReplicaAssignor implementation does not
match that was requested, we'll also throw a ReplicaAssignorException

2. Good point, I'll update the KIP

3. The KIP mentions an error code associated with
ReplicaAssignorException: REPLICA_ASSIGNOR_FAILED

4. (I'm naming your last question 4.) I spent some time looking at it.
Initially I wanted to follow the model from the topic policies. But as
you said, computing assignments for the whole batch may be more
desirable and also avoids incrementally updating the cluster state.
The logic in AdminManager is very much centered around doing 1 topic
at a time but as far as I can tell we should be able to update it to
compute assignments for the whole batch.

I'll play a bit more with 4. and I'll update the KIP in the next few days

On Mon, Sep 21, 2020 at 10:29 AM Tom Bentley  wrote:
>
> Hi Mickael,
>
> A few thoughts about the ReplicaAssignor contract:
>
> 1. What happens if a ReplicaAssignor impl returns a Map where some
> assignments don't meet the given replication factor?
> 2. Fixing the signature of assignReplicasToBrokers() as you have would make
> it hard to pass extra information in the future (e.g. maybe someone comes
> up with a use case where passing the clientId would be needed) because it
> would require the interface be changed. If you factored all the parameters
> into some new type then the signature could be
> assignReplicasToBrokers(RequiredReplicaAssignment) and adding any new
> properties to RequiredReplicaAssignment wouldn't break the contract.
> 3. When an assignor throws RepliacAssignorException what error code will be
> returned to the client?
>
> Also, this sentence got me thinking:
>
> > If multiple topics are present in the request, AdminManager will update
> the Cluster object so the ReplicaAssignor class has access to the up to
> date cluster metadata.
>
> Previously I've looked at how we can improve Kafka's pluggable policy
> support to pass the more of the cluster state to policy implementations. A
> similar problem exists there, but the more cluster state you pass the
> harder it is to incrementally change it as you iterate through the topics
> to be created/modified. This likely isn't a problem here and now, but it
> could limit any future changes to the pluggable assignors. Did you consider
> the alternative of the assignor just being passed a Set of assignments?
> That means you can just pass the cluster state as it exists at the time. It
> also gives the implementation more information to work with to find more
> optimal assignments. For example, it could perform a bin packing type
> assignment which found a better optimum for the whole collection of topics
> than one which was only told about all the topics in the request
> sequentially.
>
> Otherwise this looks like a valuable feature to me.
>
> Kind regards,
>
> Tom
>
>
>
>
>
> On Fri, Sep 11, 2020 at 6:19 PM Robert Barrett 
> wrote:
>
> > Thanks Mickael, I think adding the new Exception resolves my concerns.
> >
> > On Thu, Sep 3, 2020 at 9:47 AM Mickael Maison 
> > wrote:
> >
> > > Thanks Robert and Ryanne for the feedback.
> > >
> > > ReplicaAssignor implementations can throw an exception to indicate an
> > > assignment can't be computed. This is already what the current round
> > > robin assignor does. Unfortunately at the moment, there are no generic
> > > error codes if it fails, it's either INVALID_PARTITIONS,
> > > INVALID_REPLICATION_FACTOR or worse UNKNOWN_SERVER_ERROR.
> > >
> > > So I think it would be nice to introduce a new Exception/Error code to
> > > cover any failures in the assignor and avoid UNKNOWN_SERVER_ERROR.
> > >
> > > I've updated the KIP accordingly, let me know if you have more questions.
> > >
> > > On Fri, Aug 28, 2020 at 4:49 PM Ryanne Dolan 
> > > wrote:
> > > >
> > > > Thanks Mickael, the KIP makes sense to me, esp for cases where an
> > > external
> > > > system (like cruise control or an operator) knows more about the target
> > > > cluster state than the broker does.
> > > >
> > > > Ryanne
> > > >
> > > > On Thu, Aug 20, 2020, 10:46 AM Mickael Maison <
> > mickael.mai...@gmail.com>
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > > I've created KIP-660 to make the replica assignment logic pluggable.
> > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-660%3A+Pluggable+ReplicaAssignor
> > > > >
> > > > > Please take a look and let me know if you have any feedback.
> > > > >
> > > > > Thanks
> > > > >
> > >
> >


Jenkins build is back to normal : Kafka » kafka-trunk-jdk15 #138

2020-10-01 Thread Apache Jenkins Server
See 




Re: [DISCUSS] Apache Kafka 2.6.1 release

2020-10-01 Thread Ismael Juma
Thanks Mickael! +1

Ismael

On Thu, Oct 1, 2020 at 7:40 AM Mickael Maison 
wrote:

> Hi,
>
> I'd like to volunteer to be the release manager for the next bugfix
> release, 2.6.1.
> I created the release plan on the wiki:
> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+2.6.1
>
> Thanks
>


[jira] [Created] (KAFKA-10562) Delegate the store wrappers to the new init method

2020-10-01 Thread John Roesler (Jira)
John Roesler created KAFKA-10562:


 Summary: Delegate the store wrappers to the new init method
 Key: KAFKA-10562
 URL: https://issues.apache.org/jira/browse/KAFKA-10562
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 2.7.0
Reporter: John Roesler
Assignee: John Roesler






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[DISCUSS] Apache Kafka 2.6.1 release

2020-10-01 Thread Mickael Maison
Hi,

I'd like to volunteer to be the release manager for the next bugfix
release, 2.6.1.
I created the release plan on the wiki:
https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+2.6.1

Thanks


[jira] [Reopened] (KAFKA-10017) Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta

2020-10-01 Thread Bill Bejeck (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bill Bejeck reopened KAFKA-10017:
-

> Flaky Test EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta
> ---
>
> Key: KAFKA-10017
> URL: https://issues.apache.org/jira/browse/KAFKA-10017
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.6.0
>Reporter: Sophie Blee-Goldman
>Assignee: Sophie Blee-Goldman
>Priority: Blocker
>  Labels: flaky-test, unit-test
> Fix For: 2.6.0
>
>
> Creating a new ticket for this since the root cause is different than 
> https://issues.apache.org/jira/browse/KAFKA-9966
> With injectError = true:
> h3. Stacktrace
> java.lang.AssertionError: Did not receive all 20 records from topic 
> multiPartitionOutputTopic within 6 ms Expected: is a value equal to or 
> greater than <20> but: <15> was less than <20> at 
> org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.lambda$waitUntilMinKeyValueRecordsReceived$1(IntegrationTestUtils.java:563)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:429)
>  at 
> org.apache.kafka.test.TestUtils.retryOnExceptionWithTimeout(TestUtils.java:397)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:559)
>  at 
> org.apache.kafka.streams.integration.utils.IntegrationTestUtils.waitUntilMinKeyValueRecordsReceived(IntegrationTestUtils.java:530)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.readResult(EosBetaUpgradeIntegrationTest.java:973)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.verifyCommitted(EosBetaUpgradeIntegrationTest.java:961)
>  at 
> org.apache.kafka.streams.integration.EosBetaUpgradeIntegrationTest.shouldUpgradeFromEosAlphaToEosBeta(EosBetaUpgradeIntegrationTest.java:427)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-567: Kafka Cluster Audit (new discussion)

2020-10-01 Thread Tom Bentley
Hi Viktor,

Like Mickael, I can see that there's value in having an audit trail. For me
the KIP raises a number of questions in its current form:

Why is it necessary to introduce this interface to produce the audit trail
when there is logging that can already record a lot of the same
information, albeit in less structured form? If logging isn't adequate it
would be good to explain why not in the Motivation or Rejected Alternatives
section. It's worth pointing out that even the "less structured" part would
be helped by KIP-673, which proposes to change the RequestChannel's logging
to include a JSON representation of the request.

I'm guessing what you gain from the proposed interface is the fact that
it's called after the authorizer (perhaps after the request has been
handled: I'm unclear about the purpose of AuditInfo.error), so you could
generate a single record in the audit trail. That could still be achieved
using logging, either by correlating existing log messages or by proposing
some new logging just for this auditing purpose (perhaps with a logger per
API key so people could avoid the performance hit on the produce and fetch
paths if they weren't interested in auditing those things). Again, if this
doesn't work it would be great for the KIP to explain why.

To me there were parallels with previous discussions about broker-side
interceptors (
https://www.mail-archive.com/dev@kafka.apache.org/msg103310.html if you've
not seen it before), those seemed to founder on the unwillingness to make
the request internal classes into a supported API. You've tried to address
this by proposing a parallel set of classes implementing AuditEvent for
exposing selective details about the request. It's not really clear that
you really _need_ to access all that information about each request, rather
than simply recording it all, and it would also come with a significant
implementation and maintenance cost. If it's simply about recording all the
information in the request, then it would likely be enough to pass a
suitably formatted String rather than an AuditEvent, which basically brings
us back to point 1, but with some justification for not using logging.

Kind regards,

Tom

On Thu, Oct 1, 2020 at 11:30 AM Dániel Urbán  wrote:

> Hi Viktor,
>
> I think the current state of the proposal is flexible enough to support
> use-cases where the response data is of interest to the auditor.
> This part ensures that: "... doing the auditing before sending the response
> back ...". Additionally, event classes could be extended with additional
> data if needed.
>
> Overall, the KIP looks good, thanks!
>
> Daniel
>
> Viktor Somogyi-Vass  ezt írta (időpont: 2020.
> szept. 30., Sze, 17:24):
>
> > Hi Daniel,
> >
> > I think in this sense we can use the precedence set with the
> > KAfkaAdminClient. It has *Result and *Options classes which in this
> > interpretation are similar in versioning and usage as they transform and
> > convey the responses of the protocol in a minimalistic API.
> > I've modified the KIP a bit and created some examples for these event
> > classes. For now as the implementation I think we can treat this
> similarly
> > to KIP-4 (AdminClient) which didn't push implementation for everything
> but
> > rather pushed implementing everything to subsequent KIPs as the
> > requirements become important. In this first KIP we can create the more
> > important ones (listed in the "Default Implementation") section if that
> is
> > fine.
> >
> > Regarding the response passing: to be honest I feel like that it's not
> that
> > strictly related to auditing but I think it's a good idea and could fit
> > into this API. I think that we should design this current API with this
> in
> > mind. Did you have any specific ideas about the implementation?
> >
> > Viktor
> >
> > On Tue, Sep 22, 2020 at 9:05 AM Dániel Urbán 
> > wrote:
> >
> > > An example I had in mind was the ProduceResponse - the auditor might
> need
> > > access to the new end offset of the partitions.
> > > The event-based approach sounds good - new events and fields can be
> added
> > > on-demand. Do we need the same versioning strategy we use with the
> > > requests/responses?
> > >
> > > Daniel
> > >
> > > Viktor Somogyi-Vass  ezt írta (időpont: 2020.
> > > szept. 21., H, 14:08):
> > >
> > > > Hi Daniel,
> > > >
> > > > > If the auditor needs access to the details of the action, one could
> > > argue
> > > > that even the response should be passed down to the auditor.
> > > > At this point I don't think we need to include responses into the
> > > interface
> > > > but if you have a use-case we can consider doing that.
> > > >
> > > > > Is it feasible to convert the Java requests and responses to public
> > > API?
> > > > Well I think that in this case we would need to actually transform a
> > lot
> > > of
> > > > classes and that might be a bit too invasive. Although since the
> > protocol
> > > > itself *is* a public API it might make sense to have some kind of
> 

Build failed in Jenkins: Kafka » kafka-trunk-jdk8 #104

2020-10-01 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10556: NPE if sasl.mechanism is unrecognized (#9356)


--
[...truncated 3.33 MB...]

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCleanUpPersistentStateStoresOnClose[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowPatternNotValidForTopicNameException[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldEnqueueLaterOutputsAfterEarlierOnes[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializersDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateIfEvenTimeAdvances[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowNoSuchElementExceptionForUnusedOutputTopicWithDynamicRouting[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowNoSuchElementExceptionForUnusedOutputTopicWithDynamicRouting[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldInitProcessor[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowForUnknownTopic[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnStreamsTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureGlobalTopicNameIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfInMemoryBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourcesThatMatchMultiplePattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldPopulateGlobalStore[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldThrowIfPersistentBuiltInStoreIsAccessedWithUntypedMethod[Eos enabled = 
false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 

[VOTE] KIP-665 Kafka Connect Hash SMT

2020-10-01 Thread bran...@bbrownsound.com
Hey Kafka Developers,

I’ve created the following KIP and updated it based on feedback from Mickael. I 
was wondering if we could get a vote on my proposal and move forward with the 
proposed pr.

Thanks so much!
-Brandon

Build failed in Jenkins: Kafka » kafka-trunk-jdk15 #137

2020-10-01 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10503: MockProducer doesn't throw ClassCastException when no 
partition for topic exists (#9309)


--
[...truncated 6.71 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 

Re: [DISCUSS] KIP-567: Kafka Cluster Audit (new discussion)

2020-10-01 Thread Dániel Urbán
Hi Viktor,

I think the current state of the proposal is flexible enough to support
use-cases where the response data is of interest to the auditor.
This part ensures that: "... doing the auditing before sending the response
back ...". Additionally, event classes could be extended with additional
data if needed.

Overall, the KIP looks good, thanks!

Daniel

Viktor Somogyi-Vass  ezt írta (időpont: 2020.
szept. 30., Sze, 17:24):

> Hi Daniel,
>
> I think in this sense we can use the precedence set with the
> KAfkaAdminClient. It has *Result and *Options classes which in this
> interpretation are similar in versioning and usage as they transform and
> convey the responses of the protocol in a minimalistic API.
> I've modified the KIP a bit and created some examples for these event
> classes. For now as the implementation I think we can treat this similarly
> to KIP-4 (AdminClient) which didn't push implementation for everything but
> rather pushed implementing everything to subsequent KIPs as the
> requirements become important. In this first KIP we can create the more
> important ones (listed in the "Default Implementation") section if that is
> fine.
>
> Regarding the response passing: to be honest I feel like that it's not that
> strictly related to auditing but I think it's a good idea and could fit
> into this API. I think that we should design this current API with this in
> mind. Did you have any specific ideas about the implementation?
>
> Viktor
>
> On Tue, Sep 22, 2020 at 9:05 AM Dániel Urbán 
> wrote:
>
> > An example I had in mind was the ProduceResponse - the auditor might need
> > access to the new end offset of the partitions.
> > The event-based approach sounds good - new events and fields can be added
> > on-demand. Do we need the same versioning strategy we use with the
> > requests/responses?
> >
> > Daniel
> >
> > Viktor Somogyi-Vass  ezt írta (időpont: 2020.
> > szept. 21., H, 14:08):
> >
> > > Hi Daniel,
> > >
> > > > If the auditor needs access to the details of the action, one could
> > argue
> > > that even the response should be passed down to the auditor.
> > > At this point I don't think we need to include responses into the
> > interface
> > > but if you have a use-case we can consider doing that.
> > >
> > > > Is it feasible to convert the Java requests and responses to public
> > API?
> > > Well I think that in this case we would need to actually transform a
> lot
> > of
> > > classes and that might be a bit too invasive. Although since the
> protocol
> > > itself *is* a public API it might make sense to have some kind of Java
> > > representation as a public API as well.
> > >
> > > > If not, do we have another option to access this info in the auditor?
> > > I think one option would be to do what the original KIP-567 was
> > > implemented. Basically we could have an AuditEvent interface that would
> > > contain request specific data. Its obvious drawback is that it has to
> be
> > > implemented for most of the 40 something protocols but on the upside
> > these
> > > classes shouldn't be complicated. I can try to do a PoC with this to
> see
> > > how it looks like and whether it solves the problem. To be honest I
> think
> > > it would be better than publishing the request classes as an API
> because
> > > here we're restricting access to only what is necessary.
> > >
> > > Thanks,
> > > Viktor
> > >
> > >
> > >
> > > On Fri, Sep 18, 2020 at 8:37 AM Dániel Urbán 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > > Thanks for the KIP.
> > > >
> > > > If the auditor needs access to the details of the action, one could
> > argue
> > > > that even the response should be passed down to the auditor.
> > > > Is it feasible to convert the Java requests and responses to public
> > API?
> > > > If not, do we have another option to access this info in the auditor?
> > > > I know that the auditor could just send proper requests through the
> API
> > > to
> > > > the brokers, but that seems like an awful lot of overhead, and could
> > > > introduce timing issues as well.
> > > >
> > > > Daniel
> > > >
> > > >
> > > > Viktor Somogyi-Vass  ezt írta (időpont:
> 2020.
> > > > szept. 16., Sze, 17:17):
> > > >
> > > > > One more after-thought on your second point (AbstractRequest): the
> > > > reason I
> > > > > introduced it in the first place was that this way implementers can
> > > > access
> > > > > request data. A use case can be if they want to audit a change in
> > > > > configuration or client quotas but not just acknowledge the fact
> that
> > > > such
> > > > > an event happened but also capture the change itself by peeking
> into
> > > the
> > > > > request. Sometimes it can be useful especially when people want to
> > > trace
> > > > > back the order of events and what happened when and not just
> > > acknowledge
> > > > > that there was an event of a certain kind. I also recognize that
> this
> > > > might
> > > > > be a very loose interpretation of auditing as it's not strictly
> > 

Jenkins build is back to normal : Kafka » kafka-trunk-jdk15 #136

2020-10-01 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : Kafka » kafka-trunk-jdk8 #102

2020-10-01 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : Kafka » kafka-trunk-jdk11 #104

2020-10-01 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-673: Emit JSONs with new auto-generated schema

2020-10-01 Thread David Jacot
Thanks for the KIP, Anastasia.

+1 (non-binding)

On Thu, Oct 1, 2020 at 8:06 AM Tom Bentley  wrote:

> Thanks Anastasia,
>
> +1 (non-binding)
>
>
> On Thu, Oct 1, 2020 at 6:30 AM Gwen Shapira  wrote:
>
> > Thank you, this will be quite helpful.
> >
> > +1 (binding)
> >
> > On Wed, Sep 30, 2020 at 11:19 AM Anastasia Vela 
> > wrote:
> > >
> > > Hi everyone,
> > >
> > > Thanks again for the discussion. I'd like to start the vote for this
> KIP.
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-673%3A+Emit+JSONs+with+new+auto-generated+schema
> > >
> > > Thanks,
> > > Anastasia
> >
> >
> >
> > --
> > Gwen Shapira
> > Engineering Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter | blog
> >
> >
>


Re: [VOTE] KIP-673: Emit JSONs with new auto-generated schema

2020-10-01 Thread Tom Bentley
Thanks Anastasia,

+1 (non-binding)


On Thu, Oct 1, 2020 at 6:30 AM Gwen Shapira  wrote:

> Thank you, this will be quite helpful.
>
> +1 (binding)
>
> On Wed, Sep 30, 2020 at 11:19 AM Anastasia Vela 
> wrote:
> >
> > Hi everyone,
> >
> > Thanks again for the discussion. I'd like to start the vote for this KIP.
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-673%3A+Emit+JSONs+with+new+auto-generated+schema
> >
> > Thanks,
> > Anastasia
>
>
>
> --
> Gwen Shapira
> Engineering Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>
>