Re: [VOTE] KIP-349 Priorities for Source Topics

2018-08-13 Thread Jan Filipiak

Sorry for missing the discussion

-1 nonbinding

see

https://samza.apache.org/learn/documentation/0.7.0/api/javadocs/org/apache/samza/system/chooser/MessageChooser.html

Best Jan


On 14.08.2018 03:19, n...@afshartous.com wrote:

Hi All,

Calling for a vote on KIP-349

https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics

--
   Nick







Build failed in Jenkins: kafka-2.0-jdk8 #116

2018-08-13 Thread Apache Jenkins Server
See 


Changes:

[matthias] KAFKA-7284: streams should unwrap fenced exception (#5499)

--
[...truncated 2.48 MB...]

org.apache.kafka.streams.kstream.internals.KStreamWindowAggregateTest > 
testJoin STARTED

org.apache.kafka.streams.kstream.internals.KStreamWindowAggregateTest > 
testJoin PASSED

org.apache.kafka.streams.kstream.internals.KStreamWindowAggregateTest > 
shouldLogAndMeterWhenSkippingNullKey STARTED

org.apache.kafka.streams.kstream.internals.KStreamWindowAggregateTest > 
shouldLogAndMeterWhenSkippingNullKey PASSED

org.apache.kafka.streams.kstream.internals.KStreamSelectKeyTest > testSelectKey 
STARTED

org.apache.kafka.streams.kstream.internals.KStreamSelectKeyTest > testSelectKey 
PASSED

org.apache.kafka.streams.kstream.internals.KStreamSelectKeyTest > 
testTypeVariance STARTED

org.apache.kafka.streams.kstream.internals.KStreamSelectKeyTest > 
testTypeVariance PASSED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldInnerJoinWithStream STARTED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldInnerJoinWithStream PASSED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldLeftJoinWithStream STARTED

org.apache.kafka.streams.kstream.internals.GlobalKTableJoinsTest > 
shouldLeftJoinWithStream PASSED

org.apache.kafka.streams.kstream.internals.KStreamWindowReduceTest > 
shouldLogAndMeterOnNullKey STARTED

org.apache.kafka.streams.kstream.internals.KStreamWindowReduceTest > 
shouldLogAndMeterOnNullKey PASSED

org.apache.kafka.streams.kstream.WindowsTest > retentionTimeMustNotBeNegative 
STARTED

org.apache.kafka.streams.kstream.WindowsTest > retentionTimeMustNotBeNegative 
PASSED

org.apache.kafka.streams.kstream.WindowsTest > numberOfSegmentsMustBeAtLeastTwo 
STARTED

org.apache.kafka.streams.kstream.WindowsTest > numberOfSegmentsMustBeAtLeastTwo 
PASSED

org.apache.kafka.streams.kstream.WindowsTest > shouldSetWindowRetentionTime 
STARTED

org.apache.kafka.streams.kstream.WindowsTest > shouldSetWindowRetentionTime 
PASSED

org.apache.kafka.streams.kstream.WindowsTest > shouldSetNumberOfSegments STARTED

org.apache.kafka.streams.kstream.WindowsTest > shouldSetNumberOfSegments PASSED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
startTimeMustNotBeNegative STARTED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
startTimeMustNotBeNegative PASSED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > shouldThrowOnUntil 
STARTED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > shouldThrowOnUntil 
PASSED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldIncludeRecordsThatHappenedOnWindowStart STARTED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldIncludeRecordsThatHappenedOnWindowStart PASSED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldSetWindowStartTime STARTED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldSetWindowStartTime PASSED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldIncludeRecordsThatHappenedAfterWindowStart STARTED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldIncludeRecordsThatHappenedAfterWindowStart PASSED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldExcludeRecordsThatHappenedBeforeWindowStart STARTED

org.apache.kafka.streams.kstream.UnlimitedWindowsTest > 
shouldExcludeRecordsThatHappenedBeforeWindowStart PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
endTimeShouldNotBeBeforeStart STARTED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
endTimeShouldNotBeBeforeStart PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
shouldUseWindowSizeForMaintainDurationWhenSizeLargerThanDefaultMaintainMs 
STARTED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
shouldUseWindowSizeForMaintainDurationWhenSizeLargerThanDefaultMaintainMs PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
shouldHaveSaneEqualsAndHashCode STARTED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
shouldHaveSaneEqualsAndHashCode PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > validWindows STARTED

org.apache.kafka.streams.kstream.JoinWindowsTest > validWindows PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > startTimeShouldNotBeAfterEnd 
STARTED

org.apache.kafka.streams.kstream.JoinWindowsTest > startTimeShouldNotBeAfterEnd 
PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
untilShouldSetMaintainDuration STARTED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
untilShouldSetMaintainDuration PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
retentionTimeMustNoBeSmallerThanWindowSize STARTED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
retentionTimeMustNoBeSmallerThanWindowSize PASSED

org.apache.kafka.streams.kstream.JoinWindowsTest > 
timeDifferenceMustNotBeNegative STARTED


Build failed in Jenkins: kafka-trunk-jdk8 #2891

2018-08-13 Thread Apache Jenkins Server
See 


Changes:

[mjsax] KAFKA-7284: streams should unwrap fenced exception (#5499)

[jason] MINOR: Use statically compiled regular expressions for efficiency

--
[...truncated 425.10 KB...]
kafka.zookeeper.ZooKeeperClientTest > testGetDataNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetDataNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testConnectionTimeout STARTED

kafka.zookeeper.ZooKeeperClientTest > testConnectionTimeout PASSED

kafka.zookeeper.ZooKeeperClientTest > 
testBlockOnRequestCompletionFromStateChangeHandler STARTED

kafka.zookeeper.ZooKeeperClientTest > 
testBlockOnRequestCompletionFromStateChangeHandler PASSED

kafka.zookeeper.ZooKeeperClientTest > testUnresolvableConnectString STARTED

kafka.zookeeper.ZooKeeperClientTest > testUnresolvableConnectString PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testPipelinedGetData STARTED

kafka.zookeeper.ZooKeeperClientTest > testPipelinedGetData PASSED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChildChangeHandlerForChildChange 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChildChangeHandlerForChildChange 
PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenExistingZNodeWithChildren 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenExistingZNodeWithChildren 
PASSED

kafka.zookeeper.ZooKeeperClientTest > testSetDataExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testSetDataExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testMixedPipeline STARTED

kafka.zookeeper.ZooKeeperClientTest > testMixedPipeline PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetDataExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetDataExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testDeleteExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testDeleteExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testSessionExpiry STARTED

kafka.zookeeper.ZooKeeperClientTest > testSessionExpiry PASSED

kafka.zookeeper.ZooKeeperClientTest > testSetDataNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testSetDataNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testDeleteNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testDeleteNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testExistsExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testExistsExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics PASSED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChangeHandlerForDeletion STARTED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChangeHandlerForDeletion PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetAclNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetAclNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testStateChangeHandlerForAuthFailure 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testStateChangeHandlerForAuthFailure 
PASSED

kafka.network.SocketServerTest > testGracefulClose STARTED

kafka.network.SocketServerTest > testGracefulClose PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone PASSED

kafka.network.SocketServerTest > controlThrowable STARTED

kafka.network.SocketServerTest > controlThrowable PASSED

kafka.network.SocketServerTest > testRequestMetricsAfterStop STARTED

kafka.network.SocketServerTest > testRequestMetricsAfterStop PASSED

kafka.network.SocketServerTest > testConnectionIdReuse STARTED

kafka.network.SocketServerTest > testConnectionIdReuse PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
STARTED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
PASSED

kafka.network.SocketServerTest > testProcessorMetricsTags STARTED

kafka.network.SocketServerTest > testProcessorMetricsTags PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp STARTED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > testConnectionId STARTED

kafka.network.SocketServerTest > testConnectionId PASSED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics STARTED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics PASSED

kafka.network.SocketServerTest > testNoOpAction STARTED

kafka.network.SocketServerTest > testNoOpAction PASSED

kafka.network.SocketServerTest > simpleRequest 

Build failed in Jenkins: kafka-trunk-jdk8 #2890

2018-08-13 Thread Apache Jenkins Server
See 


Changes:

[jason] MINOR: Remove AbstractFetcherThread.PartitionData (#5233)

[rajinisivaram] KAFKA-7266: Fix MetricsTest.testMetrics flakiness using 
compression

--
[...truncated 2.47 MB...]
org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldWriteCheckpointForPersistentLogEnabledStore PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testRegisterPersistentStore STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
testRegisterPersistentStore PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldThrowIllegalArgumentExceptionOnRegisterWhenStoreHasAlreadyBeenRegistered 
STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldThrowIllegalArgumentExceptionOnRegisterWhenStoreHasAlreadyBeenRegistered 
PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldRestoreStoreWithSinglePutRestoreSpecification STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldRestoreStoreWithSinglePutRestoreSpecification PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldNotChangeOffsetsIfAckedOffsetsIsNull STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldNotChangeOffsetsIfAckedOffsetsIsNull PASSED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldThrowIllegalArgumentExceptionIfStoreNameIsSameAsCheckpointFileName STARTED

org.apache.kafka.streams.processor.internals.ProcessorStateManagerTest > 
shouldThrowIllegalArgumentExceptionIfStoreNameIsSameAsCheckpointFileName PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCloseStateManagerWithOffsets STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCloseStateManagerWithOffsets PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForTopic STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForTopic PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeProcessorTopology STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeProcessorTopology PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeContext STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeContext PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCheckpointOffsetsWhenStateIsFlushed STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldCheckpointOffsetsWhenStateIsFlushed PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldThrowStreamsExceptionWhenKeyDeserializationFails STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldThrowStreamsExceptionWhenKeyDeserializationFails PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeStateManager STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldInitializeStateManager PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForOtherTopic STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldProcessRecordsForOtherTopic PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldNotThrowStreamsExceptionWhenValueDeserializationFails STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldNotThrowStreamsExceptionWhenValueDeserializationFails PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldThrowStreamsExceptionWhenValueDeserializationFails STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldThrowStreamsExceptionWhenValueDeserializationFails PASSED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldNotThrowStreamsExceptionWhenKeyDeserializationFailsWithSkipHandler STARTED

org.apache.kafka.streams.processor.internals.GlobalStateTaskTest > 
shouldNotThrowStreamsExceptionWhenKeyDeserializationFailsWithSkipHandler PASSED

org.apache.kafka.streams.processor.internals.StreamsMetricsImplTest > 
testThroughputMetrics STARTED

org.apache.kafka.streams.processor.internals.StreamsMetricsImplTest > 
testThroughputMetrics PASSED

org.apache.kafka.streams.processor.internals.StreamsMetricsImplTest > 
testLatencyMetrics STARTED

org.apache.kafka.streams.processor.internals.StreamsMetricsImplTest > 
testLatencyMetrics PASSED

org.apache.kafka.streams.processor.internals.StreamsMetricsImplTest > 
testRemoveSensor STARTED


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread Dong Lin
Hey Xiongqi,

Thanks for the KIP. I have two questions regarding the use-case for meeting
GDPR requirement.

1) If I recall correctly, one of the GDPR requirement is that we can not
keep messages longer than e.g. 30 days in storage (e.g. Kafka). Say there
exists a partition p0 which contains message1 with key1 and message2 with
key2. And then user keeps producing messages with key=key2 to this
partition. Since message1 with key1 is never overridden, sooner or later we
will want to delete message1 and keep the latest message with key=key2. But
currently it looks like log compact logic in Kafka will always put these
messages in the same segment. Will this be an issue?

2) The current KIP intends to provide the capability to delete a given
message in log compacted topic. Does such use-case also require Kafka to
keep the messages produced before the given message? If yes, then we can
probably just use AdminClient.deleteRecords() or time-based log retention
to meet the use-case requirement. If no, do you know what is the GDPR's
requirement on time-to-deletion after user explicitly requests the deletion
(e.g. 1 hour, 1 day, 7 day)?

Thanks,
Dong


On Mon, Aug 13, 2018 at 3:44 PM, xiongqi wu  wrote:

> Hi Eno,
>
> The GDPR request we are getting here at linkedin is if we get a request to
> delete a record through a null key on a log compacted topic,
> we want to delete the record via compaction in a given time period like 2
> days (whatever is required by the policy).
>
> There might be other issues (such as orphan log segments under certain
> conditions)  that lead to GDPR problem but they are more like something we
> need to fix anyway regardless of GDPR.
>
>
> -- Xiongqi (Wesley) Wu
>
> On Mon, Aug 13, 2018 at 2:56 PM, Eno Thereska 
> wrote:
>
> > Hello,
> >
> > Thanks for the KIP. I'd like to see a more precise definition of what
> part
> > of GDPR you are targeting as well as some sort of verification that this
> > KIP actually addresses the problem. Right now I find this a bit vague:
> >
> > "Ability to delete a log message through compaction in a timely manner
> has
> > become an important requirement in some use cases (e.g., GDPR)"
> >
> >
> > Is there any guarantee that after this KIP the GDPR problem is solved or
> do
> > we need to do something else as well, e.g., more KIPs?
> >
> >
> > Thanks
> >
> > Eno
> >
> >
> >
> > On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu  wrote:
> >
> > > Hi Kafka,
> > >
> > > This KIP tries to address GDPR concern to fulfill deletion request on
> > time
> > > through time-based log compaction on a compaction enabled topic:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 354%3A+Time-based+log+compaction+policy
> > >
> > > Any feedback will be appreciated.
> > >
> > >
> > > Xiongqi (Wesley) Wu
> > >
> >
>


[VOTE] KIP-349 Priorities for Source Topics

2018-08-13 Thread nick


Hi All,

Calling for a vote on KIP-349

https://cwiki.apache.org/confluence/display/KAFKA/KIP-349%3A+Priorities+for+Source+Topics

--
  Nick





[jira] [Resolved] (KAFKA-7284) Producer getting fenced may cause Streams to shut down

2018-08-13 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax resolved KAFKA-7284.

Resolution: Fixed

> Producer getting fenced may cause Streams to shut down
> --
>
> Key: KAFKA-7284
> URL: https://issues.apache.org/jira/browse/KAFKA-7284
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 0.11.0.3, 1.0.2, 1.1.1, 2.0.0
>Reporter: John Roesler
>Assignee: John Roesler
>Priority: Critical
> Fix For: 2.0.1, 2.1.0
>
>
> As part of the investigation, I will determine what other versions are 
> affected.
>  
> In StreamTask, we catch a `ProducerFencedException` and throw a 
> `TaskMigratedException`. However, in this case, the `RecordCollectorImpl` is 
> throwing a `StreamsException`, caused by `KafkaException` caused by 
> `ProducerFencedException`.
> In response to a TaskMigratedException, we would rebalance, but when we get a 
> StreamsException, streams shuts itself down.
> In other words, we intended to do a rebalance in response to a producer 
> fence, but actually, we shut down (when the fence happens inside the record 
> collector).
> Coincidentally, Guozhang noticed and fixed this in a recent PR: 
> [https://github.com/apache/kafka/pull/5428/files#diff-4e5612eeba09dabf30d0b8430f269ff6]
>  
> The scope of this ticket is to extract that fix and associated tests, and 
> send a separate PR to trunk and 2.0, and also to determine what other 
> versions, if any, are affected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-320: Allow fetchers to detect and handle log truncation

2018-08-13 Thread Jason Gustafson
I'm going to go ahead and call the vote. Here is the final tally:

Binding: Dong Lin, Jun Rao, Jason Gustafson
Non-binding: Anna Povzner

Thanks everyone!

-Jason

On Mon, Aug 13, 2018 at 2:35 PM, Anna Povzner  wrote:

> +1
>
> Thanks for the KIP!
>
> On Thu, Aug 9, 2018 at 5:16 PM Jun Rao  wrote:
>
> > Hi, Jason,
> >
> > Thanks for the KIP. +1 from me.
> >
> > Jun
> >
> > On Wed, Aug 8, 2018 at 1:04 PM, Jason Gustafson 
> > wrote:
> >
> > > Hi All,
> > >
> > > I'd like to start a vote for KIP-320:
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 320%3A+Allow+fetchers+to+detect+and+handle+log+truncation.
> > > Thanks to everyone who reviewed the proposal. Please feel free to send
> > > additional questions to the discussion thread if you have any.
> > >
> > > +1 from me (duh)
> > >
> > > Thanks,
> > > Jason
> > >
> >
>


Re: [VOTE] KIP-280: Enhanced log compaction

2018-08-13 Thread Jason Gustafson
Hey Luis,

Thanks for the explanation. I'd suggest adding the use case to the
motivation section.

I think my only hesitation about the header-based compaction is that it is
the first time we are putting a schema requirement on header values. I
wonder if it's better to leave Kafka agnostic. For example, maybe the
compaction strategy could be a plugin which allows custom derivation of the
compaction key. Say something like this:

class CompactionKey {
  byte[] key;
  long version;
}

interface CompactionStrategy {
  CompactionKey deriveCompactionKey(Record record);
}

The idea is to leave schemas in the hands of users. Have you considered
something like that?

Thanks,
Jason

On Sat, Aug 11, 2018 at 2:04 AM, Luís Cabral 
wrote:

> Hi Jason,
>
> The initial (and still only) requirement I wanted out of this KIP was to
> have the header strategy.
> This is because I want to be able to version both by date/time or by
> (externally provided) sequence, this is specially important if you are
> running in multiple environments, which may cause the commonly known issue
> of the clocks being slightly de-synchronized.
> Having the timestamp strategy was added to the KIP as the result of the
> discussions, where it was seen as a potential benefit for other clients who
> may prefer that.
>
> Cheers,
> Luís
>
> From: Jason Gustafson
> Sent: 10 August 2018 22:38
> To: dev
> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>
> Hi Luis,
>
> It's still not very clear to me why we need the header-based strategy. Can
> you elaborate why having the timestamp-based approach alone is not
> sufficient? The use case in the motivation just describes a "most recent
> snapshot" use case.
>
> Thanks,
> Jason
>
> On Thu, Aug 9, 2018 at 4:36 AM, Luís Cabral  >
> wrote:
>
> > Hi,
> >
> >
> > So, after a "short" break, I've finally managed to find time to resume
> > this KIP. Sorry to all for the delay.
> >
> > Continuing the conversation of the configurations being global vs  topic,
> > I've checked this and it seems that they are only available globally.
> >
> > This configuration is passed to the log cleaner via
> "CleanerConfig.scala",
> > which only accepts global configurations. This seems intentional, as the
> > log cleaner is not mutable and doesn't get instantiated that often. I
> think
> > that changing this to accept per-topic configuration would be very nice,
> > but perhaps as a part of a different KIP.
> >
> >
> > Following the Kafka documentation, these are the settings I'm referring
> to:
> >
> > -- --
> >
> > Updating Log Cleaner Configs
> >
> > Log cleaner configs may be updated dynamically at cluster-default
> > level used by all brokers. The changes take effect on the next iteration
> of
> > log cleaning. One or more of these configs may be updated:
> >
> > * log.cleaner.threads
> >
> > * log.cleaner.io.max.bytes.per.second
> >
> > * log.cleaner.dedupe.buffer.size
> >
> > * log.cleaner.io.buffer.size
> >
> > * log.cleaner.io.buffer.load.factor
> >
> > * log.cleaner.backoff.ms
> >
> > -- --
> >
> >
> >
> > Please feel free to confirm, otherwise I will update the KIP to reflect
> > these configuration nuances in the next few days.
> >
> >
> > Best Regards,
> >
> > Luis
> >
> >
> >
> > On Monday, July 9, 2018, 1:57:38 PM GMT+2, Andras Beni <
> > andrasb...@cloudera.com.INVALID> wrote:
> >
> >
> >
> >
> >
> > Hi Luís,
> >
> > Can you please clarify how the header value has to be encoded in case log
> > compaction strategy is 'header'. As I see current PR reads varLong in
> > CleanerCache.extractVersion and read String and uses toLong in
> > Cleaner.extractVersion while the KIP says no more than 'the header value
> > (which must be of type "long")'.
> >
> > Otherwise +1 for the KIP
> >
> > As for current implementation: it seems in Cleaner class header key
> > "version" is hardwired.
> >
> > Andras
> >
> >
> >
> > On Fri, Jul 6, 2018 at 10:36 PM Jun Rao  wrote:
> >
> > > Hi, Guozhang,
> > >
> > > For #4, what you suggested could make sense for timestamp based de-dup,
> > but
> > > not sure how general it is since the KIP also supports de-dup based on
> > > header.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Jul 6, 2018 at 1:12 PM, Guozhang Wang 
> > wrote:
> > >
> > > > Hello Jun,
> > > > Thanks for your feedbacks. I'd agree on #3 that it's worth adding a
> > > special
> > > > check to not delete the last message, since although unlikely, it is
> > > still
> > > > possible that a new active segment gets rolled out but contains no
> data
> > > > yet, and hence the actual last message in this case would be in a
> > > > "compact-able" segment.
> > > >
> > > > For the second part of #4 you raised, maybe we could educate users to
> > > set "
> > > > message.timestamp.difference.max.ms" to be no larger than "
> > > > log.cleaner.delete.retention.ms" (its default value is
> > Long.MAX_VALUE)?
> > > A
> > > > more aggressive approach would be changing the 

Re: [VOTE] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-13 Thread Satish Duggana
+1 (non binding)
Thanks Vahid,

On Tue, Aug 14, 2018 at 2:22 AM, Gwen Shapira  wrote:

> +1 (binding)
>
> On Tue, Aug 7, 2018 at 11:14 AM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > Hi all,
> >
> > I'd like to start a vote on KIP-289 to modify the default group id of
> > KafkaConsumer.
> > The KIP:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer
> > The discussion thread:
> > https://www.mail-archive.com/dev@kafka.apache.org/msg87379.html
> >
> > Thanks!
> > --Vahid
> >
> >
>
>
> --
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter  | blog
> 
>


[jira] [Created] (KAFKA-7286) Loading offsets and group metadata hangs with large group metadata records

2018-08-13 Thread Flavien Raynaud (JIRA)
Flavien Raynaud created KAFKA-7286:
--

 Summary: Loading offsets and group metadata hangs with large group 
metadata records
 Key: KAFKA-7286
 URL: https://issues.apache.org/jira/browse/KAFKA-7286
 Project: Kafka
  Issue Type: Bug
Reporter: Flavien Raynaud


When a (Kafka-based) consumer group contains many members, group metadata 
records (in the {{__consumer-offsets}} topic) may happen to be quite large.

Increasing the {{message.max.bytes}} makes storing these records possible.
 Loading them when a broker restart is done via 
[doLoadGroupsAndOffsets|https://github.com/apache/kafka/blob/418a91b5d4e3a0579b91d286f61c2b63c5b4a9b6/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L504].
 However, this method relies on the {{offsets.load.buffer.size}} configuration 
to create a 
[buffer|https://github.com/apache/kafka/blob/418a91b5d4e3a0579b91d286f61c2b63c5b4a9b6/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L513]
 that will contain the records being loaded.

If a group metadata record is too large for this buffer, the loading method 
will get stuck trying to load records (in a tight loop) into a buffer that 
cannot accommodate a single record.

For example, if the {{__consumer-offsets-9}} partition contains a record 
smaller than {{message.max.bytes}} but larger than 
{{offsets.load.buffer.size}}, logs would indicate the following:
{noformat}
...
[2018-08-13 21:00:21,073] INFO [GroupMetadataManager brokerId=0] Scheduling 
loading of offsets and group metadata from __consumer_offsets-9 
(kafka.coordinator.group.GroupMetadataManager)
...
{noformat}
But logs will never contain the expected {{Finished loading offsets and group 
metadata from ...}} line.

Consumers whose group are assigned to this partition will see {{Marking the 
coordinator dead}} and will never be able to stabilize and make progress.

>From what I could gather in the code, it seems that:
 - 
[fetchDataInfo|https://github.com/apache/kafka/blob/418a91b5d4e3a0579b91d286f61c2b63c5b4a9b6/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L522]
 returns at least one record (even if larger than {{offsets.load.buffer.size}}, 
thanks to {{minOneMessage = true}})
 - No fully-readable record is stored in the buffer with 
[fileRecords.readInto(buffer, 
0)|https://github.com/apache/kafka/blob/418a91b5d4e3a0579b91d286f61c2b63c5b4a9b6/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L528]
 (too large to fit in the buffer)
 - 
[memRecords.batches|https://github.com/apache/kafka/blob/418a91b5d4e3a0579b91d286f61c2b63c5b4a9b6/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L532]
 returns an empty iterator
 - 
[currOffset|https://github.com/apache/kafka/blob/418a91b5d4e3a0579b91d286f61c2b63c5b4a9b6/core/src/main/scala/kafka/coordinator/group/GroupMetadataManager.scala#L590]
 never advances, hence loading the partition hangs forever.


It would be great to let the partition load even if a record is larger than the 
configured {{offsets.load.buffer.size}} limit. The fact that {{minOneMessage = 
true}} when reading records seems to indicate it might be a good idea for the 
buffer to accommodate at least one record.

If you think the limit should stay a hard limit, then at least adding a log 
line indicating {{offsets.load.buffer.size}} is not large enough and should be 
increased. Otherwise, one can only guess and dig through the code to figure out 
what is happening :)

I will try to open a PR with the first idea (allowing large records to be read 
when needed) soon, but any feedback from anyone who also had the same issue in 
the past would be appreciated :)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread xiongqi wu
Hi Eno,

The GDPR request we are getting here at linkedin is if we get a request to
delete a record through a null key on a log compacted topic,
we want to delete the record via compaction in a given time period like 2
days (whatever is required by the policy).

There might be other issues (such as orphan log segments under certain
conditions)  that lead to GDPR problem but they are more like something we
need to fix anyway regardless of GDPR.


-- Xiongqi (Wesley) Wu

On Mon, Aug 13, 2018 at 2:56 PM, Eno Thereska 
wrote:

> Hello,
>
> Thanks for the KIP. I'd like to see a more precise definition of what part
> of GDPR you are targeting as well as some sort of verification that this
> KIP actually addresses the problem. Right now I find this a bit vague:
>
> "Ability to delete a log message through compaction in a timely manner has
> become an important requirement in some use cases (e.g., GDPR)"
>
>
> Is there any guarantee that after this KIP the GDPR problem is solved or do
> we need to do something else as well, e.g., more KIPs?
>
>
> Thanks
>
> Eno
>
>
>
> On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu  wrote:
>
> > Hi Kafka,
> >
> > This KIP tries to address GDPR concern to fulfill deletion request on
> time
> > through time-based log compaction on a compaction enabled topic:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 354%3A+Time-based+log+compaction+policy
> >
> > Any feedback will be appreciated.
> >
> >
> > Xiongqi (Wesley) Wu
> >
>


Re: [VOTE] KIP-328: Ability to suppress updates for KTables

2018-08-13 Thread John Roesler
Hey all,

I just wanted to let you know that a few small issues surfaced during
implementation and review. I've updated the KIP. Here's the diff:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=87295409=9=8

Basically:
* the metrics named "*-event-*" are inconsistent with existing
nomenclature, and will be "*-record-*" instead (late records instead of
late events, for example)
* the apis taking and returning Duration will use long millis instead. We
do want to transition to Duration in the future, but we shouldn't do it
piecemeal.

Thanks,
-John

On Tue, Aug 7, 2018 at 12:07 PM John Roesler  wrote:

> Thanks everyone, KIP-328 has passed with 3 binding votes (Guozhang,
> Damian, and Matthias) and 3 non-binding (Ted, Bill, and me).
>
> Thanks for your time,
> -John
>
> On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax 
> wrote:
>
>> +1 (binding)
>>
>> Thanks for the KIP.
>>
>>
>> -Matthias
>>
>> On 8/3/18 12:52 AM, Damian Guy wrote:
>> > Thanks John! +1
>> >
>> > On Mon, 30 Jul 2018 at 23:58 Guozhang Wang  wrote:
>> >
>> >> Yes, the addendum lgtm as well. Thanks!
>> >>
>> >> On Mon, Jul 30, 2018 at 3:34 PM, John Roesler 
>> wrote:
>> >>
>> >>> Another thing that came up after I started working on an
>> implementation
>> >> is
>> >>> that in addition to deprecating "retention" from the Windows
>> interface,
>> >> we
>> >>> also need to deprecate "segmentInterval", for the same reasons. I
>> simply
>> >>> overlooked it previously. I've updated the KIP accordingly.
>> >>>
>> >>> Hopefully, this doesn't change anyone's vote.
>> >>>
>> >>> Thanks,
>> >>> -John
>> >>>
>> >>> On Mon, Jul 30, 2018 at 5:31 PM John Roesler 
>> wrote:
>> >>>
>>  Thanks Guozhang,
>> 
>>  Thanks for that catch. to clarify, currently, events are "late" only
>> >> when
>>  they are older than the retention period. Currently, we detect this
>> in
>> >>> the
>>  processor and record it as a "skipped-record". We then do not attempt
>> >> to
>>  store the event in the window store. If a user provided a
>> >> pre-configured
>>  window store with a retention period smaller than the one they
>> specify
>> >>> via
>>  Windows#until, the segmented store will drop the update with no
>> metric
>> >>> and
>>  record a debug-level log.
>> 
>>  With KIP-328, with the introduction of "grace period" and moving
>> >>> retention
>>  fully into the state store, we need to have metrics for both "late
>> >>> events"
>>  (new records older than the grace period) and "expired window events"
>> >>> (new
>>  records for windows that are no longer retained in the state store).
>> I
>>  already proposed metrics for the late events, and I've just updated
>> the
>> >>> KIP
>>  with metrics for the expired window events. I also updated the KIP to
>> >>> make
>>  it clear that neither late nor expired events will count as
>>  "skipped-records" any more.
>> 
>>  -John
>> 
>>  On Mon, Jul 30, 2018 at 4:22 PM Guozhang Wang 
>> >>> wrote:
>> 
>> > Hi John,
>> >
>> > Thanks for the updated KIP, +1 from me, and one minor suggestion:
>> >
>> > Following your suggestion of the differentiation of
>> `skipped-records`
>> >>> v.s.
>> > `late-event-drop`, we should probably consider moving the scenarios
>> >>> where
>> > records got ignored due the window not being available any more in
>> > windowed
>> > aggregation operators from the `skipped-records` metrics recording
>> to
>> >>> the
>> > `late-event-drop` metrics recording.
>> >
>> >
>> >
>> > Guozhang
>> >
>> >
>> > On Mon, Jul 30, 2018 at 1:36 PM, Bill Bejeck 
>> >> wrote:
>> >
>> >> Thanks for the KIP!
>> >>
>> >> +1
>> >>
>> >> -Bill
>> >>
>> >> On Mon, Jul 30, 2018 at 3:42 PM Ted Yu 
>> wrote:
>> >>
>> >>> +1
>> >>>
>> >>> On Mon, Jul 30, 2018 at 11:46 AM John Roesler 
>> > wrote:
>> >>>
>>  Hello devs,
>> 
>>  The discussion of KIP-328 has gone some time with no new
>> >> comments,
>> > so I
>> >>> am
>>  calling for a vote!
>> 
>>  Here's the KIP: https://cwiki.apache.org/confluence/x/sQU0BQ
>> 
>>  The basic idea is to provide:
>>  * more usable control over update rate (vs the current state
>> >> store
>> >>> caches)
>>  * the final-result-for-windowed-computations feature which
>> >>> several
>> >> people
>>  have requested
>> 
>>  Thanks,
>>  -John
>> 
>> >>>
>> >>
>> >
>> >
>> >
>> > --
>> > -- Guozhang
>> >
>> 
>> >>>
>> >>
>> >>
>> >>
>> >> --
>> >> -- Guozhang
>> >>
>> >
>>
>>


Re: Permission to create KIP

2018-08-13 Thread Jun Rao
Hi, Nikolay,

Thanks for your interest. Just gave you the wiki permission.

Jun

On Mon, Aug 13, 2018 at 10:53 AM, Nikolay Izhikov 
wrote:

> Hello, Guys.
>
> I want to create KIP for ticket [1].
> Please, give me sufficient permissions.
>
> My JIRA ID - nizhikov.
>
> [1] https://issues.apache.org/jira/browse/KAFKA-7277


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread xiongqi wu
Yes.  we want to enforce a max time interval from a message arrival time to
the time the corresponding log segment needs to be compacted.

Today, if the message arriving rate is low for a log compacted topic, the
dirty ratio increases very slowly. As a result, a log segment might be
un-compacted for a long time.

Xiongqi (Welsey) Wu

On Mon, Aug 13, 2018 at 2:46 PM, Guozhang Wang  wrote:

> Guess I need to carefully read the wiki page before asking :) Thanks!
>
> Another qq after reading the proposal: is it a complimentary to KIP-58 (
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 58+-+Make+Log+Compaction+Point+Configurable),
> just that KIP-58 is a "upper-bound" on what messages can be compacted, and
> this is for a "lower-bound" on what messages NEED to be compacted?
>
>
> Guozhang
>
> On Mon, Aug 13, 2018 at 2:31 PM, xiongqi wu  wrote:
>
> > HI Guozhang,
> >
> > As I mentioned in the motivation section, KIP-280 focuses on how to
> compact
> > the log segment to resolve the out of order messages compaction issue.
> > The issue we try to address in this KIP is different:  we want to
> introduce
> > a compaction policy so that a log segment can be pickup for compaction
> > after a specified time interval.  One use case is for GDPR to ensure
> timely
> > deletion of user record.
> >
> > There is no conflict and overlapping between this KIP and KIP-280.
> >
> > Thank you!
> >
> >
> > On Mon, Aug 13, 2018 at 1:33 PM, Guozhang Wang 
> wrote:
> >
> > > Hello Xiongqi,
> > >
> > > I think this KIP is already been covered in KIP-280? Could you check
> out
> > > that one and see if it is the case.
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, Aug 13, 2018 at 1:23 PM, xiongqi wu 
> wrote:
> > >
> > > > Hi Kafka,
> > > >
> > > > Just updated the confluence page to include the link to this KIP.
> > > >
> > > > Any comment will be appreciated:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A
> > > > +Time-based+log+compaction+policy
> > > >
> > > > Thank you.
> > > >
> > > > Xiongqi (Wesley) Wu
> > > >
> > > > On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu 
> > wrote:
> > > >
> > > > > Hi Kafka,
> > > > >
> > > > > This KIP tries to address GDPR concern to fulfill deletion request
> on
> > > > time
> > > > > through time-based log compaction on a compaction enabled topic:
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 354%3A+Time-based+log+compaction+policy
> > > > >
> > > > > Any feedback will be appreciated.
> > > > >
> > > > >
> > > > > Xiongqi (Wesley) Wu
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>


[jira] [Created] (KAFKA-7285) Streams should be more fencing-sensitive during task suspension under EOS

2018-08-13 Thread Guozhang Wang (JIRA)
Guozhang Wang created KAFKA-7285:


 Summary: Streams should be more fencing-sensitive during task 
suspension under EOS
 Key: KAFKA-7285
 URL: https://issues.apache.org/jira/browse/KAFKA-7285
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Reporter: Guozhang Wang


When EOS is turned on, Streams did the following steps:

1. InitTxn in task creation.
2. BeginTxn in topology initialization.
3. AbortTxn in clean shutdown.
4. CommitTxn in commit(), which is called in suspend() as well.

Now consider this situation, with two thread (Ta) and (Tb) and one task:

1. originally Ta owns the task, consumer generation is 1.
2. Ta is un-responsive to send heartbeats, and gets kicked out, a new 
generation 2 is formed with Tb in it. The task is migrated to Tb while Ta does 
not know.
3. Ta finally calls `consumer.poll` and was aware of the rebalance, it re-joins 
the group, forming a new generation of 3. And during the rebalance the leader 
decides to assign the task back to Ta.
4.a) Ta calls onPartitionRevoked on the task, suspending it and call commit. 
However if there is no data ever sent since `BeginTxn`, this commit call will 
become a no-op.
4.b) Ta then calls onPartitionAssigned on the task, resuming it, and then calls 
BeginTxn. Then it was encountered a ProducerFencedException, incorrectly.

The root cause is that, Ta does not trigger InitTxn to claim "I'm the newest 
for this txnId, and am going to fence everyone else with the same txnId", so it 
was mistakenly treated as the old client than Tb.

Note that this issue is not common, since we need to encounter a txn that did 
not send any data at all to make its commitTxn call a no-op, and hence not 
being fenced earlier on.

One proposal for this issue is to close the producer and recreates a new one in 
`suspend` after the commitTxn call succeeded and `startNewTxn` is false, so 
that the new producer will always `initTxn` to fence others.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread Eno Thereska
Hello,

Thanks for the KIP. I'd like to see a more precise definition of what part
of GDPR you are targeting as well as some sort of verification that this
KIP actually addresses the problem. Right now I find this a bit vague:

"Ability to delete a log message through compaction in a timely manner has
become an important requirement in some use cases (e.g., GDPR)"


Is there any guarantee that after this KIP the GDPR problem is solved or do
we need to do something else as well, e.g., more KIPs?


Thanks

Eno



On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu  wrote:

> Hi Kafka,
>
> This KIP tries to address GDPR concern to fulfill deletion request on time
> through time-based log compaction on a compaction enabled topic:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 354%3A+Time-based+log+compaction+policy
>
> Any feedback will be appreciated.
>
>
> Xiongqi (Wesley) Wu
>


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread Guozhang Wang
Guess I need to carefully read the wiki page before asking :) Thanks!

Another qq after reading the proposal: is it a complimentary to KIP-58 (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-58+-+Make+Log+Compaction+Point+Configurable),
just that KIP-58 is a "upper-bound" on what messages can be compacted, and
this is for a "lower-bound" on what messages NEED to be compacted?


Guozhang

On Mon, Aug 13, 2018 at 2:31 PM, xiongqi wu  wrote:

> HI Guozhang,
>
> As I mentioned in the motivation section, KIP-280 focuses on how to compact
> the log segment to resolve the out of order messages compaction issue.
> The issue we try to address in this KIP is different:  we want to introduce
> a compaction policy so that a log segment can be pickup for compaction
> after a specified time interval.  One use case is for GDPR to ensure timely
> deletion of user record.
>
> There is no conflict and overlapping between this KIP and KIP-280.
>
> Thank you!
>
>
> On Mon, Aug 13, 2018 at 1:33 PM, Guozhang Wang  wrote:
>
> > Hello Xiongqi,
> >
> > I think this KIP is already been covered in KIP-280? Could you check out
> > that one and see if it is the case.
> >
> >
> > Guozhang
> >
> >
> > On Mon, Aug 13, 2018 at 1:23 PM, xiongqi wu  wrote:
> >
> > > Hi Kafka,
> > >
> > > Just updated the confluence page to include the link to this KIP.
> > >
> > > Any comment will be appreciated:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A
> > > +Time-based+log+compaction+policy
> > >
> > > Thank you.
> > >
> > > Xiongqi (Wesley) Wu
> > >
> > > On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu 
> wrote:
> > >
> > > > Hi Kafka,
> > > >
> > > > This KIP tries to address GDPR concern to fulfill deletion request on
> > > time
> > > > through time-based log compaction on a compaction enabled topic:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 354%3A+Time-based+log+compaction+policy
> > > >
> > > > Any feedback will be appreciated.
> > > >
> > > >
> > > > Xiongqi (Wesley) Wu
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


Jenkins build is back to normal : kafka-trunk-jdk8 #2889

2018-08-13 Thread Apache Jenkins Server
See 




Re: [VOTE] KIP-320: Allow fetchers to detect and handle log truncation

2018-08-13 Thread Anna Povzner
+1

Thanks for the KIP!

On Thu, Aug 9, 2018 at 5:16 PM Jun Rao  wrote:

> Hi, Jason,
>
> Thanks for the KIP. +1 from me.
>
> Jun
>
> On Wed, Aug 8, 2018 at 1:04 PM, Jason Gustafson 
> wrote:
>
> > Hi All,
> >
> > I'd like to start a vote for KIP-320:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 320%3A+Allow+fetchers+to+detect+and+handle+log+truncation.
> > Thanks to everyone who reviewed the proposal. Please feel free to send
> > additional questions to the discussion thread if you have any.
> >
> > +1 from me (duh)
> >
> > Thanks,
> > Jason
> >
>


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread xiongqi wu
HI Guozhang,

As I mentioned in the motivation section, KIP-280 focuses on how to compact
the log segment to resolve the out of order messages compaction issue.
The issue we try to address in this KIP is different:  we want to introduce
a compaction policy so that a log segment can be pickup for compaction
after a specified time interval.  One use case is for GDPR to ensure timely
deletion of user record.

There is no conflict and overlapping between this KIP and KIP-280.

Thank you!


On Mon, Aug 13, 2018 at 1:33 PM, Guozhang Wang  wrote:

> Hello Xiongqi,
>
> I think this KIP is already been covered in KIP-280? Could you check out
> that one and see if it is the case.
>
>
> Guozhang
>
>
> On Mon, Aug 13, 2018 at 1:23 PM, xiongqi wu  wrote:
>
> > Hi Kafka,
> >
> > Just updated the confluence page to include the link to this KIP.
> >
> > Any comment will be appreciated:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A
> > +Time-based+log+compaction+policy
> >
> > Thank you.
> >
> > Xiongqi (Wesley) Wu
> >
> > On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu  wrote:
> >
> > > Hi Kafka,
> > >
> > > This KIP tries to address GDPR concern to fulfill deletion request on
> > time
> > > through time-based log compaction on a compaction enabled topic:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 354%3A+Time-based+log+compaction+policy
> > >
> > > Any feedback will be appreciated.
> > >
> > >
> > > Xiongqi (Wesley) Wu
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-08-13 Thread Lucas Wang
@Becket

Makes sense. I've updated the KIP by adding the following paragraph to the
motivation section

> Today there is no separate between controller requests and regular data
> plane requests. Specifically (1) a controller in a cluster uses the same
> advertised endpoints to connect to brokers as what clients and regular
> brokers use for exchanging data (2) on the broker side, the same network
> (processor) thread could be multiplexed by handling a controller connection
> and many other data plane connections (3) after a controller request is
> read from the socket, it is enqueued into the single FIFO requestQueue,
> which is used for all types of requests (4) request handler threads poll
> requests from the requestQueue and handles the controller requests with the
> same priority as regular data requests.
>
> Because of the multiplexing at every stage of request handling, controller
> requests could be significantly delayed under the following scenarios:
>
>1. The requestQueue is full, and therefore blocks a network
>(processor) thread that has read a controller request from the socket.
>2. A controller request is enqueued into the requestQueue after a
>backlog of data requests, and experiences a long queuing time in the
>requestQueue.
>
>
Please let me know if that looks ok or any other change you'd like to make.
Thanks!

Lucas

On Mon, Aug 13, 2018 at 6:33 AM, Becket Qin  wrote:

> Hi Lucas,
>
> Thanks for the explanation. It might be a nitpick, but it seems better to
> mention in the motivation part that today the client requests and
> controller requests are not only sharing the same queue, but also a bunch
> of things else, so that we can avoid asking people to read the rejected
> alternatives.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
>
>
>
>
> On Fri, Aug 10, 2018 at 6:23 AM, Lucas Wang  wrote:
>
> > @Becket,
> >
> > I've asked for review by Jun and Joel in the vote thread.
> > Regarding the separate thread and port, I did talk about it in the
> rejected
> > alternative design 1.
> > Please let me know if you'd like more elaboration or moving it to the
> > motivation, etc.
> >
> > Thanks,
> > Lucas
> >
> > On Wed, Aug 8, 2018 at 3:59 PM, Becket Qin  wrote:
> >
> > > Hi Lucas,
> > >
> > > Yes, a separate Jira is OK.
> > >
> > > Since the proposal has significantly changed since the initial vote
> > > started. We probably should let the others who have already voted know
> > and
> > > ensure they are happy with the updated proposal.
> > > Also, it seems the motivation part of the KIP wiki is still just
> talking
> > > about the separate queue and not fully cover the changes we make now,
> > e.g.
> > > separate thread, port, etc. We might want to explain a bit more so for
> > > people who did not follow the discussion mail thread also understand
> the
> > > whole proposal.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Wed, Aug 8, 2018 at 12:44 PM, Lucas Wang 
> > wrote:
> > >
> > > > Hi Becket,
> > > >
> > > > Thanks for the review. The current write up in the KIP won’t change
> the
> > > > ordering behavior. Are you ok with addressing that as a separate
> > > > independent issue (I’ll create a separate ticket for it)?
> > > > If so, can you please give me a +1 on the vote thread?
> > > >
> > > > Thanks,
> > > > Lucas
> > > >
> > > > On Tue, Aug 7, 2018 at 7:34 PM Becket Qin 
> > wrote:
> > > >
> > > > > Thanks for the updated KIP wiki, Lucas. Looks good to me overall.
> > > > >
> > > > > It might be an implementation detail, but do we still plan to use
> the
> > > > > correlation id to ensure the request processing order?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Tue, Jul 31, 2018 at 3:39 AM, Lucas Wang  >
> > > > wrote:
> > > > >
> > > > > > Thanks for your review, Dong.
> > > > > > Ack that these configs will have a bigger impact for users.
> > > > > >
> > > > > > On the other hand, I would argue that the request queue becoming
> > full
> > > > > > may or may not be a rare scenario.
> > > > > > How often the request queue gets full depends on the request
> > incoming
> > > > > rate,
> > > > > > the request processing rate, and the size of the request queue.
> > > > > > When that happens, the dedicated endpoints design can better
> handle
> > > > > > it than any of the previously discussed options.
> > > > > >
> > > > > > Another reason I made the change was that I have the same taste
> > > > > > as Becket that it's a better separation of the control plane from
> > the
> > > > > data
> > > > > > plane.
> > > > > >
> > > > > > Finally, I want to clarify that this change is NOT motivated by
> the
> > > > > > out-of-order
> > > > > > processing discussion. The latter problem is orthogonal to this
> > KIP,
> > > > and
> > > > > it
> > > > > > can happen in any of the design options we discussed for this KIP
> > so
> > > > far.
> > > > > > So I'd like to address out-of-order processing separately in
> > 

Re: [VOTE] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-13 Thread Gwen Shapira
+1 (binding)

On Tue, Aug 7, 2018 at 11:14 AM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi all,
>
> I'd like to start a vote on KIP-289 to modify the default group id of
> KafkaConsumer.
> The KIP:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer
> The discussion thread:
> https://www.mail-archive.com/dev@kafka.apache.org/msg87379.html
>
> Thanks!
> --Vahid
>
>


-- 
*Gwen Shapira*
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter  | blog



Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread Guozhang Wang
Hello Xiongqi,

I think this KIP is already been covered in KIP-280? Could you check out
that one and see if it is the case.


Guozhang


On Mon, Aug 13, 2018 at 1:23 PM, xiongqi wu  wrote:

> Hi Kafka,
>
> Just updated the confluence page to include the link to this KIP.
>
> Any comment will be appreciated:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A
> +Time-based+log+compaction+policy
>
> Thank you.
>
> Xiongqi (Wesley) Wu
>
> On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu  wrote:
>
> > Hi Kafka,
> >
> > This KIP tries to address GDPR concern to fulfill deletion request on
> time
> > through time-based log compaction on a compaction enabled topic:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 354%3A+Time-based+log+compaction+policy
> >
> > Any feedback will be appreciated.
> >
> >
> > Xiongqi (Wesley) Wu
> >
>



-- 
-- Guozhang


Re: [DISCUSS] KIP-347: Enable batching in FindCoordinatorRequest

2018-08-13 Thread Guozhang Wang
Regarding 3): Today we do not have this logic with the existing client,
because defer the decision about the version to use (we always assume that
an new versioned request need to be down-converted to a single old
versioned request: i.e. an one-to-one mapping), but in principle, we should
be able to modify the client make it work.

Again this is not necessarily need to be included in this KIP, but I'd
recommend you to look into AdminClient implementations around the
ApiVersionRequest / Response and think about how that logic can be modified
in the follow-up PR of this KIP.



Guozhang

On Mon, Aug 13, 2018 at 12:55 PM, Yishun Guan  wrote:

> @Guozhang, thank you so much!
> 1. I agree, fixed.
> 2. Added.
> 3. I see, that is something that I haven't think about. How does Kafka
> handle other api's different version problem now? So we have a specific
> convertor that convect a new version request to a old version one for each
> API (is this what the ApiVersionsRequest supposed to do, does it only
> handle the detection or it should handle the conversion too)? What will be
> the consequence of not having such a transformer but the version is
> incompatible?
>
> Best,
> Yishun
>
> On Sat, Aug 11, 2018 at 11:27 AM Guozhang Wang  wrote:
>
> > Hello Yishun,
> >
> > Thanks for the proposed KIP. I made a pass over the wiki and here are
> some
> > comments:
> >
> > 1. "DESCRIBE_GROUPS_RESPONSE_MEMBER_V0", why we need to encode the full
> > schema for the "COORDINATOR_GROUPIDS_KEY_NAME" field? Note it includes a
> > lot of fields such as member id that is not needed for this case. I
> think a
> > "new ArrayOf(String)" for the group ids should be sufficient.
> >
> > 2. "schemaVersions" of the "FindCoordinatorRequest" needs to include
> > FIND_COORDINATOR_REQUEST_V3 as well.
> >
> > 3. One thing you may need to consider is that, in the adminClient to
> handle
> > broker compatibility, how to transform a new (v3) request to a bunch of
> > (v2) requests if it detects the broker is still in old version and hence
> > cannot support v3 request (this logic is already implemented via
> > ApiVersionsRequest in AdminClient, but may need to be extended to handle
> > one-to-many mapping of different versions).
> >
> > This is not sth. that you need to implement under this KIP, but I'd
> > recommend you think about this earlier than later and see if it may
> affect
> > this proposal.
> >
> >
> > Guozhang
> >
> >
> > On Sat, Aug 11, 2018 at 10:54 AM, Yishun Guan  wrote:
> >
> > > Hi, thank you Ted! I have addressed your comments:
> > >
> > > 1. Added more descriptions about later optimization.
> > > 2. Yes, I will implement the V3 later when this KIP gets accepted.
> > > 3. Fixed.
> > >
> > > Thanks,
> > > Yishun
> > >
> > > On Fri, Aug 10, 2018 at 3:32 PM Ted Yu  wrote:
> > >
> > > > bq. this is the foundation of some later possible
> optimizations(enable
> > > > batching in *describeConsumerGroups ...*
> > > >
> > > > *Can you say more why this change lays the foundation for the future
> > > > optimizations ?*
> > > >
> > > > *You mentioned **FIND_COORDINATOR_REQUEST_V3 in the wiki but I don't
> > see
> > > it
> > > > in PR.*
> > > > *I assume you would add that later.*
> > > >
> > > > *Please read your wiki and fix grammatical error such as the
> > following:*
> > > >
> > > > bq. that need to be make
> > > >
> > > > Thanks
> > > >
> > > > On Wed, Aug 8, 2018 at 3:55 PM Yishun Guan 
> wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I would like to start a discussion on:
> > > > >
> > > > > KIP-347: Enable batching in FindCoordinatorRequest
> > > > > https://cwiki.apache.org/confluence/x/CgZPBQ
> > > > >
> > > > > Thanks @Guozhang Wang  for his help and
> > patience!
> > > > >
> > > > > Thanks,
> > > > > Yishun
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


Re: [DISCUSS] KIP-354 Time-based log compaction policy

2018-08-13 Thread xiongqi wu
Hi Kafka,

Just updated the confluence page to include the link to this KIP.

Any comment will be appreciated:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-354%3A+Time-based+log+compaction+policy

Thank you.

Xiongqi (Wesley) Wu

On Thu, Aug 9, 2018 at 4:18 PM, xiongqi wu  wrote:

> Hi Kafka,
>
> This KIP tries to address GDPR concern to fulfill deletion request on time
> through time-based log compaction on a compaction enabled topic:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 354%3A+Time-based+log+compaction+policy
>
> Any feedback will be appreciated.
>
>
> Xiongqi (Wesley) Wu
>


[jira] [Created] (KAFKA-7284) Producer getting fenced may cause Streams to shut down

2018-08-13 Thread John Roesler (JIRA)
John Roesler created KAFKA-7284:
---

 Summary: Producer getting fenced may cause Streams to shut down
 Key: KAFKA-7284
 URL: https://issues.apache.org/jira/browse/KAFKA-7284
 Project: Kafka
  Issue Type: Improvement
  Components: streams
Affects Versions: 2.0.0
Reporter: John Roesler
Assignee: John Roesler


As part of the investigation, I will determine what other versions are affected.

 

In StreamTask, we catch a `ProducerFencedException` and throw a 
`TaskMigratedException`. However, in this case, the `RecordCollectorImpl` is 
throwing a `StreamsException`, caused by `KafkaException` caused by 
`ProducerFencedException`.

In response to a TaskMigratedException, we would rebalance, but when we get a 
StreamsException, streams shuts itself down.

In other words, we intended to do a rebalance in response to a producer fence, 
but actually, we shut down (when the fence happens inside the record collector).


Coincidentally, Guozhang noticed and fixed this in a recent PR: 
[https://github.com/apache/kafka/pull/5428/files#diff-4e5612eeba09dabf30d0b8430f269ff6]

 

The scope of this ticket is to extract that fix and associated tests, and send 
a separate PR to trunk and 2.0, and also to determine what other versions, if 
any, are affected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7283) mmap indexes lazily and skip sanity check for segments below recovery point

2018-08-13 Thread Zhanxiang (Patrick) Huang (JIRA)
Zhanxiang (Patrick) Huang created KAFKA-7283:


 Summary: mmap indexes lazily and skip sanity check for segments 
below recovery point
 Key: KAFKA-7283
 URL: https://issues.apache.org/jira/browse/KAFKA-7283
 Project: Kafka
  Issue Type: New Feature
Reporter: Zhanxiang (Patrick) Huang
Assignee: Zhanxiang (Patrick) Huang


This is a follow-up ticket for KIP-263.

Currently broker will mmap the index files, read the length as well as the last 
entry of the file, and sanity check index files of all log segments in the log 
directory after the broker is started. These operations can be slow because 
broker needs to open index file and read data into page cache. In this case, 
the time to restart a broker will increase proportional to the number of 
segments in the log directory.

Per the KIP discussion, we think we can skip sanity check for segments below 
the recovery point since Kafka does not provide guarantee for segments already 
flushed to disk and sanity checking only index file benefits little when the 
segment is also corrupted because of disk failure. Therefore, we can make the 
following changes to improve broker startup time:
 # Mmap the index file and populate fields of the index file on-demand rather 
than performing costly disk operations when creating the index object on broker 
startup.
 # Skip sanity checks on indexes of segments below the recovery point.

With these changes, the broker startup time will increase only proportional to 
the number of partitions in the log directly after cleaned shutdown because 
only active segments are mmaped and sanity checked.

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSS] KIP-347: Enable batching in FindCoordinatorRequest

2018-08-13 Thread Yishun Guan
@Guozhang, thank you so much!
1. I agree, fixed.
2. Added.
3. I see, that is something that I haven't think about. How does Kafka
handle other api's different version problem now? So we have a specific
convertor that convect a new version request to a old version one for each
API (is this what the ApiVersionsRequest supposed to do, does it only
handle the detection or it should handle the conversion too)? What will be
the consequence of not having such a transformer but the version is
incompatible?

Best,
Yishun

On Sat, Aug 11, 2018 at 11:27 AM Guozhang Wang  wrote:

> Hello Yishun,
>
> Thanks for the proposed KIP. I made a pass over the wiki and here are some
> comments:
>
> 1. "DESCRIBE_GROUPS_RESPONSE_MEMBER_V0", why we need to encode the full
> schema for the "COORDINATOR_GROUPIDS_KEY_NAME" field? Note it includes a
> lot of fields such as member id that is not needed for this case. I think a
> "new ArrayOf(String)" for the group ids should be sufficient.
>
> 2. "schemaVersions" of the "FindCoordinatorRequest" needs to include
> FIND_COORDINATOR_REQUEST_V3 as well.
>
> 3. One thing you may need to consider is that, in the adminClient to handle
> broker compatibility, how to transform a new (v3) request to a bunch of
> (v2) requests if it detects the broker is still in old version and hence
> cannot support v3 request (this logic is already implemented via
> ApiVersionsRequest in AdminClient, but may need to be extended to handle
> one-to-many mapping of different versions).
>
> This is not sth. that you need to implement under this KIP, but I'd
> recommend you think about this earlier than later and see if it may affect
> this proposal.
>
>
> Guozhang
>
>
> On Sat, Aug 11, 2018 at 10:54 AM, Yishun Guan  wrote:
>
> > Hi, thank you Ted! I have addressed your comments:
> >
> > 1. Added more descriptions about later optimization.
> > 2. Yes, I will implement the V3 later when this KIP gets accepted.
> > 3. Fixed.
> >
> > Thanks,
> > Yishun
> >
> > On Fri, Aug 10, 2018 at 3:32 PM Ted Yu  wrote:
> >
> > > bq. this is the foundation of some later possible optimizations(enable
> > > batching in *describeConsumerGroups ...*
> > >
> > > *Can you say more why this change lays the foundation for the future
> > > optimizations ?*
> > >
> > > *You mentioned **FIND_COORDINATOR_REQUEST_V3 in the wiki but I don't
> see
> > it
> > > in PR.*
> > > *I assume you would add that later.*
> > >
> > > *Please read your wiki and fix grammatical error such as the
> following:*
> > >
> > > bq. that need to be make
> > >
> > > Thanks
> > >
> > > On Wed, Aug 8, 2018 at 3:55 PM Yishun Guan  wrote:
> > >
> > > > Hi all,
> > > >
> > > > I would like to start a discussion on:
> > > >
> > > > KIP-347: Enable batching in FindCoordinatorRequest
> > > > https://cwiki.apache.org/confluence/x/CgZPBQ
> > > >
> > > > Thanks @Guozhang Wang  for his help and
> patience!
> > > >
> > > > Thanks,
> > > > Yishun
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>


Re: [VOTE] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-13 Thread Vahid S Hashemian
Thanks Colin, just updated the table.
I had skipped them because they were listed in the earlier table in the 
KIP.

--Vahid




From:   Colin McCabe 
To: dev@kafka.apache.org
Date:   08/13/2018 09:45 AM
Subject:Re: [VOTE] KIP-289: Improve the default group id behavior 
in KafkaConsumer



+1 (non-binding).  Thanks, Vahid!

One last question: can you fill in the behavior for the group.id="" case, 
in the table following "This is how these two group ids will work".  It 
would be helpful to have a description of this behavior in the KIP for 
documentation purposes.

cheers,
Colin

On Tue, Aug 7, 2018, at 11:14, Vahid S Hashemian wrote:
> Hi all,
> 
> I'd like to start a vote on KIP-289 to modify the default group id of 
> KafkaConsumer.
> The KIP: 
> 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer

> The discussion thread: 
> 
https://www.mail-archive.com/dev@kafka.apache.org/msg87379.html

> 
> Thanks!
> --Vahid
> 







Permission to create KIP

2018-08-13 Thread Nikolay Izhikov
Hello, Guys.

I want to create KIP for ticket [1].
Please, give me sufficient permissions.

My JIRA ID - nizhikov.

[1] https://issues.apache.org/jira/browse/KAFKA-7277

signature.asc
Description: This is a digitally signed message part


Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression (Updated)

2018-08-13 Thread Jason Gustafson
>
> But in my opinion, since the client will fail with the API version, so we
> don't need to down-convert the messages anyway. Isn't it? So, I think we
> don't care about this case. (I'm sorry, I am not familiar with down-convert
> logic.)


Currently the broker down-converts automatically when it receives an old
version of the fetch request (a version which is known to predate the
message format in use). Typically when down-converting the message format,
we use the same compression type, but there is not much point in doing so
when we know the client doesn't support it. So if zstandard is in use, and
we have to down-convert anyway, then we can choose to use a different
compression type or no compression type.

>From my perspective, there is no significant downside to bumping the
protocol version and it has several potential benefits. Version bumps are
cheap. The main question mark in my mind is about down-conversion. Figuring
out whether down-conversion is needed is hard generally without inspecting
the fetched data, which is expensive. I think we agree in principle that we
do not want to have to pay this cost generally and prefer the clients to
fail when they see an unhandled compression type. The point I was making is
that there are some cases where we are either inspecting the data anyway
(because we have to down-convert the message format), or we have an easy
way to tell whether zstandard is in use (the topic has it configured
explicitly). In the latter case, we don't have to handle it specially. But
we do have to decide how we will handle down-conversion to older formats.

-Jason

On Sun, Aug 12, 2018 at 5:15 PM, Dongjin Lee  wrote:

> Colin and Jason,
>
> Thanks for your opinions. In summarizing, the Pros and Cons of bumping
> fetch API version are:
>
> Cons:
>
> - The Broker can't know whether a given message batch is compressed with
> zstd or not.
> - Need some additional logic for the topic explicitly configured to use
> zstd.
>
> Pros:
>
> - The broker doesn't need to conduct expensive down-conversion.
> - Can message the users to update their client.
>
> So, opinions for the backward-compatibility policy by far:
>
> - A: bump the API version - +2 (Colin, Jason)
> - B: leave unchanged - +1 (Viktor)
>
> Here are my additional comments:
>
> @Colin
>
> I greatly appreciate your response. In the case of the dictionary support,
> of course, this issue should be addressed later so we don't need it in the
> first version. You are right - it is not late to try it after some
> benchmarks. What I mean is, we should keep in mind on that potential
> feature.
>
> @Jason
>
> You wrote,
>
> > Similarly, if we have to down-convert anyway because the client does not
> understand the message format, then we could also use a different
> compression type.
>
> But in my opinion, since the client will fail with the API version, so we
> don't need to down-convert the messages anyway. Isn't it? So, I think we
> don't care about this case. (I'm sorry, I am not familiar with down-convert
> logic.)
>
> Please give more opinions. Thanks!
>
> - Dongjin
>
>
> On Wed, Aug 8, 2018 at 6:41 AM Jason Gustafson  wrote:
>
> > Hey Colin,
> >
> > The problem for the fetch API is that the broker does not generally know
> if
> > a batch was compressed with zstd unless it parses it. I think the goal
> here
> > is to avoid the expensive down-conversion that is needed to ensure
> > compatibility because it is only necessary if zstd is actually in use.
> But
> > as long as old clients can parse the message format, they should get a
> > reasonable error if they see an unsupported compression type in the
> > attributes. Basically the onus is on users to ensure that their consumers
> > have been updated prior to using zstd. It seems like a reasonable
> tradeoff
> > to me. There are a couple cases that might be worth thinking through:
> >
> > 1. If a topic is explicitly configured to use zstd, then we don't need to
> > check the fetched data for the compression type to know if we need
> > down-conversion. If we did bump the Fetch API version, then we could
> handle
> > this case by either down-converting using a different compression type or
> > returning an error.
> > 2. Similarly, if we have to down-convert anyway because the client does
> not
> > understand the message format, then we could also use a different
> > compression type.
> >
> > For the produce API, I think it's reasonable to bump the api version.
> This
> > can be used by clients to check whether a broker supports zstd. For
> > example, we might support a list of preferred compression types in the
> > producer and we could use the broker to detect which version to use.
> >
> > -Jason
> >
> > On Tue, Aug 7, 2018 at 1:32 PM, Colin McCabe  wrote:
> >
> > > Thanks for bumping this, Dongjin.  ZStd is a good compression codec
> and I
> > > hope we can get this support in soon!
> > >
> > > I would say we can just bump the API version to indicate that ZStd
> > support
> > > is 

Re: [DISCUSS] Applying scalafmt to core code

2018-08-13 Thread Colin McCabe
On Wed, Aug 8, 2018, at 10:19, Ray Chiang wrote:
> By doing piecemeal formatting, I don't think we can do a "hard" 
> enforcement on using scalafmt with every PR, but by allowing the tool to 
> run on already modified files in a patch, we can slowly migrate towards 
> getting the entire code base clean.  The trade offs are pretty standard 
> (giant patch polluting "git blame" vs. slower cleanup).  This came out 
> of the discussion in KAFKA-2423, where most seemed against one giant patch.

Right, there is definitely a tradeoff there.

> 
> The benefits of pretty-printing tends to be limited, but it does open 
> the door for other linting/static analysis tools without the need to 
> turn off their particular pretty-printing features (which is in some, 
> but not all tools).

It seems like either way, we will have to customize the tools to whatever style 
we decide on.  In the case of findbugs, at least, turning off or modifying the 
style settings is extremely easy to do.  I have observed the same with the 
other static analysis tools I have used.  So I don't think this particular 
question should make a big difference to the final decision.

best,
Colin


> 
> -Ray
> 
> 
> On 20180807 11:41 AM, Colin McCabe wrote:
> > Hmm.  It would be unfortunate to make contributors include unrelated style 
> > changes in their PRs.  This would be especially hard on new contributors 
> > who might not want to make a large change.
> >
> > If we really want to do something like this, I would vote for A1.  Just do 
> > the change all at once and get it over with.
> >
> > I'm also curious what benefit we get out of making these changes.  If the 
> > code style was acceptable to the reviewers who committed the code, maybe we 
> > should leave it alone?
> >
> > best,
> > Colin
> >
> >
> > On Tue, Aug 7, 2018, at 09:41, Guozhang Wang wrote:
> >> Hello Ray,
> >>
> >> I saw on the original PR Jason (cc'ed) expressed a concern comparing
> >> scalafmt with scalastyle: the latter will throw exceptions in the build
> >> process to notify developers while the former will automatically reformat
> >> the code that developers may not be aware of. So I think maybe Jason can
> >> elaborate a bit more of your thoughts on this regard.
> >>
> >> Personally I like this idea (scalafmt). As for cherry-picking burdens,
> >> there may be always some unavoidable, and I think B4 seems less invasive
> >> and hence preferable.
> >>
> >>
> >> Guozhang
> >>
> >>
> >>
> >>
> >> On Mon, Jul 30, 2018 at 1:20 PM, Ray Chiang  wrote:
> >>
> >>> I had started on KAFKA-2423 (was Scalastyle, now Expand scalafmt to
> >>> core).  As part of the cleanup, applying the "gradlew spotlessApply"
> >>> command ended up affecting too many (435 out of 439) files.  Since this
> >>> will affect every file, this sort of change does risk polluting the git
> >>> logs.
> >>>
> >>> So, I'd like to get a discussion going to find some agreement on an
> >>> approach.  Right now, I see two categories of options:
> >>>
> >>> A) Getting scalafmt working on the existing code
> >>> B) Getting all the code conforming to scalafmt requirements
> >>>
> >>> For the first, I see a couple of approaches:
> >>>
> >>> A1) Do the minimum change that allows scalafmt to run on all the .scala
> >>> files
> >>> A2) Make the change so that scalafmt runs as-is (only on the streams code)
> >>> and add a different task/options that allow running scalafmt on a subset 
> >>> of
> >>> code.  (Reasons explained below)
> >>>
> >>> For the second, I can think of the following options:
> >>>
> >>> B1) Do one giant git commit of all cleaned code (no one seemed to like
> >>> this)
> >>> B2) Do git commits one file at a time (trunk or as a branch)
> >>> B3) Do git commits one leaf subdirectory at a time (trunk or as a branch)
> >>> B4) With each pull request on all patches, run option A2) on the affected
> >>> files
> >>>
> >>>  From what I can envision, options B2 and B3 require quite a bit of manual
> >>> work if we want to cover multiple releases.  The "cleanest" option I can
> >>> think of looks something like:
> >>>
> >>> C1) Contributor makes code modifications for their JIRA
> >>> C2) Contributor runs option A2 to also apply scalafmt to their existing
> >>> code
> >>> C3) Committer does the regular review process
> >>>
> >>> At some point in the future, enough cleanup could be done that the final
> >>> cleanup can be done as a much smaller set of MINOR commits.
> >>>
> >>> -Ray
> >>>
> >>>
> >>
> >> -- 
> >> -- Guozhang
> 


Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-13 Thread Dhruvil Shah
Thanks for the KIP, Stanislav! +1 (non-binding)

- Dhruvil

On Mon, Aug 13, 2018 at 9:39 AM Colin McCabe  wrote:

> +1 (non-binding)
>
> best,
> Colin
>
> On Tue, Aug 7, 2018, at 04:19, Stanislav Kozlovski wrote:
> > Hey everybody,
> > I'm starting a vote on KIP-346
> > <
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-346+-+Improve+LogCleaner+behavior+on+error
> >
> >
> > --
> > Best,
> > Stanislav
>


Re: [VOTE] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-13 Thread Colin McCabe
+1 (non-binding).  Thanks, Vahid!

One last question: can you fill in the behavior for the group.id="" case, in 
the table following "This is how these two group ids will work".  It would be 
helpful to have a description of this behavior in the KIP for documentation 
purposes.

cheers,
Colin

On Tue, Aug 7, 2018, at 11:14, Vahid S Hashemian wrote:
> Hi all,
> 
> I'd like to start a vote on KIP-289 to modify the default group id of 
> KafkaConsumer.
> The KIP: 
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer
> The discussion thread: 
> https://www.mail-archive.com/dev@kafka.apache.org/msg87379.html
> 
> Thanks!
> --Vahid
> 


Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-13 Thread Colin McCabe
+1 (non-binding)

best,
Colin

On Tue, Aug 7, 2018, at 04:19, Stanislav Kozlovski wrote:
> Hey everybody,
> I'm starting a vote on KIP-346
> 
> 
> -- 
> Best,
> Stanislav


Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-08-13 Thread Adam Bellemare
CC Jan

On Mon, Aug 13, 2018 at 12:16 PM, Adam Bellemare 
wrote:

> Hi Jan
>
> If you do not use headers or other metadata, how do you ensure that
> changes to the foreign-key value are not resolved out-of-order?
> ie: If an event has FK = A, but you change it to FK = B, you need to
> propagate both a delete (FK=A -> null) and an addition (FK=B). In my
> solution, without maintaining any metadata, it is possible for the final
> output to be in either order - the correctly updated joined value, or the
> null for the delete.
>
> (key, null)
> (key, )
>
> or
>
> (key, )
> (key, null)
>
> I looked back through your code and through the discussion threads, and
> didn't see any information on how you resolved this. I have a version of my
> code working for 2.0, I am just adding more integration tests and will
> update the KIP accordingly. Any insight you could provide on resolving
> out-of-order semantics without metadata would be helpful.
>
> Thanks
> Adam
>
>
> On Mon, Aug 13, 2018 at 3:34 AM, Jan Filipiak 
> wrote:
>
>> Hi,
>>
>> Happy to see that you want to make an effort here.
>>
>> Regarding the ProcessSuppliers I couldn't find a way to not rewrite the
>> joiners + the merger.
>> The re-partitioners can be reused in theory. I don't know if repartition
>> is optimized in 2.0 now.
>>
>> I made this
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-241+KT
>> able+repartition+with+compacted+Topics
>> back then and we are running KIP-213 with KIP-241 in combination.
>>
>> For us it is vital as it minimized the size we had in our repartition
>> topics plus it removed the factor of 2 in events on every message.
>> I know about this new  "delete once consumer has read it".  I don't think
>> 241 is vital for all usecases, for ours it is. I wanted
>> to use 213 to sneak in the foundations for 241 aswell.
>>
>> I don't quite understand what a PropagationWrapper is, but I am certain
>> that you do not need RecordHeaders
>> for 213 and I would try to leave them out. They either belong to the DSL
>> or to the user, having a mixed use is
>> to be avoided. We run the join with 0.8 logformat and I don't think one
>> needs more.
>>
>> This KIP will be very valuable for the streams project! I couldn't never
>> convince myself to invest into the 1.0+ DSL
>> as I used almost all my energy to fight against it. Maybe this can also
>> help me see the good sides a little bit more.
>>
>> If there is anything unclear with all the text that has been written,
>> feel free to just directly cc me so I don't miss it on
>> the mailing list.
>>
>> Best Jan
>>
>>
>>
>>
>>
>> On 08.08.2018 15:26, Adam Bellemare wrote:
>>
>>> More followup, and +dev as Guozhang replied to me directly previously.
>>>
>>> I am currently porting the code over to trunk. One of the major changes
>>> since 1.0 is the usage of GraphNodes. I have a question about this:
>>>
>>> For a foreignKey joiner, should it have its own dedicated node type? Or
>>> would it be advisable to construct it from existing GraphNode components?
>>> For instance, I believe I could construct it from several
>>> OptimizableRepartitionNode, some SinkNode, some SourceNode, and several
>>> StatefulProcessorNode. That being said, there is some underlying
>>> complexity
>>> to each approach.
>>>
>>> I will be switching the KIP-213 to use the RecordHeaders in Kafka Streams
>>> instead of the PropagationWrapper, but conceptually it should be the
>>> same.
>>>
>>> Again, any feedback is welcomed...
>>>
>>>
>>> On Mon, Jul 30, 2018 at 9:38 AM, Adam Bellemare <
>>> adam.bellem...@gmail.com>
>>> wrote:
>>>
>>> Hi Guozhang et al

 I was just reading the 2.0 release notes and noticed a section on Record
 Headers.
 https://cwiki.apache.org/confluence/display/KAFKA/KIP-
 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API

 I am not yet sure if the contents of a RecordHeader is propagated all
 the
 way through the Sinks and Sources, but if it is, and if it remains
 attached
 to the record (including null records) I may be able to ditch the
 propagationWrapper for an implementation using RecordHeader. I am not
 yet
 sure if this is doable, so if anyone understands RecordHeader impl
 better
 than I, I would be happy to hear from you.

 In the meantime, let me know of any questions. I believe this PR has a
 lot
 of potential to solve problems for other people, as I have encountered a
 number of other companies in the wild all home-brewing their own
 solutions
 to come up with a method of handling relational data in streams.

 Adam


 On Fri, Jul 27, 2018 at 1:45 AM, Guozhang Wang 
 wrote:

 Hello Adam,
>
> Thanks for rebooting the discussion of this KIP ! Let me finish my pass
> on the wiki and get back to you soon. Sorry for the delays..
>
> Guozhang
>
> On Tue, Jul 24, 2018 at 6:08 AM, Adam Bellemare <
> adam.bellem...@gmail.com

Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-08-13 Thread Adam Bellemare
Hi Jan

If you do not use headers or other metadata, how do you ensure that changes
to the foreign-key value are not resolved out-of-order?
ie: If an event has FK = A, but you change it to FK = B, you need to
propagate both a delete (FK=A -> null) and an addition (FK=B). In my
solution, without maintaining any metadata, it is possible for the final
output to be in either order - the correctly updated joined value, or the
null for the delete.

(key, null)
(key, )

or

(key, )
(key, null)

I looked back through your code and through the discussion threads, and
didn't see any information on how you resolved this. I have a version of my
code working for 2.0, I am just adding more integration tests and will
update the KIP accordingly. Any insight you could provide on resolving
out-of-order semantics without metadata would be helpful.

Thanks
Adam


On Mon, Aug 13, 2018 at 3:34 AM, Jan Filipiak 
wrote:

> Hi,
>
> Happy to see that you want to make an effort here.
>
> Regarding the ProcessSuppliers I couldn't find a way to not rewrite the
> joiners + the merger.
> The re-partitioners can be reused in theory. I don't know if repartition
> is optimized in 2.0 now.
>
> I made this
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-241+
> KTable+repartition+with+compacted+Topics
> back then and we are running KIP-213 with KIP-241 in combination.
>
> For us it is vital as it minimized the size we had in our repartition
> topics plus it removed the factor of 2 in events on every message.
> I know about this new  "delete once consumer has read it".  I don't think
> 241 is vital for all usecases, for ours it is. I wanted
> to use 213 to sneak in the foundations for 241 aswell.
>
> I don't quite understand what a PropagationWrapper is, but I am certain
> that you do not need RecordHeaders
> for 213 and I would try to leave them out. They either belong to the DSL
> or to the user, having a mixed use is
> to be avoided. We run the join with 0.8 logformat and I don't think one
> needs more.
>
> This KIP will be very valuable for the streams project! I couldn't never
> convince myself to invest into the 1.0+ DSL
> as I used almost all my energy to fight against it. Maybe this can also
> help me see the good sides a little bit more.
>
> If there is anything unclear with all the text that has been written, feel
> free to just directly cc me so I don't miss it on
> the mailing list.
>
> Best Jan
>
>
>
>
>
> On 08.08.2018 15:26, Adam Bellemare wrote:
>
>> More followup, and +dev as Guozhang replied to me directly previously.
>>
>> I am currently porting the code over to trunk. One of the major changes
>> since 1.0 is the usage of GraphNodes. I have a question about this:
>>
>> For a foreignKey joiner, should it have its own dedicated node type? Or
>> would it be advisable to construct it from existing GraphNode components?
>> For instance, I believe I could construct it from several
>> OptimizableRepartitionNode, some SinkNode, some SourceNode, and several
>> StatefulProcessorNode. That being said, there is some underlying
>> complexity
>> to each approach.
>>
>> I will be switching the KIP-213 to use the RecordHeaders in Kafka Streams
>> instead of the PropagationWrapper, but conceptually it should be the same.
>>
>> Again, any feedback is welcomed...
>>
>>
>> On Mon, Jul 30, 2018 at 9:38 AM, Adam Bellemare > >
>> wrote:
>>
>> Hi Guozhang et al
>>>
>>> I was just reading the 2.0 release notes and noticed a section on Record
>>> Headers.
>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>> 244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API
>>>
>>> I am not yet sure if the contents of a RecordHeader is propagated all the
>>> way through the Sinks and Sources, but if it is, and if it remains
>>> attached
>>> to the record (including null records) I may be able to ditch the
>>> propagationWrapper for an implementation using RecordHeader. I am not yet
>>> sure if this is doable, so if anyone understands RecordHeader impl better
>>> than I, I would be happy to hear from you.
>>>
>>> In the meantime, let me know of any questions. I believe this PR has a
>>> lot
>>> of potential to solve problems for other people, as I have encountered a
>>> number of other companies in the wild all home-brewing their own
>>> solutions
>>> to come up with a method of handling relational data in streams.
>>>
>>> Adam
>>>
>>>
>>> On Fri, Jul 27, 2018 at 1:45 AM, Guozhang Wang 
>>> wrote:
>>>
>>> Hello Adam,

 Thanks for rebooting the discussion of this KIP ! Let me finish my pass
 on the wiki and get back to you soon. Sorry for the delays..

 Guozhang

 On Tue, Jul 24, 2018 at 6:08 AM, Adam Bellemare <
 adam.bellem...@gmail.com

> wrote:
> Let me kick this off with a few starting points that I would like to
> generate some discussion on.
>
> 1) It seems to me that I will need to repartition the data twice - once
> on
> the foreign key, and once back to the primary 

[jira] [Resolved] (KAFKA-7257) --property (silently) fails when configuring SSL parameters

2018-08-13 Thread Manikumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-7257.
--
Resolution: Not A Problem

> --property (silently) fails when configuring SSL parameters
> ---
>
> Key: KAFKA-7257
> URL: https://issues.apache.org/jira/browse/KAFKA-7257
> Project: Kafka
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 0.11.0.1
>Reporter: Andrew Pennebaker
>Priority: Major
>
> The security.protocol, ssl.truststore.location, ssl.truststore.password, 
> ssl.protocol, and ssl.enabled.protocols keys appear to fail to propagate to 
> the `kafka_console_\{producer,consumer}.sh` scripts when configured using the 
> `--property` CLI flag, resulting in the producer/consumer silently failing to 
> pass messages through Kafka.
> As a workaround, I am explicitly configuring these keys in a ssl.properties 
> file, and supplying the path to the `--producer.config` 
> (`kafka_console_producer.sh`) and `--consumer.config` 
> (`kafka_console_consumer.sh`) flags.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: I have issue in Kafka 2.0

2018-08-13 Thread Steve Tian
Have you checked the javadoc of KafkaConsumer?

On Mon, Aug 13, 2018, 11:10 PM Kailas Biradar  wrote:

> I have issue more time this ConcurrentModificationException because
> KafkaConsumer is not safe for multi-threaded access
>
> --
> kailas
>


I have issue in Kafka 2.0

2018-08-13 Thread Kailas Biradar
I have issue more time this ConcurrentModificationException because
KafkaConsumer is not safe for multi-threaded access

-- 
kailas


Re: [DISCUSS] KIP-258: Allow to Store Record Timestamps in RocksDB

2018-08-13 Thread Eno Thereska
Hi Matthias,

Good stuff. Could you comment a bit on how future-proof is this change? For
example, if we want to store both event timestamp "and" processing time in
RocksDB will we then need another interface (e.g. called
KeyValueWithTwoTimestampsStore)?

Thanks
Eno

On Thu, Aug 9, 2018 at 2:30 PM, Matthias J. Sax 
wrote:

> Thanks for your input Guozhang and John.
>
> I see your point, that the upgrade API is not simple. If you don't
> thinks it's valuable to make generic store upgrades possible (atm), we
> can make the API internal, too. The impact is, that we only support a
> predefined set up upgrades (ie, KV to KVwithTs, Windowed to
> WindowedWithTS etc) for which we implement the internal interfaces.
>
> We can keep the design generic, so if we decide to make it public, we
> don't need to re-invent it. This will also have the advantage, that we
> can add upgrade pattern for other stores later, too.
>
> I also agree, that the `StoreUpgradeBuilder` is a little ugly, but it
> was the only way I could find to design a generic upgrade interface. If
> we decide the hide all the upgrade stuff, `StoreUpgradeBuilder` would
> become an internal interface I guess (don't think we can remove it).
>
> I will wait for more feedback about this and if nobody wants to keep it
> as public API I will update the KIP accordingly. Will add some more
> clarifications for different upgrade patterns in the mean time and fix
> the typos/minor issues.
>
> About adding a new state UPGRADING: maybe we could do that. However, I
> find it particularly difficult to make the estimation when we should
> switch to RUNNING, thus, I am a little hesitant. Using store callbacks
> or just logging the progress including some indication about the "lag"
> might actually be sufficient. Not sure what others think?
>
> About "value before timestamp": no real reason and I think it does not
> make any difference. Do you want to change it?
>
> About upgrade robustness: yes, we cannot control if an instance fails.
> That is what I meant by "we need to write test". The upgrade should be
> able to continuous even is an instance goes down (and we must make sure
> that we don't end up in an invalid state that forces us to wipe out the
> whole store). Thus, we need to write system tests that fail instances
> during upgrade.
>
> For `in_place_offline` upgrade: I don't think we need this mode, because
> people can do this via a single rolling bounce.
>
>  - prepare code and switch KV-Store to KVwithTs-Store
>  - do a single rolling bounce (don't set any upgrade config)
>
> For this case, the `StoreUpgradeBuilder` (or `KVwithTs-Store` if we
> remove the `StoreUpgradeBuilder`) will detect that there is only an old
> local KV store w/o TS, will start to restore the new KVwithTs store,
> wipe out the old store and replace with the new store after restore is
> finished, and start processing only afterwards. (I guess we need to
> document this case -- will also add it to the KIP.)
>
>
>
> -Matthias
>
>
>
> On 8/9/18 1:10 PM, John Roesler wrote:
> > Hi Matthias,
> >
> > I think this KIP is looking really good.
> >
> > I have a few thoughts to add to the others:
> >
> > 1. You mentioned at one point users needing to configure
> > `upgrade.mode="null"`. I think this was a typo and you meant to say they
> > should remove the config. If they really have to set it to a string
> "null"
> > or even set it to a null value but not remove it, it would be
> unfortunate.
> >
> > 2. In response to Bill's comment #1 , you said that "The idea is that the
> > upgrade should be robust and not fail. We need to write according tests".
> > I may have misunderstood the conversation, but I don't think it's within
> > our power to say that an instance won't fail. What if one of my computers
> > catches on fire? What if I'm deployed in the cloud and one instance
> > disappears and is replaced by a new one? Or what if one instance goes
> AWOL
> > for a long time and then suddenly returns? How will the upgrade process
> > behave in light of such failures?
> >
> > 3. your thought about making in-place an offline mode is interesting, but
> > it might be a bummer for on-prem users who wish to upgrade online, but
> > cannot just add new machines to the pool. It could be a new upgrade mode
> > "offline-in-place", though...
> >
> > 4. I was surprised to see that a user would need to modify the topology
> to
> > do an upgrade (using StoreUpgradeBuilder). Maybe some of Guozhang's
> > suggestions would remove this necessity.
> >
> > Thanks for taking on this very complex but necessary work.
> >
> > -John
> >
> > On Thu, Aug 9, 2018 at 12:22 PM Guozhang Wang 
> wrote:
> >
> >> Hello Matthias,
> >>
> >> Thanks for the updated KIP. Some more comments:
> >>
> >> 1. The current set of proposed API is a bit too complicated, which makes
> >> the upgrade flow from user's perspective also a bit complex. I'd like to
> >> check different APIs and discuss about their needs separately:
> >>
> >> 1.a. 

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-08-13 Thread Becket Qin
Hi Lucas,

Thanks for the explanation. It might be a nitpick, but it seems better to
mention in the motivation part that today the client requests and
controller requests are not only sharing the same queue, but also a bunch
of things else, so that we can avoid asking people to read the rejected
alternatives.

Thanks,

Jiangjie (Becket) Qin







On Fri, Aug 10, 2018 at 6:23 AM, Lucas Wang  wrote:

> @Becket,
>
> I've asked for review by Jun and Joel in the vote thread.
> Regarding the separate thread and port, I did talk about it in the rejected
> alternative design 1.
> Please let me know if you'd like more elaboration or moving it to the
> motivation, etc.
>
> Thanks,
> Lucas
>
> On Wed, Aug 8, 2018 at 3:59 PM, Becket Qin  wrote:
>
> > Hi Lucas,
> >
> > Yes, a separate Jira is OK.
> >
> > Since the proposal has significantly changed since the initial vote
> > started. We probably should let the others who have already voted know
> and
> > ensure they are happy with the updated proposal.
> > Also, it seems the motivation part of the KIP wiki is still just talking
> > about the separate queue and not fully cover the changes we make now,
> e.g.
> > separate thread, port, etc. We might want to explain a bit more so for
> > people who did not follow the discussion mail thread also understand the
> > whole proposal.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Wed, Aug 8, 2018 at 12:44 PM, Lucas Wang 
> wrote:
> >
> > > Hi Becket,
> > >
> > > Thanks for the review. The current write up in the KIP won’t change the
> > > ordering behavior. Are you ok with addressing that as a separate
> > > independent issue (I’ll create a separate ticket for it)?
> > > If so, can you please give me a +1 on the vote thread?
> > >
> > > Thanks,
> > > Lucas
> > >
> > > On Tue, Aug 7, 2018 at 7:34 PM Becket Qin 
> wrote:
> > >
> > > > Thanks for the updated KIP wiki, Lucas. Looks good to me overall.
> > > >
> > > > It might be an implementation detail, but do we still plan to use the
> > > > correlation id to ensure the request processing order?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Jul 31, 2018 at 3:39 AM, Lucas Wang 
> > > wrote:
> > > >
> > > > > Thanks for your review, Dong.
> > > > > Ack that these configs will have a bigger impact for users.
> > > > >
> > > > > On the other hand, I would argue that the request queue becoming
> full
> > > > > may or may not be a rare scenario.
> > > > > How often the request queue gets full depends on the request
> incoming
> > > > rate,
> > > > > the request processing rate, and the size of the request queue.
> > > > > When that happens, the dedicated endpoints design can better handle
> > > > > it than any of the previously discussed options.
> > > > >
> > > > > Another reason I made the change was that I have the same taste
> > > > > as Becket that it's a better separation of the control plane from
> the
> > > > data
> > > > > plane.
> > > > >
> > > > > Finally, I want to clarify that this change is NOT motivated by the
> > > > > out-of-order
> > > > > processing discussion. The latter problem is orthogonal to this
> KIP,
> > > and
> > > > it
> > > > > can happen in any of the design options we discussed for this KIP
> so
> > > far.
> > > > > So I'd like to address out-of-order processing separately in
> another
> > > > > thread,
> > > > > and avoid mentioning it in this KIP.
> > > > >
> > > > > Thanks,
> > > > > Lucas
> > > > >
> > > > > On Fri, Jul 27, 2018 at 7:51 PM, Dong Lin 
> > wrote:
> > > > >
> > > > > > Hey Lucas,
> > > > > >
> > > > > > Thanks for the update.
> > > > > >
> > > > > > The current KIP propose new broker configs
> > "listeners.for.controller"
> > > > and
> > > > > > "advertised.listeners.for.controller". This is going to be a big
> > > change
> > > > > > since listeners are among the most important configs that every
> > user
> > > > > needs
> > > > > > to change. According to the rejected alternative section, it
> seems
> > > that
> > > > > the
> > > > > > reason to add these two configs is to improve performance when
> the
> > > data
> > > > > > request queue is full rather than for correctness. It should be a
> > > very
> > > > > rare
> > > > > > scenario and I am not sure we should add configs for all users
> just
> > > to
> > > > > > improve the performance in such rare scenario.
> > > > > >
> > > > > > Also, if the new design is based on the issues which are
> discovered
> > > in
> > > > > the
> > > > > > recent discussion, e.g. out of order processing if we don't use a
> > > > > dedicated
> > > > > > thread for controller request, it may be useful to explain the
> > > problem
> > > > in
> > > > > > the motivation section.
> > > > > >
> > > > > > Thanks,
> > > > > > Dong
> > > > > >
> > > > > > On Fri, Jul 27, 2018 at 1:28 PM, Lucas Wang <
> lucasatu...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > A kind reminder for review of this KIP.
> > > > > > >
> > > > > > > Thank you very much!

Re: [DISCUSS] KIP-342 Add Customizable SASL extensions to OAuthBearer authentication

2018-08-13 Thread Stanislav Kozlovski
Hi,

I've written up an initial implementation of what has been discussed. Take
a look at it here: https://github.com/apache/kafka/pull/5497/
I will make sure to update the KIP once a review of the PR passes


On Mon, Aug 13, 2018 at 10:19 AM Rajini Sivaram 
wrote:

> Hi Stanislav,
>
> I think `token` and `extensions` on
> `OAuthBearerExtensionsValidatorCallback`
> should be immutable. The getters should return whatever was provided in the
> constructor and these should be stored as `final` objects. The whole point
> of separating out `OAuthBearerExtensionsValidatorCallback` from
> `OAuthBearerValidatorCallback`
> was to ensure that tokens are securely validated and created without any
> reference to insecure extensions. So it is critical that
> `OAuthBearerSaslServer` never uses the token returned by the extensions
> callback. As for the extensions, we should have two methods -
> {`extensions()`, `validatedExtensions()`} as I suggested in the last note
> OR {`defaultExtensions()`, `extensions()`} as used in NameCallback. I don't
> think we should make extensions mutable to use the same value for input and
> output. Btw, on error, we shouldn't care about the values in the returned
> extensions at all, we should simply fail authentication.
>
>
> On Sat, Aug 11, 2018 at 1:21 PM, Stanislav Kozlovski <
> stanis...@confluent.io
> > wrote:
>
> > Hi,
> >
> > @Ron
> > Agreed, tracking multiple errors would be better and would help diagnose
> > bad extensions faster
> > I've updated the KIP to address your two comments.
> > Regarding the Javadoc, please read below:
> >
> > @Rajini
> > The idea of the potentially-null token and extensions is not that they
> can
> > be passed to the constructor - it is that they can be nullified on a
> > validation error occurring (like it is done in the #error() method on
> > `OAuthBearerValidatorCallback`). Maybe it doesn't make too much sense to
> > nullify the token, but I believe it is worth it to do the same with the
> > extensions.
> >
> > I agree that the callback handler should himself populate the callback
> with
> > the *validated* extensions only. Will change implementation and KIP in
> due
> > time.
> >
> > Please share what you think about nullifying token/extensions on
> validation
> > error.
> >
> > Best,
> > Stanislav
> >
> >
> > On Fri, Aug 10, 2018 at 7:24 PM Rajini Sivaram 
> > wrote:
> >
> > > Hi Stanislav,
> > >
> > > For the point that Ron made above for:
> > >
> > > public OAuthBearerExtensionsValidatorCallback(OAuthBearerToken token,
> > > SaslExtensions
> > > extensions)
> > >
> > >
> > > I don't think we should ever invoke extensions callback without the
> > token.
> > > We can first validate the token and invoke extensions callback only if
> > > token is non-null. Can we clarify that in the javadoc?
> > >
> > >- public SaslExtensions extensions() : Extensions should be non-null
> > >- public OAuthBearerToken token() : Token should be non-null
> > >
> > >
> > > Also agree with Ron that we should have the ability to return errors
> for
> > > all invalid extensions, even if a callback handler may choose to stop
> on
> > > first failure.
> > >
> > > I think we also need another method to return the extensions that were
> > > validated and will be made available as negotiated properties. As per
> the
> > > RFC, server should ignore unknown extensions. So callback handlers need
> > to
> > > be able to validate the ones they know of and return those. Other
> > > extensions should not be added to the SaslServer's negotiated
> properties.
> > >
> > >- public SaslExtensions validatedExtensions()
> > >
> > >
> > >
> > > On Fri, Aug 10, 2018 at 3:26 PM, Ron Dagostino 
> > wrote:
> > >
> > > > Hi Stanislav.  Here are a few KIP comments.
> > > >
> > > > << > > > values to ensure they conform to the OAuth standard
> > > > It is the SASL/OAUTHBEARER standard that defines the regular
> > expressions
> > > > (specifically, https://tools.ietf.org/html/rfc7628#section-3.1)
> rather
> > > > than
> > > > any of the OAuth specifications.  It would be good to make this
> > > > clarification.
> > > >
> > > > << > token,
> > > > SaslExtensions extensions)
> > > > This constructor lacks Javadoc in the KIP.  Could you add it, and
> also
> > > > indicate which of the two parameters are required vs. optional?  The
> > > > Javadoc for the token() method indicates that the return value could
> be
> > > > null, but that would only be true if the constructor accepted a null
> > > value
> > > > for the token.  I'm okay with the constructor accepting a null token
> > > > (Rajini, you may differ in opinion, in which case I defer to your
> > > > preference).  But please do clarify this issue.
> > > >
> > > > I also am not sure if exposing just one invalid extension name and
> > error
> > > > message in the OAuthBearerExtensionsValidatorCallback class is good
> > > > enough.  An alternative to invalidExtensionName() and errorMessage()
> > > > methods would be to return an always 

[jira] [Created] (KAFKA-7282) Failed to read `log header` from file channel

2018-08-13 Thread Alastair Munro (JIRA)
Alastair Munro created KAFKA-7282:
-

 Summary: Failed to read `log header` from file channel
 Key: KAFKA-7282
 URL: https://issues.apache.org/jira/browse/KAFKA-7282
 Project: Kafka
  Issue Type: Bug
Affects Versions: 2.0.0, 1.1.1, 0.11.0.2
 Environment: Linux
Reporter: Alastair Munro


Full stack trace:
{code:java}
[2018-08-13 11:22:01,635] ERROR [ReplicaManager broker=2] Error processing 
fetch operation on partition segmenter-evt-v1-14, offset 96745 
(kafka.server.ReplicaManager)
org.apache.kafka.common.KafkaException: java.io.EOFException: Failed to read 
`log header` from file channel `sun.nio.ch.FileChannelImpl@6e6d8ddd`. Expected 
to read 17 bytes, but reached end of file after reading 0 bytes. Started read 
from position 25935.
at 
org.apache.kafka.common.record.RecordBatchIterator.makeNext(RecordBatchIterator.java:40)
at 
org.apache.kafka.common.record.RecordBatchIterator.makeNext(RecordBatchIterator.java:24)
at 
org.apache.kafka.common.utils.AbstractIterator.maybeComputeNext(AbstractIterator.java:79)
at 
org.apache.kafka.common.utils.AbstractIterator.hasNext(AbstractIterator.java:45)
at 
org.apache.kafka.common.record.FileRecords.searchForOffsetWithSize(FileRecords.java:286)
at kafka.log.LogSegment.translateOffset(LogSegment.scala:254)
at kafka.log.LogSegment.read(LogSegment.scala:277)
at kafka.log.Log$$anonfun$read$2.apply(Log.scala:1159)
at kafka.log.Log$$anonfun$read$2.apply(Log.scala:1114)
at kafka.log.Log.maybeHandleIOException(Log.scala:1837)
at kafka.log.Log.read(Log.scala:1114)
at 
kafka.server.ReplicaManager.kafka$server$ReplicaManager$$read$1(ReplicaManager.scala:912)
at 
kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:974)
at 
kafka.server.ReplicaManager$$anonfun$readFromLocalLog$1.apply(ReplicaManager.scala:973)
at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at kafka.server.ReplicaManager.readFromLocalLog(ReplicaManager.scala:973)
at kafka.server.ReplicaManager.readFromLog$1(ReplicaManager.scala:802)
at kafka.server.ReplicaManager.fetchMessages(ReplicaManager.scala:815)
at kafka.server.KafkaApis.handleFetchRequest(KafkaApis.scala:678)
at kafka.server.KafkaApis.handle(KafkaApis.scala:107)
at kafka.server.KafkaRequestHandler.run(KafkaRequestHandler.scala:69)
at java.lang.Thread.run(Thread.java:748)
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-2.0-jdk8 #115

2018-08-13 Thread Apache Jenkins Server
See 


Changes:

[jason] KAFKA-7164; Follower should truncate after every missed leader epoch

--
[...truncated 882.89 KB...]
kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOfflinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidNonexistentPartitionToOfflinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionZkUtilsExceptionFromCreateStates 
STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransitionZkUtilsExceptionFromCreateStates 
PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidNewPartitionToNonexistentPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidNewPartitionToNonexistentPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testNewPartitionToOnlinePartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNewPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNewPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionErrorCodeFromStateLookup STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionErrorCodeFromStateLookup PASSED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransitionForControlledShutdown STARTED

kafka.controller.PartitionStateMachineTest > 
testOnlinePartitionToOnlineTransitionForControlledShutdown PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionZkUtilsExceptionFromStateLookup 
STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransitionZkUtilsExceptionFromStateLookup 
PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNonexistentPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOnlinePartitionToNonexistentPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testInvalidOfflinePartitionToNewPartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testInvalidOfflinePartitionToNewPartitionTransition PASSED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransition STARTED

kafka.controller.PartitionStateMachineTest > 
testOfflinePartitionToOnlinePartitionTransition PASSED

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
STARTED

kafka.controller.ControllerEventManagerTest > testEventThatThrowsException 
PASSED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent STARTED

kafka.controller.ControllerEventManagerTest > testSuccessfulEvent PASSED

kafka.controller.ControllerFailoverTest > testHandleIllegalStateException 
STARTED

kafka.controller.ControllerFailoverTest > testHandleIllegalStateException PASSED

kafka.network.SocketServerTest > testGracefulClose STARTED

kafka.network.SocketServerTest > testGracefulClose PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone PASSED

kafka.network.SocketServerTest > controlThrowable STARTED

kafka.network.SocketServerTest > controlThrowable PASSED

kafka.network.SocketServerTest > testRequestMetricsAfterStop STARTED

kafka.network.SocketServerTest > testRequestMetricsAfterStop PASSED

kafka.network.SocketServerTest > testConnectionIdReuse STARTED

kafka.network.SocketServerTest > testConnectionIdReuse PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
STARTED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
PASSED

kafka.network.SocketServerTest > testProcessorMetricsTags STARTED

kafka.network.SocketServerTest > testProcessorMetricsTags PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp STARTED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > testConnectionId STARTED

kafka.network.SocketServerTest > testConnectionId PASSED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics STARTED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics PASSED

kafka.network.SocketServerTest > testNoOpAction STARTED

kafka.network.SocketServerTest > 

Re: [DISCUSS] KIP-342 Add Customizable SASL extensions to OAuthBearer authentication

2018-08-13 Thread Rajini Sivaram
Hi Stanislav,

I think `token` and `extensions` on `OAuthBearerExtensionsValidatorCallback`
should be immutable. The getters should return whatever was provided in the
constructor and these should be stored as `final` objects. The whole point
of separating out `OAuthBearerExtensionsValidatorCallback` from
`OAuthBearerValidatorCallback`
was to ensure that tokens are securely validated and created without any
reference to insecure extensions. So it is critical that
`OAuthBearerSaslServer` never uses the token returned by the extensions
callback. As for the extensions, we should have two methods -
{`extensions()`, `validatedExtensions()`} as I suggested in the last note
OR {`defaultExtensions()`, `extensions()`} as used in NameCallback. I don't
think we should make extensions mutable to use the same value for input and
output. Btw, on error, we shouldn't care about the values in the returned
extensions at all, we should simply fail authentication.


On Sat, Aug 11, 2018 at 1:21 PM, Stanislav Kozlovski  wrote:

> Hi,
>
> @Ron
> Agreed, tracking multiple errors would be better and would help diagnose
> bad extensions faster
> I've updated the KIP to address your two comments.
> Regarding the Javadoc, please read below:
>
> @Rajini
> The idea of the potentially-null token and extensions is not that they can
> be passed to the constructor - it is that they can be nullified on a
> validation error occurring (like it is done in the #error() method on
> `OAuthBearerValidatorCallback`). Maybe it doesn't make too much sense to
> nullify the token, but I believe it is worth it to do the same with the
> extensions.
>
> I agree that the callback handler should himself populate the callback with
> the *validated* extensions only. Will change implementation and KIP in due
> time.
>
> Please share what you think about nullifying token/extensions on validation
> error.
>
> Best,
> Stanislav
>
>
> On Fri, Aug 10, 2018 at 7:24 PM Rajini Sivaram 
> wrote:
>
> > Hi Stanislav,
> >
> > For the point that Ron made above for:
> >
> > public OAuthBearerExtensionsValidatorCallback(OAuthBearerToken token,
> > SaslExtensions
> > extensions)
> >
> >
> > I don't think we should ever invoke extensions callback without the
> token.
> > We can first validate the token and invoke extensions callback only if
> > token is non-null. Can we clarify that in the javadoc?
> >
> >- public SaslExtensions extensions() : Extensions should be non-null
> >- public OAuthBearerToken token() : Token should be non-null
> >
> >
> > Also agree with Ron that we should have the ability to return errors for
> > all invalid extensions, even if a callback handler may choose to stop on
> > first failure.
> >
> > I think we also need another method to return the extensions that were
> > validated and will be made available as negotiated properties. As per the
> > RFC, server should ignore unknown extensions. So callback handlers need
> to
> > be able to validate the ones they know of and return those. Other
> > extensions should not be added to the SaslServer's negotiated properties.
> >
> >- public SaslExtensions validatedExtensions()
> >
> >
> >
> > On Fri, Aug 10, 2018 at 3:26 PM, Ron Dagostino 
> wrote:
> >
> > > Hi Stanislav.  Here are a few KIP comments.
> > >
> > > << > > values to ensure they conform to the OAuth standard
> > > It is the SASL/OAUTHBEARER standard that defines the regular
> expressions
> > > (specifically, https://tools.ietf.org/html/rfc7628#section-3.1) rather
> > > than
> > > any of the OAuth specifications.  It would be good to make this
> > > clarification.
> > >
> > > << token,
> > > SaslExtensions extensions)
> > > This constructor lacks Javadoc in the KIP.  Could you add it, and also
> > > indicate which of the two parameters are required vs. optional?  The
> > > Javadoc for the token() method indicates that the return value could be
> > > null, but that would only be true if the constructor accepted a null
> > value
> > > for the token.  I'm okay with the constructor accepting a null token
> > > (Rajini, you may differ in opinion, in which case I defer to your
> > > preference).  But please do clarify this issue.
> > >
> > > I also am not sure if exposing just one invalid extension name and
> error
> > > message in the OAuthBearerExtensionsValidatorCallback class is good
> > > enough.  An alternative to invalidExtensionName() and errorMessage()
> > > methods would be to return an always non-null but potentially empty
> > > Map so that potentially all of the provided extensions
> > > could be validated and the list of invalid extension names could be
> > > returned (along with the error message for each of them).  If we
> adopted
> > > this alternative then the error(String invalidExtensionName, String
> > > errorMessage) method might need to be renamed addError(String
> > > invalidExtensionName, String errorMessage).  I suspect it would be
> better
> > > to go with the map approach to support returning multiple error
> 

Build failed in Jenkins: kafka-trunk-jdk8 #2888

2018-08-13 Thread Apache Jenkins Server
See 


Changes:

[jason] KAFKA-7164; Follower should truncate after every missed leader epoch

[jason] KAFKA-5638; Improve the Required ACL of ListGroups API (KIP-231) (#5352)

--
[...truncated 426.54 KB...]

kafka.security.auth.SimpleAclAuthorizerTest > testAclInheritance STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAclInheritance PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDistributedConcurrentModificationOfResourceAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDistributedConcurrentModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnWildcardResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnWildcardResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesExtendedAclChangeEventWhenInterBrokerProtocolAtLeastKafkaV2 STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesExtendedAclChangeEventWhenInterBrokerProtocolAtLeastKafkaV2 PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralAclChangeEventWhenInterBrokerProtocolIsKafkaV2 STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralAclChangeEventWhenInterBrokerProtocolIsKafkaV2 PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnPrefixedResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnPrefixedResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testEmptyAclThrowsException 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testEmptyAclThrowsException PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testSuperUserWithCustomPrincipalHasAccess STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testSuperUserWithCustomPrincipalHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAllowAccessWithCustomPrincipal STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAllowAccessWithCustomPrincipal PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnWildcardResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnWildcardResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testChangeListenerTiming STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testChangeListenerTiming PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralWritesLiteralAclChangeEventWhenInterBrokerProtocolLessThanKafkaV2eralAclChangesForOlderProtocolVersions
 STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralWritesLiteralAclChangeEventWhenInterBrokerProtocolLessThanKafkaV2eralAclChangesForOlderProtocolVersions
 PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testThrowsOnAddPrefixedAclIfInterBrokerProtocolVersionTooLow STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testThrowsOnAddPrefixedAclIfInterBrokerProtocolVersionTooLow PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAccessAllowedIfAllowAclExistsOnPrefixedResource STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAccessAllowedIfAllowAclExistsOnPrefixedResource PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAuthorizeWithEmptyResourceName STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAuthorizeWithEmptyResourceName PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDeleteAllAclOnPrefixedResource STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDeleteAllAclOnPrefixedResource PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnLiteralResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnLiteralResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAuthorizeThrowsOnNoneLiteralResource STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 

[jira] [Created] (KAFKA-7281) Fix documentation and error message regarding cleanup.policy=[compact,delete]

2018-08-13 Thread Patrik Kleindl (JIRA)
Patrik Kleindl created KAFKA-7281:
-

 Summary: Fix documentation and error message regarding 
cleanup.policy=[compact,delete]
 Key: KAFKA-7281
 URL: https://issues.apache.org/jira/browse/KAFKA-7281
 Project: Kafka
  Issue Type: Task
  Components: config
Affects Versions: 1.1.0
Reporter: Patrik Kleindl


Issue as requested in: 
https://lists.apache.org/thread.html/621821e321b9ae5a8af623f5918edc4ceee564e0561009317fc705af@%3Cusers.kafka.apache.org%3E

1) The documentation at [https://kafka.apache.org/documentation/] is missing 
the updated information regarding the "compact,delete" cleanup policy on topic 
level.
log.cleanup.policy on broker level
The default cleanup policy for segments beyond the retention window. A comma 
separated list of valid policies. Valid policies are: "delete" and "compact"
cleanup.policy on topic level
A string that is either "delete" or "compact".
 
2) Also the special notation for the command-line client should be noted:
./kafka-configs  --zookeeper broker0:2181 --alter --entity-type topics 
--entity-name test --add-config cleanup.policy=[compact,delete]

Completed Updating config for entity: topic 'test'.

3) The config command does not show this new notation in the error message:

./kafka-configs  --zookeeper broker0:2181 --alter --entity-type topics 
--entity-name test --add-config cleanup.policy=test

Error while executing config command with args '--zookeeper broker0:2181 
--alter --entity-type topics --entity-name test --add-config 
cleanup.policy=test'

org.apache.kafka.common.config.ConfigException: Invalid value test for 
configuration cleanup.policy: String must be one of: compact, delete

 at 
org.apache.kafka.common.config.ConfigDef$ValidString.ensureValid(ConfigDef.java:930)

 at 
org.apache.kafka.common.config.ConfigDef$ValidList.ensureValid(ConfigDef.java:906)

 at org.apache.kafka.common.config.ConfigDef.parseValue(ConfigDef.java:478)

 at org.apache.kafka.common.config.ConfigDef.parse(ConfigDef.java:462)

 at kafka.log.LogConfig$.validate(LogConfig.scala:299)

 at kafka.zk.AdminZkClient.validateTopicConfig(AdminZkClient.scala:336)

 at kafka.zk.AdminZkClient.changeTopicConfig(AdminZkClient.scala:348)

 at kafka.zk.AdminZkClient.changeConfigs(AdminZkClient.scala:285)

 at kafka.admin.ConfigCommand$.alterConfig(ConfigCommand.scala:133)

 at kafka.admin.ConfigCommand$.processCommandWithZk(ConfigCommand.scala:100)

 at kafka.admin.ConfigCommand$.main(ConfigCommand.scala:77)

 at kafka.admin.ConfigCommand.main(ConfigCommand.scala)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [DISCUSSION] KIP-336: Consolidate ExtendedSerializer/Serializer and ExtendedDeserializer/Deserializer

2018-08-13 Thread Viktor Somogyi
Bumping.

If there are no more thoughts on this for a few more days, I'll start a
vote.
(Linking the KIP to avoid people having to scroll back in the conversation:
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=87298242 )

Viktor

On Thu, Aug 2, 2018 at 11:30 AM Viktor Somogyi 
wrote:

> Hi Chia-Ping,
>
> Sorry for the delay on this. One thought though: looking at current
> implementations on github they seemed a bit elaborate, which makes me think
> that people don't usually want to use it as a lambda. But since in your KIP
> you added it, what was your use case there?
>
> "Q: Which implementation is suitable for serialize()/deserialize()? maybe
> just throw exception?"
>
> For now I think l it'll be useful to not throw exceptions but I have to
> check if there is a use case for this. Do you have any use cases btw? :)
>
> Viktor
>
> On Tue, Jul 31, 2018 at 12:24 PM Chia-Ping Tsai 
> wrote:
>
>> Bumping up!
>>
>> On 2018/07/09 08:50:07, Viktor Somogyi  wrote:
>> > Hi folks,
>> >
>> > I've published KIP-336 which is about consolidating the
>> > Serializer/Deserializer interfaces.
>> >
>> > Basically the story here is when ExtendedSerializer and
>> > ExtendedDeserializer were added we still supported Java 7 and therefore
>> had
>> > to use compatible constructs which now seem unnecessary since we dropped
>> > support for Java 7. Now in this KIP I propose a way to deprecate those
>> > patterns:
>> >
>> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=87298242
>> >
>> > I'd be happy to receive some feedback about the KIP I published.
>> >
>> > Cheers,
>> > Viktor
>> >
>>
>


[jira] [Created] (KAFKA-7280) ConcurrentModificationException in FetchSessionHandler in heartbeat thread

2018-08-13 Thread Rajini Sivaram (JIRA)
Rajini Sivaram created KAFKA-7280:
-

 Summary: ConcurrentModificationException in FetchSessionHandler in 
heartbeat thread
 Key: KAFKA-7280
 URL: https://issues.apache.org/jira/browse/KAFKA-7280
 Project: Kafka
  Issue Type: Bug
  Components: consumer
Affects Versions: 2.0.0, 1.1.1
Reporter: Rajini Sivaram
Assignee: Rajini Sivaram
 Fix For: 1.1.2, 2.0.1, 2.1.0


Request/response handling in FetchSessionHandler is not thread-safe. But we are 
using it in Kafka consumer without any synchronization even though poll() from 
heartbeat thread can process responses. Heartbeat thread holds the coordinator 
lock while processing its poll and responses, making other operations involving 
the group coordinator safe. We also need to lock FetchSessionHandler for the 
operations that update or read FetchSessionHandler#sessionPartitions.

This exception is from a system test run on trunk of 
TestSecurityRollingUpgrade.test_rolling_upgrade_sasl_mechanism_phase_two:
{quote}
[2018-08-12 06:13:22,316] ERROR [Consumer clientId=console-consumer, 
groupId=group] Heartbeat thread failed due to unexpected error 
(org.apache.kafka.clients.consumer.internals.AbstractCoordinator)
java.util.ConcurrentModificationException
at 
java.util.LinkedHashMap$LinkedHashIterator.nextNode(LinkedHashMap.java:719)
at 
java.util.LinkedHashMap$LinkedKeyIterator.next(LinkedHashMap.java:742)
at 
org.apache.kafka.clients.FetchSessionHandler.responseDataToLogString(FetchSessionHandler.java:362)
at 
org.apache.kafka.clients.FetchSessionHandler.handleResponse(FetchSessionHandler.java:424)
at 
org.apache.kafka.clients.consumer.internals.Fetcher$1.onSuccess(Fetcher.java:216)
at 
org.apache.kafka.clients.consumer.internals.Fetcher$1.onSuccess(Fetcher.java:206)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:167)
at 
org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:127)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.fireCompletion(ConsumerNetworkClient.java:575)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.firePendingCompletedRequests(ConsumerNetworkClient.java:389)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:297)
at 
org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.pollNoWakeup(ConsumerNetworkClient.java:304)
at 
org.apache.kafka.clients.consumer.internals.AbstractCoordinator$HeartbeatThread.run(AbstractCoordinator.java:996)

{quote}
 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-08-13 Thread Jan Filipiak

Hi,

Happy to see that you want to make an effort here.

Regarding the ProcessSuppliers I couldn't find a way to not rewrite the 
joiners + the merger.
The re-partitioners can be reused in theory. I don't know if repartition 
is optimized in 2.0 now.


I made this
https://cwiki.apache.org/confluence/display/KAFKA/KIP-241+KTable+repartition+with+compacted+Topics
back then and we are running KIP-213 with KIP-241 in combination.

For us it is vital as it minimized the size we had in our repartition 
topics plus it removed the factor of 2 in events on every message.
I know about this new  "delete once consumer has read it".  I don't 
think 241 is vital for all usecases, for ours it is. I wanted

to use 213 to sneak in the foundations for 241 aswell.

I don't quite understand what a PropagationWrapper is, but I am certain 
that you do not need RecordHeaders
for 213 and I would try to leave them out. They either belong to the DSL 
or to the user, having a mixed use is
to be avoided. We run the join with 0.8 logformat and I don't think one 
needs more.


This KIP will be very valuable for the streams project! I couldn't never 
convince myself to invest into the 1.0+ DSL
as I used almost all my energy to fight against it. Maybe this can also 
help me see the good sides a little bit more.


If there is anything unclear with all the text that has been written, 
feel free to just directly cc me so I don't miss it on

the mailing list.

Best Jan




On 08.08.2018 15:26, Adam Bellemare wrote:

More followup, and +dev as Guozhang replied to me directly previously.

I am currently porting the code over to trunk. One of the major changes
since 1.0 is the usage of GraphNodes. I have a question about this:

For a foreignKey joiner, should it have its own dedicated node type? Or
would it be advisable to construct it from existing GraphNode components?
For instance, I believe I could construct it from several
OptimizableRepartitionNode, some SinkNode, some SourceNode, and several
StatefulProcessorNode. That being said, there is some underlying complexity
to each approach.

I will be switching the KIP-213 to use the RecordHeaders in Kafka Streams
instead of the PropagationWrapper, but conceptually it should be the same.

Again, any feedback is welcomed...


On Mon, Jul 30, 2018 at 9:38 AM, Adam Bellemare 
wrote:


Hi Guozhang et al

I was just reading the 2.0 release notes and noticed a section on Record
Headers.
https://cwiki.apache.org/confluence/display/KAFKA/KIP-
244%3A+Add+Record+Header+support+to+Kafka+Streams+Processor+API

I am not yet sure if the contents of a RecordHeader is propagated all the
way through the Sinks and Sources, but if it is, and if it remains attached
to the record (including null records) I may be able to ditch the
propagationWrapper for an implementation using RecordHeader. I am not yet
sure if this is doable, so if anyone understands RecordHeader impl better
than I, I would be happy to hear from you.

In the meantime, let me know of any questions. I believe this PR has a lot
of potential to solve problems for other people, as I have encountered a
number of other companies in the wild all home-brewing their own solutions
to come up with a method of handling relational data in streams.

Adam


On Fri, Jul 27, 2018 at 1:45 AM, Guozhang Wang  wrote:


Hello Adam,

Thanks for rebooting the discussion of this KIP ! Let me finish my pass
on the wiki and get back to you soon. Sorry for the delays..

Guozhang

On Tue, Jul 24, 2018 at 6:08 AM, Adam Bellemare 
wrote:
Let me kick this off with a few starting points that I would like to
generate some discussion on.

1) It seems to me that I will need to repartition the data twice - once
on
the foreign key, and once back to the primary key. Is there anything I am
missing here?

2) I believe I will also need to materialize 3 state stores: the
prefixScan
SS, the highwater mark SS (for out-of-order resolution) and the final
state
store, due to the workflow I have laid out. I have not thought of a
better
way yet, but would appreciate any input on this matter. I have gone back
through the mailing list for the previous discussions on this KIP, and I
did not see anything relating to resolving out-of-order compute. I cannot
see a way around the current three-SS structure that I have.

3) Caching is disabled on the prefixScan SS, as I do not know how to
resolve the iterator obtained from rocksDB with that of the cache. In
addition, I must ensure everything is flushed before scanning. Since the
materialized prefixScan SS is under "control" of the function, I do not
anticipate this to be a problem. Performance throughput will need to be
tested, but as Jan observed in his initial overview of this issue, it is
generally a surge of output events which affect performance moreso than
the
flush or prefixScan itself.

Thoughts on any of these are greatly appreciated, since these elements
are
really the cornerstone of the whole design. I can put up the code I have
written 

[jira] [Resolved] (KAFKA-7164) Follower should truncate after every leader epoch change

2018-08-13 Thread Jason Gustafson (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-7164.

   Resolution: Fixed
Fix Version/s: 2.1.0
   2.0.1
   1.1.2

> Follower should truncate after every leader epoch change
> 
>
> Key: KAFKA-7164
> URL: https://issues.apache.org/jira/browse/KAFKA-7164
> Project: Kafka
>  Issue Type: Bug
>Reporter: Jason Gustafson
>Assignee: Bob Barrett
>Priority: Major
> Fix For: 1.1.2, 2.0.1, 2.1.0
>
>
> Currently we skip log truncation for followers if a LeaderAndIsr request is 
> received, but the leader does not change. This can lead to log divergence if 
> the follower missed a leader change before the current known leader was 
> reelected. Basically the problem is that the leader may truncate its own log 
> prior to becoming leader again, so the follower would need to reconcile its 
> log again.
> For example, suppose that we have three replicas: r1, r2, and r3. Initially, 
> r1 is the leader in epoch 0 and writes one record at offset 0. r3 replicates 
> this successfully.
> {code}
> r1: 
>   status: leader
>   epoch: 0
>   log: [{id: 0, offset: 0, epoch:0}]
> r2: 
>   status: follower
>   epoch: 0
>   log: []
> r3: 
>   status: follower
>   epoch: 0
>   log: [{id: 0, offset: 0, epoch:0}]
> {code}
> Suppose then that r2 becomes leader in epoch 1. r1 notices the leader change 
> and truncates, but r3 for whatever reason, does not.
> {code}
> r1: 
>   status: follower
>   epoch: 1
>   log: []
> r2: 
>   status: leader
>   epoch: 1
>   log: []
> r3: 
>   status: follower
>   epoch: 0
>   log: [{offset: 0, epoch:0}]
> {code}
> Now suppose that r2 fails and r1 becomes the leader in epoch 2. Immediately 
> it writes a new record:
> {code}
> r1: 
>   status: leader
>   epoch: 2
>   log: [{id: 1, offset: 0, epoch:2}]
> r2: 
>   status: follower
>   epoch: 2
>   log: []
> r3: 
>   status: follower
>   epoch: 0
>   log: [{id: 0, offset: 0, epoch:0}]
> {code}
> If the replica continues fetching with the old epoch, we can have log 
> divergence as noted in KAFKA-6880. However, if r3 successfully receives the 
> new LeaderAndIsr request which updates the epoch to 2, but skips the 
> truncation, then the logs will stay inconsistent.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)