Build failed in Jenkins: kafka-trunk-jdk8 #4097

2019-12-10 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Simplify the timeout logic to handle  protocol in Connect


--
[...truncated 7.13 MB...]
kafka.log.LogCleanerParameterizedIntegrationTest > cleanerConfigUpdateTest[2] 
STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerConfigUpdateTest[2] 
PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[2] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[2] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[2] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[2] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerTest[2] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleanerWithMessageFormatV0[2] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleanerWithMessageFormatV0[2] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerConfigUpdateTest[3] 
STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerConfigUpdateTest[3] 
PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[3] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[3] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[3] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[3] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerTest[3] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerTest[3] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleanerWithMessageFormatV0[3] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleanerWithMessageFormatV0[3] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerConfigUpdateTest[4] 
STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerConfigUpdateTest[4] 
PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[4] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[4] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[4] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[4] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerTest[4] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > cleanerTest[4] PASSED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleanerWithMessageFormatV0[4] STARTED

kafka.log.LogCleanerParameterizedIntegrationTest > 
testCleanerWithMessageFormatV0[4] PASSED

kafka.log.LogCleanerIntegrationTest > 
testMarksPartitionsAsOfflineAndPopulatesUncleanableMetrics STARTED

kafka.log.LogCleanerIntegrationTest > 
testMarksPartitionsAsOfflineAndPopulatesUncleanableMetrics PASSED

kafka.log.LogCleanerIntegrationTest > testIsThreadFailed STARTED

kafka.log.LogCleanerIntegrationTest > testIsThreadFailed PASSED

kafka.log.LogCleanerIntegrationTest > testMaxLogCompactionLag STARTED
ERROR: Could not install GRADLE_4_10_3_HOME
java.lang.NullPointerException
at 
hudson.plugins.toolenv.ToolEnvBuildWrapper$1.buildEnvVars(ToolEnvBuildWrapper.java:46)
at hudson.model.AbstractBuild.getEnvironment(AbstractBuild.java:873)
at hudson.plugins.git.GitSCM.getParamExpandedRepos(GitSCM.java:484)
at 
hudson.plugins.git.GitSCM.compareRemoteRevisionWithImpl(GitSCM.java:693)
at hudson.plugins.git.GitSCM.compareRemoteRevisionWith(GitSCM.java:658)
at hudson.scm.SCM.compareRemoteRevisionWith(SCM.java:400)
at hudson.scm.SCM.poll(SCM.java:417)
at hudson.model.AbstractProject._poll(AbstractProject.java:1390)
at hudson.model.AbstractProject.poll(AbstractProject.java:1293)
at hudson.triggers.SCMTrigger$Runner.runPolling(SCMTrigger.java:603)
at hudson.triggers.SCMTrigger$Runner.run(SCMTrigger.java:649)
at 
hudson.util.SequentialExecutionQueue$QueueEntry.run(SequentialExecutionQueue.java:119)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
ERROR: Could not install GRADLE_4_10_3_HOME
java.lang.NullPointerException
at 

Jenkins build is back to normal : kafka-2.4-jdk8 #106

2019-12-10 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : kafka-trunk-jdk11 #1015

2019-12-10 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-2.3-jdk8 #154

2019-12-10 Thread Apache Jenkins Server
See 


Changes:

[rhauch] MINOR: Simplify the timeout logic to handle  protocol in Connect


--
[...truncated 2.98 MB...]
kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testPreferredReplicaPartitionLeaderElectionPreferredReplicaInIsrNotLive PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testReassignPartitionLeaderElectionWithNoLiveIsr STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testReassignPartitionLeaderElectionWithNoLiveIsr PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testReassignPartitionLeaderElection STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testReassignPartitionLeaderElection PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testOfflinePartitionLeaderElection STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testOfflinePartitionLeaderElection PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testPreferredReplicaPartitionLeaderElection STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testPreferredReplicaPartitionLeaderElection PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testReassignPartitionLeaderElectionWithEmptyIsr STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testReassignPartitionLeaderElectionWithEmptyIsr PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testControlledShutdownPartitionLeaderElectionAllIsrSimultaneouslyShutdown 
STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testControlledShutdownPartitionLeaderElectionAllIsrSimultaneouslyShutdown PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testOfflinePartitionLeaderElectionLastIsrOfflineUncleanLeaderElectionEnabled 
STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testOfflinePartitionLeaderElectionLastIsrOfflineUncleanLeaderElectionEnabled 
PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testPreferredReplicaPartitionLeaderElectionPreferredReplicaNotInIsrNotLive 
STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testPreferredReplicaPartitionLeaderElectionPreferredReplicaNotInIsrNotLive 
PASSED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testOfflinePartitionLeaderElectionLastIsrOfflineUncleanLeaderElectionDisabled 
STARTED

kafka.controller.PartitionLeaderElectionAlgorithmsTest > 
testOfflinePartitionLeaderElectionLastIsrOfflineUncleanLeaderElectionDisabled 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataInterBrokerProtocolVersion STARTED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataInterBrokerProtocolVersion PASSED

kafka.controller.ControllerChannelManagerTest > testLeaderAndIsrRequestIsNew 
STARTED

kafka.controller.ControllerChannelManagerTest > testLeaderAndIsrRequestIsNew 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaRequestsWhileTopicQueuedForDeletion STARTED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaRequestsWhileTopicQueuedForDeletion PASSED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrRequestSentToLiveOrShuttingDownBrokers STARTED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrRequestSentToLiveOrShuttingDownBrokers PASSED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaInterBrokerProtocolVersion STARTED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaInterBrokerProtocolVersion PASSED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaSentOnlyToLiveAndShuttingDownBrokers STARTED

kafka.controller.ControllerChannelManagerTest > 
testStopReplicaSentOnlyToLiveAndShuttingDownBrokers PASSED

kafka.controller.ControllerChannelManagerTest > testStopReplicaGroupsByBroker 
STARTED

kafka.controller.ControllerChannelManagerTest > testStopReplicaGroupsByBroker 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataDoesNotIncludePartitionsWithoutLeaderAndIsr STARTED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataDoesNotIncludePartitionsWithoutLeaderAndIsr PASSED

kafka.controller.ControllerChannelManagerTest > 
testMixedDeleteAndNotDeleteStopReplicaRequests STARTED

kafka.controller.ControllerChannelManagerTest > 
testMixedDeleteAndNotDeleteStopReplicaRequests PASSED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrInterBrokerProtocolVersion STARTED

kafka.controller.ControllerChannelManagerTest > 
testLeaderAndIsrInterBrokerProtocolVersion PASSED

kafka.controller.ControllerChannelManagerTest > testUpdateMetadataRequestSent 
STARTED

kafka.controller.ControllerChannelManagerTest > testUpdateMetadataRequestSent 
PASSED

kafka.controller.ControllerChannelManagerTest > 
testUpdateMetadataRequestDuringTopicDeletion STARTED


[jira] [Resolved] (KAFKA-9002) Flaky Test org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenCreated

2019-12-10 Thread Guozhang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-9002.
--
  Assignee: Guozhang Wang
Resolution: Fixed

> Flaky Test 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenCreated
> -
>
> Key: KAFKA-9002
> URL: https://issues.apache.org/jira/browse/KAFKA-9002
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Bill Bejeck
>Assignee: Guozhang Wang
>Priority: Major
>  Labels: flaky-test, streams
>
> [https://builds.apache.org/job/kafka-pr-jdk8-scala2.11/25603/testReport/junit/org.apache.kafka.streams.integration/RegexSourceIntegrationTest/testRegexMatchesTopicsAWhenCreated/]
> {noformat}
> Error Messagejava.lang.AssertionError: Condition not met within timeout 
> 15000. Stream tasks not updatedStacktracejava.lang.AssertionError: Condition 
> not met within timeout 15000. Stream tasks not updated
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:376)
>   at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:336)
>   at 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenCreated(RegexSourceIntegrationTest.java:175)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
>   at 
> org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
>   at 
> org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
>   at 
> org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at 
> org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
>   at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:365)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
>   at 
> org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
>   at org.junit.runners.ParentRunner$4.run(ParentRunner.java:330)
>   at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:78)
>   at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:328)
>   at org.junit.runners.ParentRunner.access$100(ParentRunner.java:65)
>   at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:292)
>   at 
> org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
>   at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
>   at org.junit.rules.RunRules.evaluate(RunRules.java:20)
>   at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:305)
>   at org.junit.runners.ParentRunner.run(ParentRunner.java:412)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.runTestClass(JUnitTestClassExecutor.java:110)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:58)
>   at 
> org.gradle.api.internal.tasks.testing.junit.JUnitTestClassExecutor.execute(JUnitTestClassExecutor.java:38)
>   at 
> org.gradle.api.internal.tasks.testing.junit.AbstractJUnitTestClassProcessor.processTestClass(AbstractJUnitTestClassProcessor.java:62)
>   at 
> org.gradle.api.internal.tasks.testing.SuiteTestClassProcessor.processTestClass(SuiteTestClassProcessor.java:51)
>   at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:36)
>   at 
> org.gradle.internal.dispatch.ReflectionDispatch.dispatch(ReflectionDispatch.java:24)
>   at 
> org.gradle.internal.dispatch.ContextClassLoaderDispatch.dispatch(ContextClassLoaderDispatch.java:33)
>   at 
> org.gradle.internal.dispatch.ProxyDispatchAdapter$DispatchingInvocationHandler.invoke(ProxyDispatchAdapter.java:94)
>   at 

[jira] [Resolved] (KAFKA-4898) Add timeouts to streams integration tests

2019-12-10 Thread Guozhang Wang (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-4898?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-4898.
--
Resolution: Won't Fix

> Add timeouts to streams integration tests
> -
>
> Key: KAFKA-4898
> URL: https://issues.apache.org/jira/browse/KAFKA-4898
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Reporter: Colin McCabe
>Assignee: Colin McCabe
>Priority: Major
>
> Add timeouts to streams integration tests.  A few recent Jenkins jobs seem to 
> have hung in these tests.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-10 Thread Stanislav Kozlovski
Hey Viktor,

I like your latest idea regarding the replication/reassignment configs
interplay - I think it makes sense for replication to always be higher. A
small matrix of possibilities in the KIP may be useful to future readers
(users)
To be extra clear:
1. if reassignment.throttle is -1, reassignment traffic is counted with
replication traffic against replication.throttle
2. if replication.throttle is 20 and reassignment.throttle is 10, we have a
30 total throttle
Is my understanding correct?

Regarding the KIP - the motivation states

> So a user is able to specify the partition and the throttle rate but it
will be applied to all non-ISR replication traffic. This is undesirable
because if a node that is being throttled falls out of ISR it would further
prevent it from catching up.

This KIP does not solve this problem, right?
Or did you mean to mention the problem where reassignment replicas would
eat up the throttle and further limit the non-ISR "original" replicas from
catching up?

Best,
Stanislav

On Tue, Dec 10, 2019 at 9:09 AM Viktor Somogyi-Vass 
wrote:

> This config will only be applied to those replicas which are reassigning
> and not yet in ISR. When they become ISR then reassignment throttling stops
> altogether and won't apply when they fall out of ISR. Specifically
> the validity of the config spans from the point when a reassignment is
> propagated by the adding_replicas field in the LeaderAndIsr request until
> the broker gets another LeaderAndIsr request saying that the new replica is
> added and in ISR. Furthermore the config will be applied only the actual
> leader and follower so if the leader changes in the meanwhile the
> throttling changes with it (again based on the LeaderAndIsr requests).
>
> For instance when a new broker is added to offload some partitions there,
> it will be safer to use this config instead of general fetch throttling for
> this very reason: when an existing partition that is being reassigned falls
> out of ISR then it will be propagated via the LeaderAndIsr request so
> throttling also changes. This removes the need for changing the configs
> manually and would give an easy way for people to configure throttling yet
> would make better efforts to not throttle what's not needed to be throttled
> (the replica which is falling out of ISR).
>
> Viktor
>
> On Fri, Dec 6, 2019 at 5:12 PM Ismael Juma  wrote:
>
> > My concern is that we're very focused on reassignment where I think users
> > enable throttling to avoid overwhelming brokers with replica catch up
> > traffic (typically disk and/or bandwidth). The current approach achieves
> > that by not throttling ISR replication.
> >
> > The downside is that when a broker falls out of the ISR, it may suddenly
> > get throttled and never catch up. However, if the throttle can cause this
> > kind of issue, then it's broken for replicas being reassigned too, so one
> > could say that it's a configuration error.
> >
> > Do we have specific scenarios that would be solved by the proposed
> change?
> >
> > Ismael
> >
> > On Fri, Dec 6, 2019 at 2:26 AM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > wrote:
> >
> > > Thanks for the question. I think it depends on how the user will try to
> > fix
> > > it.
> > > - If they just replace the disk then I think it shouldn't count as a
> > > reassignment and should be allocated under the normal replication
> quotas.
> > > In this case there is no reassignment going on as far as I can tell,
> the
> > > broker shuts down serving replicas from that dir/disk, notifies the
> > > controller which changes the leadership. When the disk is fixed the
> > broker
> > > will be restarted to pick up the changes and it starts the replication
> > from
> > > the current leader.
> > > - If the user reassigns the partitions to other brokers then it will
> fall
> > > under the reassignment traffic.
> > > Also if the user moves a partition to a different disk it would also
> > count
> > > as normal replication as it technically not a reassignment but an
> > > alter-replica-dir event but it's still done with the reassignment tool,
> > so
> > > I'd keep the current functionality of the
> > > --replica-alter-log-dirs-throttle.
> > > Is this aligned with your thinking?
> > >
> > > Viktor
> > >
> > > On Wed, Dec 4, 2019 at 2:47 PM Ismael Juma  wrote:
> > >
> > > > Thanks Viktor. How do we intend to handle the case where a broker
> loses
> > > its
> > > > disk and has to catch up from the beginning?
> > > >
> > > > Ismael
> > > >
> > > > On Wed, Dec 4, 2019, 4:31 AM Viktor Somogyi-Vass <
> > > viktorsomo...@gmail.com>
> > > > wrote:
> > > >
> > > > > Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed.
> > I've
> > > > > updated the KIP.
> > > > >
> > > > > Viktor
> > > > >
> > > > > On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma 
> > wrote:
> > > > >
> > > > > > Hi Viktor,
> > > > > >
> > > > > > The KIP states:
> > > > > >
> > > > > > "KIP-73
> > > > > > <
> > > > > >
> > > > >
> > > >
> 

Re: [DISCUSS] KIP-551: Expose disk read and write metrics

2019-12-10 Thread Magnus Edenhill
Hi Colin,

aren't those counters (ever increasing), rather than gauges (fluctuating)?

You also mention CPU usage as a side note, you could use getrusage(2)'s
ru_utime (user) and ru_stime (sys)
to allow the broker to monitor its own CPU usage.

/Magnus

Den tis 10 dec. 2019 kl 19:33 skrev Colin McCabe :

> Hi all,
>
> I wrote KIP about adding support for exposing disk read and write
> metrics.  Check it out here:
>
> https://cwiki.apache.org/confluence/x/sotSC
>
> best,
> Colin
>


[DISCUSS] KIP-551: Expose disk read and write metrics

2019-12-10 Thread Colin McCabe
Hi all,

I wrote KIP about adding support for exposing disk read and write metrics.  
Check it out here:

https://cwiki.apache.org/confluence/x/sotSC

best,
Colin


Re: [VOTE] 2.4.0 RC4

2019-12-10 Thread Adam Bellemare
- All PGP signatures are good
- All md5, sha1sums and sha512sums pass

Initial test results:
1310 tests completed, 2 failed, 17 skipped

> Task :core:integrationTest FAILED

The failed tests:
SaslSslAdminClientIntegrationTest. testElectPreferredLeaders
SslAdminClientIntegrationTest.
testSynchronousAuthorizerAclUpdatesBlockRequestThreads

Both failed due to timeout:

java.util.concurrent.ExecutionException:
org.apache.kafka.common.errors.TimeoutException: Aborted due to
timeout.

Reran the tests and both passed.

+1 from me.





On Mon, Dec 9, 2019 at 12:32 PM Manikumar  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the fifth candidate for release of Apache Kafka 2.4.0.
>
> This release includes many new features, including:
> - Allow consumers to fetch from closest replica
> - Support for incremental cooperative rebalancing to the consumer rebalance
> protocol
> - MirrorMaker 2.0 (MM2), a new multi-cluster, cross-datacenter replication
> engine
> - New Java authorizer Interface
> - Support for non-key joining in KTable
> - Administrative API for replica reassignment
> - Sticky partitioner
> - Return topic metadata and configs in CreateTopics response
> - Securing Internal connect REST endpoints
> - API to delete consumer offsets and expose it via the AdminClient.
>
> Release notes for the 2.4.0 release:
> https://home.apache.org/~manikumar/kafka-2.4.0-rc4/RELEASE_NOTES.html
>
> *** Please download, test and vote by Thursday, December 12, 9am PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> https://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> https://home.apache.org/~manikumar/kafka-2.4.0-rc4/
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> https://home.apache.org/~manikumar/kafka-2.4.0-rc4/javadoc/
>
> * Tag to be voted upon (off 2.4 branch) is the 2.4.0 tag:
> https://github.com/apache/kafka/releases/tag/2.4.0-rc4
>
> * Documentation:
> https://kafka.apache.org/24/documentation.html
>
> * Protocol:
> https://kafka.apache.org/24/protocol.html
>
> Thanks,
> Manikumar
>


[jira] [Created] (KAFKA-9289) consolidate all the streams smoke system tests

2019-12-10 Thread John Roesler (Jira)
John Roesler created KAFKA-9289:
---

 Summary: consolidate all the streams smoke system tests
 Key: KAFKA-9289
 URL: https://issues.apache.org/jira/browse/KAFKA-9289
 Project: Kafka
  Issue Type: Bug
  Components: streams
Reporter: John Roesler


As of the addition of a relational smoke test 
(https://issues.apache.org/jira/browse/KAFKA-9138) we now have three "smoke 
test" applications in Streams, which are used across a variety of system tests.

This might be fine, since each test's existence doesn't hurt the others. 
However, there are a few drawbacks to the proliferations of smoke test 
applications to consider:
* if each application is different, and they are used in different scenarios in 
the system tests, then our coverage of Streams is lower than it could be.
* the older smoke tests are written in ways that tend to make them more prone 
to false positives. In KAFKA-9138, I have adopted a different paradigm to 
attempt avoiding this fate.

For these reasons, it would be a good exercise to attempt the following plan:
1. Expand the new RelationalSmokeTest introduced in KAFKA-9138 to include all 
the stream processing operations from the other smoke tests, but written and 
verified in the same style as the RelationalSmokeTest.
2. Add a new verification component for the RelationalSmokeTest to make some 
assertions about it when EOS is disabled. But be careful to account for the 
fact that without EOS, some operators can "overcount".
3. Expand the streams_relational_smoke_test system test to also run the same 
scenarios (with and without EOS) as the other smoke system tests.
4. Remove the other system tests and associated smoke test classes.

Note that this work should certainly be broken up into a series of pull 
requests so that each one is as small as possible while also being a sensible 
contribution on its own. Perhaps following the above plan.

Also note that this would resolve 
https://issues.apache.org/jira/browse/KAFKA-8080 as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Build failed in Jenkins: kafka-trunk-jdk11 #1014

2019-12-10 Thread Apache Jenkins Server
See 


Changes:

[manikumar] KAFKA-9025: Add a option for path existence check in 
ZkSecurityMigrator


--
[...truncated 5.58 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis STARTED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis PASSED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime STARTED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime PASSED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep PASSED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnIsOpen 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldReturnName 
PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldPutWindowStartTimestampWithUnknownTimestamp PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent STARTED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > 
shouldReturnIsPersistent PASSED

org.apache.kafka.streams.internals.WindowStoreFacadeTest > shouldForwardClose 
STARTED


Re: Proposal for EmbeddedKafkaCluster: Support JUnit 5 Extension in addition to JUnit 4 Rule

2019-12-10 Thread Matthias Merdes
Hi John and Thomas,

Thanks a lot for your detailed and well-informed replies.
I am adding some comments below.


On 6. Dec 2019, at 17:42, John Roesler 
mailto:vvcep...@apache.org>> wrote:

Hi Matthias!

Thanks for the note, and the kind sentiment.

We're always open to improvements like this, so your contribution would 
certainly be welcome.

Just FYI, "public interface" changes need to go through the KIP process (see 
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals 
). You could of course, open an exploratory PR and ask here for comments before 
deciding if you want to make a KIP, if you prefer. Personally, I often find it 
clarifying to put together a PR concurrently with the KIP, because it helps to 
flush out any "devils in the details", and also can help the KIP discussion.


So far I was under the impression that KIPs were only for major changes.
From GitHub projects I am used to discussions directly in the issue.

Just wanted to make you aware of the process. I'm happy to help you navigate 
this process, by the way.

Regarding the specific proposal, the number one question in my mind is 
compatibility... I.e., do we add new dependencies, or change dependencies, that 
may conflict with what's already on users' classpaths? Would the change result 
in any source-code or binary incompatibility with previous versions? Etc... you 
get the idea.


In this case the additional dependency would be on the JUnit Jupiter API which 
was intended to work alongside JUnit 4 in IDEs and build tools.
So there should not be compatibility problems. The existing Rule could be 
slightly refactored such that JUnit 4 users would not need the JUnit 5 
dependency (and vice versa) at test time.

We can make such changes, but it's a _lot_ easier to support doing it in a 
major release. Right now, that would be 3.0, which is not currently planned. 
We're currently working toward 2.5, and then 2.6, and so on, essentially 
waiting for a good reason to bump up to 3.0.

All that said, EmbeddedKafkaCluster is an interesting case, because it's not 
actually a public API!
When Bill wrote the book, he included it (I assume) because there was no other 
practical approach to testing available.
However, since the publication, we have added an official integration testing 
framework called TopologyTestDriver. When people ask me for testing advice, I 
tell them:
1. Use TopologyTestDriver (for Streams testing)
2. If you need a "real" broker for your test, then set up a pre/post 
integration test hook to run Kafka independently (e.g., with Maven).
3. If that's not practical, then _copy_ EmbeddedKafkaCluster into your own 
project, don't depend on an internal test artifact.


TopologyTestDriver ist very useful for doing unit tests (in the narrow sense of 
the word) and should of course be used whereever possible

The EmbeddedKafkaCluster would be aimed at integration testing - with or 
without streams.
Personally, I have had mixed experiencies with running ‘middleware’ externally 
to tests (Docker container, or other kind of external process)
Having control and configuration right within your tests can be very helpful.
When I first suggested the JUnit 5 variant I was not aware of it not being 
‘public’ because I checked the book and Kafka’s own sources only and missed 
this fact.
From your answers I understood that you are not convinced that publishing 
EmbeddedKafkaCluster is a good idea at all.
Maybe this could be reevaluated when (if at all) the Kafka test code base is 
migrated to JUnit 5.

I guess in the meantime the similar support from spring-kafka-test 
(EmbeddedKafkaBroker) can be used -
at least when working with the spring stack.




To me, this means that we essentially have free reign to make changes to 
EmbeddedKafkaCluster, since it _should_ only
be used for internal testing. In that case, I would just evaluate the merits up 
bumping all the tests in Apache Kafka up to JUnit 5. Although that's not a 
public API, it might be a big enough change to the process to justify a design 
document and project-wide discussion.


Upgrading the Kafka test code base to JUnit 5 should not be too hard, with 
@(Class)Rule usage needing some work.
The rules used here (Timeout, TestName, TempFolder, Moskito, and EasyMock) all 
have JUnit 5 replacements.
So at least the Java part should mostly be busy work only (I cannot judge the 
Scala part).
If you ever decide to upgrade to JUnit 5 let me know und I would be happy to 
help. :)

Thanks again,
Cheers,
Matthias




Well, I guess it turns out I had more to say than I initially thought... Sorry 
for rambling a bit.

What are your thoughts?
-John

On Fri, Dec 6, 2019, at 07:05, Matthias Merdes wrote:
Hi all,

when reading ‘Kafka Streams in Action’ I especially enjoyed the
thoughtful treatment of
mocking, unit, and integration tests.
Integration testing in the book (and in the Kafka codebase) is done
using the @ClassRule-annotated EmbeddedKafkaCluster.
JUnit 5 Jupiter 

Build failed in Jenkins: kafka-trunk-jdk8 #4095

2019-12-10 Thread Apache Jenkins Server
See 


Changes:

[manikumar] KAFKA-9025: Add a option for path existence check in 
ZkSecurityMigrator


--
[...truncated 2.76 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis STARTED

org.apache.kafka.streams.MockTimeTest > shouldGetNanosAsMillis PASSED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime STARTED

org.apache.kafka.streams.MockTimeTest > shouldSetStartTime PASSED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldNotAllowNegativeSleep PASSED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep STARTED

org.apache.kafka.streams.MockTimeTest > shouldAdvanceTimeOnSleep PASSED


Jenkins build is back to normal : kafka-trunk-jdk8 #4094

2019-12-10 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-435: Internal Partition Reassignment Batching

2019-12-10 Thread Viktor Somogyi-Vass
Hey Colin,

Sure, that's understandable, we should keep the project stable first and
foremost. I'll work out the details of the client side reassignment changes
so perhaps we'll have better, clearer picture about this and we'll decide
when we feel the reassignment API can take more changes in :).

Viktor

On Mon, Dec 9, 2019 at 8:16 PM Colin McCabe  wrote:

> Hi Viktor,
>
> Thanks for the thoughtful reply.
>
> I'm actually not a PMC on the Apache Kafka project, although I hope to
> become one in the future :)  In any case, the tasks that the PMC tackles
> tend to be more related to things like project policies, committers, legal
> stuff, brand stuff, and so on.  I don't think this discussion about
> reassignment needs to be a PMC discussion.
>
> I think it helps to stay focused on the pain points that users are
> experiencing.  Prior to KIP-455, we had a lot of them.  For example, there
> was no officially supported way to cancel reassignments or change them
> after they were created.  There was no official API so behavior could
> change between versions without much warning.  The tooling needed to depend
> on ZooKeeper, and many tools pulled in internal classes from Kafka.
>
> We've fixed these pain points now with the new API, but it will take time
> for these changes to propagate to all the external tools.  I think it would
> be good to see how the new API works out before we think about another
> major revision.  There will probably be pain points with the new API, but
> it's not completely clear what they will be yet.
>
> best,
> Colin
>
>
> On Fri, Dec 6, 2019, at 08:55, Viktor Somogyi-Vass wrote:
> > Hi Colin,
> >
> > Thanks for the honest answer. As a bottom line I think a better
> > reassignment logic should be included with Kafka somehow instead of
> relying
> > on external tools. It makes the system more mature on its own and
> > standardizes what others implemented many times.
> > Actually I also agree that decoupling this complexity would come with
> > benefits but I haven't looked into this direction yet. And frankly with
> > KIP-500 there will be a lot of changes in the controller itself.
> >
> > My overall vision with this is that there could be an API of some sort at
> > some point where you'd be able to specify how would you want to do your
> > rebalance, what plugins you'd use, what would be the default, etc.. I
> > really don't mind if this functionality is on the client or on the broker
> > as long as it gives a consistent experience for the users. For a better
> > overall user experience though I think this is a basic building block and
> > it made sense to me to put both functionality onto the server side as
> there
> > is nothing much to be configured on batching except partition, replica
> > batch sizes and maybe leader movements. Also users could very well limit
> > the blast radius of a reassignment by applying throttling too where they
> > don't care about the order of reassignments or any special requirements
> > just about the fact that it's sustainable in the cluster and finishes
> > without causing troubles. With small batches and throttling individual
> > partition reassignments finish earlier because you divide the given
> > bandwidth among fewer participants at a time, so it's more effective. By
> > just tweaking these knobs (batch size and throttling) we could achieve a
> > significantly better and more effective rebalance.
> >
> > Policies I think can be somewhat parallel to this as they may require
> some
> > information from metrics and you're right that the controller could
> gather
> > these easier but it's harder to make plugins, although I don't think it's
> > impossible to give a good framework for this. Also reassignment can be
> > extracted from the controller (Stan has already looked into separating
> the
> > reassignment logic from the KafkaController class in #7339) so I think we
> > could come up with a construct that is decoupled from the controller but
> > still resides on the broker side. Furthermore policies could also be
> > dynamically configured if needed.
> >
> > Any change that we make should be backward compatible and I'd default
> them
> > to the current behavior which means external tools wouldn't have to take
> > extra steps but those who don't use any external tools would benefit
> these
> > changes.
> >
> > I think it would be useful to discuss the overall vision on reassignments
> > as it would help us to decide whether this thing goes on the client or
> the
> > broker side so allow me to pose some questions to you and other PMCs as
> I'd
> > like to get some clarity and the overall stance on this so I can align my
> > work better.
> >
> > What would be the overall goal that we want to achieve? Do we want to
> give
> > a solution for automatic partition load balancing or just a framework
> that
> > standardizes what many others implemented? Or do we want to adopt a
> > solution or recommend a specific one?
> > - For the first it might be 

Re: [DISCUSS] KIP-542: Partition Reassignment Throttling

2019-12-10 Thread Viktor Somogyi-Vass
This config will only be applied to those replicas which are reassigning
and not yet in ISR. When they become ISR then reassignment throttling stops
altogether and won't apply when they fall out of ISR. Specifically
the validity of the config spans from the point when a reassignment is
propagated by the adding_replicas field in the LeaderAndIsr request until
the broker gets another LeaderAndIsr request saying that the new replica is
added and in ISR. Furthermore the config will be applied only the actual
leader and follower so if the leader changes in the meanwhile the
throttling changes with it (again based on the LeaderAndIsr requests).

For instance when a new broker is added to offload some partitions there,
it will be safer to use this config instead of general fetch throttling for
this very reason: when an existing partition that is being reassigned falls
out of ISR then it will be propagated via the LeaderAndIsr request so
throttling also changes. This removes the need for changing the configs
manually and would give an easy way for people to configure throttling yet
would make better efforts to not throttle what's not needed to be throttled
(the replica which is falling out of ISR).

Viktor

On Fri, Dec 6, 2019 at 5:12 PM Ismael Juma  wrote:

> My concern is that we're very focused on reassignment where I think users
> enable throttling to avoid overwhelming brokers with replica catch up
> traffic (typically disk and/or bandwidth). The current approach achieves
> that by not throttling ISR replication.
>
> The downside is that when a broker falls out of the ISR, it may suddenly
> get throttled and never catch up. However, if the throttle can cause this
> kind of issue, then it's broken for replicas being reassigned too, so one
> could say that it's a configuration error.
>
> Do we have specific scenarios that would be solved by the proposed change?
>
> Ismael
>
> On Fri, Dec 6, 2019 at 2:26 AM Viktor Somogyi-Vass <
> viktorsomo...@gmail.com>
> wrote:
>
> > Thanks for the question. I think it depends on how the user will try to
> fix
> > it.
> > - If they just replace the disk then I think it shouldn't count as a
> > reassignment and should be allocated under the normal replication quotas.
> > In this case there is no reassignment going on as far as I can tell, the
> > broker shuts down serving replicas from that dir/disk, notifies the
> > controller which changes the leadership. When the disk is fixed the
> broker
> > will be restarted to pick up the changes and it starts the replication
> from
> > the current leader.
> > - If the user reassigns the partitions to other brokers then it will fall
> > under the reassignment traffic.
> > Also if the user moves a partition to a different disk it would also
> count
> > as normal replication as it technically not a reassignment but an
> > alter-replica-dir event but it's still done with the reassignment tool,
> so
> > I'd keep the current functionality of the
> > --replica-alter-log-dirs-throttle.
> > Is this aligned with your thinking?
> >
> > Viktor
> >
> > On Wed, Dec 4, 2019 at 2:47 PM Ismael Juma  wrote:
> >
> > > Thanks Viktor. How do we intend to handle the case where a broker loses
> > its
> > > disk and has to catch up from the beginning?
> > >
> > > Ismael
> > >
> > > On Wed, Dec 4, 2019, 4:31 AM Viktor Somogyi-Vass <
> > viktorsomo...@gmail.com>
> > > wrote:
> > >
> > > > Thanks for the notice Ismael, KAFKA-4313 fixed this issue indeed.
> I've
> > > > updated the KIP.
> > > >
> > > > Viktor
> > > >
> > > > On Tue, Dec 3, 2019 at 3:28 PM Ismael Juma 
> wrote:
> > > >
> > > > > Hi Viktor,
> > > > >
> > > > > The KIP states:
> > > > >
> > > > > "KIP-73
> > > > > <
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-73+Replication+Quotas
> > > > > >
> > > > > added
> > > > > quotas for replication but it doesn't separate normal replication
> > > traffic
> > > > > from reassignment. So a user is able to specify the partition and
> the
> > > > > throttle rate but it will be applied to both ISR and non-ISR
> traffic"
> > > > >
> > > > > This is not true, ISR traffic is not throttled.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Thu, Oct 24, 2019 at 5:38 AM Viktor Somogyi-Vass <
> > > > > viktorsomo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Hi People,
> > > > > >
> > > > > > I've created a KIP to improve replication quotas by handling
> > > > reassignment
> > > > > > related throttling as a separate case with its own configurable
> > > limits
> > > > > and
> > > > > > change the kafka-reassign-partitions tool to use these new
> configs
> > > > going
> > > > > > forward.
> > > > > > Please have a look, I'd be happy to receive any feedback and
> answer
> > > > > > all your questions.
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-542%3A+Partition+Reassignment+Throttling
> > > > > >
> > > > > > Thanks,
> > > > > > Viktor
> > > > > >
> > > > >
> > 

[jira] [Resolved] (KAFKA-9025) ZkSecurityMigrator not working with zookeeper chroot

2019-12-10 Thread Manikumar (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-9025?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar resolved KAFKA-9025.
--
Fix Version/s: 2.5.0
   Resolution: Fixed

Issue resolved by pull request 7618
[https://github.com/apache/kafka/pull/7618]

> ZkSecurityMigrator not working with zookeeper chroot
> 
>
> Key: KAFKA-9025
> URL: https://issues.apache.org/jira/browse/KAFKA-9025
> Project: Kafka
>  Issue Type: Bug
>  Components: security
>Affects Versions: 2.3.0
> Environment: Reproduced at least on rhel and macos
>Reporter: Laurent Millet
>Assignee: huxihx
>Priority: Major
> Fix For: 2.5.0
>
>
> The ZkSecurityMigrator tool fails to handle installations where kafka is 
> configured with a zookeeper chroot (as opposed to using /, the default):
>  * ACLs on existing nodes are not modified (they are left world-modifiable)
>  * New nodes created by the tool are created directly under the zookeeper 
> root instead of under the chroot
> The tool does not emit any message, thus the unsuspecting user can only 
> assume everything went well, when in fact it did not and znodes are still not 
> secure:
> kafka_2.12-2.3.0 $ bin/zookeeper-security-migration.sh --zookeeper.acl=secure 
> --zookeeper.connect=localhost:2181
> kafka_2.12-2.3.0 $
> For example, with kafka configured to use /kafka as chroot 
> (zookeeper.connect=localhost:2181/kafka), the following is observed:
>  * Before running the tool
>  ** Zookeeper top-level nodes (all kafka nodes are under /kafka):
> [zk: localhost:2181(CONNECTED) 1] ls /
> [kafka, zookeeper]
>  ** Example node ACL:
> [zk: localhost:2181(CONNECTED) 2] getAcl /kafka/brokers
> 'world,'anyone
> : cdrwa
>  * After running the tool:
>  ** Zookeeper top-level nodes (kafka nodes created by the tool appeared here):
> [zk: localhost:2181(CONNECTED) 3] ls /
> [admin, brokers, cluster, config, controller, controller_epoch, 
> delegation_token, isr_change_notification, kafka, kafka-acl, 
> kafka-acl-changes, kafka-acl-extended, kafka-acl-extended-changes, 
> latest_producer_id_block, log_dir_event_notification, zookeeper]
>  ** Example node ACL:
> [zk: localhost:2181(CONNECTED) 4] getAcl /kafka/brokers
> 'world,'anyone
> : cdrwa
>  ** New node ACL:
> [zk: localhost:2181(CONNECTED) 5] getAcl /brokers
> 'sasl,'kafka
> : cdrwa
> 'world,'anyone
> : r
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)