Jenkins build is back to normal : kafka-trunk-jdk11 #1695

2020-08-07 Thread Apache Jenkins Server
See 




Build failed in Jenkins: kafka-trunk-jdk8 #4770

2020-08-07 Thread Apache Jenkins Server
See 


Changes:

[github] KAFKA-10371; Partition reassignments can result in crashed


--
[...truncated 3.19 MB...]

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED


Build failed in Jenkins: kafka-trunk-jdk11 #1694

2020-08-07 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Ensure a reason is logged for all segment deletion operations


--
[...truncated 3.21 MB...]
org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldAllowPrePopulatingStatesStoresWithCachingEnabled[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectPersistentStoreTypeOnly[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldRespectTaskIdling[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSourceSpecificDeserializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldReturnAllStores[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotCreateStateDirectoryForStatelessTopology[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldApplyGlobalUpdatesCorrectlyInRecursiveTopologies[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnAllStoresNames[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPassRecordHeadersIntoSerializersAndDeserializers[Eos enabled = false] 
PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessConsumerRecordList[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUseSinkSpecificSerializers[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldFlushStoreForFirstInput[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldProcessFromSourceThatMatchPattern[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldCaptureSinkTopicNamesIfWrittenInto[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldUpdateStoreForNewKey[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldSendRecordViaCorrectSourceTopicDeprecated[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldPunctuateOnWallClockTime[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > shouldSetRecordMetadata[Eos 
enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldNotUpdateStoreForLargerValue[Eos enabled = false] PASSED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] STARTED

org.apache.kafka.streams.TopologyTestDriverTest > 
shouldReturnCorrectInMemoryStoreTypeOnly[Eos enabled = false] PASSED


[jira] [Resolved] (KAFKA-10371) Partition reassignments can result in crashed ReplicaFetcherThreads.

2020-08-07 Thread Jason Gustafson (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jason Gustafson resolved KAFKA-10371.
-
Resolution: Fixed

> Partition reassignments can result in crashed ReplicaFetcherThreads.
> 
>
> Key: KAFKA-10371
> URL: https://issues.apache.org/jira/browse/KAFKA-10371
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Reporter: Steve Rodrigues
>Assignee: David Jacot
>Priority: Critical
>
> A Kafka system doing partition reassignments got stuck with the reassignment 
> partially done and the system with a non-zero number of URPs and increasing 
> max lag.
> Looking in the logs, we see: 
> {noformat}
> [ERROR] 2020-07-31 21:22:23,984 [ReplicaFetcherThread-0-3] 
> kafka.server.ReplicaFetcherThread - [ReplicaFetcher replicaId=4, leaderId=3, 
> fetcherId=0] Error due to
> org.apache.kafka.common.errors.NotLeaderOrFollowerException: Error while 
> fetching partition state for foo
> [INFO] 2020-07-31 21:22:23,986 [ReplicaFetcherThread-0-3] 
> kafka.server.ReplicaFetcherThread - [ReplicaFetcher replicaId=4, leaderId=3, 
> fetcherId=0] Stopped
> {noformat}
> Investigating further and with some helpful changes to the exception (which 
> was not generating a stack trace because it was a client-side exception), we 
> see on a test run:
> {noformat}
> [2020-08-06 19:58:21,592] ERROR [ReplicaFetcher replicaId=2, leaderId=1, 
> fetcherId=0] Error due to (kafka.server.ReplicaFetcherThread)
> org.apache.kafka.common.errors.NotLeaderOrFollowerException: Error while 
> fetching partition state for topic-test-topic-85
> at org.apache.kafka.common.protocol.Errors.exception(Errors.java:415)
> at 
> kafka.server.ReplicaManager.getPartitionOrException(ReplicaManager.scala:645)
> at 
> kafka.server.ReplicaManager.localLogOrException(ReplicaManager.scala:672)
> at 
> kafka.server.ReplicaFetcherThread.logStartOffset(ReplicaFetcherThread.scala:133)
> at 
> kafka.server.ReplicaFetcherThread.$anonfun$buildFetch$1(ReplicaFetcherThread.scala:316)
> at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:553)
> at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:551)
> at scala.collection.AbstractIterable.foreach(Iterable.scala:920)
> at 
> kafka.server.ReplicaFetcherThread.buildFetch(ReplicaFetcherThread.scala:309)
> {noformat}
> It appears that the fetcher is attempting to fetch for a partition that has 
> been getting reassigned away. From further investigation, it seems that in 
> KAFKA-10002 the StopReplica code was changed from:
> 1. Remove partition from fetcher
> 2. Remove partition from partition map
> to the other way around, but now the fetcher may race and attempt to build a 
> fetch for a partition that's no longer mapped.  In particular, since the 
> logOrException code is being called from logStartOffset which isn't protected 
> against NotLeaderOrFollowerException, just against KafkaStorageException, the 
> exception isn't caught and throws all the way out, killing the replica 
> fetcher thread.
> We need to switch this back.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


Jenkins build is back to normal : kafka-trunk-jdk14 #345

2020-08-07 Thread Apache Jenkins Server
See 




Jenkins build is back to normal : kafka-trunk-jdk11 #1693

2020-08-07 Thread Apache Jenkins Server
See 




Re: [DISCUSS] KIP-631: The Quorum-based Kafka Controller

2020-08-07 Thread Jose Garcia Sancio
Hi Unmesh,

Very cool prototype!

Hi Colin,

The KIP proposes a record called IsrChange which includes the
partition, topic, isr, leader and leader epoch. During normal
operation ISR changes do not result in leader changes. Similarly,
leader changes do not necessarily involve ISR changes. The controller
implementation that uses ZK modeled them together because
1. All of this information is stored in one znode.
2. ZK's optimistic lock requires that you specify the new value completely
3. The change to that znode was being performed by both the controller
and the leader.

None of these reasons are true in KIP-500. Have we considered having
two different records? For example

1. IsrChange record which includes topic, partition, isr
2. LeaderChange record which includes topic, partition, leader and leader epoch.

I suspect that making this change will also require changing the
message AlterIsrRequest introduced in KIP-497: Add inter-broker API to
alter ISR.

Thanks
-Jose


Re: [DISCUSS] KIP-630: Kafka Raft Snapshot

2020-08-07 Thread Jose Garcia Sancio
Thanks for your feedback Jun.

Here are my changes to the KIP:
https://cwiki.apache.org/confluence/pages/diffpagesbyversion.action?pageId=158864763=21=20

My comments are below...

On Wed, Aug 5, 2020 at 1:59 PM Jun Rao  wrote:
>
> Hi, Jose,
>
> Thanks for the KIP. A few comments blow.
>
> 10. I agree with Jason that it's useful to document the motivation a bit
> clearer. Regarding semantic/performance, one benefit of snapshotting is
> that it allows changes to be encoded incrementally instead of using the
> full post image. For example, in KIP-631, each partition has multiple
> fields like assigned replicas, leader, epoch, isr, etc. If only isr is
> changed, the snapshotting approach allows the change to be represented with
> just the new value in isr. Compaction will require all existing fields to
> be included in order to represent just an isr change. This is just because
> we can customize the combining logic with snapshotting.
>

Yes. Right now the IsrChange record from KIP-631 has both the ISR and
the leader and epoch. I think we can split this record into two
records:
1. ISR change that includes the topic, partition, and isr.
2. Leader change that includes the topic, partition, leader and leader epoch.
I'll bring this up in the discussion thread for that KIP.

> As for the
> performance benefit, I guess in theory snapshotting allows the snapshot to
> be updated in-place incrementally without having to read the full state in
> the snapshot. BTW, during compaction, we only read the cleaned data once
> instead of 3 times.
>

Doesn't compaction need to read the clean records to compare if the
key is in the map of keys to offset? I made the following changes to
the KIP:

2. With log compaction the broker needs to
  a. read 1MB/s from the head of the log to update the in-memory state
  b. read 1MB/s to update the map of keys to offsets
  c. read 3MB/s (100MB from the already compacted log, 50MB from the
new key-value records) from the older segments. The log will
accumulate 50MB in 50 seconds worth of changes before compacting
because the default configuration has a minimum clean ratio of 50%.

The "100MB in the already compacted log" are these cleaned records.
Let me know what you think and if I am missing something.

> 11. The KIP mentions topic id. Currently there is no topic id. Does this
> KIP depend on KIP-516?
>

For the purpose of measuring the impact, I was using the records
proposed by KIP-631.This KIP doesn't depend on KIP-516 or KIIP-631 on
its design and implementation. I was just referencing that KIP in the
motivation and analysis. The KIP only assumes the changes in KIP-595
which has been approved but it are not part of trunk yet.

In the overview section the KIP mentions: "This KIP assumes that
KIP-595 has been approved and implemented. "

> 12. Is there a need to keep more than 1 snapshot? It seems we always expose
> the latest snapshot to clients.
>

The KIP proposes keeping more than one snapshot to not invalidate any
pending/concurrent `FetchSnapshot` that are attempting to fetch a
snapshot that can be deleted. I'll remove this wording as the first
version of this implementation will probably won't have this feature
as it requires extra coordination. The implementation will still allow
for multiple snapshots because generating a snapshot is not atomic
with respect to increasing the LBO.


> 13. "During leader election, followers with incomplete or missing snapshot
> will send a vote request and response as if they had an empty log." Hmm, a
> follower may not have a snapshot created, but that doesn't imply its log is
> empty.
>

Yes. I fixed the "Validation of Snapshot and Log"  and that sentence.
I basically added an additional condition where a snapshot is not
required if the LBO is 0.

> 14. "LBO is max.replication.lag.ms old." Not sure that I follow. How do we
> compare an offset to a time?
>

Yeah. This may be hard to implement. I am trying to avoid invalidating
followers and observers by aggressively deleting an offset/record
which they are trying to fetch. It is possible that
`controller.snapshot.minimum.records` is good enough to throttle
increasing LBO.

> 15. "Followers and observers will increase their log begin offset to the
> value sent on the fetch response as long as the local state machine has
> generated a snapshot that includes to the log begin offset minus one." Does
> the observer store the log? I thought it only needed to maintain a
> snapshot. If so, the observer doesn't need to maintain LBO.
>

In KIP-595 observers are similar to voters/followers in that they
Fetch the log and the snapshot. Two of the distinctions are that they
don't participate in the leader election and they are not included
when computing the high-watermark. Regarding storing: it is possible
that observers never need to store the log since they don't vote or
become leaders. I think in the future we would like to implement
"KIP-642: Dynamic quorum reassignment" which would add the capability
to 

Re: [DISCUSS] KIP-649: Dynamic Client Configuration

2020-08-07 Thread Ryan Dielhenn
Hi Jason,

Thank you for this feedback it was very insightful and helpful.

> 1. I wonder if we need to bother with `enable.dynamic.config`, especially
> if the default is going to be true anyway. I think users who don't want to
> use this capability can just not set dynamic configs. The only case I can
> see an explicit opt-out being useful is when users are trying to avoid
> getting affected by dynamic defaults. And on that note, is there a strong
> case for supporting default overrides? Many client configs are tied closely
> to application behavior, so it feels a bit dangerous to give users the
> ability to override the configuration for all applications.

I agree that dynamic defaults hitting all applications seems very dangerous and 
probably not a good idea. I also agree that there are better alternatives to 
`enable.dynamic.config`.

> 2. Tying dynamic configurations to clientId has some downsides. It is
> common for users to use a different clientId for every application in a
> consumer group so that it is easier to tie group members back to where
> the client is running. This makes setting configurations at an application
> level cumbersome. The alternative is to use the default, but that means
> hitting /all/ applications which I think is probably not a good idea. A
> convenient alternative for consumers would be to use group.id, but we don't
> have anything similar for the producer. I am wondering if we need to give
> the clients a separate config label of some kind so that there is a
> convenient way to group configurations. For example `config.group`. Note
> that this would be another way to opt into dynamic config support.

A label `config.group` might be the best way to allow users to arbitrarily 
group configs and opt into configs set up for specific workloads. This also 
makes the contract for dynamic config support explicit. The user must set this 
value in the client’s config file, so they should know that the client supports 
dynamic configs. Furthermore, it seems best to scope this `config.group` by the 
user principal. This will prevent users from altering and describing other 
user’s important configs. I’m wondering if new APIs would need to be created to 
scope by principal or if user/config.group could be encoded together in the 
`ResourceName` of `{Describe,IncrementalAlter}Configs`. The configs could be 
stored in:

/config/users//groups/

I’m wondering if introducing dynamic defaults that are not scoped by principal 
for client configs is worth it e.g. /config/groups/. This would create 
a complicated hierarchy, make it very confusing for the user to know where 
dynamic configs are coming from, and also has the potential for all 
applications to match against the default which was already determined to not 
be a good idea.

A default for a specific user also has issues e.g. /config/users/ since 
the user would no longer need to set `config.group` on the client. In this case 
the contract for dynamic config support is not explicit and this would not 
allow the user to opt out of dynamic config support. An alternative is that 
this default is only used if a config group was set on the client but not found 
by the broker. 

> 3. I'm trying to understand the contract between brokers and clients to
> support dynamic configurations. I imagine that once this is available,
> users will have a hard time telling which applications support the
> capability and which do not. Also, we would likely add new dynamic config
> support over time which would make this even harder since we cannot
> retroactively change clients to add support for new dynamic configs. I'm
> wondering if there is anything we can do to make it easier for users to
> tell which dynamic configs are available for each application.

Listing dynamic configs that are supported for each running application could 
be helpful, although this would require storage and APIs for each client to 
register the configs that are supported. I’m wondering if it is most practical 
to make the contract for dynamic config support more explicit to the user by 
introducing `config.group` and then just logging rejected dynamic configs on 
the client. This would include dynamic configs that the client does not support.

> I'm wondering if we need to change the JoinGroup behavior so that it can be
> used to update the session timeout without triggering a rebalance.

I agree. Could a flag be added to the JoinGroup API specifying that just the 
session timeout needs updating? Without this it seems like it would be hard to 
differentiate between a regular JoinGroupRequest and a JoinGroupRequest that 
was sent to update the timeout.

On 2020/08/06 00:23:28, Jason Gustafson  wrote: 
> Hi Ryan,
> 
> Thanks for the proposal. Just a few quick questions:
> 
> 1. I wonder if we need to bother with `enable.dynamic.config`, especially
> if the default is going to be true anyway. I think users who don't want to
> use this capability can just not set dynamic 

[jira] [Created] (KAFKA-10374) Add concurrency tests for the ReplicaManager

2020-08-07 Thread David Jacot (Jira)
David Jacot created KAFKA-10374:
---

 Summary: Add concurrency tests for the ReplicaManager
 Key: KAFKA-10374
 URL: https://issues.apache.org/jira/browse/KAFKA-10374
 Project: Kafka
  Issue Type: Improvement
Reporter: David Jacot


We recently discovered a regression in the ReplicaManager that was due to some 
concurrency issue: KAFKA-10371.

We should add concurrency tests for this area of the broker to ensure that we 
catch similar issues in the future.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10373) Kafka Reassign Partition is stuck with Java OutOfMemory error

2020-08-07 Thread azher khan (Jira)
azher khan created KAFKA-10373:
--

 Summary: Kafka Reassign Partition is stuck with Java OutOfMemory 
error
 Key: KAFKA-10373
 URL: https://issues.apache.org/jira/browse/KAFKA-10373
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 2.2.1
Reporter: azher khan


Hi Team,

While trying to run the Kafka script to reassign partitions of an existing 
topic, we are seeing a Java OutOfMemory issue.

 

The heap for Kafka is set to "-Xmx1G -Xms1G" on the kafka broker.

 
{code:java}
/opt/kafka/bin/kafka-reassign-partitions.sh --zookeeper zookeeper1:2181 
--reassignment-json-file topic_kafka_topic1_reassignment.json 
--bootstrap-server kafkabroker1:9092 --verify
Status of partition reassignment:
[2020-08-07 XX:XX:XX,] ERROR Uncaught exception in thread 
'kafka-admin-client-thread | reassign-partitions-tool': 
(org.apache.kafka.common.utils.KafkaThread)
java.lang.OutOfMemoryError: Java heap space
 at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57)
 at java.nio.ByteBuffer.allocate(ByteBuffer.java:335)
 at org.apache.kafka.common.memory.MemoryPool$1.tryAllocate(MemoryPool.java:30)
 at 
org.apache.kafka.common.network.NetworkReceive.readFrom(NetworkReceive.java:112)
 at org.apache.kafka.common.network.KafkaChannel.receive(KafkaChannel.java:335)
 at org.apache.kafka.common.network.KafkaChannel.read(KafkaChannel.java:296)
 at org.apache.kafka.common.network.Selector.attemptRead(Selector.java:560)
 at 
org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:496)
 at org.apache.kafka.common.network.Selector.poll(Selector.java:425)
 at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
 at 
org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1116)
 at java.lang.Thread.run(Thread.java:748)
Reassignment of partition kafka_topic1-0 is still in progress
Reassignment of partition kafka_topic1-1 is still in progress
Reassignment of partition kafka_topic1-2 is still in progress{code}
 

Retried the above command after removing the "reassign_partitions" from 
zookeeper as suggested but we are seeing the same error.

 

 
{code:java}
[zk: localhost:2181(CONNECTED) 5] delete /admin/reassign_partitions
[zk: localhost:2181(CONNECTED) 7] ls /admin
[delete_topics] 
{code}
 

Would highly appreciate your advice,

Thank you in advance,

 

Regards,

Azher Khan



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (KAFKA-10372) [JAVA 11] Unrecognized VM option PrintGCDateStamps

2020-08-07 Thread Kevin Tibi (Jira)


 [ 
https://issues.apache.org/jira/browse/KAFKA-10372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kevin Tibi resolved KAFKA-10372.

Resolution: Abandoned

error from ansible role which override script.

> [JAVA 11] Unrecognized VM option PrintGCDateStamps
> --
>
> Key: KAFKA-10372
> URL: https://issues.apache.org/jira/browse/KAFKA-10372
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 2.6.0
>Reporter: Kevin Tibi
>Priority: Blocker
>  Labels: bug
> Fix For: 2.6.1
>
>
> Hello,
> I can't start kafka with JAVA 11. 
>  
> {code:java}
> kafka-server-start.sh[2721]: Unrecognized VM option 'PrintGCDateStamps'
> kafka-server-start.sh[2721]: Error: Could not create the Java Virtual Machine.
> kafka-server-start.sh[2721]: Error: A fatal exception has occurred. Program 
> will exit.{code}
> This flag (in kafka-run-class.sh L302) is deprecated in JAVA 11.
> Solution :
> {code:java}
> -Xlog:::time,uptime,level,tags" # Equivalent to PrintGCTimeStamps, Uptime and 
> PrintGCDateStamps{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (KAFKA-10372) [JAVA 11] Unrecognized VM option PrintGCDateStamps

2020-08-07 Thread Kevin Tibi (Jira)
Kevin Tibi created KAFKA-10372:
--

 Summary: [JAVA 11] Unrecognized VM option PrintGCDateStamps
 Key: KAFKA-10372
 URL: https://issues.apache.org/jira/browse/KAFKA-10372
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 2.6.0
Reporter: Kevin Tibi


Hello,

I can start kafka with JAVA 11. 

 
{code:java}
kafka-server-start.sh[2721]: Unrecognized VM option 'PrintGCDateStamps'
kafka-server-start.sh[2721]: Error: Could not create the Java Virtual Machine.
kafka-server-start.sh[2721]: Error: A fatal exception has occurred. Program 
will exit.{code}
This flag (in kafka-run-class.sh L302) is deprecated in JAVA 11.

Solution :
{code:java}
-Xlog:::time,uptime,level,tags" # Equivalent to PrintGCTimeStamps, Uptime and 
PrintGCDateStamps{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[GitHub] [kafka-site] showuon commented on pull request #285: MINOR: Fix the wrong referenced document in quickstart page

2020-08-07 Thread GitBox


showuon commented on pull request #285:
URL: https://github.com/apache/kafka-site/pull/285#issuecomment-67045


   @scott-confluent @rhauch , could you review this PR ? Thanks.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org




[GitHub] [kafka-site] showuon opened a new pull request #285: MINOR: Fix the wrong referenced document name in quickstart page

2020-08-07 Thread GitBox


showuon opened a new pull request #285:
URL: https://github.com/apache/kafka-site/pull/285


   In quickstart page, we'll see the following error message displayed. After 
investigation, I found it's because we point to the wrong document 
`26/quickstart-zookeeper.html`. Change to the correct name `26/quickstart.html` 
to fix it.
   
   https://user-images.githubusercontent.com/43372967/89632898-905eff80-d8d5-11ea-9849-e70095ad5d4b.png;>
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org