[jira] [Resolved] (KAFKA-3067) Fix producer API documentation for case when RequiredAcks > 1

2016-04-15 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3067?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy resolved KAFKA-3067.

   Resolution: Fixed
 Assignee: Manikumar Reddy  (was: Jun Rao)
Fix Version/s: 0.10.0.0

Updated the Protocol wiki documentation.

> Fix producer API documentation for case when RequiredAcks > 1
> -
>
> Key: KAFKA-3067
> URL: https://issues.apache.org/jira/browse/KAFKA-3067
> Project: Kafka
>  Issue Type: Improvement
>  Components: producer 
>Affects Versions: 0.9.0.0
>Reporter: Sergiy Zuban
>Assignee: Manikumar Reddy
>Priority: Trivial
> Fix For: 0.10.0.0
>
>
> RequiredAcks section of 
> https://cwiki.apache.org/confluence/display/KAFKA/A+Guide+To+The+Kafka+Protocol#AGuideToTheKafkaProtocol-ProduceAPI
> probably needs to be updated/removed.
> With Kafka 0.9 number > 1 doesn't seem to be supported anymore since we are 
> getting InvalidRequiredAcksCode error.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk7 #1199

2016-04-15 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-3526: Return string instead of object in ConfigKeyInfo and

--
[...truncated 2450 lines...]
kafka.api.ProducerFailureHandlingTest > testNotEnoughReplicas PASSED

kafka.api.ProducerFailureHandlingTest > testNonExistentTopic PASSED

kafka.api.ProducerFailureHandlingTest > testInvalidPartition PASSED

kafka.api.ProducerFailureHandlingTest > testSendAfterClosed PASSED

kafka.api.ProducerFailureHandlingTest > testTooLargeRecordWithAckZero PASSED

kafka.api.ProducerFailureHandlingTest > 
testNotEnoughReplicasAfterBrokerShutdown PASSED

kafka.api.ProducerBounceTest > testBrokerFailure PASSED

kafka.api.SaslPlaintextConsumerTest > testPauseStateNotPreservedByRebalance 
PASSED

kafka.api.SaslPlaintextConsumerTest > testUnsubscribeTopic PASSED

kafka.api.SaslPlaintextConsumerTest > testListTopics PASSED

kafka.api.SaslPlaintextConsumerTest > testAutoCommitOnRebalance PASSED

kafka.api.SaslPlaintextConsumerTest > testSimpleConsumption PASSED

kafka.api.SaslPlaintextConsumerTest > testPartitionReassignmentCallback PASSED

kafka.api.SaslPlaintextConsumerTest > testCommitSpecifiedOffsets PASSED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithCreateTime 
PASSED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithLogAppendTime 
PASSED

kafka.api.SslProducerSendTest > testClose PASSED

kafka.api.SslProducerSendTest > testFlush PASSED

kafka.api.SslProducerSendTest > testSendToPartition PASSED

kafka.api.SslProducerSendTest > testSendOffset PASSED

kafka.api.SslProducerSendTest > testAutoCreateTopic PASSED

kafka.api.SslProducerSendTest > testSendWithInvalidCreateTime PASSED

kafka.api.SslProducerSendTest > testSendCompressedMessageWithCreateTime PASSED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromCallerThread PASSED

kafka.api.SslProducerSendTest > testCloseWithZeroTimeoutFromSenderThread PASSED

kafka.api.SslProducerSendTest > testSendNonCompressedMessageWithLogApendTime 
PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededToReadFromNonExistentTopic PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testListOfsetsWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicRead PASSED

kafka.api.AuthorizerIntegrationTest > testListOffsetsWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > 
testCreatePermissionNeededForWritingToNonExistentTopic PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithNoTopicAccess PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testAuthorization PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicAndGroupRead PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithTopicWrite PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithNoGroupAccess PASSED

kafka.api.AuthorizerIntegrationTest > testConsumeWithNoAccess PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchWithTopicAndGroupRead 
PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testProduceWithTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testOffsetFetchTopicDescribe PASSED

kafka.api.AuthorizerIntegrationTest > testCommitWithTopicAndGroupRead PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testNoConsumeAcl PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testProduceConsume PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testNoProduceAcl PASSED

kafka.api.SaslSslEndToEndAuthorizationTest > testNoGroupAcl PASSED

kafka.api.PlaintextProducerSendTest > testSerializerConstructors PASSED

kafka.api.PlaintextProducerSendTest > testWrongSerializer PASSED

kafka.api.PlaintextProducerSendTest > 
testSendNonCompressedMessageWithCreateTime PASSED

kafka.api.PlaintextProducerSendTest > 
testSendCompressedMessageWithLogAppendTime PASSED

kafka.api.PlaintextProducerSendTest > testClose PASSED

kafka.api.PlaintextProducerSendTest > testFlush PASSED

kafka.api.PlaintextProducerSendTest > testSendToPartition PASSED

kafka.api.PlaintextProducerSendTest > testSendOffset PASSED

kafka.api.PlaintextProducerSendTest > testAutoCreateTopic PASSED


[jira] [Commented] (KAFKA-3499) byte[] should not be used as Map key nor Set member

2016-04-15 Thread josh gruenberg (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243866#comment-15243866
 ] 

josh gruenberg commented on KAFKA-3499:
---

Sure, I'll take a look later tonight.

> byte[] should not be used as Map key nor Set member
> ---
>
> Key: KAFKA-3499
> URL: https://issues.apache.org/jira/browse/KAFKA-3499
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: josh gruenberg
>Assignee: Guozhang Wang
>  Labels: user-experience
> Fix For: 0.10.0.0
>
>
> On the JVM, Array.equals and Array.hashCode do not incorporate array 
> contents; they inherit Object.equals/hashCode. This implies that Collections 
> that rely upon equals/hashCode (eg, HashMap/HashSet and variants) treat two 
> arrays with equal contents as distinct elements.
> Many of the Kafka Streams internal classes currently use generic HashMaps and 
> Sets to manage caches and invalidation status. For example, 
> RocksDBStore.cacheDirtyKeys is a HashSet. Then, in RocksDBWindowStore, the 
> Elements are constructed as RocksDBStore.
> Similarly, the MemoryLRUCache internally holds a 
> LinkedHashMap map, and a HashSet keys, and these end up holding 
> byte[] keys. Finally, user-code may attempt to use any of these provided 
> types with byte[], with undesirable results.
> Keys that are byte-arrays should be wrapped in a type that incorporates the 
> content in their computation of equals/hashCode. java.nio.ByteBuffer is one 
> such type that could be used, but a purpose-built immutable class would 
> likely be a better solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Build failed in Jenkins: kafka-trunk-jdk8 #529

2016-04-15 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-3526: Return string instead of object in ConfigKeyInfo and

--
[...truncated 1595 lines...]

kafka.consumer.ZookeeperConsumerConnectorTest > testConsumerDecoder PASSED

kafka.consumer.ZookeeperConsumerConnectorTest > testConsumerRebalanceListener 
PASSED

kafka.consumer.ZookeeperConsumerConnectorTest > testCompression PASSED

kafka.controller.ControllerFailoverTest > testMetadataUpdate PASSED

kafka.producer.ProducerTest > testSendToNewTopic PASSED

kafka.producer.ProducerTest > testAsyncSendCanCorrectlyFailWithTimeout PASSED

kafka.producer.ProducerTest > testSendNullMessage PASSED

kafka.producer.ProducerTest > testUpdateBrokerPartitionInfo PASSED

kafka.producer.ProducerTest > testSendWithDeadBroker PASSED

kafka.producer.AsyncProducerTest > testFailedSendRetryLogic PASSED

kafka.producer.AsyncProducerTest > testQueueTimeExpired PASSED

kafka.producer.AsyncProducerTest > testPartitionAndCollateEvents PASSED

kafka.producer.AsyncProducerTest > testBatchSize PASSED

kafka.producer.AsyncProducerTest > testSerializeEvents PASSED

kafka.producer.AsyncProducerTest > testProducerQueueSize PASSED

kafka.producer.AsyncProducerTest > testRandomPartitioner PASSED

kafka.producer.AsyncProducerTest > testInvalidConfiguration PASSED

kafka.producer.AsyncProducerTest > testInvalidPartition PASSED

kafka.producer.AsyncProducerTest > testNoBroker PASSED

kafka.producer.AsyncProducerTest > testProduceAfterClosed PASSED

kafka.producer.AsyncProducerTest > testJavaProducer PASSED

kafka.producer.AsyncProducerTest > testIncompatibleEncoder PASSED

kafka.producer.SyncProducerTest > testReachableServer PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLarge PASSED

kafka.producer.SyncProducerTest > testNotEnoughReplicas PASSED

kafka.producer.SyncProducerTest > testMessageSizeTooLargeWithAckZero PASSED

kafka.producer.SyncProducerTest > testProducerCanTimeout PASSED

kafka.producer.SyncProducerTest > testProduceRequestWithNoResponse PASSED

kafka.producer.SyncProducerTest > testEmptyProduceRequest PASSED

kafka.producer.SyncProducerTest > testProduceCorrectlyReceivesResponse PASSED

kafka.tools.ConsoleProducerTest > testParseKeyProp PASSED

kafka.tools.ConsoleProducerTest > testValidConfigsOldProducer PASSED

kafka.tools.ConsoleProducerTest > testInvalidConfigs PASSED

kafka.tools.ConsoleProducerTest > testValidConfigsNewProducer PASSED

kafka.tools.ConsoleConsumerTest > shouldLimitReadsToMaxMessageLimit PASSED

kafka.tools.ConsoleConsumerTest > shouldParseValidNewConsumerValidConfig PASSED

kafka.tools.ConsoleConsumerTest > shouldParseConfigsFromFile PASSED

kafka.tools.ConsoleConsumerTest > shouldParseValidOldConsumerValidConfig PASSED

kafka.security.auth.PermissionTypeTest > testFromString PASSED

kafka.security.auth.ResourceTypeTest > testFromString PASSED

kafka.security.auth.OperationTest > testFromString PASSED

kafka.security.auth.AclTest > testAclJsonConversion PASSED

kafka.security.auth.ZkAuthorizationTest > testIsZkSecurityEnabled PASSED

kafka.security.auth.ZkAuthorizationTest > testZkUtils PASSED

kafka.security.auth.ZkAuthorizationTest > testZkAntiMigration PASSED

kafka.security.auth.ZkAuthorizationTest > testZkMigration PASSED

kafka.security.auth.ZkAuthorizationTest > testChroot PASSED

kafka.security.auth.ZkAuthorizationTest > testDelete PASSED

kafka.security.auth.ZkAuthorizationTest > testDeleteRecursive PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAllowAllAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testLocalConcurrentModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFound PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDistributedConcurrentModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testLoadCache PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testIsrAfterBrokerShutDownAndJoinsBack PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAutoCreateTopicWithCollision PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 
testAliveBrokerListWithNoTopics PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > testGetAllTopicMetadata 
PASSED

kafka.integration.SaslPlaintextTopicMetadataTest > 

[jira] [Commented] (KAFKA-3565) Producer's throughput lower with compressed data after KIP-31/32

2016-04-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243862#comment-15243862
 ] 

Ismael Juma commented on KAFKA-3565:


Interesting. Good to know that there are cases where the new code does better. 
:)

I ran `benchmark_test.py` on d2.2xlarge AWS instances (8 vCPUs, Xeon E5-2676v3 
2.4 Ghz , 61 GB of RAM, 6 x 2000 GB HDD, High networking), the file includes 
all the settings used:

https://github.com/apache/kafka/blob/trunk/tests/kafkatest/benchmarks/core/benchmark_test.py#L72

> Producer's throughput lower with compressed data after KIP-31/32
> 
>
> Key: KAFKA-3565
> URL: https://issues.apache.org/jira/browse/KAFKA-3565
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> Relative offsets were introduced by KIP-31 so that the broker does not have 
> to recompress data (this was previously required after offsets were 
> assigned). The implicit assumption is that reducing CPU usage required by 
> recompression would mean that producer throughput for compressed data would 
> increase.
> However, this doesn't seem to be the case:
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   59.030 seconds
> {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
> {code}
> Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   1 minute 0.243 seconds
> {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
> {code}
> Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
> The difference for the uncompressed case is smaller (and within what one 
> would expect given the additional size overhead caused by the timestamp 
> field):
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 4.176 seconds
> {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
> {code}
> Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 5.079 seconds
> {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
> {code}
> Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3499) byte[] should not be used as Map key nor Set member

2016-04-15 Thread Guozhang Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243856#comment-15243856
 ] 

Guozhang Wang commented on KAFKA-3499:
--

[~joshng] Would you like to take a look at the PR?

> byte[] should not be used as Map key nor Set member
> ---
>
> Key: KAFKA-3499
> URL: https://issues.apache.org/jira/browse/KAFKA-3499
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: josh gruenberg
>Assignee: Guozhang Wang
>  Labels: user-experience
> Fix For: 0.10.0.0
>
>
> On the JVM, Array.equals and Array.hashCode do not incorporate array 
> contents; they inherit Object.equals/hashCode. This implies that Collections 
> that rely upon equals/hashCode (eg, HashMap/HashSet and variants) treat two 
> arrays with equal contents as distinct elements.
> Many of the Kafka Streams internal classes currently use generic HashMaps and 
> Sets to manage caches and invalidation status. For example, 
> RocksDBStore.cacheDirtyKeys is a HashSet. Then, in RocksDBWindowStore, the 
> Elements are constructed as RocksDBStore.
> Similarly, the MemoryLRUCache internally holds a 
> LinkedHashMap map, and a HashSet keys, and these end up holding 
> byte[] keys. Finally, user-code may attempt to use any of these provided 
> types with byte[], with undesirable results.
> Keys that are byte-arrays should be wrapped in a type that incorporates the 
> content in their computation of equals/hashCode. java.nio.ByteBuffer is one 
> such type that could be used, but a purpose-built immutable class would 
> likely be a better solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3499) byte[] should not be used as Map key nor Set member

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243852#comment-15243852
 ] 

ASF GitHub Bot commented on KAFKA-3499:
---

GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/1229

KAFKA-3499: prevent array typed keys in KeyValueStore



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka K3499

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1229.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1229


commit 718b8f1423d885c5bee32bacbcacf4fb78598372
Author: Guozhang Wang 
Date:   2016-04-15T00:26:57Z

wrap segments keys with ByteBuffer

commit 9302bc18c8fa400b1fb7d4b36f29c8f0c812785c
Author: Guozhang Wang 
Date:   2016-04-15T23:35:36Z

prevent array keys in key value store




> byte[] should not be used as Map key nor Set member
> ---
>
> Key: KAFKA-3499
> URL: https://issues.apache.org/jira/browse/KAFKA-3499
> Project: Kafka
>  Issue Type: Sub-task
>  Components: streams
>Reporter: josh gruenberg
>Assignee: Guozhang Wang
>  Labels: user-experience
> Fix For: 0.10.0.0
>
>
> On the JVM, Array.equals and Array.hashCode do not incorporate array 
> contents; they inherit Object.equals/hashCode. This implies that Collections 
> that rely upon equals/hashCode (eg, HashMap/HashSet and variants) treat two 
> arrays with equal contents as distinct elements.
> Many of the Kafka Streams internal classes currently use generic HashMaps and 
> Sets to manage caches and invalidation status. For example, 
> RocksDBStore.cacheDirtyKeys is a HashSet. Then, in RocksDBWindowStore, the 
> Elements are constructed as RocksDBStore.
> Similarly, the MemoryLRUCache internally holds a 
> LinkedHashMap map, and a HashSet keys, and these end up holding 
> byte[] keys. Finally, user-code may attempt to use any of these provided 
> types with byte[], with undesirable results.
> Keys that are byte-arrays should be wrapped in a type that incorporates the 
> content in their computation of equals/hashCode. java.nio.ByteBuffer is one 
> such type that could be used, but a purpose-built immutable class would 
> likely be a better solution.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3499: prevent array typed keys in KeyVal...

2016-04-15 Thread guozhangwang
GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/1229

KAFKA-3499: prevent array typed keys in KeyValueStore



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka K3499

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1229.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1229


commit 718b8f1423d885c5bee32bacbcacf4fb78598372
Author: Guozhang Wang 
Date:   2016-04-15T00:26:57Z

wrap segments keys with ByteBuffer

commit 9302bc18c8fa400b1fb7d4b36f29c8f0c812785c
Author: Guozhang Wang 
Date:   2016-04-15T23:35:36Z

prevent array keys in key value store




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3565) Producer's throughput lower with compressed data after KIP-31/32

2016-04-15 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243850#comment-15243850
 ] 

Jiangjie Qin commented on KAFKA-3565:
-

The producer performance I ran was with similar tweaks you made but with an 
adjustable range of integer (0 to valueBound) to get different compression 
ratio.

> Producer's throughput lower with compressed data after KIP-31/32
> 
>
> Key: KAFKA-3565
> URL: https://issues.apache.org/jira/browse/KAFKA-3565
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> Relative offsets were introduced by KIP-31 so that the broker does not have 
> to recompress data (this was previously required after offsets were 
> assigned). The implicit assumption is that reducing CPU usage required by 
> recompression would mean that producer throughput for compressed data would 
> increase.
> However, this doesn't seem to be the case:
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   59.030 seconds
> {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
> {code}
> Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   1 minute 0.243 seconds
> {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
> {code}
> Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
> The difference for the uncompressed case is smaller (and within what one 
> would expect given the additional size overhead caused by the timestamp 
> field):
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 4.176 seconds
> {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
> {code}
> Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 5.079 seconds
> {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
> {code}
> Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3565) Producer's throughput lower with compressed data after KIP-31/32

2016-04-15 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243840#comment-15243840
 ] 

Jiangjie Qin commented on KAFKA-3565:
-

[~ijuma] What is your other settings? e.g. max.in.flight.requests, number of 
partitions, batch size, etc.

I happen to be conducting some performance tuning for the producers. The 
results I noticed are different from what you saw:

{noformat}
Producer before KIP31/32

./kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 
test_3_replica_1_partition --num-records 10 --record-size 1 
--throughput 1 --valueBound 5 --producer-props 
bootstrap.servers=CLUSTER_WIHT_KIP31/32 acks=1 
max.in.flight.requests.per.connection=1 batch.size=5 compression.type=gzip 
client.id=becket_gzip
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/core/build/dependant-libs-2.10.6/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/tools/build/dependant-libs-2.10.6/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/connect/api/build/dependant-libs/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/connect/runtime/build/dependant-libs/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/connect/file/build/dependant-libs/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/connect/json/build/dependant-libs/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
5370 records sent, 1074.0 records/sec (10.24 MB/sec), 14.3 ms avg latency, 
165.0 max latency.
5875 records sent, 1175.0 records/sec (11.21 MB/sec), 13.1 ms avg latency, 34.0 
max latency.
5977 records sent, 1195.2 records/sec (11.40 MB/sec), 12.7 ms avg latency, 96.0 
max latency.
6000 records sent, 1199.8 records/sec (11.44 MB/sec), 12.7 ms avg latency, 79.0 
max latency.
6115 records sent, 1223.0 records/sec (11.66 MB/sec), 12.0 ms avg latency, 34.0 
max latency.
5967 records sent, 1193.4 records/sec (11.38 MB/sec), 12.5 ms avg latency, 32.0 
max latency.
5948 records sent, 1189.6 records/sec (11.34 MB/sec), 12.7 ms avg latency, 35.0 
max latency.
5939 records sent, 1187.8 records/sec (11.33 MB/sec), 12.9 ms avg latency, 78.0 
max latency.
5941 records sent, 1188.2 records/sec (11.33 MB/sec), 12.7 ms avg latency, 32.0 
max latency.
5947 records sent, 1188.9 records/sec (11.34 MB/sec), 12.7 ms avg latency, 31.0 
max latency.
6016 records sent, 1203.2 records/sec (11.47 MB/sec), 12.5 ms avg latency, 44.0 
max latency.
6049 records sent, 1209.6 records/sec (11.54 MB/sec), 12.3 ms avg latency, 32.0 
max latency.
6042 records sent, 1208.4 records/sec (11.52 MB/sec), 12.3 ms avg latency, 33.0 
max latency.
5961 records sent, 1191.5 records/sec (11.36 MB/sec), 12.6 ms avg latency, 32.0 
max latency.
5875 records sent, 1174.5 records/sec (11.20 MB/sec), 12.9 ms avg latency, 94.0 
max latency.
5730 records sent, 1145.5 records/sec (10.92 MB/sec), 13.0 ms avg latency, 
111.0 max latency.
10 records sent, 1181.851489 records/sec (11.27 MB/sec), 12.76 ms avg 
latency, 165.00 ms max latency, 12 ms 50th, 21 ms 95th, 27 ms 99th, 47 ms 
99.9th.

./kafka-run-class.sh org.apache.kafka.tools.ProducerPerformance --topic 
test_3_replica_1_partition --num-records 10 --record-size 1 
--throughput 1 --valueBound 5 --producer-props 
bootstrap.servers=CLUSTER_BEFORE_KIP31/32 acks=1 
max.in.flight.requests.per.connection=1 batch.size=5 compression.type=gzip 
client.id=becket_gzip
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/core/build/dependant-libs-2.10.6/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/tools/build/dependant-libs-2.10.6/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/connect/api/build/dependant-libs/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 
[jar:file:/home/jqin/workspace/temp/gitlikafka/connect/runtime/build/dependant-libs/slf4j-log4j12-1.7.15.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in 

[jira] [Created] (KAFKA-3566) Enable VerifiableProducer and ConsoleConsumer to run with interceptors

2016-04-15 Thread Anna Povzner (JIRA)
Anna Povzner created KAFKA-3566:
---

 Summary: Enable VerifiableProducer and ConsoleConsumer to run with 
interceptors
 Key: KAFKA-3566
 URL: https://issues.apache.org/jira/browse/KAFKA-3566
 Project: Kafka
  Issue Type: Test
Affects Versions: 0.10.0.0
Reporter: Anna Povzner
Assignee: Anna Povzner


Add interceptor class list and export path list params to VerifiableProducer 
and ConsoleConsumer constructors. This is to allow running VerifiableProducer 
and ConsoleConsumer with interceptors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3526) REST APIs return object representation instead of string for config values, default values and recommended values

2016-04-15 Thread Ewen Cheslack-Postava (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ewen Cheslack-Postava updated KAFKA-3526:
-
   Resolution: Fixed
Fix Version/s: (was: 0.11.0.0)
   0.10.1.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 1200
[https://github.com/apache/kafka/pull/1200]

> REST APIs return object representation instead of string for config values, 
> default values and recommended values 
> --
>
> Key: KAFKA-3526
> URL: https://issues.apache.org/jira/browse/KAFKA-3526
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.0
>Reporter: Liquan Pei
>Assignee: Liquan Pei
> Fix For: 0.10.1.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In the response of 
> {code}
> PUT /connector-plugins/{name}/config/validate
> {code}
> The value.value, value.recommended_values and definition.default_value (If 
> the ConfigKey is not String type) are serialized by Jackson to something 
> other than strings.
> We actually expect the values to be strings, and this should mirror the 
> current requirements of submitting configs which is that the config is a 
> Map.
> This can be fixed by adding ConfigDef support for converting values to 
> strings, corresponding to all the parse methods, and then making sure all 
> these outputs of values have type String instead of Object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3526: Return string instead of object in...

2016-04-15 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1200


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3526) REST APIs return object representation instead of string for config values, default values and recommended values

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243777#comment-15243777
 ] 

ASF GitHub Bot commented on KAFKA-3526:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/1200


> REST APIs return object representation instead of string for config values, 
> default values and recommended values 
> --
>
> Key: KAFKA-3526
> URL: https://issues.apache.org/jira/browse/KAFKA-3526
> Project: Kafka
>  Issue Type: Bug
>  Components: KafkaConnect
>Affects Versions: 0.10.0.0
>Reporter: Liquan Pei
>Assignee: Liquan Pei
> Fix For: 0.10.1.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> In the response of 
> {code}
> PUT /connector-plugins/{name}/config/validate
> {code}
> The value.value, value.recommended_values and definition.default_value (If 
> the ConfigKey is not String type) are serialized by Jackson to something 
> other than strings.
> We actually expect the values to be strings, and this should mirror the 
> current requirements of submitting configs which is that the config is a 
> Map.
> This can be fixed by adding ConfigDef support for converting values to 
> strings, corresponding to all the parse methods, and then making sure all 
> these outputs of values have type String instead of Object.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-04-15 Thread Ashish Singh
Jitendra,

Could you post your views on existing discuss thread for KIP-48,
http://mail-archives.apache.org/mod_mbox/kafka-dev/201602.mbox/%3cd2f60a7c.61f2c%25pbrahmbh...@hortonworks.com%3E
?

On Fri, Apr 15, 2016 at 3:11 PM, Jitendra Pandey 
wrote:

>
>  The need for a large number of clients that are running all over the
> cluster that authenticate with Kafka brokers, is very similar to the Hadoop
> use case of large number of tasks running across the cluster that need
> authentication to Hdfs Namenode. Therefore, the delegation token approach
> does seem like a good fit for this use case as we have seen it working at
> large scale in HDFS and YARN.
>
>   The proposed design is very much inline with Hadoop approach. A few
> comments:
>
> 1) Why do you guys want to allow infinite renewable lifetime for a token?
> HDFS restricts a token to a max life time (default 7 days).  A token's
> vulnerability is believed to increase with time.
>
> 2) As I understand the tokens are stored in zookeeper as well, and can be
> updated there. This is clever as it can allow replacing the tokens once
> they run out of max life time, and clients can download new tokens from
> zookeeper. It shouldn't be a big load on zookeeper as a client will need to
> get a new token once in several days. In this approach you don't need
> infinite lifetime on the token even for long running clients.
>
> 3) The token password are generated using a master key. The master key
> should also be periodically changed. In Hadoop, the default renewal period
> is 1 day.?
>
> Thanks for a thorough proposal, great work!
>
>
> ?
>
>


-- 

Regards,
Ashish


[jira] [Commented] (KAFKA-3565) Producer's throughput lower with compressed data after KIP-31/32

2016-04-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243699#comment-15243699
 ] 

Ismael Juma commented on KAFKA-3565:


By the way, the results with compression include the following change to 
`ProducerPerformance`:

https://github.com/apache/kafka/pull/1225/files#diff-34eb560f812ed25c56bd774d3360e4b8R66

> Producer's throughput lower with compressed data after KIP-31/32
> 
>
> Key: KAFKA-3565
> URL: https://issues.apache.org/jira/browse/KAFKA-3565
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> Relative offsets were introduced by KIP-31 so that the broker does not have 
> to recompress data (this was previously required after offsets were 
> assigned). The implicit assumption is that reducing CPU usage required by 
> recompression would mean that producer throughput for compressed data would 
> increase.
> However, this doesn't seem to be the case:
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   59.030 seconds
> {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
> {code}
> Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   1 minute 0.243 seconds
> {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
> {code}
> Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
> The difference for the uncompressed case is smaller (and within what one 
> would expect given the additional size overhead caused by the timestamp 
> field):
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 4.176 seconds
> {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
> {code}
> Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 5.079 seconds
> {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
> {code}
> Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3565) Producer's throughput lower with compressed data after KIP-31/32

2016-04-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243695#comment-15243695
 ] 

Ismael Juma commented on KAFKA-3565:


[~becket_qin], do you think you can look into this? We need to understand 
what's happening here before we can release 0.10.0.0. cc [~junrao] [~gwenshap]

> Producer's throughput lower with compressed data after KIP-31/32
> 
>
> Key: KAFKA-3565
> URL: https://issues.apache.org/jira/browse/KAFKA-3565
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ismael Juma
>Priority: Critical
> Fix For: 0.10.0.0
>
>
> Relative offsets were introduced by KIP-31 so that the broker does not have 
> to recompress data (this was previously required after offsets were 
> assigned). The implicit assumption is that reducing CPU usage required by 
> recompression would mean that producer throughput for compressed data would 
> increase.
> However, this doesn't seem to be the case:
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   59.030 seconds
> {"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
> {code}
> Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
> status: PASS
> run time:   1 minute 0.243 seconds
> {"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
> {code}
> Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d
> The difference for the uncompressed case is smaller (and within what one 
> would expect given the additional size overhead caused by the timestamp 
> field):
> {code}
> Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
> test_id:
> 2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 4.176 seconds
> {"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
> {code}
> Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f
> {code}
> Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
> test_id:
> 2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
> status: PASS
> run time:   1 minute 5.079 seconds
> {"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
> {code}
> Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (KAFKA-3565) Producer's throughput lower with compressed data after KIP-31/32

2016-04-15 Thread Ismael Juma (JIRA)
Ismael Juma created KAFKA-3565:
--

 Summary: Producer's throughput lower with compressed data after 
KIP-31/32
 Key: KAFKA-3565
 URL: https://issues.apache.org/jira/browse/KAFKA-3565
 Project: Kafka
  Issue Type: Bug
Reporter: Ismael Juma
Priority: Critical
 Fix For: 0.10.0.0


Relative offsets were introduced by KIP-31 so that the broker does not have to 
recompress data (this was previously required after offsets were assigned). The 
implicit assumption is that reducing CPU usage required by recompression would 
mean that producer throughput for compressed data would increase.

However, this doesn't seem to be the case:

{code}
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
test_id:
2016-04-15--012.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
status: PASS
run time:   59.030 seconds
{"records_per_sec": 519418.343653, "mb_per_sec": 49.54}
{code}
Full results: https://gist.github.com/ijuma/0afada4ff51ad6a5ac2125714d748292

{code}
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
test_id:
2016-04-15--013.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100.compression_type=snappy
status: PASS
run time:   1 minute 0.243 seconds
{"records_per_sec": 427308.818848, "mb_per_sec": 40.75}
{code}
Full results: https://gist.github.com/ijuma/e49430f0548c4de5691ad47696f5c87d

The difference for the uncompressed case is smaller (and within what one would 
expect given the additional size overhead caused by the timestamp field):

{code}
Commit: eee95228fabe1643baa016a2d49fb0a9fe2c66bd (one before KIP-31/32)
test_id:
2016-04-15--010.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
status: PASS
run time:   1 minute 4.176 seconds
{"records_per_sec": 321018.17747, "mb_per_sec": 30.61}
{code}
Full results: https://gist.github.com/ijuma/5fec369d686751a2d84debae8f324d4f

{code}
Commit: fa594c811e4e329b6e7b897bce910c6772c46c0f (KIP-31/32)
test_id:
2016-04-15--014.kafkatest.tests.benchmark_test.Benchmark.test_producer_throughput.topic=topic-replication-factor-three.security_protocol=PLAINTEXT.acks=1.message_size=100
status: PASS
run time:   1 minute 5.079 seconds
{"records_per_sec": 291777.608696, "mb_per_sec": 27.83}
{code}
Full results: https://gist.github.com/ijuma/1d35bd831ff9931448b0294bd9b787ed



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3561) Auto create through topic for KStream aggregation and join

2016-04-15 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-3561:
-
Description: 
For KStream.join / aggregateByKey operations that requires the streams to be 
partitioned on the record key, today users should repartition themselves 
through the "through" call:

{code}
stream1 = builder.stream("topic1");
stream2 = builder.stream("topic2");

stream3 = stream1.map(/* set the right key for join*/).through("topic3");
stream4 = stream2.map(/* set the right key for join*/).through("topic4");

stream3.join(stream4, ..)
{code}

This pattern can actually be done by the Streams DSL itself instead of 
requiring users to specify themselves, i.e. users can just set the right key 
like (see KAFKA-3430) and then call join, which will be translated by adding 
the "internal topic for repartition".

Another thing is that today if user do not call "through" after setting a new 
key, the aggregation result would not be correct as the aggregation is based on 
key B while the source partitions is partitioned by key A and hence each task 
will only get a partial aggregation for all keys. But this is not validated in 
the DSL today. We should do both the auto-translation and validation.

  was:
For KStream.join / aggregateByKey operations that requires the streams to be 
partitioned on the record key, today users should repartition themselves 
through the "through" call:

{code}
stream1 = builder.stream("topic1");
stream2 = builder.stream("topic2");

stream3 = stream1.map(/* set the right key for join*/).through("topic3");
stream4 = stream2.map(/* set the right key for join*/).through("topic4");

stream3.join(stream4, ..)
{code}

This pattern can actually be done by the Streams DSL itself instead of 
requiring users to specify themselves, i.e. users can just set the right key 
like (see KAFKA-3430) and then call join, which will be translated by adding 
the "internal topic for repartition".


> Auto create through topic for KStream aggregation and join
> --
>
> Key: KAFKA-3561
> URL: https://issues.apache.org/jira/browse/KAFKA-3561
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Guozhang Wang
>  Labels: api
> Fix For: 0.10.1.0
>
>
> For KStream.join / aggregateByKey operations that requires the streams to be 
> partitioned on the record key, today users should repartition themselves 
> through the "through" call:
> {code}
> stream1 = builder.stream("topic1");
> stream2 = builder.stream("topic2");
> stream3 = stream1.map(/* set the right key for join*/).through("topic3");
> stream4 = stream2.map(/* set the right key for join*/).through("topic4");
> stream3.join(stream4, ..)
> {code}
> This pattern can actually be done by the Streams DSL itself instead of 
> requiring users to specify themselves, i.e. users can just set the right key 
> like (see KAFKA-3430) and then call join, which will be translated by adding 
> the "internal topic for repartition".
> Another thing is that today if user do not call "through" after setting a new 
> key, the aggregation result would not be correct as the aggregation is based 
> on key B while the source partitions is partitioned by key A and hence each 
> task will only get a partial aggregation for all keys. But this is not 
> validated in the DSL today. We should do both the auto-translation and 
> validation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[DISCUSS] KIP-48 Support for delegation tokens as an authentication mechanism

2016-04-15 Thread Jitendra Pandey

 The need for a large number of clients that are running all over the cluster 
that authenticate with Kafka brokers, is very similar to the Hadoop use case of 
large number of tasks running across the cluster that need authentication to 
Hdfs Namenode. Therefore, the delegation token approach does seem like a good 
fit for this use case as we have seen it working at large scale in HDFS and 
YARN.

  The proposed design is very much inline with Hadoop approach. A few comments:

1) Why do you guys want to allow infinite renewable lifetime for a token? HDFS 
restricts a token to a max life time (default 7 days).  A token's vulnerability 
is believed to increase with time.

2) As I understand the tokens are stored in zookeeper as well, and can be 
updated there. This is clever as it can allow replacing the tokens once they 
run out of max life time, and clients can download new tokens from zookeeper. 
It shouldn't be a big load on zookeeper as a client will need to get a new 
token once in several days. In this approach you don't need infinite lifetime 
on the token even for long running clients.

3) The token password are generated using a master key. The master key should 
also be periodically changed. In Hadoop, the default renewal period is 1 day.?

Thanks for a thorough proposal, great work!


?



[GitHub] kafka pull request: MINOR: Typo fixes in ReplicaFetchMaxBytesDoc

2016-04-15 Thread Erethon
GitHub user Erethon opened a pull request:

https://github.com/apache/kafka/pull/1228

MINOR: Typo fixes in ReplicaFetchMaxBytesDoc



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Erethon/kafka trunk

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1228.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1228


commit b4f8c7ca3022db4a1eb796c1870cea9c0c476217
Author: Dionysis Grigoropoulos 
Date:   2016-04-15T20:24:55Z

MINOR: Typo fixes in ReplicaFetchMaxBytesDoc




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Created] (KAFKA-3564) Count metric always increments by 1.0

2016-04-15 Thread Michael Coon (JIRA)
Michael Coon created KAFKA-3564:
---

 Summary: Count metric always increments by 1.0
 Key: KAFKA-3564
 URL: https://issues.apache.org/jira/browse/KAFKA-3564
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.10.1.0
Reporter: Michael Coon


The Count metric's update method always increments its value by 1.0 instead of 
the value passed to it. If this is by design, it's misleading as I want to be 
able to count based on values I send to the record method.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [RELEASE] Anyone minds if we push the next RC another week away?

2016-04-15 Thread Harsha
+1

On Fri, Apr 15, 2016, at 08:06 AM, Grant Henke wrote:
> +1
> 
> On Fri, Apr 15, 2016 at 10:05 AM, Ashish Singh 
> wrote:
> 
> > Good idea. Thanks!
> >
> > On Friday, April 15, 2016, Ismael Juma  wrote:
> >
> > > +1
> > >
> > > On Fri, Apr 15, 2016 at 4:28 AM, Gwen Shapira  > > > wrote:
> > >
> > > > Hi Team,
> > > >
> > > > As we got more time, we merrily expended the scope of all our
> > > > in-progress KIPs :)
> > > >
> > > > I want to be able to at least get some of them in.
> > > >
> > > > Do you feel that pushing the next RC from Apr 22 (next week) to Apr 29
> > > > (week after!) will be helpful?
> > > > Any objections to this plan?
> > > >
> > > > Gwen
> > > >
> > >
> >
> >
> > --
> > Ashish h
> >
> 
> 
> 
> -- 
> Grant Henke
> Software Engineer | Cloudera
> gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


[GitHub] kafka pull request: KAFKA 3421: Update docs with new connector fea...

2016-04-15 Thread Ishiihara
GitHub user Ishiihara opened a pull request:

https://github.com/apache/kafka/pull/1227

KAFKA 3421: Update docs with new connector features

@ewencp @gwenshap Docs. I also tried to clean up some typos. However, it 
seems that the we don't have two words without space in between in the source 
yet they showed up as no space in between in the generated doc. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Ishiihara/kafka config-doc

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1227.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1227


commit c3c1daa92084e1cb26b2677f3cd6bff474ec27bd
Author: Liquan Pei 
Date:   2016-04-15T02:23:32Z

Clean up existing doc

commit 7708908b0a4852739e67fd169bd3199e23063f3d
Author: Liquan Pei 
Date:   2016-04-15T17:00:20Z

Add docs for new APIs




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [VOTE] KIP-33 - Add a time based log index

2016-04-15 Thread Guozhang Wang
+1 from me. Thanks.

On Fri, Apr 15, 2016 at 9:16 AM, Jun Rao  wrote:

> Hi, Jiangjie,
>
> Thanks for the latest update. +1 on the KIP.
>
> Jun
>
> On Thu, Apr 14, 2016 at 2:36 PM, Becket Qin  wrote:
>
> > Hi Jun,
> >
> > 11. Yes, that sounds a reasonable solution. In the latest patch I am
> doing
> > the following in order:
> > a. Create an empty time index for a log segment if there isn't one.
> > b. For all non-active log segments, append an entry of
> > (last_modification_time -> next_base_offset) into the index. The time
> index
> > of the active segment is left empty. All the in-memory maxTimestamp is
> set
> > to -1.
> > c. If there is a hard failure, the time index and offset index will both
> be
> > rebuilt.
> >
> > So we do not rebuild the time index during upgrade unless there was a
> hard
> > failure. I have updated the wiki to reflect this.
> >
> > BTW, it seems that the current code will never hit the case where an
> index
> > is missing. I commented on PR.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> >
> > On Thu, Apr 14, 2016 at 10:00 AM, Jun Rao  wrote:
> >
> > > Hi, Jiangjie,
> > >
> > > 11. Rebuilding all missing time indexes will make the upgrade process
> > > longer since the log segments pre 0.10.0 don't have the time index.
> Could
> > > we only rebuild the missing indexes after the last flush offset? For
> > other
> > > segments missing the time index, we just assume lastTimestamp to be -1?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Apr 13, 2016 at 9:55 AM, Becket Qin 
> > wrote:
> > >
> > > > Hi Jun and Guozhang,
> > > >
> > > > I have updated the KIP wiki to incorporate your comments. Please let
> me
> > > > know if you prefer starting another discussion thread for further
> > > > discussion.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Mon, Apr 11, 2016 at 12:21 AM, Becket Qin 
> > > wrote:
> > > >
> > > > > Hi Guozhang and Jun,
> > > > >
> > > > > Thanks for the comments. Please see the responses below.
> > > > >
> > > > > Regarding to Guozhang's question #1 and Jun's question #12. I was
> > > > > inserting the time index and offset index entry together mostly for
> > > > > simplicity as Guozhang mentioned. The purpose of using index
> interval
> > > > bytes
> > > > > for time index was to control the density of the time index, which
> is
> > > the
> > > > > same purpose as offset index. It seems reasonable to make them
> > aligned.
> > > > We
> > > > > can track separately the physical position when we insert the last
> > time
> > > > > index entry(my original code did that), but when I wrote the code I
> > > feel
> > > > it
> > > > > seems unnecessary. Another minor benefit is that searching by
> > timestamp
> > > > > could be potentially faster if we align the time index and offset
> > > index.
> > > > > It is possible that we only have either a corrupted time index or
> an
> > > > > offset index, but not both. Although we can choose to only rebuild
> > the
> > > > one
> > > > > which is corrupted, given that we have to scan the entire log
> segment
> > > > > anyway, rebuilding both of them seems not much overhead. So the
> > current
> > > > > patch I have is rebuilding both of them together.
> > > > >
> > > > > 10. Yes, it should only happen after a hard failure. The last time
> > > index
> > > > > entry of a normally closed segment has already points to the LEO,
> so
> > > > there
> > > > > is no scan during start up.
> > > > >
> > > > > 11. On broker startup, if a time index does not exist, an empty one
> > > will
> > > > > be created first. If message format version is 0.9.0, we will
> append
> > a
> > > > time
> > > > > index entry of (last modification time -> base offset of next
> > segment)
> > > to
> > > > > the time index of each inactive segment. So no actual rebuild will
> > > happen
> > > > > during upgrade. However, if message format version is 0.10.0, we
> will
> > > > > rebuild the time index if it does not exist. (I actually had a
> > question
> > > > > about the how we are loading the log segments, we can discuss it in
> > the
> > > > PR)
> > > > >
> > > > > I will update the wiki to clarify the question raised in the
> comments
> > > and
> > > > > submit a PR by tomorrow. I am currently cleaning up the
> > documentation.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > >
> > > > >
> > > > > On Sun, Apr 10, 2016 at 9:25 PM, Jun Rao  wrote:
> > > > >
> > > > >> Hi, Jiangjie,
> > > > >>
> > > > >> Thanks for the update. Looks good to me overall. Just a few minor
> > > > comments
> > > > >> below.
> > > > >>
> > > > >> 10. On broker startup, it's not clear to me why we need to scan
> the
> > > log
> > > > >> segment to retrieve the largest timestamp since the time index
> > always
> > > > has
> > > > >> an entry for the largest timestamp. Is 

Re: [VOTE] KIP-33 - Add a time based log index

2016-04-15 Thread Jun Rao
Hi, Jiangjie,

Thanks for the latest update. +1 on the KIP.

Jun

On Thu, Apr 14, 2016 at 2:36 PM, Becket Qin  wrote:

> Hi Jun,
>
> 11. Yes, that sounds a reasonable solution. In the latest patch I am doing
> the following in order:
> a. Create an empty time index for a log segment if there isn't one.
> b. For all non-active log segments, append an entry of
> (last_modification_time -> next_base_offset) into the index. The time index
> of the active segment is left empty. All the in-memory maxTimestamp is set
> to -1.
> c. If there is a hard failure, the time index and offset index will both be
> rebuilt.
>
> So we do not rebuild the time index during upgrade unless there was a hard
> failure. I have updated the wiki to reflect this.
>
> BTW, it seems that the current code will never hit the case where an index
> is missing. I commented on PR.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
>
> On Thu, Apr 14, 2016 at 10:00 AM, Jun Rao  wrote:
>
> > Hi, Jiangjie,
> >
> > 11. Rebuilding all missing time indexes will make the upgrade process
> > longer since the log segments pre 0.10.0 don't have the time index. Could
> > we only rebuild the missing indexes after the last flush offset? For
> other
> > segments missing the time index, we just assume lastTimestamp to be -1?
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Apr 13, 2016 at 9:55 AM, Becket Qin 
> wrote:
> >
> > > Hi Jun and Guozhang,
> > >
> > > I have updated the KIP wiki to incorporate your comments. Please let me
> > > know if you prefer starting another discussion thread for further
> > > discussion.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Mon, Apr 11, 2016 at 12:21 AM, Becket Qin 
> > wrote:
> > >
> > > > Hi Guozhang and Jun,
> > > >
> > > > Thanks for the comments. Please see the responses below.
> > > >
> > > > Regarding to Guozhang's question #1 and Jun's question #12. I was
> > > > inserting the time index and offset index entry together mostly for
> > > > simplicity as Guozhang mentioned. The purpose of using index interval
> > > bytes
> > > > for time index was to control the density of the time index, which is
> > the
> > > > same purpose as offset index. It seems reasonable to make them
> aligned.
> > > We
> > > > can track separately the physical position when we insert the last
> time
> > > > index entry(my original code did that), but when I wrote the code I
> > feel
> > > it
> > > > seems unnecessary. Another minor benefit is that searching by
> timestamp
> > > > could be potentially faster if we align the time index and offset
> > index.
> > > > It is possible that we only have either a corrupted time index or an
> > > > offset index, but not both. Although we can choose to only rebuild
> the
> > > one
> > > > which is corrupted, given that we have to scan the entire log segment
> > > > anyway, rebuilding both of them seems not much overhead. So the
> current
> > > > patch I have is rebuilding both of them together.
> > > >
> > > > 10. Yes, it should only happen after a hard failure. The last time
> > index
> > > > entry of a normally closed segment has already points to the LEO, so
> > > there
> > > > is no scan during start up.
> > > >
> > > > 11. On broker startup, if a time index does not exist, an empty one
> > will
> > > > be created first. If message format version is 0.9.0, we will append
> a
> > > time
> > > > index entry of (last modification time -> base offset of next
> segment)
> > to
> > > > the time index of each inactive segment. So no actual rebuild will
> > happen
> > > > during upgrade. However, if message format version is 0.10.0, we will
> > > > rebuild the time index if it does not exist. (I actually had a
> question
> > > > about the how we are loading the log segments, we can discuss it in
> the
> > > PR)
> > > >
> > > > I will update the wiki to clarify the question raised in the comments
> > and
> > > > submit a PR by tomorrow. I am currently cleaning up the
> documentation.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > >
> > > >
> > > > On Sun, Apr 10, 2016 at 9:25 PM, Jun Rao  wrote:
> > > >
> > > >> Hi, Jiangjie,
> > > >>
> > > >> Thanks for the update. Looks good to me overall. Just a few minor
> > > comments
> > > >> below.
> > > >>
> > > >> 10. On broker startup, it's not clear to me why we need to scan the
> > log
> > > >> segment to retrieve the largest timestamp since the time index
> always
> > > has
> > > >> an entry for the largest timestamp. Is that only for restarting
> after
> > a
> > > >> hard failure?
> > > >>
> > > >> 11. On broker startup, if a log segment misses the time index, do we
> > > >> always
> > > >> rebuild it? This can happen when the broker is upgraded.
> > > >>
> > > >> 12. Related to Guozhang's question #1. It seems it's simpler to add
> > time
> > > >> index entries independent of the offset index since at index entry

Re: IncompatibleClassChangeError

2016-04-15 Thread Michael D. Coon
Thanks.
I can't copy/paste stack trace since it's on an internal system.
I'll do my best to hand-type the thing in here:
java.lang.IncompatiableClassChangeError
at scala.collection.immutable.StringLink$class.format(StringLink.scala:318)at 
scala.collection.imutable.StringOps.format(StringOps.scala:30)at 
kafka.cluster.Partition$$anonfun$9.apply(Partition.scala:450)at 
kafka.cluster.Partition$$anonfun$9.aplpy(Partition.scala:428)at 
kafka.utils.CoreUtils$.inLock(CoreUtils.scala:262)at 
kafka.utils.CoreUtils$.inReadLock(CoreUtils.scala.268)at 
kafka.cluster.Partition.appendMessagesToLeader(Partition.scala:428)at 
kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:401)at
 
kafka.server.ReplicaManager$$anonfun$appendToLocalLog$2.apply(ReplicaManager.scala:386)at
 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)at
 
scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:245)at
 scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)at 
scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:99)at 
scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:230)at 
scala.collection.mutable.HashMap.freachEntry(HashMap.scala:40)at 
scala.collection.mutable.HashMap.foreach(HashMap.scala:99)at 
scala.collection.TeraversableLike$class.map(TraversableLike.scala:245)at 
scala.collection.AbstractTraversable.map(Traversable.scala:104)at 
kafka.server.ReplicaManager.appendToLocalLog(ReplicaManager.scala:386)at 
kafka.server.ReplicaManager.appendMessages(ReplicaManager.scala:322)at 
kafka.server.KafkaApis.handleProducerRequest(KafkaApis.scala:366)at 
kafka.server.KafkaApis.handle(KafkaApis.scala:68)at 
kafka.server.KafkaRequestHandler.run(KafkaRequesHandler.scala:60)at 
java.lang.Thread.run(Thread.java:853)

 

On Friday, April 15, 2016 10:03 AM, Ismael Juma  wrote:
 

 Hi Michael,

We would need more information to be able to help. We don't serialize scala
objects and clients should be able to use different Scala versions than the
broker. Do you have a stacktrace of when the exception is thrown?

Ismael

On Fri, Apr 15, 2016 at 2:39 PM, Michael D. Coon 
wrote:

> We are seeing odd behavior that we need to understand. We are getting
> IncompatibleClassChangeErrors and I know that's related to a Scala version
> mismatch. What's not clear, however, is where or why the mismatch is
> occurring. We know up front that there were occasions where we ran apps
> that had Scala 2.10 dependencies with the 0.9 consumer. We also plugged in
> a metrics reporter at the broker with 2.10 scala dependencies. Both of
> these situations produced the class change errors and they both appear to
> be around the consumer offsets topic. Is there something under the hood
> that is serializing a scala object that would cause this issue? The only
> fix appears to be blowing away all data in the __consumer_offsets topic and
> starting over with 2.11 clients. Is this expected behavior? Seems like a
> weakness if so because we have no control, in some cases, what version of
> Scala a client might use.
>
>


  

[jira] [Commented] (KAFKA-3358) Only request metadata updates once we have topics or a pattern subscription

2016-04-15 Thread Grant Henke (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243135#comment-15243135
 ] 

Grant Henke commented on KAFKA-3358:


[~hachikuji] I think it would be a valuable patch separate from the KIP-4 
changes too. 

> Only request metadata updates once we have topics or a pattern subscription
> ---
>
> Key: KAFKA-3358
> URL: https://issues.apache.org/jira/browse/KAFKA-3358
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> The current code requests a metadata update for _all_ topics which can cause 
> major load issues in large clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3358) Only request metadata updates once we have topics or a pattern subscription

2016-04-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243134#comment-15243134
 ] 

Ismael Juma commented on KAFKA-3358:


Also, are there other places where we are using `dequeFor` unnecessarily? It 
seems to me that we should only use it when we want to rely on the automatic 
creation of the deque, but we tend to use it everywhere.

> Only request metadata updates once we have topics or a pattern subscription
> ---
>
> Key: KAFKA-3358
> URL: https://issues.apache.org/jira/browse/KAFKA-3358
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> The current code requests a metadata update for _all_ topics which can cause 
> major load issues in large clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3358) Only request metadata updates once we have topics or a pattern subscription

2016-04-15 Thread Ismael Juma (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243129#comment-15243129
 ] 

Ismael Juma commented on KAFKA-3358:


[~hachikuji], I think that's a good change so I suggest you file a PR (it would 
be great to have a test too) when you have a chance.

> Only request metadata updates once we have topics or a pattern subscription
> ---
>
> Key: KAFKA-3358
> URL: https://issues.apache.org/jira/browse/KAFKA-3358
> Project: Kafka
>  Issue Type: Improvement
>  Components: clients
>Affects Versions: 0.9.0.0, 0.9.0.1
>Reporter: Ismael Juma
>Assignee: Jason Gustafson
>Priority: Critical
> Fix For: 0.10.1.0
>
>
> The current code requests a metadata update for _all_ topics which can cause 
> major load issues in large clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Work started] (KAFKA-3421) Update docs with new connector features

2016-04-15 Thread Liquan Pei (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on KAFKA-3421 started by Liquan Pei.
-
> Update docs with new connector features
> ---
>
> Key: KAFKA-3421
> URL: https://issues.apache.org/jira/browse/KAFKA-3421
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 0.10.0.0
>Reporter: Gwen Shapira
>Assignee: Liquan Pei
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> Documentation for connector features added in 0.10.0. This should include:
> * Configuration REST API
> * Connector Plugins REST API
> * Pause / Resume (if it is in 0.10)
> * Connector aliases
> * New CommitRecord API



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3563) Maintain MessageAndMetadata constructor compatibility

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15243088#comment-15243088
 ] 

ASF GitHub Bot commented on KAFKA-3563:
---

GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/1226

KAFKA-3563: Maintain MessageAndMetadata constructor compatibility



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka message_constructor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1226


commit cf6b651dac2dc87ccb85b9c6c63e8010d9067818
Author: Grant Henke 
Date:   2016-04-15T14:59:13Z

KAFKA-3563: Maintain MessageAndMetadata constructor compatibility




> Maintain MessageAndMetadata constructor compatibility 
> --
>
> Key: KAFKA-3563
> URL: https://issues.apache.org/jira/browse/KAFKA-3563
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.0.0
>Reporter: Grant Henke
>Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> The MessageAndMetadata constructor was changed to include timestamp 
> information as a part of KIP-32. Though the constructor may not be used in 
> general client usage, it may be used in unit tests or some advanced usage. We 
> should maintain compatibility if possible. 
> One example where the constructor is used is Apache Spark: 
> https://github.com/apache/spark/blob/master/external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala#L223-L225
> The old constructor was:
> {code}
> MessageAndMetadata[K, V](topic: String,
>partition: Int,
>private val rawMessage: Message,
>offset: Long,
>keyDecoder: Decoder[K], valueDecoder: Decoder[V])
> {code}
> And after KIP-32 it is now:
> {code}
> MessageAndMetadata[K, V](topic: String,
>partition: Int,
>private val rawMessage: Message,
>offset: Long,
>timestamp: Long = Message.NoTimestamp,
>timestampType: TimestampType = TimestampType.CREATE_TIME,
>keyDecoder: Decoder[K], valueDecoder: Decoder[V])
> {code}
> Even though _timestamp_ and _timestampType_ have defaults, if _keyDecoder_ 
> and _valueDecoder_ were not accessed by name, then the new constructor is not 
> backwards compatible. 
> We can fix compatibility by moving the _timestamp_ and _timestampType_ 
> parameters to the end of the constructor, or by providing a new constructor 
> without _timestamp_ and _timestampType_ that matches the old constructor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3563) Maintain MessageAndMetadata constructor compatibility

2016-04-15 Thread Grant Henke (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Grant Henke updated KAFKA-3563:
---
Status: Patch Available  (was: Open)

> Maintain MessageAndMetadata constructor compatibility 
> --
>
> Key: KAFKA-3563
> URL: https://issues.apache.org/jira/browse/KAFKA-3563
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 0.10.0.0
>Reporter: Grant Henke
>Assignee: Grant Henke
> Fix For: 0.10.0.0
>
>
> The MessageAndMetadata constructor was changed to include timestamp 
> information as a part of KIP-32. Though the constructor may not be used in 
> general client usage, it may be used in unit tests or some advanced usage. We 
> should maintain compatibility if possible. 
> One example where the constructor is used is Apache Spark: 
> https://github.com/apache/spark/blob/master/external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala#L223-L225
> The old constructor was:
> {code}
> MessageAndMetadata[K, V](topic: String,
>partition: Int,
>private val rawMessage: Message,
>offset: Long,
>keyDecoder: Decoder[K], valueDecoder: Decoder[V])
> {code}
> And after KIP-32 it is now:
> {code}
> MessageAndMetadata[K, V](topic: String,
>partition: Int,
>private val rawMessage: Message,
>offset: Long,
>timestamp: Long = Message.NoTimestamp,
>timestampType: TimestampType = TimestampType.CREATE_TIME,
>keyDecoder: Decoder[K], valueDecoder: Decoder[V])
> {code}
> Even though _timestamp_ and _timestampType_ have defaults, if _keyDecoder_ 
> and _valueDecoder_ were not accessed by name, then the new constructor is not 
> backwards compatible. 
> We can fix compatibility by moving the _timestamp_ and _timestampType_ 
> parameters to the end of the constructor, or by providing a new constructor 
> without _timestamp_ and _timestampType_ that matches the old constructor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3563: Maintain MessageAndMetadata constr...

2016-04-15 Thread granthenke
GitHub user granthenke opened a pull request:

https://github.com/apache/kafka/pull/1226

KAFKA-3563: Maintain MessageAndMetadata constructor compatibility



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/granthenke/kafka message_constructor

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1226.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1226


commit cf6b651dac2dc87ccb85b9c6c63e8010d9067818
Author: Grant Henke 
Date:   2016-04-15T14:59:13Z

KAFKA-3563: Maintain MessageAndMetadata constructor compatibility




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [RELEASE] Anyone minds if we push the next RC another week away?

2016-04-15 Thread Grant Henke
+1

On Fri, Apr 15, 2016 at 10:05 AM, Ashish Singh  wrote:

> Good idea. Thanks!
>
> On Friday, April 15, 2016, Ismael Juma  wrote:
>
> > +1
> >
> > On Fri, Apr 15, 2016 at 4:28 AM, Gwen Shapira  > > wrote:
> >
> > > Hi Team,
> > >
> > > As we got more time, we merrily expended the scope of all our
> > > in-progress KIPs :)
> > >
> > > I want to be able to at least get some of them in.
> > >
> > > Do you feel that pushing the next RC from Apr 22 (next week) to Apr 29
> > > (week after!) will be helpful?
> > > Any objections to this plan?
> > >
> > > Gwen
> > >
> >
>
>
> --
> Ashish h
>



-- 
Grant Henke
Software Engineer | Cloudera
gr...@cloudera.com | twitter.com/gchenke | linkedin.com/in/granthenke


Re: [RELEASE] Anyone minds if we push the next RC another week away?

2016-04-15 Thread Ashish Singh
Good idea. Thanks!

On Friday, April 15, 2016, Ismael Juma  wrote:

> +1
>
> On Fri, Apr 15, 2016 at 4:28 AM, Gwen Shapira  > wrote:
>
> > Hi Team,
> >
> > As we got more time, we merrily expended the scope of all our
> > in-progress KIPs :)
> >
> > I want to be able to at least get some of them in.
> >
> > Do you feel that pushing the next RC from Apr 22 (next week) to Apr 29
> > (week after!) will be helpful?
> > Any objections to this plan?
> >
> > Gwen
> >
>


-- 
Ashish h


[jira] [Created] (KAFKA-3563) Maintain MessageAndMetadata constructor compatibility

2016-04-15 Thread Grant Henke (JIRA)
Grant Henke created KAFKA-3563:
--

 Summary: Maintain MessageAndMetadata constructor compatibility 
 Key: KAFKA-3563
 URL: https://issues.apache.org/jira/browse/KAFKA-3563
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.10.0.0
Reporter: Grant Henke
Assignee: Grant Henke
 Fix For: 0.10.0.0


The MessageAndMetadata constructor was changed to include timestamp information 
as a part of KIP-32. Though the constructor may not be used in general client 
usage, it may be used in unit tests or some advanced usage. We should maintain 
compatibility if possible. 

One example where the constructor is used is Apache Spark: 
https://github.com/apache/spark/blob/master/external/kafka/src/main/scala/org/apache/spark/streaming/kafka/KafkaRDD.scala#L223-L225

The old constructor was:
{code}
MessageAndMetadata[K, V](topic: String,
   partition: Int,
   private val rawMessage: Message,
   offset: Long,
   keyDecoder: Decoder[K], valueDecoder: Decoder[V])
{code}

And after KIP-32 it is now:
{code}
MessageAndMetadata[K, V](topic: String,
   partition: Int,
   private val rawMessage: Message,
   offset: Long,
   timestamp: Long = Message.NoTimestamp,
   timestampType: TimestampType = TimestampType.CREATE_TIME,
   keyDecoder: Decoder[K], valueDecoder: Decoder[V])
{code}

Even though _timestamp_ and _timestampType_ have defaults, if _keyDecoder_ and 
_valueDecoder_ were not accessed by name, then the new constructor is not 
backwards compatible. 

We can fix compatibility by moving the _timestamp_ and _timestampType_ 
parameters to the end of the constructor, or by providing a new constructor 
without _timestamp_ and _timestampType_ that matches the old constructor. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-4 Metadata Schema (Round 2)

2016-04-15 Thread Ismael Juma
Thanks!

Ismael

On Fri, Apr 15, 2016 at 3:36 PM, Grant Henke  wrote:

> I added a row on the main KIP wiki page to indicate the metadata changes
> will be in 0.10 and I also update the KIP-4 wiki too.
>
> Thanks,
> Grant
>
> On Fri, Apr 15, 2016 at 9:27 AM, Ismael Juma  wrote:
>
> > Third time is a charm. :) Can you please update the wiki page below to
> > include this KIP in the adopted table and 0.10.0.0 as the release?
> >
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
> >
> > I know we already have KIP-4 in that table, but I think it's useful for
> > people to know that KIP-4 Metadata is being included in 0.10.0.0 (whereas
> > the rest is not).
> >
> > Ismael
> >
> > On Fri, Apr 15, 2016 at 3:19 PM, Grant Henke 
> wrote:
> >
> > > Thanks to all who voted. The KIP-4 Metadata changes passed with +3
> > > (binding), and +4 (non-binding).
> > >
> > > There is a patch available for review here:
> > > https://github.com/apache/kafka/pull/1095
> > >
> > > On Tue, Apr 12, 2016 at 11:23 AM, Jason Gustafson 
> > > wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > On Mon, Apr 11, 2016 at 6:21 PM, Ismael Juma 
> > wrote:
> > > >
> > > > > +1 (non-binding)
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Tue, Apr 12, 2016 at 2:19 AM, Jun Rao  wrote:
> > > > >
> > > > > > Grant,
> > > > > >
> > > > > > Thanks for the updated version. +1 from me.
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Mon, Apr 11, 2016 at 10:42 AM, Grant Henke <
> ghe...@cloudera.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Based on the discussion in the previous vote thread
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> http://search-hadoop.com/m/uyzND1xlaiU10QlYX=+VOTE+KIP+4+Metadata+Schema
> > > > > > > >
> > > > > > > I also would like to include a behavior change to the
> > > > > MetadataResponse. I
> > > > > > > have update the wiki
> > > > > > > <
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-MetadataSchema
> > > > > > > >
> > > > > > > and pull request 
> to
> > > > > include
> > > > > > > this change.
> > > > > > >
> > > > > > > The change as described on the wiki is:
> > > > > > >
> > > > > > > > The behavior of the replicas and isr arrays will be changed
> in
> > > > order
> > > > > to
> > > > > > > > support the admin tools, and better represent the state of
> the
> > > > > cluster:
> > > > > > > >
> > > > > > > >- In version 0, if a broker is down the replicas and isr
> > array
> > > > > will
> > > > > > > >omit the brokers entry and add a REPLICA_NOT_AVAILABLE
> error
> > > > code.
> > > > > > > >- In version 1, no error code will be set and a the broker
> > id
> > > > will
> > > > > > be
> > > > > > > >included in the replicas and isr array.
> > > > > > > >   - Note: A user can still detect if the replica is not
> > > > > available,
> > > > > > by
> > > > > > > >   checking if the broker is in the returned broker list.
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > > Being optimistic that this doesn't require to much discussion,
> I
> > > > would
> > > > > > like
> > > > > > > to re-start the voting process on this thread. If more
> discussion
> > > is
> > > > > > > needed, please don't hesitate to bring it up here.
> > > > > > >
> > > > > > > Ismael, Gwen, Guozhang could you please review and revote based
> > on
> > > > the
> > > > > > > changes.
> > > > > > >
> > > > > > > Thank you,
> > > > > > > Grant
> > > > > > >
> > > > > > > On Sat, Apr 9, 2016 at 1:03 PM, Guozhang Wang <
> > wangg...@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > On Fri, Apr 8, 2016 at 4:36 PM, Gwen Shapira <
> > g...@confluent.io>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > On Fri, Apr 8, 2016 at 2:41 PM, Grant Henke <
> > > ghe...@cloudera.com
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I would like to re-initiate the voting process for the
> > "KIP-4
> > > > > > > Metadata
> > > > > > > > > > Schema changes". This is not a vote for all of KIP-4, but
> > > > > > > specifically
> > > > > > > > > for
> > > > > > > > > > the metadata changes. I have included the exact changes
> > below
> > > > for
> > > > > > > > > clarity:
> > > > > > > > > > >
> > > > > > > > > > > Metadata Request (version 1)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > MetadataRequest => [topics]
> > > > > > > > > > >
> > > > > > > > > > > Stays the same as version 0 however behavior changes.
> > > > > > > > > > > In version 0 there was no way to request no topics, 

Re: [VOTE] KIP-4 Metadata Schema (Round 2)

2016-04-15 Thread Grant Henke
I added a row on the main KIP wiki page to indicate the metadata changes
will be in 0.10 and I also update the KIP-4 wiki too.

Thanks,
Grant

On Fri, Apr 15, 2016 at 9:27 AM, Ismael Juma  wrote:

> Third time is a charm. :) Can you please update the wiki page below to
> include this KIP in the adopted table and 0.10.0.0 as the release?
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
>
> I know we already have KIP-4 in that table, but I think it's useful for
> people to know that KIP-4 Metadata is being included in 0.10.0.0 (whereas
> the rest is not).
>
> Ismael
>
> On Fri, Apr 15, 2016 at 3:19 PM, Grant Henke  wrote:
>
> > Thanks to all who voted. The KIP-4 Metadata changes passed with +3
> > (binding), and +4 (non-binding).
> >
> > There is a patch available for review here:
> > https://github.com/apache/kafka/pull/1095
> >
> > On Tue, Apr 12, 2016 at 11:23 AM, Jason Gustafson 
> > wrote:
> >
> > > +1 (non-binding)
> > >
> > > On Mon, Apr 11, 2016 at 6:21 PM, Ismael Juma 
> wrote:
> > >
> > > > +1 (non-binding)
> > > >
> > > > Ismael
> > > >
> > > > On Tue, Apr 12, 2016 at 2:19 AM, Jun Rao  wrote:
> > > >
> > > > > Grant,
> > > > >
> > > > > Thanks for the updated version. +1 from me.
> > > > >
> > > > > Jun
> > > > >
> > > > > On Mon, Apr 11, 2016 at 10:42 AM, Grant Henke  >
> > > > wrote:
> > > > >
> > > > > > Based on the discussion in the previous vote thread
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> http://search-hadoop.com/m/uyzND1xlaiU10QlYX=+VOTE+KIP+4+Metadata+Schema
> > > > > > >
> > > > > > I also would like to include a behavior change to the
> > > > MetadataResponse. I
> > > > > > have update the wiki
> > > > > > <
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-MetadataSchema
> > > > > > >
> > > > > > and pull request  to
> > > > include
> > > > > > this change.
> > > > > >
> > > > > > The change as described on the wiki is:
> > > > > >
> > > > > > > The behavior of the replicas and isr arrays will be changed in
> > > order
> > > > to
> > > > > > > support the admin tools, and better represent the state of the
> > > > cluster:
> > > > > > >
> > > > > > >- In version 0, if a broker is down the replicas and isr
> array
> > > > will
> > > > > > >omit the brokers entry and add a REPLICA_NOT_AVAILABLE error
> > > code.
> > > > > > >- In version 1, no error code will be set and a the broker
> id
> > > will
> > > > > be
> > > > > > >included in the replicas and isr array.
> > > > > > >   - Note: A user can still detect if the replica is not
> > > > available,
> > > > > by
> > > > > > >   checking if the broker is in the returned broker list.
> > > > > > >
> > > > > > >
> > > > > >
> > > > > > Being optimistic that this doesn't require to much discussion, I
> > > would
> > > > > like
> > > > > > to re-start the voting process on this thread. If more discussion
> > is
> > > > > > needed, please don't hesitate to bring it up here.
> > > > > >
> > > > > > Ismael, Gwen, Guozhang could you please review and revote based
> on
> > > the
> > > > > > changes.
> > > > > >
> > > > > > Thank you,
> > > > > > Grant
> > > > > >
> > > > > > On Sat, Apr 9, 2016 at 1:03 PM, Guozhang Wang <
> wangg...@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > On Fri, Apr 8, 2016 at 4:36 PM, Gwen Shapira <
> g...@confluent.io>
> > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > On Fri, Apr 8, 2016 at 2:41 PM, Grant Henke <
> > ghe...@cloudera.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > I would like to re-initiate the voting process for the
> "KIP-4
> > > > > > Metadata
> > > > > > > > > Schema changes". This is not a vote for all of KIP-4, but
> > > > > > specifically
> > > > > > > > for
> > > > > > > > > the metadata changes. I have included the exact changes
> below
> > > for
> > > > > > > > clarity:
> > > > > > > > > >
> > > > > > > > > > Metadata Request (version 1)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > MetadataRequest => [topics]
> > > > > > > > > >
> > > > > > > > > > Stays the same as version 0 however behavior changes.
> > > > > > > > > > In version 0 there was no way to request no topics, and
> and
> > > > empty
> > > > > > > list
> > > > > > > > > > signified all topics.
> > > > > > > > > > In version 1 a null topics list (size -1 on the wire)
> will
> > > > > indicate
> > > > > > > > that
> > > > > > > > > a
> > > > > > > > > > user wants *ALL* topic metadata. Compared to an empty
> list
> > > > (size
> > > > > 0)
> > > > > > > > which
> > > > > > > > > > indicates metadata for *NO* topics 

Re: [VOTE] KIP-4 Metadata Schema (Round 2)

2016-04-15 Thread Ismael Juma
Third time is a charm. :) Can you please update the wiki page below to
include this KIP in the adopted table and 0.10.0.0 as the release?

https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals

I know we already have KIP-4 in that table, but I think it's useful for
people to know that KIP-4 Metadata is being included in 0.10.0.0 (whereas
the rest is not).

Ismael

On Fri, Apr 15, 2016 at 3:19 PM, Grant Henke  wrote:

> Thanks to all who voted. The KIP-4 Metadata changes passed with +3
> (binding), and +4 (non-binding).
>
> There is a patch available for review here:
> https://github.com/apache/kafka/pull/1095
>
> On Tue, Apr 12, 2016 at 11:23 AM, Jason Gustafson 
> wrote:
>
> > +1 (non-binding)
> >
> > On Mon, Apr 11, 2016 at 6:21 PM, Ismael Juma  wrote:
> >
> > > +1 (non-binding)
> > >
> > > Ismael
> > >
> > > On Tue, Apr 12, 2016 at 2:19 AM, Jun Rao  wrote:
> > >
> > > > Grant,
> > > >
> > > > Thanks for the updated version. +1 from me.
> > > >
> > > > Jun
> > > >
> > > > On Mon, Apr 11, 2016 at 10:42 AM, Grant Henke 
> > > wrote:
> > > >
> > > > > Based on the discussion in the previous vote thread
> > > > > <
> > > > >
> > > >
> > >
> >
> http://search-hadoop.com/m/uyzND1xlaiU10QlYX=+VOTE+KIP+4+Metadata+Schema
> > > > > >
> > > > > I also would like to include a behavior change to the
> > > MetadataResponse. I
> > > > > have update the wiki
> > > > > <
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-MetadataSchema
> > > > > >
> > > > > and pull request  to
> > > include
> > > > > this change.
> > > > >
> > > > > The change as described on the wiki is:
> > > > >
> > > > > > The behavior of the replicas and isr arrays will be changed in
> > order
> > > to
> > > > > > support the admin tools, and better represent the state of the
> > > cluster:
> > > > > >
> > > > > >- In version 0, if a broker is down the replicas and isr array
> > > will
> > > > > >omit the brokers entry and add a REPLICA_NOT_AVAILABLE error
> > code.
> > > > > >- In version 1, no error code will be set and a the broker id
> > will
> > > > be
> > > > > >included in the replicas and isr array.
> > > > > >   - Note: A user can still detect if the replica is not
> > > available,
> > > > by
> > > > > >   checking if the broker is in the returned broker list.
> > > > > >
> > > > > >
> > > > >
> > > > > Being optimistic that this doesn't require to much discussion, I
> > would
> > > > like
> > > > > to re-start the voting process on this thread. If more discussion
> is
> > > > > needed, please don't hesitate to bring it up here.
> > > > >
> > > > > Ismael, Gwen, Guozhang could you please review and revote based on
> > the
> > > > > changes.
> > > > >
> > > > > Thank you,
> > > > > Grant
> > > > >
> > > > > On Sat, Apr 9, 2016 at 1:03 PM, Guozhang Wang 
> > > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > On Fri, Apr 8, 2016 at 4:36 PM, Gwen Shapira 
> > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > On Fri, Apr 8, 2016 at 2:41 PM, Grant Henke <
> ghe...@cloudera.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > I would like to re-initiate the voting process for the "KIP-4
> > > > > Metadata
> > > > > > > > Schema changes". This is not a vote for all of KIP-4, but
> > > > > specifically
> > > > > > > for
> > > > > > > > the metadata changes. I have included the exact changes below
> > for
> > > > > > > clarity:
> > > > > > > > >
> > > > > > > > > Metadata Request (version 1)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > MetadataRequest => [topics]
> > > > > > > > >
> > > > > > > > > Stays the same as version 0 however behavior changes.
> > > > > > > > > In version 0 there was no way to request no topics, and and
> > > empty
> > > > > > list
> > > > > > > > > signified all topics.
> > > > > > > > > In version 1 a null topics list (size -1 on the wire) will
> > > > indicate
> > > > > > > that
> > > > > > > > a
> > > > > > > > > user wants *ALL* topic metadata. Compared to an empty list
> > > (size
> > > > 0)
> > > > > > > which
> > > > > > > > > indicates metadata for *NO* topics should be returned.
> > > > > > > > > Metadata Response (version 1)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > MetadataResponse => [brokers] controllerId [topic_metadata]
> > > > > > > > >   brokers => node_id host port rack
> > > > > > > > > node_id => INT32
> > > > > > > > > host => STRING
> > > > > > > > > port => INT32
> > > > > > > > > rack => NULLABLE_STRING
> > > > > > > > >   controllerId => INT32
> > > > > > > > >   topic_metadata => topic_error_code topic is_internal
> > > > > 

Re: [VOTE] KIP-4 Metadata Schema (Round 2)

2016-04-15 Thread Grant Henke
Thanks to all who voted. The KIP-4 Metadata changes passed with +3
(binding), and +4 (non-binding).

There is a patch available for review here:
https://github.com/apache/kafka/pull/1095

On Tue, Apr 12, 2016 at 11:23 AM, Jason Gustafson 
wrote:

> +1 (non-binding)
>
> On Mon, Apr 11, 2016 at 6:21 PM, Ismael Juma  wrote:
>
> > +1 (non-binding)
> >
> > Ismael
> >
> > On Tue, Apr 12, 2016 at 2:19 AM, Jun Rao  wrote:
> >
> > > Grant,
> > >
> > > Thanks for the updated version. +1 from me.
> > >
> > > Jun
> > >
> > > On Mon, Apr 11, 2016 at 10:42 AM, Grant Henke 
> > wrote:
> > >
> > > > Based on the discussion in the previous vote thread
> > > > <
> > > >
> > >
> >
> http://search-hadoop.com/m/uyzND1xlaiU10QlYX=+VOTE+KIP+4+Metadata+Schema
> > > > >
> > > > I also would like to include a behavior change to the
> > MetadataResponse. I
> > > > have update the wiki
> > > > <
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-4+-+Command+line+and+centralized+administrative+operations#KIP-4-Commandlineandcentralizedadministrativeoperations-MetadataSchema
> > > > >
> > > > and pull request  to
> > include
> > > > this change.
> > > >
> > > > The change as described on the wiki is:
> > > >
> > > > > The behavior of the replicas and isr arrays will be changed in
> order
> > to
> > > > > support the admin tools, and better represent the state of the
> > cluster:
> > > > >
> > > > >- In version 0, if a broker is down the replicas and isr array
> > will
> > > > >omit the brokers entry and add a REPLICA_NOT_AVAILABLE error
> code.
> > > > >- In version 1, no error code will be set and a the broker id
> will
> > > be
> > > > >included in the replicas and isr array.
> > > > >   - Note: A user can still detect if the replica is not
> > available,
> > > by
> > > > >   checking if the broker is in the returned broker list.
> > > > >
> > > > >
> > > >
> > > > Being optimistic that this doesn't require to much discussion, I
> would
> > > like
> > > > to re-start the voting process on this thread. If more discussion is
> > > > needed, please don't hesitate to bring it up here.
> > > >
> > > > Ismael, Gwen, Guozhang could you please review and revote based on
> the
> > > > changes.
> > > >
> > > > Thank you,
> > > > Grant
> > > >
> > > > On Sat, Apr 9, 2016 at 1:03 PM, Guozhang Wang 
> > > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > On Fri, Apr 8, 2016 at 4:36 PM, Gwen Shapira 
> > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > On Fri, Apr 8, 2016 at 2:41 PM, Grant Henke  >
> > > > wrote:
> > > > > >
> > > > > > > I would like to re-initiate the voting process for the "KIP-4
> > > > Metadata
> > > > > > > Schema changes". This is not a vote for all of KIP-4, but
> > > > specifically
> > > > > > for
> > > > > > > the metadata changes. I have included the exact changes below
> for
> > > > > > clarity:
> > > > > > > >
> > > > > > > > Metadata Request (version 1)
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > MetadataRequest => [topics]
> > > > > > > >
> > > > > > > > Stays the same as version 0 however behavior changes.
> > > > > > > > In version 0 there was no way to request no topics, and and
> > empty
> > > > > list
> > > > > > > > signified all topics.
> > > > > > > > In version 1 a null topics list (size -1 on the wire) will
> > > indicate
> > > > > > that
> > > > > > > a
> > > > > > > > user wants *ALL* topic metadata. Compared to an empty list
> > (size
> > > 0)
> > > > > > which
> > > > > > > > indicates metadata for *NO* topics should be returned.
> > > > > > > > Metadata Response (version 1)
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > MetadataResponse => [brokers] controllerId [topic_metadata]
> > > > > > > >   brokers => node_id host port rack
> > > > > > > > node_id => INT32
> > > > > > > > host => STRING
> > > > > > > > port => INT32
> > > > > > > > rack => NULLABLE_STRING
> > > > > > > >   controllerId => INT32
> > > > > > > >   topic_metadata => topic_error_code topic is_internal
> > > > > > > [partition_metadata]
> > > > > > > > topic_error_code => INT16
> > > > > > > > topic => STRING
> > > > > > > > is_internal => BOOLEAN
> > > > > > > > partition_metadata => partition_error_code partition_id
> > > leader
> > > > > > > [replicas] [isr]
> > > > > > > >   partition_error_code => INT16
> > > > > > > >   partition_id => INT32
> > > > > > > >   leader => INT32
> > > > > > > >   replicas => INT32
> > > > > > > >   isr => INT32
> > > > > > > >
> > > > > > > > Adds rack, controller_id, and is_internal to the version 0
> > > > response.
> > > > > > > >
> > > > > > >
> > > > > > > The KIP is available here for reference (linked to the Metadata
> > > > schema
> > > > > > > section):
> > > > > > > *

Re: IncompatibleClassChangeError

2016-04-15 Thread Ismael Juma
Hi Michael,

We would need more information to be able to help. We don't serialize scala
objects and clients should be able to use different Scala versions than the
broker. Do you have a stacktrace of when the exception is thrown?

Ismael

On Fri, Apr 15, 2016 at 2:39 PM, Michael D. Coon 
wrote:

> We are seeing odd behavior that we need to understand. We are getting
> IncompatibleClassChangeErrors and I know that's related to a Scala version
> mismatch. What's not clear, however, is where or why the mismatch is
> occurring. We know up front that there were occasions where we ran apps
> that had Scala 2.10 dependencies with the 0.9 consumer. We also plugged in
> a metrics reporter at the broker with 2.10 scala dependencies. Both of
> these situations produced the class change errors and they both appear to
> be around the consumer offsets topic. Is there something under the hood
> that is serializing a scala object that would cause this issue? The only
> fix appears to be blowing away all data in the __consumer_offsets topic and
> starting over with 2.11 clients. Is this expected behavior? Seems like a
> weakness if so because we have no control, in some cases, what version of
> Scala a client might use.
>
>


IncompatibleClassChangeError

2016-04-15 Thread Michael D. Coon
We are seeing odd behavior that we need to understand. We are getting 
IncompatibleClassChangeErrors and I know that's related to a Scala version 
mismatch. What's not clear, however, is where or why the mismatch is occurring. 
We know up front that there were occasions where we ran apps that had Scala 
2.10 dependencies with the 0.9 consumer. We also plugged in a metrics reporter 
at the broker with 2.10 scala dependencies. Both of these situations produced 
the class change errors and they both appear to be around the consumer offsets 
topic. Is there something under the hood that is serializing a scala object 
that would cause this issue? The only fix appears to be blowing away all data 
in the __consumer_offsets topic and starting over with 2.11 clients. Is this 
expected behavior? Seems like a weakness if so because we have no control, in 
some cases, what version of Scala a client might use.



[jira] [Commented] (KAFKA-3558) Add compression_type parameter to benchmarks in benchmark_test.py

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242942#comment-15242942
 ] 

ASF GitHub Bot commented on KAFKA-3558:
---

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1225

KAFKA-3558; Add compression_type parameter to benchmarks in 
benchmark_test.py

* Use a fixed `Random` seed in `EndToEndLatency.scala` for determinism
* Add `compression_type` to and remove `consumer_fetch_max_wait` from 
`end_to_end_latency.py`. The latter was never used.
* Tweak logging of `end_to_end_latency.py` to be similar to 
`consumer_performance.py`.
* Add `compression_type` to `benchmark_test.py` methods and add `snappy` to 
`matrix` annotation
* Use randomly generated bytes from a restricted range for 
`ProducerPerformance` payload. This is a simple fix for now. It can be improved 
in the PR for KAFKA-3554.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3558-add-compression_type-benchmark_test.py

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1225.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1225


commit 861208470c511881063876e7834f9526795cafb2
Author: Ismael Juma 
Date:   2016-04-14T12:52:33Z

Minor clean-ups in `ConsumerPerformance`

commit da504f09d141adc9e0330e682d5d7a67f4992522
Author: Ismael Juma 
Date:   2016-04-14T12:53:49Z

Use a fixed `Random` seed in `EndToEndLatency.scala` and minor clean-up

commit 629cd7400e3b31b87119704052e685fead153eab
Author: Ismael Juma 
Date:   2016-04-14T13:06:48Z

Add `compression_type` to and remove `consumer_fetch_max_wait` from 
`end_to_end_latency.py`

Also tweak logging in `end_to_end_latency.py` to match what we do in 
`consumer_performance.py`

commit b3de8b0083ee5b81c9c50fcd326374500558f710
Author: Ismael Juma 
Date:   2016-04-14T13:08:51Z

Add `compression_type` to `benchmark_test.py`

commit 06d58913e54025a1cb2ca97dc6fe2b3ca91ea281
Author: Ismael Juma 
Date:   2016-04-14T13:31:52Z

Use randomly generated bytes from a restricted range for 
`ProducerPerformance` payload

Used the same range as `EndToEndLatency`




> Add compression_type parameter to benchmarks in benchmark_test.py
> -
>
> Key: KAFKA-3558
> URL: https://issues.apache.org/jira/browse/KAFKA-3558
> Project: Kafka
>  Issue Type: Improvement
>  Components: system tests
>Reporter: Ismael Juma
>Assignee: Ismael Juma
>
> Our nightly performance tests should include scenarios involving compression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3558; Add compression_type parameter to ...

2016-04-15 Thread ijuma
GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/1225

KAFKA-3558; Add compression_type parameter to benchmarks in 
benchmark_test.py

* Use a fixed `Random` seed in `EndToEndLatency.scala` for determinism
* Add `compression_type` to and remove `consumer_fetch_max_wait` from 
`end_to_end_latency.py`. The latter was never used.
* Tweak logging of `end_to_end_latency.py` to be similar to 
`consumer_performance.py`.
* Add `compression_type` to `benchmark_test.py` methods and add `snappy` to 
`matrix` annotation
* Use randomly generated bytes from a restricted range for 
`ProducerPerformance` payload. This is a simple fix for now. It can be improved 
in the PR for KAFKA-3554.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka 
kafka-3558-add-compression_type-benchmark_test.py

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1225.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1225


commit 861208470c511881063876e7834f9526795cafb2
Author: Ismael Juma 
Date:   2016-04-14T12:52:33Z

Minor clean-ups in `ConsumerPerformance`

commit da504f09d141adc9e0330e682d5d7a67f4992522
Author: Ismael Juma 
Date:   2016-04-14T12:53:49Z

Use a fixed `Random` seed in `EndToEndLatency.scala` and minor clean-up

commit 629cd7400e3b31b87119704052e685fead153eab
Author: Ismael Juma 
Date:   2016-04-14T13:06:48Z

Add `compression_type` to and remove `consumer_fetch_max_wait` from 
`end_to_end_latency.py`

Also tweak logging in `end_to_end_latency.py` to match what we do in 
`consumer_performance.py`

commit b3de8b0083ee5b81c9c50fcd326374500558f710
Author: Ismael Juma 
Date:   2016-04-14T13:08:51Z

Add `compression_type` to `benchmark_test.py`

commit 06d58913e54025a1cb2ca97dc6fe2b3ca91ea281
Author: Ismael Juma 
Date:   2016-04-14T13:31:52Z

Use randomly generated bytes from a restricted range for 
`ProducerPerformance` payload

Used the same range as `EndToEndLatency`




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3525) max.reserved.broker.id off-by-one error

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242882#comment-15242882
 ] 

ASF GitHub Bot commented on KAFKA-3525:
---

GitHub user omkreddy opened a pull request:

https://github.com/apache/kafka/pull/1224

KAFKA-3525; getSequenceId should return 1  for first path creation



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omkreddy/kafka KAFKA-3525

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1224.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1224


commit 83143d07bc2bafdd4699b8144f55b9f44d528e0f
Author: Manikumar reddy O 
Date:   2016-04-15T12:19:53Z

KAFKA-3525; getSequenceId should return 1  for first path creation




> max.reserved.broker.id off-by-one error
> ---
>
> Key: KAFKA-3525
> URL: https://issues.apache.org/jira/browse/KAFKA-3525
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Reporter: Alan Braithwaite
>Assignee: Manikumar Reddy
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> There's an off-by-one error in the config check / id generation for 
> max.reserved.broker.id setting.  The auto-generation will generate 
> max.reserved.broker.id as the initial broker id as it's currently written.
> Not sure what the consequences of this are if there's already a broker with 
> that id as I didn't test that behavior.
> This can return 0 + max.reserved.broker.id:
> https://github.com/apache/kafka/blob/8dbd688b1617968329087317fa6bde8b8df0392e/core/src/main/scala/kafka/utils/ZkUtils.scala#L213-L215
> However, this does a <= check, which is inclusive of max.reserved.broker.id:
> https://github.com/apache/kafka/blob/8dbd688b1617968329087317fa6bde8b8df0392e/core/src/main/scala/kafka/server/KafkaConfig.scala#L984-L986



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-3525) max.reserved.broker.id off-by-one error

2016-04-15 Thread Manikumar Reddy (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manikumar Reddy updated KAFKA-3525:
---
Status: Patch Available  (was: Open)

> max.reserved.broker.id off-by-one error
> ---
>
> Key: KAFKA-3525
> URL: https://issues.apache.org/jira/browse/KAFKA-3525
> Project: Kafka
>  Issue Type: Bug
>  Components: config
>Reporter: Alan Braithwaite
>Assignee: Manikumar Reddy
>Priority: Blocker
> Fix For: 0.10.0.0
>
>
> There's an off-by-one error in the config check / id generation for 
> max.reserved.broker.id setting.  The auto-generation will generate 
> max.reserved.broker.id as the initial broker id as it's currently written.
> Not sure what the consequences of this are if there's already a broker with 
> that id as I didn't test that behavior.
> This can return 0 + max.reserved.broker.id:
> https://github.com/apache/kafka/blob/8dbd688b1617968329087317fa6bde8b8df0392e/core/src/main/scala/kafka/utils/ZkUtils.scala#L213-L215
> However, this does a <= check, which is inclusive of max.reserved.broker.id:
> https://github.com/apache/kafka/blob/8dbd688b1617968329087317fa6bde8b8df0392e/core/src/main/scala/kafka/server/KafkaConfig.scala#L984-L986



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3525; getSequenceId should return 1 for ...

2016-04-15 Thread omkreddy
GitHub user omkreddy opened a pull request:

https://github.com/apache/kafka/pull/1224

KAFKA-3525; getSequenceId should return 1  for first path creation



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omkreddy/kafka KAFKA-3525

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1224.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1224


commit 83143d07bc2bafdd4699b8144f55b9f44d528e0f
Author: Manikumar reddy O 
Date:   2016-04-15T12:19:53Z

KAFKA-3525; getSequenceId should return 1  for first path creation




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [RELEASE] Anyone minds if we push the next RC another week away?

2016-04-15 Thread Ismael Juma
+1

On Fri, Apr 15, 2016 at 4:28 AM, Gwen Shapira  wrote:

> Hi Team,
>
> As we got more time, we merrily expended the scope of all our
> in-progress KIPs :)
>
> I want to be able to at least get some of them in.
>
> Do you feel that pushing the next RC from Apr 22 (next week) to Apr 29
> (week after!) will be helpful?
> Any objections to this plan?
>
> Gwen
>


[jira] [Created] (KAFKA-3562) Null Pointer Exception Found when delete topic and Using New Producer

2016-04-15 Thread Pengwei (JIRA)
Pengwei created KAFKA-3562:
--

 Summary: Null Pointer Exception Found when delete topic and Using 
New Producer
 Key: KAFKA-3562
 URL: https://issues.apache.org/jira/browse/KAFKA-3562
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.9.0.1, 0.9.0.0
Reporter: Pengwei
Priority: Minor
 Fix For: 0.10.0.1


Exception in thread “Thread-2” java.lang.NullPointerException
at 
org.apache.kafka.clients.producer.internals.DefaultPartitioner.partition(DefaultPartitioner.java:70)
at 
org.apache.kafka.clients.producer.KafkaProducer.partition(KafkaProducer.java:687)
at org.apache.kafka.clients.producer.KafkaProducer.send(KafkaProducer.java:432)
at 
com.huawei.kafka.internal.remove.ProducerMsgThread.run(ProducerMsgThread.java:36)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3559) Task creation time taking too long in rebalance callback

2016-04-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15242668#comment-15242668
 ] 

ASF GitHub Bot commented on KAFKA-3559:
---

GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/1223

KAFKA-3559: lazy initialisation of state stores

Instead of initialising state stores on init(), they are initialised on 
first access. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-3559-rebalance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1223


commit fa970d2126fc10bcba1748d2448ae3c3489e05e7
Author: Eno Thereska 
Date:   2016-04-15T08:52:34Z

Lazy initialization of state stores (on access, rather than all at once)

commit aebb365d04504cc0a5100715ccfd37a82b8b0298
Author: Eno Thereska 
Date:   2016-04-15T09:02:50Z

Check arguments




> Task creation time taking too long in rebalance callback
> 
>
> Key: KAFKA-3559
> URL: https://issues.apache.org/jira/browse/KAFKA-3559
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Eno Thereska
>  Labels: architecture
> Fix For: 0.10.0.0
>
>
> Currently in Kafka Streams, we create stream tasks upon getting newly 
> assigned partitions in rebalance callback function {code} onPartitionAssigned 
> {code}, which involves initialization of the processor state stores as well 
> (including opening the rocksDB, restore the store from changelog, etc, which 
> takes time).
> With a large number of state stores, the initialization time itself could 
> take tens of seconds, which usually is larger than the consumer session 
> timeout. As a result, when the callback is completed, the consumer is already 
> treated as failed by the coordinator and rebalance again.
> We need to consider if we can optimize the initialization process, or move it 
> out of the callback function, and while initializing the stores one-by-one, 
> use poll call to send heartbeats to avoid being kicked out by coordinator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request: KAFKA-3559: lazy initialisation of state store...

2016-04-15 Thread enothereska
GitHub user enothereska opened a pull request:

https://github.com/apache/kafka/pull/1223

KAFKA-3559: lazy initialisation of state stores

Instead of initialising state stores on init(), they are initialised on 
first access. 

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka KAFKA-3559-rebalance

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/1223.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1223


commit fa970d2126fc10bcba1748d2448ae3c3489e05e7
Author: Eno Thereska 
Date:   2016-04-15T08:52:34Z

Lazy initialization of state stores (on access, rather than all at once)

commit aebb365d04504cc0a5100715ccfd37a82b8b0298
Author: Eno Thereska 
Date:   2016-04-15T09:02:50Z

Check arguments




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Jenkins build is back to normal : kafka-trunk-jdk7 #1198

2016-04-15 Thread Apache Jenkins Server
See 



Build failed in Jenkins: kafka-trunk-jdk8 #528

2016-04-15 Thread Apache Jenkins Server
See 

Changes:

[me] KAFKA-3549: Close consumers instantiated in consumer tests

--
[...truncated 5423 lines...]

kafka.log.BrokerCompressionTest > testBrokerSideCompression[2] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[3] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[4] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[5] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[6] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[7] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[8] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[9] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[10] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[11] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[12] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[13] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[14] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[15] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[16] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[17] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[18] PASSED

kafka.log.BrokerCompressionTest > testBrokerSideCompression[19] PASSED

kafka.log.OffsetMapTest > testClear PASSED

kafka.log.OffsetMapTest > testBasicValidation PASSED

kafka.log.LogManagerTest > testCleanupSegmentsToMaintainSize PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithRelativeDirectory 
PASSED

kafka.log.LogManagerTest > testGetNonExistentLog PASSED

kafka.log.LogManagerTest > testTwoLogManagersUsingSameDirFails PASSED

kafka.log.LogManagerTest > testLeastLoadedAssignment PASSED

kafka.log.LogManagerTest > testCleanupExpiredSegments PASSED

kafka.log.LogManagerTest > testCheckpointRecoveryPoints PASSED

kafka.log.LogManagerTest > testTimeBasedFlush PASSED

kafka.log.LogManagerTest > testCreateLog PASSED

kafka.log.LogManagerTest > testRecoveryDirectoryMappingWithTrailingSlash PASSED

kafka.log.LogSegmentTest > testRecoveryWithCorruptMessage PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptIndex PASSED

kafka.log.LogSegmentTest > testReadFromGap PASSED

kafka.log.LogSegmentTest > testTruncate PASSED

kafka.log.LogSegmentTest > testReadBeforeFirstOffset PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeAppendMessage PASSED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.FileMessageSetTest > testTruncate PASSED

kafka.log.FileMessageSetTest > testIterationOverPartialAndTruncation PASSED

kafka.log.FileMessageSetTest > testRead PASSED

kafka.log.FileMessageSetTest > testFileSize PASSED

kafka.log.FileMessageSetTest > testIteratorWithLimits PASSED

kafka.log.FileMessageSetTest > testPreallocateTrue PASSED

kafka.log.FileMessageSetTest > testIteratorIsConsistent PASSED

kafka.log.FileMessageSetTest > testFormatConversionWithPartialMessage PASSED

kafka.log.FileMessageSetTest > testIterationDoesntChangePosition PASSED

kafka.log.FileMessageSetTest > testWrittenEqualsRead PASSED

kafka.log.FileMessageSetTest > testWriteTo PASSED

kafka.log.FileMessageSetTest > testPreallocateFalse PASSED

kafka.log.FileMessageSetTest > testPreallocateClearShutdown PASSED

kafka.log.FileMessageSetTest > testMessageFormatConversion PASSED

kafka.log.FileMessageSetTest > testSearch PASSED

kafka.log.FileMessageSetTest > testSizeInBytes PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[1] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[2] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[3] PASSED

kafka.log.LogTest > testParseTopicPartitionNameForMissingTopic PASSED

kafka.log.LogTest > testIndexRebuild PASSED

kafka.log.LogTest > testLogRolls PASSED

kafka.log.LogTest > testMessageSizeCheck PASSED

kafka.log.LogTest > testAsyncDelete PASSED

kafka.log.LogTest > testReadOutOfRange PASSED

kafka.log.LogTest > testAppendWithOutOfOrderOffsetsThrowsException PASSED

kafka.log.LogTest > testReadAtLogGap PASSED

kafka.log.LogTest > testTimeBasedLogRoll PASSED

kafka.log.LogTest > testLoadEmptyLog PASSED

kafka.log.LogTest > testMessageSetSizeCheck PASSED

kafka.log.LogTest > testIndexResizingAtTruncation PASSED

kafka.log.LogTest > testCompactedTopicConstraints PASSED

kafka.log.LogTest > testThatGarbageCollectingSegmentsDoesntChangeOffset