Re: Doubts regarding kafka-reassign-partitions.sh

2018-12-17 Thread Abhimanyu Nagrath
I have not changed anything in producer and consumer. I am using the
default Kafka config.

Regards,
Abhimanyu

On Tue, Dec 18, 2018 at 12:38 PM Suman B N  wrote:

> Can you paste the producer and consumer configs as well?
>
> Thanks,
> Suman
>
> On Tue, Dec 18, 2018 at 11:49 AM Abhimanyu Nagrath <
> abhimanyunagr...@gmail.com> wrote:
>
> > so we're currently trying to use Kafka 2.1.0 and are pretty much in a
> > proof-of-concept-phase. We've only just started to look into it and are
> > trying to figure out, if it's what we need.
> >
> > The current setup is as follows:
> >
> >- 3 Kafka brokers on different hosts: kafka1,kafka2,kafka3
> >- 3 Zookeeper node: zkhost1,zkhost2,zkhost3
> >- One topic: "myTopic"
> >- The topic had 4 partitions
> >- The replication factor was 1
> >- We have one producer and three consumers, all in the same consumer
> >group "myGroup"
> >
> > Now I was trying to reassign the partition with the
> > Kafka-reassign-partitions.sh script. For this I created the following
> JSON
> > file:
> >
> > {"version":1,
> > "partitions":[
> > {"topic":"myTopic","partition":0,"replicas":[0]},
> > {"topic":"myTopic","partition":1,"replicas":[0]},
> > {"topic":"myTopic","partition":2,"replicas":[1]},
> > {"topic":"myTopic","partition":3,"replicas":[2]}
> > ]
> > }
> >
> > ...and then executed the script:
> >
> > kafka/bin/kafka-reassign-partitions.sh --zookeeper
> > zkhost1:2181,zkhost2:2181,zkhost3:2181 --reassignment-json-file
> > increase-replication-factor.json --execute
> >
> > This ran smoothly and after that I got my expected replication:
> >
> > Topic:myTopic   PartitionCount:4ReplicationFactor:1 Configs:
> > Topic: myTopic  Partition: 0Leader: 0   Replicas: 0 Isr: 0
> > Topic: myTopic  Partition: 1Leader: 0   Replicas: 0 Isr: 0
> > Topic: myTopic  Partition: 2Leader: 0   Replicas: 1 Isr: 1
> > Topic: myTopic  Partition: 3Leader: 0   Replicas: 2 Isr: 2
> >
> > What I don't understand is, what happened to the partitions during that
> > reassignment. When I looked at the ConsumerOffsetChecker, this is what I
> > saw *before* the reassignment:
> >
> > kafka/bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> > myGroup --zookeeper zkhost1:2181 --topic myTopic
> >
> > Group   Topic  Pid Offset
> > logSize Lag Owner
> > myGroup myTopic0   925230
> > 925230  0   none
> > myGroup myTopic1   925230
> > 925230  0   none
> > myGroup myTopic2   925230
> > 925230  0   none
> > myGroup myTopic3   925230
> > 925230  0   none
> >
> > ...and this is what I saw *after* the reassignment:
> >
> > Group   Topic  Pid Offset
> > logSize Lag Owner
> > myGroup myTopic0   23251
> > 23252   1   none
> > myGroup myTopic1   41281
> > 41281   0   none
> > myGroup myTopic2   23260
> > 23260   0   none
> > myGroup myTopic3   41270
> > 41270   0   none
> >
> > For me this raised a few questions:
> >
> >- Why is the logSize now heavily reduced? Does the reassignment
> trigger
> >some cleanup? (we have not set a byte limit)
> >- Why had all 4 partitions roughly the same size before the
> >reassignment, whereas after the reassignment there's this big
> difference
> >between partitions 0,2 and 1,3? Shouldn't all partitions of one topic
> > have
> >the same logSize or am I misunderstanding the concept here?
> >- Can something like this (i.e. reassigning partitions) lead to data
> >loss? (I couldn't see any on our consumer in this case). And if so, is
> >there a way to do this without this risk?
> >- At the time of executing the command kafka-reassign-partitions.sh
> for
> >a topic should we stop producer and consumers?
> >- How to know after running kafka-reassign-partitions.sh
> >command distribution is complete?
> >
>
>
> --
> *Suman*
> *OlaCabs*
>


Re: Doubts regarding kafka-reassign-partitions.sh

2018-12-17 Thread Suman B N
Can you paste the producer and consumer configs as well?

Thanks,
Suman

On Tue, Dec 18, 2018 at 11:49 AM Abhimanyu Nagrath <
abhimanyunagr...@gmail.com> wrote:

> so we're currently trying to use Kafka 2.1.0 and are pretty much in a
> proof-of-concept-phase. We've only just started to look into it and are
> trying to figure out, if it's what we need.
>
> The current setup is as follows:
>
>- 3 Kafka brokers on different hosts: kafka1,kafka2,kafka3
>- 3 Zookeeper node: zkhost1,zkhost2,zkhost3
>- One topic: "myTopic"
>- The topic had 4 partitions
>- The replication factor was 1
>- We have one producer and three consumers, all in the same consumer
>group "myGroup"
>
> Now I was trying to reassign the partition with the
> Kafka-reassign-partitions.sh script. For this I created the following JSON
> file:
>
> {"version":1,
> "partitions":[
> {"topic":"myTopic","partition":0,"replicas":[0]},
> {"topic":"myTopic","partition":1,"replicas":[0]},
> {"topic":"myTopic","partition":2,"replicas":[1]},
> {"topic":"myTopic","partition":3,"replicas":[2]}
> ]
> }
>
> ...and then executed the script:
>
> kafka/bin/kafka-reassign-partitions.sh --zookeeper
> zkhost1:2181,zkhost2:2181,zkhost3:2181 --reassignment-json-file
> increase-replication-factor.json --execute
>
> This ran smoothly and after that I got my expected replication:
>
> Topic:myTopic   PartitionCount:4ReplicationFactor:1 Configs:
> Topic: myTopic  Partition: 0Leader: 0   Replicas: 0 Isr: 0
> Topic: myTopic  Partition: 1Leader: 0   Replicas: 0 Isr: 0
> Topic: myTopic  Partition: 2Leader: 0   Replicas: 1 Isr: 1
> Topic: myTopic  Partition: 3Leader: 0   Replicas: 2 Isr: 2
>
> What I don't understand is, what happened to the partitions during that
> reassignment. When I looked at the ConsumerOffsetChecker, this is what I
> saw *before* the reassignment:
>
> kafka/bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
> myGroup --zookeeper zkhost1:2181 --topic myTopic
>
> Group   Topic  Pid Offset
> logSize Lag Owner
> myGroup myTopic0   925230
> 925230  0   none
> myGroup myTopic1   925230
> 925230  0   none
> myGroup myTopic2   925230
> 925230  0   none
> myGroup myTopic3   925230
> 925230  0   none
>
> ...and this is what I saw *after* the reassignment:
>
> Group   Topic  Pid Offset
> logSize Lag Owner
> myGroup myTopic0   23251
> 23252   1   none
> myGroup myTopic1   41281
> 41281   0   none
> myGroup myTopic2   23260
> 23260   0   none
> myGroup myTopic3   41270
> 41270   0   none
>
> For me this raised a few questions:
>
>- Why is the logSize now heavily reduced? Does the reassignment trigger
>some cleanup? (we have not set a byte limit)
>- Why had all 4 partitions roughly the same size before the
>reassignment, whereas after the reassignment there's this big difference
>between partitions 0,2 and 1,3? Shouldn't all partitions of one topic
> have
>the same logSize or am I misunderstanding the concept here?
>- Can something like this (i.e. reassigning partitions) lead to data
>loss? (I couldn't see any on our consumer in this case). And if so, is
>there a way to do this without this risk?
>- At the time of executing the command kafka-reassign-partitions.sh for
>a topic should we stop producer and consumers?
>- How to know after running kafka-reassign-partitions.sh
>command distribution is complete?
>


-- 
*Suman*
*OlaCabs*


Jenkins build is back to normal : kafka-trunk-jdk8 #3266

2018-12-17 Thread Apache Jenkins Server
See 




Doubts regarding kafka-reassign-partitions.sh

2018-12-17 Thread Abhimanyu Nagrath
so we're currently trying to use Kafka 2.1.0 and are pretty much in a
proof-of-concept-phase. We've only just started to look into it and are
trying to figure out, if it's what we need.

The current setup is as follows:

   - 3 Kafka brokers on different hosts: kafka1,kafka2,kafka3
   - 3 Zookeeper node: zkhost1,zkhost2,zkhost3
   - One topic: "myTopic"
   - The topic had 4 partitions
   - The replication factor was 1
   - We have one producer and three consumers, all in the same consumer
   group "myGroup"

Now I was trying to reassign the partition with the
Kafka-reassign-partitions.sh script. For this I created the following JSON
file:

{"version":1,
"partitions":[
{"topic":"myTopic","partition":0,"replicas":[0]},
{"topic":"myTopic","partition":1,"replicas":[0]},
{"topic":"myTopic","partition":2,"replicas":[1]},
{"topic":"myTopic","partition":3,"replicas":[2]}
]
}

...and then executed the script:

kafka/bin/kafka-reassign-partitions.sh --zookeeper
zkhost1:2181,zkhost2:2181,zkhost3:2181 --reassignment-json-file
increase-replication-factor.json --execute

This ran smoothly and after that I got my expected replication:

Topic:myTopic   PartitionCount:4ReplicationFactor:1 Configs:
Topic: myTopic  Partition: 0Leader: 0   Replicas: 0 Isr: 0
Topic: myTopic  Partition: 1Leader: 0   Replicas: 0 Isr: 0
Topic: myTopic  Partition: 2Leader: 0   Replicas: 1 Isr: 1
Topic: myTopic  Partition: 3Leader: 0   Replicas: 2 Isr: 2

What I don't understand is, what happened to the partitions during that
reassignment. When I looked at the ConsumerOffsetChecker, this is what I
saw *before* the reassignment:

kafka/bin/kafka-run-class.sh kafka.tools.ConsumerOffsetChecker --group
myGroup --zookeeper zkhost1:2181 --topic myTopic

Group   Topic  Pid Offset
logSize Lag Owner
myGroup myTopic0   925230
925230  0   none
myGroup myTopic1   925230
925230  0   none
myGroup myTopic2   925230
925230  0   none
myGroup myTopic3   925230
925230  0   none

...and this is what I saw *after* the reassignment:

Group   Topic  Pid Offset
logSize Lag Owner
myGroup myTopic0   23251
23252   1   none
myGroup myTopic1   41281
41281   0   none
myGroup myTopic2   23260
23260   0   none
myGroup myTopic3   41270
41270   0   none

For me this raised a few questions:

   - Why is the logSize now heavily reduced? Does the reassignment trigger
   some cleanup? (we have not set a byte limit)
   - Why had all 4 partitions roughly the same size before the
   reassignment, whereas after the reassignment there's this big difference
   between partitions 0,2 and 1,3? Shouldn't all partitions of one topic have
   the same logSize or am I misunderstanding the concept here?
   - Can something like this (i.e. reassigning partitions) lead to data
   loss? (I couldn't see any on our consumer in this case). And if so, is
   there a way to do this without this risk?
   - At the time of executing the command kafka-reassign-partitions.sh for
   a topic should we stop producer and consumers?
   - How to know after running kafka-reassign-partitions.sh
   command distribution is complete?


Re: [DISCUSS] KIP-382: MirrorMaker 2.0

2018-12-17 Thread Ryanne Dolan
> So, if we want to add it, it seems it would be useful to do it in a
backward compatible way in the connect framework, rather than sth specific
to MM

Jun, that sgtm. The MirrorMaker driver I have right now creates multiple
Herders (for multiple Kafka clusters) and exposes them through a high-level
API. I think if we narrow the scope of this API a bit, we might have a
better chance of eventually ending up with a single API for both MM and
Connect.

Specifically, I think MM2 can be stripped down to include only the
endpoints in the existing Connect API, except grouped by Herder:

GET //connectors
PUT //connectors//config
etc

(i.e pretty much exactly what Sönke proposed earlier for Connect in general)

This would accomplish what we need for MM2 without diverging much from the
existing Connect API. Moreover, there would be a clear path to combining
both APIs as Sönke suggested.

What do you think?

Ryanne


On Mon, Dec 17, 2018 at 3:40 PM Jun Rao  wrote:

> Hi, Sonke, Ryanne,
>
> Thanks for the explanation. To me, the single connect cluster model could
> be useful for any connector, not just MM. So, if we want to add it, it
> seems it would be useful to do it in a backward compatible way in the
> connect framework, rather than sth specific to MM. I am not sure what the
> best approach is. For example, one other option is KIP-296. If we feel
> that's
> adding too much work in this KIP, it might be ok to leave this part out in
> this KIP.
>
> Jun
>
> On Fri, Dec 14, 2018 at 1:25 PM Ryanne Dolan 
> wrote:
>
> > Thanks Sönke, you're spot-on. I don't want MM2 to wait for Connect
> features
> > that don't exist yet, especially if MM2 is the primary use case for them.
> > Moreover, I think MM2 can drive and inform some of these features, which
> > only makes sense if we adopt MM2 first.
> >
> > Ryanne
> >
> > On Fri, Dec 14, 2018, 9:03 AM Sönke Liebau
> >  >
> > > Hi Jun,
> > >
> > > I believe Ryanne's idea is to run multiple workers per MM cluster-node,
> > one
> > > per target cluster. So in essence you'd specify three clusters in the
> MM
> > > config and MM would then instantiate one worker per cluster. Every MM
> > > connector would then be deployed to the appropriate (internal) worker
> > that
> > > is configured for the cluster in question. Thus there would be no
> changes
> > > necessary to the Connect framework itself, everything would be handled
> > by a
> > > new layer around existing Connect code (probably a sibling
> implementation
> > > to the DistributedHerder if I understood him correctly). Ryanne, please
> > > correct/expand if I misunderstood your intentions.
> > >
> > > To briefly summarize the discussion that Ryanne and I had around this
> > > earlier, my opinion was that the extra layer could potentially be
> avoided
> > > by extending Connect instead, which would benefit all connectors.
> > >
> > > My proposal was to add a configuration option to the worker config that
> > > allows defining "external clusters" which can then be referenced from
> the
> > > connector config.
> > >
> > > For example:
> > >
> > > # Core cluster config stays the same and is used for status, config and
> > > offsets as usual
> > > bootstrap.servers=localkafka1:9092,localkafka2:9092
> > >
> > > # Allow defining extra remote clusters
> > >
> > >
> >
> externalcluster.kafka_europe.bootstrap.servers=europekafka1:9092,europekafka2:9092
> > > externalcluster.kafka_europe.security.protocol=SSL
> > >
> > >
> >
> externalcluster.kafka_europe.ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
> > > ...
> > >
> > >
> >
> externalcluster.kafka_asia.bootstrap.servers=asiakafka1:9092,asiakafka2:9092
> > >
> > >
> > > When starting a connector you could now reference these pre-configured
> > > clusters in the config:
> > > {
> > >   "name": "file-source",
> > >   "config": {
> > > "connector.class": "FileStreamSource",
> > > "file": "/tmp/test.txt",
> > > "topic": "connect-test",
> > > "name": "file-source",
> > > "cluster": "kafka_asia"
> > >   }
> > > }
> > >
> > > When omitting the "cluster" parameter current behavior of Connect
> remains
> > > unchanged. This way we could address multiple remote clusters from
> > within a
> > > single worker without adding the extra layer for MirrorMaker. I believe
> > > that this could be done without major structural changes to the Connect
> > > codebase, but I freely admit that this opinion is based on 10 minutes
> > > poking through the code not any real expertise.
> > >
> > > Ryanne's main concern with this approach was that there are additional
> > > worker setting that apply to all connectors and that no truly universal
> > > approach would be feasible while running a single worker per Connect
> > node.
> > > Also he feels that from a development perspective it would be
> preferable
> > to
> > > have independent MM code and contribute applicable features back to
> > > Connect.
> > > While I agree that this would make development of MM easier it 

[jira] [Created] (KAFKA-7750) Hjson support in kafka connect

2018-12-17 Thread Manjeet Duhan (JIRA)
Manjeet Duhan created KAFKA-7750:


 Summary: Hjson support in kafka connect
 Key: KAFKA-7750
 URL: https://issues.apache.org/jira/browse/KAFKA-7750
 Project: Kafka
  Issue Type: Improvement
Reporter: Manjeet Duhan
 Attachments: image-2018-12-18-10-07-22-944.png

I agree that json format is most accepted format among applications to 
communicate but this json is programme friendly , We needed something user 
friendly where we can pass comments comments as part of connector configuration.

Features of Hjson :-
 # We are allowed to use comments
 # We are allowed to pass json as part of connector configuration key without 
escaping it which is very user friendly. (We have modified version of 
kafka-connect-elasticsearch where user can pass index mapping part of connector 
properties).

Please find attached connector configuration in Json and Hjson. We are already 
running this in production. I have introduced HJSON filter on POST and PUT apis 
of kafka connect

!image-2018-12-18-10-07-22-944.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


RE: what happens when a vote ends with no votes?

2018-12-17 Thread Pellerin, Clement
Then let's nag. I got one vote for KIP-383.
I need more votes folks.

-Original Message-
From: John Roesler [mailto:j...@confluent.io] 
Sent: Friday, December 14, 2018 4:49 PM
To: dev@kafka.apache.org
Subject: Re: what happens when a vote ends with no votes?

I guess it would be different if you had one vote instead of none. The norm
seems to be to just continue nagging people in the [VOTE] thread until you
accrue the needed votes.



Build failed in Jenkins: kafka-2.1-jdk8 #86

2018-12-17 Thread Apache Jenkins Server
See 


Changes:

[jason] MINOR: Include additional detail in fetch error message (#6036)

--
[...truncated 910.27 KB...]

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnWildcardResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnWildcardResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesExtendedAclChangeEventWhenInterBrokerProtocolAtLeastKafkaV2 STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesExtendedAclChangeEventWhenInterBrokerProtocolAtLeastKafkaV2 PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralAclChangeEventWhenInterBrokerProtocolIsKafkaV2 STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralAclChangeEventWhenInterBrokerProtocolIsKafkaV2 PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnPrefixedResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnPrefixedResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testSingleCharacterResourceAcls 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testSingleCharacterResourceAcls 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testEmptyAclThrowsException 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testEmptyAclThrowsException PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testSuperUserWithCustomPrincipalHasAccess STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testSuperUserWithCustomPrincipalHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAllowAccessWithCustomPrincipal STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAllowAccessWithCustomPrincipal PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnWildcardResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDeleteAclOnWildcardResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testChangeListenerTiming STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testChangeListenerTiming PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralWritesLiteralAclChangeEventWhenInterBrokerProtocolLessThanKafkaV2eralAclChangesForOlderProtocolVersions
 STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testWritesLiteralWritesLiteralAclChangeEventWhenInterBrokerProtocolLessThanKafkaV2eralAclChangesForOlderProtocolVersions
 PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testThrowsOnAddPrefixedAclIfInterBrokerProtocolVersionTooLow STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testThrowsOnAddPrefixedAclIfInterBrokerProtocolVersionTooLow PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAccessAllowedIfAllowAclExistsOnPrefixedResource STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAccessAllowedIfAllowAclExistsOnPrefixedResource PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAuthorizeWithEmptyResourceName STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAuthorizeWithEmptyResourceName PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDeleteAllAclOnPrefixedResource STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDeleteAllAclOnPrefixedResource PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnLiteralResource 
STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnLiteralResource 
PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAuthorizeThrowsOnNoneLiteralResource STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testAuthorizeThrowsOnNoneLiteralResource PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testGetAclsPrincipal STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testGetAclsPrincipal PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAddAclsOnPrefiexedResource 
STARTED


Build failed in Jenkins: kafka-trunk-jdk8 #3265

2018-12-17 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: Include additional detail in fetch error message (#6036)

--
[...truncated 4.49 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueIsEqualWithNullForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueAndTimestampIsEqualWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestampWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 

[jira] [Created] (KAFKA-7749) confluent does not provide option to set consumer properties at connector level

2018-12-17 Thread Manjeet Duhan (JIRA)
Manjeet Duhan created KAFKA-7749:


 Summary: confluent does not provide option to set consumer 
properties at connector level
 Key: KAFKA-7749
 URL: https://issues.apache.org/jira/browse/KAFKA-7749
 Project: Kafka
  Issue Type: Improvement
  Components: KafkaConnect
Reporter: Manjeet Duhan


_We want to increase consumer.max.poll.record to increase performance but this  
value can only be set in worker properties which is applicable to all 
connectors given cluster._

 __ 

_Operative Situation :- We have one project which is communicating with 
Elasticsearch and we set consumer.max.poll.record=500 after multiple 
performance tests which worked fine for an year._

 _Then one more project onboarded in the same cluster which required 
consumer.max.poll.record=5000 based on their performance tests. This 
configuration is moved to production._

  _Admetric started failing as it was taking more than 5 minutes to process 
5000 polled records and started throwing commitfailed exception which is 
vicious cycle as it will process same data over and over again._

 __ 

_We can control above if start consumer using plain java but this control was 
not available at each consumer level in confluent connector._

_I have overridden kafka code to accept connector properties which will be 
applied to single connector and others will keep on using default properties . 
These changes are already running in production for more than 5 months._

_Some of the properties which were useful for us._



max.poll.records

max.poll.interval.ms

request.timeout.ms

key.deserializer

value.deserializer

heartbeat.interval.ms

session.timeout.ms

auto.offset.reset

connections.max.idle.ms

enable.auto.commit

 

auto.commit.interval.ms

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-228 Negative record timestamp support

2018-12-17 Thread Gwen Shapira
Guozhang,

Can you speak to the following note under "impact on existing users":
"No impact on current users, they should update their infrastructure
in that order: Broker, Consumers, Producers."

I think this isn't true, but if it is - we need to fix the proposal
and make sure we allow upgrade at any order (as is usual for Kafka
since version 0.10.0.0)

On Sun, Dec 9, 2018 at 9:21 PM Guozhang Wang  wrote:
>
> Hi folks,
>
> Thanks for your replies! Just to clarify, the proposal itself does not 
> introduce any additional fields in the message format (some new attributes 
> are mentioned in the Rejected Alternatives though), so my understand is that 
> we do not need to increment the magic byte version of the message itself.
>
> Also, ConsumerRecord.NO_TIMESTAMP (-1) is still used as "no timestamp", and 
> as we've discussed before it is a good trade-off to say "we do not have way 
> express Wednesday, December 31, 1969 11:59:59.999". I.e. we are extending the 
> timestamp to have other negative values than -1, but we did not change the 
> semantics if value -1 itself. So I think although we do bump up the request 
> protocol version for semantics changes, it is not necessary for this case 
> (please correct me if there are cases that would not work for it).
>
> I do agree with Magnus's point that for ListOffsetsRequest, we should 
> consider using different values than -1 / -2 to indicate `EARLEST / LATEST 
> TIMESTAMP` now since for example timestamp -2 does have a meaningful 
> semantics now, and hence its protocol version would need bump as we change 
> its field. And I think a single byte indicating the type as (EARLEST, LATEST, 
> ACTUAL_TIMESTAMP_VALUE) should be sufficient, but I'll leave it to Konstandin 
> to decide if he wants to do this KIP or do it in another follow-up KIP.
>
>
> Guozhang
>
>
> On Fri, Dec 7, 2018 at 5:06 PM Jun Rao  wrote:
>>
>> Hi, Konstandin,
>>
>> Thanks for the KIP. I agree with Magnus on the protocol version changes. As
>> for the sentinel value, currently ConsumerRecord.NO_TIMESTAMP (-1) is used
>> for V0 message format. For compatibility, it seems that we still need to
>> preserve that.
>>
>> Jun
>>
>> On Thu, Dec 6, 2018 at 2:32 AM Magnus Edenhill  wrote:
>>
>> > Sorry for getting in the game this late, and on the wrong thread!
>> >
>> > I think negative timestamps makes sense and is a good addition,
>> > but I have a couple of concerns with the proposal:
>> >
>> >  1. I believe any change to the protocol format or semantics require a
>> > protocol bump, in this case for ProduceRequest, FetchRequest,
>> > ListOffsetsRequest.
>> >  2. ListOffsetsRequest should be changed to allow logical (END, BEGINNING)
>> > and absolute lookups without special treatment of two absolute values as
>> > logical (-1, -2), this seems like a hack and will require application logic
>> > to avoid these timestamps, that's leaky abstraction.
>> >  Perhaps add a new field `int8 LookupType = { BEGINNING=-2, END=-1,
>> > TIMESTAMP=0 }`: the broker will either look up using the absolute
>> > Timestamp, or logical offset value, depending on the value of LookupType.
>> >  3. With the added Attribute for extended timestamp, do we really need to
>> > have a sentinel value for an unset timestamp (-1 or Long.MIN_VALUE)?
>> >  To make the logic simpler I suggest the attribute is renamed to just
>> > Timestamp, and if the Timestamp attribute is set, the Timestamp field is
>> > always a proper timestamp. If the bit is not set, no timestamp was
>> > provided.
>> >
>> >  /Magnus
>> >
>> >
>> >
>> > Den tors 6 dec. 2018 kl 08:06 skrev Gwen Shapira :
>> >
>> > > I may be missing something, but why are we using an attribute for
>> > > this? IIRC, we normally bump protocol version to indicate semantic
>> > > changes. If I understand correctly, not using an attribute will allow
>> > > us to not change the message format (just the protocol), which makes
>> > > the upgrade significantly easier (since we don't up/down convert).
>> > >
>> > > Another thing I don't understand: The compatibility map indicates that
>> > > NO_TIMESTAMP is now Long.MIN_VALUE, but a bit above that you say that
>> > > -1 semantics does not change.
>> > >
>> > > Last: At around version 1.0 we decide to completely avoid changes that
>> > > require a particular order of upgrades (this is why we added the
>> > > versions API, up/down conversion, etc). So I'd like to see if we can
>> > > avoid this here (should be relatively easy?). I'm CCing Magnus,
>> > > because I think he remembers *why* we made this decision in first
>> > > place.
>> > >
>> > > Gwen
>> > > On Thu, Dec 6, 2018 at 4:44 PM Guozhang Wang  wrote:
>> > > >
>> > > > Bump up on this thread again: we have two binding votes already and
>> > need
>> > > > another committer to take a look at it and vote.
>> > > >
>> > > >
>> > > > Guozhang
>> > > >
>> > > > On Fri, Oct 19, 2018 at 11:34 AM Konstantin Chukhlomin <
>> > > chuhlo...@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > 

Re: [VOTE] KIP-228 Negative record timestamp support

2018-12-17 Thread Jun Rao
Hi. Guozhang,

For fetch/produce requests, even though there is no protocol change, it is
a semantic change if the timestamp can be negative. Some client
implementations may depend on that. So, it's probably better to bump up the
version of those requests.

Thanks,

Jun

On Sun, Dec 9, 2018 at 9:21 PM Guozhang Wang  wrote:

> Hi folks,
>
> Thanks for your replies! Just to clarify, the proposal itself does not
> introduce any additional fields in the message format (some new attributes
> are mentioned in the Rejected Alternatives though), so my understand is
> that we do not need to increment the magic byte version of the message
> itself.
>
> Also, ConsumerRecord.NO_TIMESTAMP (-1) is still used as "no timestamp", and
> as we've discussed before it is a good trade-off to say "we do not have way
> express Wednesday, December 31, 1969 11:59:59.999". I.e. we are extending
> the timestamp to have other negative values than -1, but we did not change
> the semantics if value -1 itself. So I think although we do bump up the
> request protocol version for semantics changes, it is not necessary for
> this case (please correct me if there are cases that would not work for
> it).
>
> I do agree with Magnus's point that for ListOffsetsRequest, we should
> consider using different values than -1 / -2 to indicate `EARLEST / LATEST
> TIMESTAMP` now since for example timestamp -2 does have a meaningful
> semantics now, and hence its protocol version would need bump as we change
> its field. And I think a single byte indicating the type as (EARLEST,
> LATEST, ACTUAL_TIMESTAMP_VALUE) should be sufficient, but I'll leave it to
> Konstandin to decide if he wants to do this KIP or do it in another
> follow-up KIP.
>
>
> Guozhang
>
>
> On Fri, Dec 7, 2018 at 5:06 PM Jun Rao  wrote:
>
> > Hi, Konstandin,
> >
> > Thanks for the KIP. I agree with Magnus on the protocol version changes.
> As
> > for the sentinel value, currently ConsumerRecord.NO_TIMESTAMP (-1) is
> used
> > for V0 message format. For compatibility, it seems that we still need to
> > preserve that.
> >
> > Jun
> >
> > On Thu, Dec 6, 2018 at 2:32 AM Magnus Edenhill 
> wrote:
> >
> > > Sorry for getting in the game this late, and on the wrong thread!
> > >
> > > I think negative timestamps makes sense and is a good addition,
> > > but I have a couple of concerns with the proposal:
> > >
> > >  1. I believe any change to the protocol format or semantics require a
> > > protocol bump, in this case for ProduceRequest, FetchRequest,
> > > ListOffsetsRequest.
> > >  2. ListOffsetsRequest should be changed to allow logical (END,
> > BEGINNING)
> > > and absolute lookups without special treatment of two absolute values
> as
> > > logical (-1, -2), this seems like a hack and will require application
> > logic
> > > to avoid these timestamps, that's leaky abstraction.
> > >  Perhaps add a new field `int8 LookupType = { BEGINNING=-2, END=-1,
> > > TIMESTAMP=0 }`: the broker will either look up using the absolute
> > > Timestamp, or logical offset value, depending on the value of
> LookupType.
> > >  3. With the added Attribute for extended timestamp, do we really need
> to
> > > have a sentinel value for an unset timestamp (-1 or Long.MIN_VALUE)?
> > >  To make the logic simpler I suggest the attribute is renamed to
> just
> > > Timestamp, and if the Timestamp attribute is set, the Timestamp field
> is
> > > always a proper timestamp. If the bit is not set, no timestamp was
> > > provided.
> > >
> > >  /Magnus
> > >
> > >
> > >
> > > Den tors 6 dec. 2018 kl 08:06 skrev Gwen Shapira :
> > >
> > > > I may be missing something, but why are we using an attribute for
> > > > this? IIRC, we normally bump protocol version to indicate semantic
> > > > changes. If I understand correctly, not using an attribute will allow
> > > > us to not change the message format (just the protocol), which makes
> > > > the upgrade significantly easier (since we don't up/down convert).
> > > >
> > > > Another thing I don't understand: The compatibility map indicates
> that
> > > > NO_TIMESTAMP is now Long.MIN_VALUE, but a bit above that you say that
> > > > -1 semantics does not change.
> > > >
> > > > Last: At around version 1.0 we decide to completely avoid changes
> that
> > > > require a particular order of upgrades (this is why we added the
> > > > versions API, up/down conversion, etc). So I'd like to see if we can
> > > > avoid this here (should be relatively easy?). I'm CCing Magnus,
> > > > because I think he remembers *why* we made this decision in first
> > > > place.
> > > >
> > > > Gwen
> > > > On Thu, Dec 6, 2018 at 4:44 PM Guozhang Wang 
> > wrote:
> > > > >
> > > > > Bump up on this thread again: we have two binding votes already and
> > > need
> > > > > another committer to take a look at it and vote.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Fri, Oct 19, 2018 at 11:34 AM Konstantin Chukhlomin <
> > > > chuhlo...@gmail.com>
> > > > > wrote:
> > > > >
> > > 

Build failed in Jenkins: kafka-2.0-jdk8 #203

2018-12-17 Thread Apache Jenkins Server
See 


Changes:

[jason] KAFKA-7616; Make MockConsumer only add entries to the partition map

--
[...truncated 888.51 KB...]
kafka.zk.ReassignPartitionsZNodeTest > testDecodeValidJson STARTED

kafka.zk.ReassignPartitionsZNodeTest > testDecodeValidJson PASSED

kafka.zk.KafkaZkClientTest > testZNodeChangeHandlerForDataChange STARTED

kafka.zk.KafkaZkClientTest > testZNodeChangeHandlerForDataChange PASSED

kafka.zk.KafkaZkClientTest > testCreateAndGetTopicPartitionStatesRaw STARTED

kafka.zk.KafkaZkClientTest > testCreateAndGetTopicPartitionStatesRaw PASSED

kafka.zk.KafkaZkClientTest > testLogDirGetters STARTED

kafka.zk.KafkaZkClientTest > testLogDirGetters PASSED

kafka.zk.KafkaZkClientTest > testSetGetAndDeletePartitionReassignment STARTED

kafka.zk.KafkaZkClientTest > testSetGetAndDeletePartitionReassignment PASSED

kafka.zk.KafkaZkClientTest > testIsrChangeNotificationsDeletion STARTED

kafka.zk.KafkaZkClientTest > testIsrChangeNotificationsDeletion PASSED

kafka.zk.KafkaZkClientTest > testGetDataAndVersion STARTED

kafka.zk.KafkaZkClientTest > testGetDataAndVersion PASSED

kafka.zk.KafkaZkClientTest > testGetChildren STARTED

kafka.zk.KafkaZkClientTest > testGetChildren PASSED

kafka.zk.KafkaZkClientTest > testSetAndGetConsumerOffset STARTED

kafka.zk.KafkaZkClientTest > testSetAndGetConsumerOffset PASSED

kafka.zk.KafkaZkClientTest > testClusterIdMethods STARTED

kafka.zk.KafkaZkClientTest > testClusterIdMethods PASSED

kafka.zk.KafkaZkClientTest > testEntityConfigManagementMethods STARTED

kafka.zk.KafkaZkClientTest > testEntityConfigManagementMethods PASSED

kafka.zk.KafkaZkClientTest > testUpdateLeaderAndIsr STARTED

kafka.zk.KafkaZkClientTest > testUpdateLeaderAndIsr PASSED

kafka.zk.KafkaZkClientTest > testUpdateBrokerInfo STARTED

kafka.zk.KafkaZkClientTest > testUpdateBrokerInfo PASSED

kafka.zk.KafkaZkClientTest > testCreateRecursive STARTED

kafka.zk.KafkaZkClientTest > testCreateRecursive PASSED

kafka.zk.KafkaZkClientTest > testGetConsumerOffsetNoData STARTED

kafka.zk.KafkaZkClientTest > testGetConsumerOffsetNoData PASSED

kafka.zk.KafkaZkClientTest > testDeleteTopicPathMethods STARTED

kafka.zk.KafkaZkClientTest > testDeleteTopicPathMethods PASSED

kafka.zk.KafkaZkClientTest > testSetTopicPartitionStatesRaw STARTED

kafka.zk.KafkaZkClientTest > testSetTopicPartitionStatesRaw PASSED

kafka.zk.KafkaZkClientTest > testAclManagementMethods STARTED

kafka.zk.KafkaZkClientTest > testAclManagementMethods PASSED

kafka.zk.KafkaZkClientTest > testPreferredReplicaElectionMethods STARTED

kafka.zk.KafkaZkClientTest > testPreferredReplicaElectionMethods PASSED

kafka.zk.KafkaZkClientTest > testPropagateLogDir STARTED

kafka.zk.KafkaZkClientTest > testPropagateLogDir PASSED

kafka.zk.KafkaZkClientTest > testGetDataAndStat STARTED

kafka.zk.KafkaZkClientTest > testGetDataAndStat PASSED

kafka.zk.KafkaZkClientTest > testReassignPartitionsInProgress STARTED

kafka.zk.KafkaZkClientTest > testReassignPartitionsInProgress PASSED

kafka.zk.KafkaZkClientTest > testCreateTopLevelPaths STARTED

kafka.zk.KafkaZkClientTest > testCreateTopLevelPaths PASSED

kafka.zk.KafkaZkClientTest > testIsrChangeNotificationGetters STARTED

kafka.zk.KafkaZkClientTest > testIsrChangeNotificationGetters PASSED

kafka.zk.KafkaZkClientTest > testLogDirEventNotificationsDeletion STARTED

kafka.zk.KafkaZkClientTest > testLogDirEventNotificationsDeletion PASSED

kafka.zk.KafkaZkClientTest > testGetLogConfigs STARTED

kafka.zk.KafkaZkClientTest > testGetLogConfigs PASSED

kafka.zk.KafkaZkClientTest > testBrokerSequenceIdMethods STARTED

kafka.zk.KafkaZkClientTest > testBrokerSequenceIdMethods PASSED

kafka.zk.KafkaZkClientTest > testCreateSequentialPersistentPath STARTED

kafka.zk.KafkaZkClientTest > testCreateSequentialPersistentPath PASSED

kafka.zk.KafkaZkClientTest > testConditionalUpdatePath STARTED

kafka.zk.KafkaZkClientTest > testConditionalUpdatePath PASSED

kafka.zk.KafkaZkClientTest > testDeleteTopicZNode STARTED

kafka.zk.KafkaZkClientTest > testDeleteTopicZNode PASSED

kafka.zk.KafkaZkClientTest > testDeletePath STARTED

kafka.zk.KafkaZkClientTest > testDeletePath PASSED

kafka.zk.KafkaZkClientTest > testGetBrokerMethods STARTED

kafka.zk.KafkaZkClientTest > testGetBrokerMethods PASSED

kafka.zk.KafkaZkClientTest > testCreateTokenChangeNotification STARTED

kafka.zk.KafkaZkClientTest > testCreateTokenChangeNotification PASSED

kafka.zk.KafkaZkClientTest > testGetTopicsAndPartitions STARTED

kafka.zk.KafkaZkClientTest > testGetTopicsAndPartitions PASSED

kafka.zk.KafkaZkClientTest > testRegisterBrokerInfo STARTED

kafka.zk.KafkaZkClientTest > testRegisterBrokerInfo PASSED

kafka.zk.KafkaZkClientTest > testConsumerOffsetPath STARTED

kafka.zk.KafkaZkClientTest > testConsumerOffsetPath PASSED

kafka.zk.KafkaZkClientTest > testControllerManagementMethods STARTED

kafka.zk.KafkaZkClientTest > 

Re: Kafka Summit NYC and London in 2019

2018-12-17 Thread Jun Rao
Hi, Everyone,

This is a reminder about the deadline for proposal this Thursday.

Thanks,

Jun

On Tue, Dec 4, 2018 at 1:49 PM Jun Rao  wrote:

> Hi, Everyone,
>
> We have two upcoming Kafka Summits, one in NYC and another in London. The
> deadline for summiting proposals is Dec 20 for both events. Please consider
> submitting a proposal if you are interested. The Links to submit abstracts
> are
>
> Kafka Summit NYC - https://myeventi.events/kafka19/ny/cfp/
> Kafka Summit London - https://myeventi.events/kafka19/gb/cfp/
>
> Thanks,
>
> Jun
>


Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-12-17 Thread Adam Bellemare
Hi John and Guozhang

Ah yes, I lost that in the mix! Thanks for the convergent solutions - I do
think that the attachment that John included makes for a better design. It
should also help with overall performance as very high-cardinality foreign
keyed data (say millions of events with the same entity) will be able to
leverage the multiple nodes for join functionality instead of having it all
performed in one node. There is still a bottleneck in the right table
having to propagate all those events, but with slimmer structures, less IO
and no need to perform the join I think the throughput will be much higher
in those scenarios.

Okay, I am convinced. I will update the KIP accordingly to a Gliffy version
of John's diagram and ensure that the example flow matches correctly. Then
I can go back to working on the PR to match the diagram.

Thanks both of you for all the help - very much appreciated.

Adam







On Mon, Dec 17, 2018 at 6:39 PM Guozhang Wang  wrote:

> Hi John,
>
> Just made a pass on your diagram (nice hand-drawing btw!), and obviously we
> are thinking about the same thing :) A neat difference that I like, is that
> in the pre-join repartition topic we can still send message in the format
> of `K=k, V=(i=2)` while using "i" as the partition key in StreamsPartition,
> this way we do not need to even augment the key for the repartition topic,
> but just do a projection on the foreign key part but trim all other fields:
> as long as we still materialize the store as `A-2` co-located with the
> right KTable, that is fine.
>
> As I mentioned in my previous email, I also think this has a few advantages
> on saving over-the-wire bytes as well as disk bytes.
>
> Guozhang
>
>
> On Mon, Dec 17, 2018 at 3:17 PM John Roesler  wrote:
>
> > Hi Guozhang,
> >
> > Thanks for taking a look! I think Adam's already addressed your questions
> > as well as I could have.
> >
> > Hi Adam,
> >
> > Thanks for updating the KIP. It looks great, especially how all the
> > need-to-know information is right at the top, followed by the details.
> >
> > Also, thanks for that high-level diagram. Actually, now that I'm looking
> > at it, I think part of my proposal got lost in translation, although I do
> > think that what you have there is also correct.
> >
> > I sketched up a crude diagram based on yours and attached it to the KIP
> > (I'm not sure if attached or inline images work on the mailing list):
> >
> https://cwiki.apache.org/confluence/download/attachments/74684836/suggestion.png
> > . It's also attached to this email for convenience.
> >
> > Hopefully, you can see how it's intended to line up, and which parts are
> > modified.
> > At a high level, instead of performing the join on the right-hand side,
> > we're essentially just registering interest, like "LHS key A wishes to
> > receive updates for RHS key 2". Then, when there is a new "interest" or
> any
> > updates to the RHS records, it "broadcasts" its state back to the LHS
> > records who are interested in it.
> >
> > Thus, instead of sending the LHS values to the RHS joiner workers and
> then
> > sending the join results back to the LHS worke be co-partitioned and
> > validated, we instead only send the LHS *keys* to the RHS workers and
> then
> > only the RHS k/v back to be joined by the LHS worker.
> >
> > I've been considering both your diagram and mine, and I *think* what I'm
> > proposing has a few advantages.
> >
> > Here are some points of interest as you look at the diagram:
> > * When we extract the foreign key and send it to the Pre-Join Repartition
> > Topic, we can send only the FK/PK pair. There's no need to worry about
> > custom partitioner logic, since we can just use the foreign key plainly
> as
> > the repartition record key. Also, we save on transmitting the LHS value,
> > since we only send its key in this step.
> > * We also only need to store the RHSKey:LHSKey mapping in the
> > MaterializedSubscriptionStore, saving on disk. We can use the same rocks
> > key format you proposed and the same algorithm involving range scans when
> > the RHS records get updated.
> > * Instead of joining on the right side, all we do is compose a
> > re-repartition record so we can broadcast the RHS k/v pair back to the
> > original LHS partition. (this is what the "rekey" node is doing)
> > * Then, there is a special kind of Joiner that's co-resident in the same
> > StreamTask as the LHS table, subscribed to the Post-Join Repartition
> Topic.
> > ** This Joiner is *not* triggered directly by any changes in the LHS
> > KTable. Instead, LHS events indirectly trigger the join via the whole
> > lifecycle.
> > ** For each event arriving from the Post-Join Repartition Topic, the
> > Joiner looks up the corresponding record in the LHS KTable. It validates
> > the FK as you noted, discarding any inconsistent events. Otherwise, it
> > unpacks the RHS K/V pair and invokes the ValueJoiner to obtain the join
> > result
> > ** Note that the Joiner itself is stateless, so 

Re: [VOTE] KIP-345: Introduce static membership protocol to reduce consumer rebalances

2018-12-17 Thread Jason Gustafson
Hi Boyang,

Thanks, the KIP looks good. Just one comment.

The new schema for the LeaveGroup request is slightly odd since it is
handling both the single consumer use case and the administrative use case.
I wonder we could make it consistent from a batching perspective.

In other words, instead of this:
LeaveGroupRequest => GroupId MemberId [GroupInstanceId]

Maybe we could do this:
LeaveGroupRequest => GroupId [GroupInstanceId MemberId]

For dynamic members, GroupInstanceId could be empty, which is consistent
with JoinGroup. What do you think?

Also, just for clarification, what is the expected behavior if the current
memberId of a static member is passed to LeaveGroup? Will the static member
be removed? I know the consumer will not do this, but we'll still have to
handle the case on the broker.

Best,
Jason


On Mon, Dec 10, 2018 at 11:54 PM Boyang Chen  wrote:

> Thanks Stanislav!
>
> Get Outlook for iOS
>
> 
> From: Stanislav Kozlovski 
> Sent: Monday, December 10, 2018 11:28 PM
> To: dev@kafka.apache.org
> Subject: Re: [VOTE] KIP-345: Introduce static membership protocol to
> reduce consumer rebalances
>
> This is great work, Boyang. Thank you very much.
>
> +1 (non-binding)
>
> On Mon, Dec 10, 2018 at 6:09 PM Boyang Chen  wrote:
>
> > Hey there, could I get more votes on this thread?
> >
> > Thanks for the vote from Mayuresh and Mike :)
> >
> > Best,
> > Boyang
> > 
> > From: Mayuresh Gharat 
> > Sent: Thursday, December 6, 2018 10:53 AM
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-345: Introduce static membership protocol to
> > reduce consumer rebalances
> >
> > +1 (non-binding)
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Tue, Dec 4, 2018 at 6:58 AM Mike Freyberger <
> mike.freyber...@xandr.com>
> > wrote:
> >
> > > +1 (non binding)
> > >
> > > On 12/4/18, 9:43 AM, "Patrick Williams" <
> patrick.willi...@storageos.com
> > >
> > > wrote:
> > >
> > > Pls take me off this VOTE list
> > >
> > > Best,
> > >
> > > Patrick Williams
> > >
> > > Sales Manager, UK & Ireland, Nordics & Israel
> > > StorageOS
> > > +44 (0)7549 676279
> > > patrick.willi...@storageos.com
> > >
> > > 20 Midtown
> > > 20 Proctor Street
> > > Holborn
> > > London WC1V 6NX
> > >
> > > Twitter: @patch37
> > > LinkedIn: linkedin.com/in/patrickwilliams4 <
> > >
> >
> https://nam02.safelinks.protection.outlook.com/?url=http%3A%2F%2Flinkedin.com%2Fin%2Fpatrickwilliams4data=02%7C01%7C%7C9b12ec4ce9ae4454db8a08d65f3a4862%7C84df9e7fe9f640afb435%7C1%7C0%7C636801101252994092sdata=ipDTX%2FGARrFkwZfRuOY0M5m3iJ%2Bnkxovv6u9bBDaTyc%3Dreserved=0
> > >
> > >
> > >
> >
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fslack.storageos.com%2Fdata=02%7C01%7C%7C9b12ec4ce9ae4454db8a08d65f3a4862%7C84df9e7fe9f640afb435%7C1%7C0%7C636801101252994092sdata=hxuKU6aZdQU%2FpxpqaaThR6IjpEmwIP5%2F3NhYzMYijkw%3Dreserved=0
> > >
> > >
> > >
> > > On 03/12/2018, 17:34, "Guozhang Wang"  wrote:
> > >
> > > Hello Boyang,
> > >
> > > I've browsed through the new wiki and there are still a couple of
> > > minor
> > > things to notice:
> > >
> > > 1. RemoveMemberFromGroupOptions seems not defined anywhere.
> > >
> > > 2. LeaveGroupRequest added a list of group instance id, but still
> > > keep the
> > > member id as a singleton; is that intentional? I think to make
> > the
> > > protocol
> > > consistent both member id and instance ids could be plural.
> > >
> > > 3. About the *kafka-remove-member-from-group.sh *tool, I'm
> > > wondering if we
> > > can defer adding this while just add the corresponding calls of
> > the
> > > LeaveGroupRequest inside Streams until we have used it in
> > > production and
> > > hence have a better understanding on how flexible or extensible
> > if
> > > we want
> > > to add any cmd tools. The rationale is that if we do not
> > > necessarily need
> > > it now, we can always add it later with a more think-through API
> > > design,
> > > but if we add the tool in a rush, we may need to extend or modify
> > > it soon
> > > after we realize its limits in operations.
> > >
> > > Otherwise, I'm +1 on the proposal.
> > >
> > > Guozhang
> > >
> > >
> > > On Mon, Dec 3, 2018 at 9:14 AM Boyang Chen 
> > > wrote:
> > >
> > > > Hey community friends,
> > > >
> > > > after another month of polishing, KIP-345<
> > > >
> > >
> >
> https://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-345%253A%2BIntroduce%2Bstatic%2Bmembership%2Bprotocol%2Bto%2Breduce%2Bconsumer%2Brebalancesdata=02%7C01%7C%7C9b12ec4ce9ae4454db8a08d65f3a4862%7C84df9e7fe9f640afb435%7C1%7C0%7C636801101252994092sdata=T4i7L1i0nIeHrrjTeLOOgYKsfzfNEMGDhTazvBEZbXw%3Dreserved=0
> > > >
> > > > design is ready for vote. Feel free to add your comment on the
> > > discussion
> > > > thread or here.
> > > >
> > > > Thanks for your time!
> > > >
> > > > Boyang
> > > > 

Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-12-17 Thread Guozhang Wang
Hi John,

Just made a pass on your diagram (nice hand-drawing btw!), and obviously we
are thinking about the same thing :) A neat difference that I like, is that
in the pre-join repartition topic we can still send message in the format
of `K=k, V=(i=2)` while using "i" as the partition key in StreamsPartition,
this way we do not need to even augment the key for the repartition topic,
but just do a projection on the foreign key part but trim all other fields:
as long as we still materialize the store as `A-2` co-located with the
right KTable, that is fine.

As I mentioned in my previous email, I also think this has a few advantages
on saving over-the-wire bytes as well as disk bytes.

Guozhang


On Mon, Dec 17, 2018 at 3:17 PM John Roesler  wrote:

> Hi Guozhang,
>
> Thanks for taking a look! I think Adam's already addressed your questions
> as well as I could have.
>
> Hi Adam,
>
> Thanks for updating the KIP. It looks great, especially how all the
> need-to-know information is right at the top, followed by the details.
>
> Also, thanks for that high-level diagram. Actually, now that I'm looking
> at it, I think part of my proposal got lost in translation, although I do
> think that what you have there is also correct.
>
> I sketched up a crude diagram based on yours and attached it to the KIP
> (I'm not sure if attached or inline images work on the mailing list):
> https://cwiki.apache.org/confluence/download/attachments/74684836/suggestion.png
> . It's also attached to this email for convenience.
>
> Hopefully, you can see how it's intended to line up, and which parts are
> modified.
> At a high level, instead of performing the join on the right-hand side,
> we're essentially just registering interest, like "LHS key A wishes to
> receive updates for RHS key 2". Then, when there is a new "interest" or any
> updates to the RHS records, it "broadcasts" its state back to the LHS
> records who are interested in it.
>
> Thus, instead of sending the LHS values to the RHS joiner workers and then
> sending the join results back to the LHS worke be co-partitioned and
> validated, we instead only send the LHS *keys* to the RHS workers and then
> only the RHS k/v back to be joined by the LHS worker.
>
> I've been considering both your diagram and mine, and I *think* what I'm
> proposing has a few advantages.
>
> Here are some points of interest as you look at the diagram:
> * When we extract the foreign key and send it to the Pre-Join Repartition
> Topic, we can send only the FK/PK pair. There's no need to worry about
> custom partitioner logic, since we can just use the foreign key plainly as
> the repartition record key. Also, we save on transmitting the LHS value,
> since we only send its key in this step.
> * We also only need to store the RHSKey:LHSKey mapping in the
> MaterializedSubscriptionStore, saving on disk. We can use the same rocks
> key format you proposed and the same algorithm involving range scans when
> the RHS records get updated.
> * Instead of joining on the right side, all we do is compose a
> re-repartition record so we can broadcast the RHS k/v pair back to the
> original LHS partition. (this is what the "rekey" node is doing)
> * Then, there is a special kind of Joiner that's co-resident in the same
> StreamTask as the LHS table, subscribed to the Post-Join Repartition Topic.
> ** This Joiner is *not* triggered directly by any changes in the LHS
> KTable. Instead, LHS events indirectly trigger the join via the whole
> lifecycle.
> ** For each event arriving from the Post-Join Repartition Topic, the
> Joiner looks up the corresponding record in the LHS KTable. It validates
> the FK as you noted, discarding any inconsistent events. Otherwise, it
> unpacks the RHS K/V pair and invokes the ValueJoiner to obtain the join
> result
> ** Note that the Joiner itself is stateless, so materializing the join
> result is optional, just as with the 1:1 joins.
>
> So in summary:
> * instead of transmitting the LHS keys and values to the right and the
> JoinResult back to the left, we only transmit the LHS keys to the right and
> the RHS values to the left. Assuming the average RHS value is on smaller
> than or equal to the average join result size, it's a clear win on broker
> traffic. I think this is actually a reasonable assumption, which we can
> discuss more if you're suspicious.
> * we only need one copy of the data (the left and right tables need to be
> materialized) and one extra copy of the PK:FK pairs in the Materialized
> Subscription Store. Materializing the join result is optional, just as with
> the existing 1:1 joins.
> * we still need the fancy range-scan algorithm on the right to locate all
> interested LHS keys when a RHS value is updated, but we don't need a custom
> partitioner for either repartition topic (this is of course a modification
> we could make to your version as well)
>
> How does this sound to you? (And did I miss anything?)
> -John
>
> On Mon, Dec 17, 2018 at 9:00 AM 

Re: [DISCUSS] KIP-307: Allow to define custom processor names with KStreams DSL

2018-12-17 Thread Guozhang Wang
Hi Florian / John,

Just wanted to throw a couple minor thoughts on the current proposal:

1) Regarding the interface / function name, I'd propose we call the
interface `NamedOperation` which would be implemented by Produced /
Consumed / Printed / Joined / Grouped / Suppressed (note I intentionally
exclude Materialized here since its semantics is quite), and have the
default class that implements `NamedOperation` as `Named`, which would be
used in our adding overload functions. The main reason is to have
consistency in naming.

2) As a minor tweak, I think it's better to use Joined.name() in both its
possibly generate repartition topic, as well as the map processor used for
group-by (currently this name is only used for the repartition topic).


Florian: if you think this proposal makes sense, please feel free to go
ahead and update the PR; after we made a first pass on it and feels
confident about it, we can go ahead with the VOTING process. About the
implementation of 2) above, this may be out of your implementation scope,
so feel free to leave it out side your PR while Bill who originally worked
on the Grouped KIP can make a follow-up PR for it.

Guozhang

On Fri, Dec 14, 2018 at 9:43 PM Guozhang Wang  wrote:

> Hello Florian,
>
> Really appreciate you for your patience.
>
> I know that we've discussed about the approach to adding overloaded
> functions and rejected it early on. But looking deeper into the current PR
> I realized that this approach has a danger of great API confusions to users
> (I tried to explain my thoughts in the PR, but it was not very clear) ---
> the basic idea is that, today we already have a few existing control
> classes including Grouped, Joined, Suppressed that allow users to specify
> serdes etc, while also a "name" which can then be used to define the
> processor name / internal topic names in the topology (the static function
> names are not consistent, which I think we should fix as well). And Named
> interface, by extending the lambda function interfaces like ValueJoiner /
> Predicate etc opens the door for another way to specify the names again.
>
> So in order to achieve consistency, we are left with generally two options:
>
> 1) only allow users to specify names via the lambda interfaces that
> extends Named interface. This means we'd better remove the naming mechanism
> from the existing control objects to keep consistency.
>
> 2) only allow users to specify names via control classes, and we introduce
> a new class (Named) for those which do not have one yet --- this leads to
> the overloaded functions.
>
> I did a quick count on the num.of overloaded functions, and summing from
> KTable (8) / KStream (15) / KGroupedStream (6) / KGroupedTable (6) /
> TimeWindowedKStream (6) / SessionWindowedKStream (6) we got about 47
> overloaded functions (our guess was pretty close!) -- note this is based on
> John's proposal that we can let existing Grouped / Joined to extend Named
> and hence we only need overloaded functions with a default NamedOperation
> for those operators that do not have a control classes already.
>
> Thinking about this approach I feel it is not too bad compared with either
> 1) above, which would require us to deprecate lot of public functions
> around name(), or having a mixed mechanism for naming, which could lead to
> very confusing behavior to users. Additionally, for most users who would
> only want to specify the names for those stateful operations which have
> internal topics / state stores and hence are more keen to upgrade
> compatibility, those added overloads would be not-often used functions for
> them anyways. And by letting existing control classes to extend Named, we
> can have a unified method name for static constructor as well.
>
>
>
> Guozhang
>
>
> On Fri, Dec 14, 2018 at 10:24 AM John Roesler  wrote:
>
>> Hi Florian,
>>
>> Sorry about the run-around of rejecting the original proposal,
>> only to return to it later on. Hopefully, it's more encouraging
>> than frustrating that we're coming around to your initial way of
>> thinking.
>>
>> Thanks!
>> -John
>>
>> On Thu, Dec 13, 2018 at 4:28 PM Florian Hussonnois > >
>> wrote:
>>
>> > Hi all,
>> >
>> > Thanks again. I agree with your propositions.
>> > Also IMHO, overloading all methods (filter, map) to accept a new control
>> > object seems to provide a more natural development experience for users.
>> >
>> > Actually, this was the first proposition for this KIP, but we have
>> rejected
>> > it because this solution led to adding a lot of new methods.
>> > As you mentioned it, the API has evolve since the creation of this KIP -
>> > some existing control objects already allow to customize internal
>> names. We
>> > should so keep on that strategy.
>> >
>> > If everyone is OK with that, I will update the KIP and the PR
>> accordingly;
>> >
>> > Thanks.
>> >
>> > Le jeu. 13 déc. 2018 à 18:08, John Roesler  a écrit
>> :
>> >
>> > > Hi again, all,
>> > >
>> > > Matthias, I agree with you.

[jira] [Created] (KAFKA-7748) Add wall clock TimeDefinition for suppression of intermediate events

2018-12-17 Thread Jonathan Gordon (JIRA)
Jonathan Gordon created KAFKA-7748:
--

 Summary: Add wall clock TimeDefinition for suppression of 
intermediate events
 Key: KAFKA-7748
 URL: https://issues.apache.org/jira/browse/KAFKA-7748
 Project: Kafka
  Issue Type: New Feature
  Components: streams
Affects Versions: 2.1.0
Reporter: Jonathan Gordon


Currently, Kafka Streams offers the ability to suppress intermediate events 
based on either RecordTime or WindowEndTime, which are in turn defined by 
stream time:

{{Suppressed.untilTimeLimit(final Duration timeToWaitForMoreEvents, final 
BufferConfig bufferConfig)}}

It would be helpful to have another option that would allow suppression of 
intermediate events based on wall clock time. This would allow us to only 
produce a limited number of aggregates independent of their stream time (which 
in our case is event time).

For reference, here's the relevant KIP:

[https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables#KIP-328:AbilitytosuppressupdatesforKTables-Best-effortratelimitperkey]

And here's the relevant Confluent Slack thread:

https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1544468349201700

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: KIP-406: GlobalStreamThread should honor custom reset policy

2018-12-17 Thread Richard Yu
Hi Matthias,

It would be great if we got your input on this.


On Sun, Dec 16, 2018 at 3:06 PM Richard Yu 
wrote:

> Hi everybody,
>
> There is a new KIP regarding the resilience of GlobalStreamThread which
> could be seen below:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-406%3A+GlobalStreamThread+should+honor+custom+reset+policy
>
> We are considering the new addition of some new reset policy. It would be
> great if you could pitch in!
>
> Thanks,
> Richard Yu
>


Re: [DISCUSS] KIP-382: MirrorMaker 2.0

2018-12-17 Thread Jun Rao
Hi, Sonke, Ryanne,

Thanks for the explanation. To me, the single connect cluster model could
be useful for any connector, not just MM. So, if we want to add it, it
seems it would be useful to do it in a backward compatible way in the
connect framework, rather than sth specific to MM. I am not sure what the
best approach is. For example, one other option is KIP-296. If we feel that's
adding too much work in this KIP, it might be ok to leave this part out in
this KIP.

Jun

On Fri, Dec 14, 2018 at 1:25 PM Ryanne Dolan  wrote:

> Thanks Sönke, you're spot-on. I don't want MM2 to wait for Connect features
> that don't exist yet, especially if MM2 is the primary use case for them.
> Moreover, I think MM2 can drive and inform some of these features, which
> only makes sense if we adopt MM2 first.
>
> Ryanne
>
> On Fri, Dec 14, 2018, 9:03 AM Sönke Liebau
> 
> > Hi Jun,
> >
> > I believe Ryanne's idea is to run multiple workers per MM cluster-node,
> one
> > per target cluster. So in essence you'd specify three clusters in the MM
> > config and MM would then instantiate one worker per cluster. Every MM
> > connector would then be deployed to the appropriate (internal) worker
> that
> > is configured for the cluster in question. Thus there would be no changes
> > necessary to the Connect framework itself, everything would be handled
> by a
> > new layer around existing Connect code (probably a sibling implementation
> > to the DistributedHerder if I understood him correctly). Ryanne, please
> > correct/expand if I misunderstood your intentions.
> >
> > To briefly summarize the discussion that Ryanne and I had around this
> > earlier, my opinion was that the extra layer could potentially be avoided
> > by extending Connect instead, which would benefit all connectors.
> >
> > My proposal was to add a configuration option to the worker config that
> > allows defining "external clusters" which can then be referenced from the
> > connector config.
> >
> > For example:
> >
> > # Core cluster config stays the same and is used for status, config and
> > offsets as usual
> > bootstrap.servers=localkafka1:9092,localkafka2:9092
> >
> > # Allow defining extra remote clusters
> >
> >
> externalcluster.kafka_europe.bootstrap.servers=europekafka1:9092,europekafka2:9092
> > externalcluster.kafka_europe.security.protocol=SSL
> >
> >
> externalcluster.kafka_europe.ssl.truststore.location=/var/private/ssl/kafka.client.truststore.jks
> > ...
> >
> >
> externalcluster.kafka_asia.bootstrap.servers=asiakafka1:9092,asiakafka2:9092
> >
> >
> > When starting a connector you could now reference these pre-configured
> > clusters in the config:
> > {
> >   "name": "file-source",
> >   "config": {
> > "connector.class": "FileStreamSource",
> > "file": "/tmp/test.txt",
> > "topic": "connect-test",
> > "name": "file-source",
> > "cluster": "kafka_asia"
> >   }
> > }
> >
> > When omitting the "cluster" parameter current behavior of Connect remains
> > unchanged. This way we could address multiple remote clusters from
> within a
> > single worker without adding the extra layer for MirrorMaker. I believe
> > that this could be done without major structural changes to the Connect
> > codebase, but I freely admit that this opinion is based on 10 minutes
> > poking through the code not any real expertise.
> >
> > Ryanne's main concern with this approach was that there are additional
> > worker setting that apply to all connectors and that no truly universal
> > approach would be feasible while running a single worker per Connect
> node.
> > Also he feels that from a development perspective it would be preferable
> to
> > have independent MM code and contribute applicable features back to
> > Connect.
> > While I agree that this would make development of MM easier it will also
> > create a certain amount of extra code (can probably be kept at a minimum,
> > but still) that could be avoided by using "vanilla" Connect for MM.
> >
> > I hope I summarized your views accurately Ryanne, if not please feel free
> > to correct me!
> >
> > Best regards,
> > Sönke
> >
> >
> > On Fri, Dec 14, 2018 at 1:55 AM Jun Rao  wrote:
> >
> > > Hi, Ryanne,
> > >
> > > Regarding the single connect cluster model, yes, the co-existence of a
> > MM2
> > > REST API and the nearly identical Connect API is one of my concerns.
> > > Implementation wise, my understanding is that the producer URL in a
> > > SourceTask is always obtained from the connect worker's configuration.
> > So,
> > > not sure how you would customize the producer URL for individual
> > SourceTask
> > > w/o additional support from the Connect framework.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Mon, Dec 10, 2018 at 1:17 PM Ryanne Dolan 
> > > wrote:
> > >
> > > > Jun, thanks for your time reviewing the KIP.
> > > >
> > > > > In a MirrorSourceConnector, it seems that the offsets of the source
> > > will
> > > > be stored in a different cluster from the target cluster?
> > > >
> 

[jira] [Created] (KAFKA-7747) Consumer should check for truncation after leader changes

2018-12-17 Thread Jason Gustafson (JIRA)
Jason Gustafson created KAFKA-7747:
--

 Summary: Consumer should check for truncation after leader changes
 Key: KAFKA-7747
 URL: https://issues.apache.org/jira/browse/KAFKA-7747
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Reporter: Jason Gustafson
Assignee: Jason Gustafson
 Fix For: 2.2.0


The change is documented in KIP-320: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation.
 After a leader change, the consumer needs to verify its current fetch offset



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7746) sasl.jaas.config dynamic broker configuration does not accept "=" in value

2018-12-17 Thread Tom Scott (JIRA)
Tom Scott created KAFKA-7746:


 Summary: sasl.jaas.config dynamic broker configuration does not 
accept "=" in value
 Key: KAFKA-7746
 URL: https://issues.apache.org/jira/browse/KAFKA-7746
 Project: Kafka
  Issue Type: Bug
  Components: config, security
Reporter: Tom Scott


In KIP-226 it give an example of setting sasl.jaas.config using dynmaic broker 
configuration:

[https://cwiki.apache.org/confluence/display/KAFKA/KIP-226+-+Dynamic+Broker+Configuration#KIP-226-DynamicBrokerConfiguration-NewBrokerConfigurationOption]

 

However, as most SASL module configurations contain the "=" symbol this ends up 
with the error:

 
{code:java}
requirement failed: Invalid entity config: all configs to be added must be in 
the format “key=val”.{code}
 

I have tried various escape sequences but have not so far been successful.

 

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk8 #3264

2018-12-17 Thread Apache Jenkins Server
See 


Changes:

[github] MINOR: improve Streams error message (#5975)

--
[...truncated 4.49 MB...]
org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfKeyIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfValueIsEqualWithNullForCompareValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReverseForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfTimestampIsDifferentForCompareValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualWithNullForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullForCompareKeyValueWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentForCompareKeyValueTimestampWithProducerRecord PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordWithExpectedRecordForCompareValueTimestamp 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullExpectedRecordForCompareValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueTimestampWithProducerRecord
 PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldNotAllowNullProducerRecordForCompareKeyValue PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldFailIfValueIsDifferentWithNullReversForCompareKeyValueWithProducerRecord 
PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp STARTED

org.apache.kafka.streams.test.OutputVerifierTest > 
shouldPassIfKeyAndValueAndTimestampIsEqualForCompareKeyValueTimestamp PASSED

org.apache.kafka.streams.test.OutputVerifierTest > 

Re: [DISCUSS] KIP-404: Add Kafka Connect configuration parameter for disabling WADL output on OPTIONS request

2018-12-17 Thread Konstantine Karantasis
Thanks for considering removal Alex.
I totally agree with your assessment. Still, I'd be in favor of making
KIP-404 a small KIP that describes that this option is now being disabled.
(If I'm not mistaken, one place I've noticed this feature being used is in
Connect's unit tests for the rest interface).

Konstantine

On Fri, Dec 14, 2018 at 5:00 PM Oleksandr Diachenko <
alex.diache...@confluent.io> wrote:

> Konstantine and Jason,
>
> I do agree that this functionality was not documented, and most likely not
> intended to be present.
> Therefore we can consider it as not a part of the public interface, and
> current behavior as not expected.
> Hence, addressing the issue by just disabling the WADL output seems like a
> viable solution to me.
>
> In order to proceed, do we need this KIP at all, or creating a new JIRA and
> fixing it as a bug without
> changes in public interfaces are sufficient?
>
> Regards, Alex.
>
> On Fri, Dec 14, 2018 at 4:18 PM Jason Gustafson 
> wrote:
>
> > Hi Alex,
> >
> > I think WADL support was likely unintentional, so this could be treated
> as
> > more of a bug. Unless we think it's a good idea to support it going
> > forward, I'd suggest going with the rejected alternative of just turning
> it
> > off. What do you think?
> >
> > Thanks,
> > Jason
> >
> > On Fri, Dec 14, 2018 at 3:06 PM Oleksandr Diachenko <
> > alex.diache...@confluent.io> wrote:
> >
> > > Thanks, everyone for taking the time to review the KIP.
> > >
> > > It looks like there are no major objections on it, so I will start
> voting
> > > thread.
> > >
> > > Regards, Alex.
> > >
> > >
> > >
> > > On Thu, Dec 13, 2018 at 3:50 PM Randall Hauch 
> wrote:
> > >
> > > > Thanks, Alex. The KIP looks good to me.
> > > >
> > > > Randall
> > > >
> > > > On Wed, Dec 12, 2018 at 10:08 PM Guozhang Wang 
> > > wrote:
> > > >
> > > > > Alex,
> > > > >
> > > > > Thanks for putting up this KIP. The proposal lgtm.
> > > > >
> > > > > Guozhang
> > > > >
> > > > > On Wed, Dec 12, 2018 at 7:41 PM Oleksandr Diachenko <
> > > > odiache...@apache.org
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I would like to start a discussing for the following KIP:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-404%3A+Add+Kafka+Connect+configuration+parameter+for+disabling+WADL+output+on+OPTIONS+request
> > > > > > .
> > > > > >
> > > > > > The KIP proposes to add a configuration parameter for Connect
> > Worker,
> > > > > which
> > > > > > would allow to not expose WADL information in Connect REST api
> > > > responces.
> > > > > >
> > > > > > Feedback is appreciated, thanks in advance.
> > > > > >
> > > > > > Regards, Alex.
> > > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
>


Re: [DISCUSS] KIP-258: Allow to Store Record Timestamps in RocksDB

2018-12-17 Thread Matthias J. Sax
Dear all,

I finally managed to update the KIP.

To address the concerns about the complex upgrade path, I simplified the
design. We don't need any configs and the upgrade can be done with the
simple single rolling bounce pattern.

The suggestion is to exploit RocksDB column families to isolate old and
new on-disk format. Furthermore, the upgrade from old to new format
happens "on the side" after an instance was upgraded.

I also pushed a WIP PR in case you want to look into some details
(potential reviewers, don't panic: I plan to break this down into
multiple PRs for actual review if the KIP is accepted).

https://github.com/apache/kafka/pull/6044

@Eno: I think I never answered your question about being future proof:

The latest design is not generic, because it does not support changes
that need to be reflected in the changelog topic. I aimed for a
non-generic design for now to keep it as simple as possible. Thus, other
format changes might need a different design / upgrade path -- however,
because this KIP is quite encapsulated in the current design, I don't
see any issue to build this later and a generic upgrade path seems to be
an orthogonal concern atm.


-Matthias


On 11/22/18 2:50 PM, Adam Bellemare wrote:
> Thanks for the information Matthias.
> 
> I will await your completion of this ticket then since it underpins the
> essential parts of a RocksDB TTL aligned with the changelog topic. I am
> eager to work on that ticket myself, so if I can help on this one in any
> way please let me know.
> 
> Thanks
> Adam
> 
> 
> 
> On Tue, Nov 20, 2018 at 5:26 PM Matthias J. Sax 
> wrote:
> 
>> It's an interesting idea to use second store, to maintain the
>> timestamps. However, each RocksDB instance implies some overhead. In
>> fact, we are looking into ColumnFamilies atm to see if we can use those
>> and merge multiple RocksDBs into a single one to reduce this overhead.
>>
>> -Matthias
>>
>> On 11/20/18 5:15 AM, Patrik Kleindl wrote:
>>> Hi Adam
>>>
>>> Sounds great, I was already planning to ask around if anyone had tackled
>>> this.
>>> We have a use case very similar to what you described in KAFKA-4212, only
>>> with Global State Stores.
>>> I have tried a few things with the normal DSL but was not really
>> successful.
>>> Schedule/Punctuate is not possible, supplying a windowed store is also
>> not
>>> allowed and the process method has no knowledge of the timestamp of the
>>> record.
>>> And anything loaded on startup is not filtered anyway.
>>>
>>> Regarding 4212, wouldn't it be easier (although a little less
>>> space-efficient) to track the Timestamps in a separate Store with > Long>
>>> ?
>>> This would leave the original store intact and allow a migration of the
>>> timestamps without touching the other data.
>>>
>>> So I am very interested in your PR :-)
>>>
>>> best regards
>>>
>>> Patrik
>>>
>>> On Tue, 20 Nov 2018 at 04:46, Adam Bellemare 
>>> wrote:
>>>
 Hi Matthias

 Thanks - I figured that it was probably a case of just too much to do
>> and
 not enough time. I know how that can go. I am asking about this one in
 relation to https://issues.apache.org/jira/browse/KAFKA-4212, adding a
>> TTL
 to RocksDB. I have outlined a bit about my use-case within 4212, but for
 brevity here it is:

 My case:
 1) I have a RocksDB with TTL implementation working where records are
>> aged
 out using the TTL that comes with RocksDB (very simple).
 2) We prevent records from loading from the changelog if recordTime +
>> TTL <
 referenceTimeStamp (default = System.currentTimeInMillis() ).

 This assumes that the records are stored with the same time reference
>> (say
 UTC) as the consumer materializing the RocksDB store.

 My questions about KIP-258 are as follows:
 1) How does "we want to be able to store record timestamps in KTables"
 differ from inserting records into RocksDB with TTL at consumption
>> time? I
 understand that it could be a difference of some seconds, minutes,
>> hours,
 days etc between when the record was published and now, but given the
 nature of how RocksDB TTL works (eventual - based on compaction) I don't
 see how a precise TTL can be achieved, such as that which one can get
>> with
 windowed stores.

 2) Are you looking to change how records are inserted into a TTL
>> RocksDB,
 such that the TTL would take effect from the record's published time? If
 not, what would be the ideal workflow here for a single record with TTL
 RocksDB?
 ie: Record Timestamp: 100
 TTL: 50
 Record inserted into rocksDB: 110
 Record to expire at 150?

 3) I'm not sure I fully understand the importance of the upgrade path. I
 have read the link to (https://issues.apache.org/jira/browse/KAFKA-3522
>> )
 in
 the KIP, and I can understand that a state-store on disk may not
>> represent
 what the application is expecting. I don't think I have the full 

Re: KIP-213 - Scalable/Usable Foreign-Key KTable joins - Rebooted.

2018-12-17 Thread Adam Bellemare
Hi John & Guozhang

@John & @Guozhang Wang  - I have cleaned up the KIP,
pruned much of what I wrote and put a simplified diagram near the top to
illustrate the workflow. I encapsulated Jan's content at the bottom of the
document. I believe it is simpler to read by far now.

@Guozhang Wang :
> #1: rekey left table
>   -> source from the left upstream, send to rekey-processor to generate
combined key, and then sink to copartition topic.
Correct.

> #2: first-join with right table
>   -> source from the right table upstream, materialize the right table.
>   -> source from the co-partition topic, materialize the rekeyed left
table, join with the right table, rekey back, and then sink to the
rekeyed-back topic.
Almost - I cleared up the KIP. We do not rekey back yet, as I need the
Foreign-Key value generated in #1 above to compare in the resolution stage.

> #3: second join
>-> source from the rekeyed-back topic, materialize the rekeyed back
table.
>   -> source from the left upstream, materialize the left table, join with
the rekeyed back table.
Almost - As each event comes in, we just run it through a stateful
processor that checks the original ("This") KTable for the key. The value
payload then has the foreignKeyExtractor applied again as in Part #1 above,
and gets the current foreign key. Then we compare it to the joined event
that we are currently resolving. If they have the same foreign-key,
propagate the result out. If they don't, throw the event away.

The end result is that we do need to materialize 2 additional tables
(left/this-combinedkey table, and the final Joined table) as I've
illustrated in the updated KIP. I hope the diagram clears it up a lot
better. Please let me know.

Thanks again
Adam




On Sun, Dec 16, 2018 at 3:18 PM Guozhang Wang  wrote:

> John,
>
> Thanks a lot for the suggestions on refactoring the wiki, I agree with you
> that we should consider the KIP proposal to be easily understood by anyone
> in the future to read, and hence should provide a good summary on the
> user-facing interfaces, as well as rejected alternatives to represent
> briefly "how we came a long way to this conclusion, and what we have
> argued, disagreed, and agreed about, etc" so that readers do not need to
> dig into the DISCUSS thread to get all the details. We can, of course, keep
> the implementation details like "workflows" on the wiki page as a addendum
> section since it also has correlations.
>
> Regarding your proposal on comment 6): that's a very interesting idea! Just
> to clarify that I understands it fully correctly: the proposal's resulted
> topology is still the same as the current proposal, where we will have 3
> sub-topologies for this operator:
>
> #1: rekey left table
>-> source from the left upstream, send to rekey-processor to generate
> combined key, and then sink to copartition topic.
>
> #2: first-join with right table
>-> source from the right table upstream, materialize the right table.
>-> source from the co-partition topic, materialize the rekeyed left
> table, join with the right table, rekey back, and then sink to the
> rekeyed-back topic.
>
> #3: second join
>-> source from the rekeyed-back topic, materialize the rekeyed back
> table.
>-> source from the left upstream, materialize the left table, join with
> the rekeyed back table.
>
> Sub-topology #1 and #3 may be merged to a single sub-topology since both of
> them read from the left table source stream. In this workflow, we need to
> materialize 4 tables (left table in #3, right table in #2, rekeyed left
> table in #2, rekeyed-back table in #3), and 2 repartition topics
> (copartition topic, rekeyed-back topic).
>
> Compared with Adam's current proposal in the workflow overview, it has the
> same num.materialize tables (left table, rekeyed left table, right table,
> out-of-ordering resolver table), and same num.internal topics (two). The
> advantage is that on the copartition topic, we can save bandwidth by not
> sending value, and in #2 the rekeyed left table is smaller since we do not
> have any values to materialize. Is that right?
>
>
> Guozhang
>
>
>
> On Wed, Dec 12, 2018 at 1:22 PM John Roesler  wrote:
>
> > Hi Adam,
> >
> > Given that the committers are all pretty busy right now, I think that it
> > would help if you were to refactor the KIP a little to reduce the
> workload
> > for reviewers.
> >
> > I'd recommend the following changes:
> > * relocate all internal details to a section at the end called something
> > like "Implementation Notes" or something like that.
> > * rewrite the rest of the KIP to be a succinct as possible and mention
> only
> > publicly-facing API changes.
> > ** for example, the interface that you've already listed there, as well
> as
> > a textual description of the guarantees we'll be providing (join result
> is
> > copartitioned with the LHS, and the join result is guaranteed correct)
> >
> > A good target would be that the whole main body of the KIP, including
> 

[jira] [Created] (KAFKA-7745) Kafka Connect doesn't create tombstone record for tasks of deleted connector

2018-12-17 Thread Andrey Pustovetov (JIRA)
Andrey Pustovetov created KAFKA-7745:


 Summary: Kafka Connect doesn't create tombstone record for tasks 
of deleted connector
 Key: KAFKA-7745
 URL: https://issues.apache.org/jira/browse/KAFKA-7745
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 1.1.1
Reporter: Andrey Pustovetov


*Test case*

# Create a connector:
{noformat}
curl -i -XPOST -H "Accept:application/json" -H  "Content-Type:application/json" 
http://localhost:8083/connectors/ -d'
{
  "name": "postgres-connector",
  "config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"database.hostname": "postgresql",
"database.port": "5432",
"database.user": "postgres",
"database.password": "postgres",
"database.dbname": "postgres",
"database.server.name": "test",
"slot.drop_on_stop": "true"
  }
}'
{noformat}
# Delete the connector:
{noformat}
curl -XDELETE http://localhost:8083/connectors/postgres-connector
{noformat}
# Check your config topic:
{noformat}
./bin/kafka-console-consumer.sh --bootstrap-server kafka:9092 --topic 
my-connect-config --from-beginning --property print.key=true
{noformat}

AR:
The following records are read from the config topic after the connector is 
created:
||Key||Value||
|connector-postgres-connector|{{\{"properties":\{"connector.class":"io.debezium.connector.postgresql.PostgresConnector","database.hostname":"postgresql","database.port":"5432","database.user":"postgres","database.password":"postgres","database.dbname":"postgres","database.server.name":"test","slot.drop_on_stop":"true","name":"postgres-connector"\}\}}}|
|task-postgres-connector-0|{{\{"properties":\{"connector.class":"io.debezium.connector.postgresql.PostgresConnector","slot.drop_on_stop":"true","database.user":"postgres","database.dbname":"postgres","task.class":"io.debezium.connector.postgresql.PostgresConnectorTask","database.hostname":"postgresql","database.password":"postgres","name":"postgres-connector","database.server.name":"test","database.port":"5432"\}\}}}|
|commit-postgres-connector|{{\{"tasks":1\}}}|

The following records are read from the config topic after the connector is 
deleted:
||Key||Value||
|connector-postgres-connector|{{null}}|
|target-state-postgres-connector|{{null}}|

Note that the tombstone record is missing for the {{task-postgres-connector-0}} 
key.

Additionally, the following fields of 
{{org.apache.kafka.connect.storage.KafkaConfigBackingStore}} class are never 
cleared during connector removal:
* {{connectorTaskCounts}}
* {{taskConfigs
* deferredTaskUpdates
and can lead to memory leaks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


RE: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory

2018-12-17 Thread Pellerin, Clement
I'm new here. Is this vote binding or not?

-Original Message-
From: Harsha [mailto:ka...@harsha.io] 
Sent: Saturday, December 15, 2018 1:59 PM
To: dev@kafka.apache.org
Subject: Re: [VOTE] [REMINDER] KIP-383 Pluggable interface for SSL Factory

Overall LGTM. +1.

Thanks,
Harsha


[jira] [Created] (KAFKA-7744) Type in java

2018-12-17 Thread Andrii Abramov (JIRA)
Andrii Abramov created KAFKA-7744:
-

 Summary: Type in java
 Key: KAFKA-7744
 URL: https://issues.apache.org/jira/browse/KAFKA-7744
 Project: Kafka
  Issue Type: Bug
  Components: documentation
Affects Versions: 0.10.0.0
Reporter: Andrii Abramov


Typo in JavaDocs 
([https://kafka.apache.org/10/javadoc/?org/apache/kafka/clients/consumer/KafkaConsumer.html)]

 
{noformat}
This can be achieved by by setting the isolation.level=read_committed in the 
consumer's configuration.{noformat}
 

There is an extra 'by'



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-7743) "pure virtual method called" error message thrown for unit tests which PASS

2018-12-17 Thread Sarvesh Tamba (JIRA)
Sarvesh Tamba created KAFKA-7743:


 Summary: "pure virtual method called" error message thrown for 
unit tests which PASS
 Key: KAFKA-7743
 URL: https://issues.apache.org/jira/browse/KAFKA-7743
 Project: Kafka
  Issue Type: Bug
  Components: unit tests
Affects Versions: 2.1.0, 2.0.1, 2.0.0, 2.2.0, 2.1.1, 2.0.2
Reporter: Sarvesh Tamba


Observing the following messages intermittently for a few random unit tests, 
though the status for each of them is PASSED:-
*"pure virtual method called*
*terminate called without an active exception"*

Some of the unit tests throwing above messages are, besides others:-
org.apache.kafka.streams.kstream.internals.KTableImplTest > 
shouldThrowNullPointerOnTransformValuesWithKeyWhenMaterializedIsNull PASSED
org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest > 
shouldThrowStreamsExceptionOnStartupIfThereIsAStreamsException PASSED
org.apache.kafka.streams.state.internals.CachingSessionStoreTest > 
shouldNotForwardChangedValuesDuringFlushWhenSendOldValuesDisabled PASSED
org.apache.kafka.streams.processor.internals.StoreChangelogReaderTest > 
shouldCompleteImmediatelyWhenEndOffsetIs0 PASSED
org.apache.kafka.streams.kstream.internals.KStreamKStreamJoinTest > testJoin 
PASSED
org.apache.kafka.streams.kstream.internals.KTableFilterTest > testTypeVariance 
PASSED
org.apache.kafka.streams.kstream.internals.KTableKTableInnerJoinTest > 
testNotSendingOldValues PASSED
org.apache.kafka.streams.processor.internals.GlobalStreamThreadTest > 
shouldThrowStreamsExceptionOnStartupIfThereIsAStreamsException PASSED
org.apache.kafka.streams.processor.DefaultPartitionGrouperTest > 
shouldComputeGroupingForTwoGroups PASSED
org.apache.kafka.streams.state.internals.RocksDBWindowStoreTest > 
shouldFetchAndIterateOverExactKeys PASSED
org.apache.kafka.streams.state.internals.FilteredCacheIteratorTest > 
shouldFilterEntriesNotMatchingHasNextCondition PASSED
org.apache.kafka.streams.state.internals.GlobalStateStoreProviderTest > 
shouldThrowExceptionIfStoreIsntOpen PASSED

This probably causes the 'gradle unitTest' command to fail during cleanup time 
with final status as FAILED and the following message:-
"Process 'Gradle Test Executor 16' finished with non-zero exit value 134"

This intermittent/random error is not seen when final unit test suite status is 
"BUILD SUCCESSFUL".
Reproducing "pure virtual method" issue is extremely hard, since it happens 
intermittently and for any random unit test(not the same unit test will fail 
next time). The ones noted above were some of the failing unit tests observed. 
Note that the status next to the test shows PASSED(is this correct or 
misleading?).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)