[jira] [Assigned] (KAFKA-8134) ProducerConfig.LINGER_MS_CONFIG undocumented breaking change in kafka-clients 2.1

2019-03-25 Thread Dhruvil Shah (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dhruvil Shah reassigned KAFKA-8134:
---

Assignee: Dhruvil Shah

> ProducerConfig.LINGER_MS_CONFIG undocumented breaking change in kafka-clients 
> 2.1
> -
>
> Key: KAFKA-8134
> URL: https://issues.apache.org/jira/browse/KAFKA-8134
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.1.0
>Reporter: Sam Lendle
>Assignee: Dhruvil Shah
>Priority: Major
>
> Prior to 2.1, the type of the "linger.ms" config was Long, but was changed to 
> Integer in 2.1.0 ([https://github.com/apache/kafka/pull/5270]) A config using 
> a Long value for that parameter which works with kafka-clients < 2.1 will 
> cause a ConfigException to be thrown when constructing a KafkaProducer if 
> kafka-clients is upgraded to >= 2.1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8134) ProducerConfig.LINGER_MS_CONFIG undocumented breaking change in kafka-clients 2.1

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801386#comment-16801386
 ] 

ASF GitHub Bot commented on KAFKA-8134:
---

dhruvilshah3 commented on pull request #6502: KAFKA-8134: `linger.ms` must be a 
long
URL: https://github.com/apache/kafka/pull/6502
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> ProducerConfig.LINGER_MS_CONFIG undocumented breaking change in kafka-clients 
> 2.1
> -
>
> Key: KAFKA-8134
> URL: https://issues.apache.org/jira/browse/KAFKA-8134
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 2.1.0
>Reporter: Sam Lendle
>Priority: Major
>
> Prior to 2.1, the type of the "linger.ms" config was Long, but was changed to 
> Integer in 2.1.0 ([https://github.com/apache/kafka/pull/5270]) A config using 
> a Long value for that parameter which works with kafka-clients < 2.1 will 
> cause a ConfigException to be thrown when constructing a KafkaProducer if 
> kafka-clients is upgraded to >= 2.1.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7986) distinguish the logging from different ZooKeeperClient instances

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801307#comment-16801307
 ] 

ASF GitHub Bot commented on KAFKA-7986:
---

junrao commented on pull request #6493: KAFKA-7986: Distinguish logging from 
different ZooKeeperClient instances
URL: https://github.com/apache/kafka/pull/6493
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> distinguish the logging from different ZooKeeperClient instances
> 
>
> Key: KAFKA-7986
> URL: https://issues.apache.org/jira/browse/KAFKA-7986
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Jun Rao
>Assignee: Ivan Yurchenko
>Priority: Major
>  Labels: newbie
>
> It's possible for each broker to have more than 1 ZooKeeperClient instance. 
> For example, SimpleAclAuthorizer creates a separate ZooKeeperClient instance 
> when configured. It would be useful to distinguish the logging from different 
> ZooKeeperClient instances.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (KAFKA-7991) Add StreamsUpgradeTest for 2.2 release

2019-03-25 Thread Matthias J. Sax (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-7991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matthias J. Sax reassigned KAFKA-7991:
--

Assignee: John Roesler

> Add StreamsUpgradeTest for 2.2 release
> --
>
> Key: KAFKA-7991
> URL: https://issues.apache.org/jira/browse/KAFKA-7991
> Project: Kafka
>  Issue Type: Task
>  Components: streams, system tests
>Affects Versions: 2.3.0
>Reporter: Matthias J. Sax
>Assignee: John Roesler
>Priority: Blocker
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8156) Client id when provided is not suffixed with an index

2019-03-25 Thread Nagaraj Gopal (JIRA)
Nagaraj Gopal created KAFKA-8156:


 Summary: Client id when provided is not suffixed with an index
 Key: KAFKA-8156
 URL: https://issues.apache.org/jira/browse/KAFKA-8156
 Project: Kafka
  Issue Type: Improvement
  Components: consumer
Affects Versions: 2.1.1
Reporter: Nagaraj Gopal


We use Camel Kafka component and one of the configuration is consumersCount 
which is number of concurrent consumers that can read data from the topic. 
Usually we don't care about client id but when we start emitting metrics it 
becomes important piece of the puzzle. The client id would help differentiate 
metrics between different consumers each with `n` consumer count (concurrent 
consumers) and each consumer deployed in different JVMs.

Currently when client id is provided it is not suffixed with an index and when 
it is not provided the library seems to create its own client id prefixed with 
an index (format: consumer-0, consumer-1). This is limiting when we have 
multiple consumers as described above



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8151) Broker hangs and lockups after Zookeeper outages

2019-03-25 Thread Jeff Nadler (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801228#comment-16801228
 ] 

Jeff Nadler commented on KAFKA-8151:


We've seen similar problems with 2.1.0 and 2.1.1, and we don't use SSL.   The 
issues are always preceeded by a ZK disconnect like yours:
WARN Client session timed out, have not heard from server in 5334ms for 
sessionid 0x400216eb36a0275
... but if we have any 'real' ZK unavailability it's very brief - I even have 
gc logging enabled and the GC pauses are very short.   

I rolled all of our clusters back to 2.0.1 and we're stable again.    

> Broker hangs and lockups after Zookeeper outages
> 
>
> Key: KAFKA-8151
> URL: https://issues.apache.org/jira/browse/KAFKA-8151
> Project: Kafka
>  Issue Type: Bug
>  Components: controller, core, zkclient
>Affects Versions: 2.1.1
>Reporter: Joe Ammann
>Priority: Major
> Attachments: symptom3_lxgurten_kafka_dump1.txt, 
> symptom3_lxgurten_kafka_dump2.txt, symptom3_lxgurten_kafka_dump3.txt
>
>
> We're running several clusters (mostly with 3 brokers) with 2.1.1, where we 
> see at least 3 different symptoms, all resulting on broker/controller lockups.
> We are pretty sure that the triggering cause for all these symptoms are 
> temporary (for 3-5 minutes normally) of the Zookeeper cluster. The Linux VMs 
> where the ZK nodes run on regularly get stalled for a couple of minutes. The 
> ZK nodes always very quickly reunite and build a Quorum after the situation 
> clears, but the Kafka brokers (which run on then same Linux VMs) quite often 
> show problems after this procedure.
> I've seen 3 different kinds of problems (this is why I put "reproduce" in 
> quotes, I can never predict what will happen)
>  # the brokers get their ZK sessions expired (obviously) and sometimes only 2 
> of 3 re-register under /brokers/ids. The 3rd broker doesn't re-register for 
> some reason (that's the problem I originally described)
>  # the brokers all re-register and re-elect a new controller. But that new 
> controller does not fully work. For example it doesn't process partition 
> reassignment requests and or does not transfer partition leadership after I 
> kill a broker
>  # the previous controller gets "dead-locked" (it has 3-4 of the important 
> controller threads in a lock) and hence does not perform any of it's 
> controller duties. But it regards itsself still as the valid controller and 
> is accepted by the other brokers
> I'll try to describe each one of the problems in more detail below, and hope 
> to be able to cleary separate them.
> I'm able to provoke these problems in our DEV environment quite regularly 
> using the following procedure
> * make sure all ZK nodes and Kafka brokers are stable and reacting normally
> * freeze 2 out of 3 ZK nodes with {{kill -STOP}} for some minutes
> * let the Kafka broker running, of course they will start complaining to be 
> unable to reach ZK
> * thaw the processes with {{kill -CONT}}
> * now all Kafka brokers get notified that their ZK session has expired, and 
> they start to reorganize the cluster
> In about 20% of the tests, I'm able to produce one of the symptoms above. I 
> can not predict which one though. I'm varying this procedure sometimes by 
> also freezing one Kafka broker (most often the controller), but until now I 
> haven't been able to create a clear pattern or really force one specific 
> symptom
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8151) Broker hangs and lockups after Zookeeper outages

2019-03-25 Thread Joe Ammann (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801214#comment-16801214
 ] 

Joe Ammann commented on KAFKA-8151:
---

I still can't reproduce any symptoms in DEV when using PLAINTEXT for 
interbroker comms.

But last night we had 2 occurences of symptom 2 (all brokers and controller 
registered in ZK, but controller actions - e.g. partition leader reassignment - 
does not happen) in TEST, where I had also enabled PLAINTEXT.

So it definitely also happens with PLAINTEXT

> Broker hangs and lockups after Zookeeper outages
> 
>
> Key: KAFKA-8151
> URL: https://issues.apache.org/jira/browse/KAFKA-8151
> Project: Kafka
>  Issue Type: Bug
>  Components: controller, core, zkclient
>Affects Versions: 2.1.1
>Reporter: Joe Ammann
>Priority: Major
> Attachments: symptom3_lxgurten_kafka_dump1.txt, 
> symptom3_lxgurten_kafka_dump2.txt, symptom3_lxgurten_kafka_dump3.txt
>
>
> We're running several clusters (mostly with 3 brokers) with 2.1.1, where we 
> see at least 3 different symptoms, all resulting on broker/controller lockups.
> We are pretty sure that the triggering cause for all these symptoms are 
> temporary (for 3-5 minutes normally) of the Zookeeper cluster. The Linux VMs 
> where the ZK nodes run on regularly get stalled for a couple of minutes. The 
> ZK nodes always very quickly reunite and build a Quorum after the situation 
> clears, but the Kafka brokers (which run on then same Linux VMs) quite often 
> show problems after this procedure.
> I've seen 3 different kinds of problems (this is why I put "reproduce" in 
> quotes, I can never predict what will happen)
>  # the brokers get their ZK sessions expired (obviously) and sometimes only 2 
> of 3 re-register under /brokers/ids. The 3rd broker doesn't re-register for 
> some reason (that's the problem I originally described)
>  # the brokers all re-register and re-elect a new controller. But that new 
> controller does not fully work. For example it doesn't process partition 
> reassignment requests and or does not transfer partition leadership after I 
> kill a broker
>  # the previous controller gets "dead-locked" (it has 3-4 of the important 
> controller threads in a lock) and hence does not perform any of it's 
> controller duties. But it regards itsself still as the valid controller and 
> is accepted by the other brokers
> I'll try to describe each one of the problems in more detail below, and hope 
> to be able to cleary separate them.
> I'm able to provoke these problems in our DEV environment quite regularly 
> using the following procedure
> * make sure all ZK nodes and Kafka brokers are stable and reacting normally
> * freeze 2 out of 3 ZK nodes with {{kill -STOP}} for some minutes
> * let the Kafka broker running, of course they will start complaining to be 
> unable to reach ZK
> * thaw the processes with {{kill -CONT}}
> * now all Kafka brokers get notified that their ZK session has expired, and 
> they start to reorganize the cluster
> In about 20% of the tests, I'm able to produce one of the symptoms above. I 
> can not predict which one though. I'm varying this procedure sometimes by 
> also freezing one Kafka broker (most often the controller), but until now I 
> haven't been able to create a clear pattern or really force one specific 
> symptom
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8026) Flaky Test RegexSourceIntegrationTest#testRegexMatchesTopicsAWhenDeleted

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801077#comment-16801077
 ] 

ASF GitHub Bot commented on KAFKA-8026:
---

bbejeck commented on pull request #6463: KAFKA-8026: Fix flaky regex source 
integration test 1.0
URL: https://github.com/apache/kafka/pull/6463
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flaky Test RegexSourceIntegrationTest#testRegexMatchesTopicsAWhenDeleted
> 
>
> Key: KAFKA-8026
> URL: https://issues.apache.org/jira/browse/KAFKA-8026
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 1.0.2, 1.1.1
>Reporter: Matthias J. Sax
>Assignee: Bill Bejeck
>Priority: Critical
>  Labels: flaky-test
> Fix For: 1.0.3, 1.1.2
>
>
> {quote}java.lang.AssertionError: Condition not met within timeout 15000. 
> Stream tasks not updated
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276)
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:254)
> at 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted(RegexSourceIntegrationTest.java:215){quote}
> Happend in 1.0 and 1.1 builds:
> [https://builds.apache.org/blue/organizations/jenkins/kafka-1.0-jdk7/detail/kafka-1.0-jdk7/263/tests/]
> and
> [https://builds.apache.org/blue/organizations/jenkins/kafka-1.1-jdk7/detail/kafka-1.1-jdk7/249/tests/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8155) Update Streams system tests for 2.2.0 and 2.1.1 releases

2019-03-25 Thread John Roesler (JIRA)
John Roesler created KAFKA-8155:
---

 Summary: Update Streams system tests for 2.2.0 and 2.1.1 releases
 Key: KAFKA-8155
 URL: https://issues.apache.org/jira/browse/KAFKA-8155
 Project: Kafka
  Issue Type: Task
  Components: streams, system tests
Reporter: John Roesler
Assignee: John Roesler






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8147) Add changelog topic configuration to KTable suppress

2019-03-25 Thread John Roesler (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16801023#comment-16801023
 ] 

John Roesler commented on KAFKA-8147:
-

No worries! I think it's pretty normal to see (and ignore) a lot of updates 
from the wiki as everyone is editing it.

> Add changelog topic configuration to KTable suppress
> 
>
> Key: KAFKA-8147
> URL: https://issues.apache.org/jira/browse/KAFKA-8147
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.1.1
>Reporter: Maarten
>Assignee: Maarten
>Priority: Minor
>  Labels: needs-kip
>
> The streams DSL does not provide a way to configure the changelog topic 
> created by KTable.suppress.
> From the perspective of an external user this could be implemented similar to 
> the configuration of aggregate + materialized, i.e., 
> {code:java}
> changelogTopicConfigs = // Configs
> materialized = Materialized.as(..).withLoggingEnabled(changelogTopicConfigs)
> ..
> KGroupedStream.aggregate(..,materialized)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (KAFKA-8154) Buffer Overflow exceptions between brokers and with clients

2019-03-25 Thread Rajesh Nataraja (JIRA)
Rajesh Nataraja created KAFKA-8154:
--

 Summary: Buffer Overflow exceptions between brokers and with 
clients
 Key: KAFKA-8154
 URL: https://issues.apache.org/jira/browse/KAFKA-8154
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 2.1.0
Reporter: Rajesh Nataraja
 Attachments: server.properties.txt

https://github.com/apache/kafka/pull/6495

https://github.com/apache/kafka/pull/5785



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-7895) Ktable supress operator emitting more than one record for the same key per window

2019-03-25 Thread John Roesler (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-7895?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800950#comment-16800950
 ] 

John Roesler commented on KAFKA-7895:
-

Hi [~AndrewRK],

Sorry to hear that!

Looking at your topology, I'd expect it to work as advertised (obviously), and 
it also looks very similar to what I've tested heavily, so I might need to get 
some more help from you to investigate the cause.

I have a couple of follow-up questions...
 # Is this restarting after a clean shutdown or a crash? (i.e., how do you stop 
and restart the app?)
 # Do you have EOS enabled? (If you don't, can you try to repro with EOS on?)

Thanks,

-John

> Ktable supress operator emitting more than one record for the same key per 
> window
> -
>
> Key: KAFKA-7895
> URL: https://issues.apache.org/jira/browse/KAFKA-7895
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.0, 2.1.1
>Reporter: prasanthi
>Assignee: John Roesler
>Priority: Major
> Fix For: 2.2.0, 2.1.2
>
>
> Hi, We are using kstreams to get the aggregated counts per vendor(key) within 
> a specified window.
> Here's how we configured the suppress operator to emit one final record per 
> key/window.
> {code:java}
> KTable, Long> windowedCount = groupedStream
>  .windowedBy(TimeWindows.of(Duration.ofMinutes(1)).grace(ofMillis(5L)))
>  .count(Materialized.with(Serdes.Integer(),Serdes.Long()))
>  .suppress(Suppressed.untilWindowCloses(unbounded()));
> {code}
> But we are getting more than one record for the same key/window as shown 
> below.
> {code:java}
> [KTABLE-TOSTREAM-10]: [131@154906704/154906710], 1039
> [KTABLE-TOSTREAM-10]: [131@154906704/154906710], 1162
> [KTABLE-TOSTREAM-10]: [9@154906704/154906710], 6584
> [KTABLE-TOSTREAM-10]: [88@154906704/154906710], 107
> [KTABLE-TOSTREAM-10]: [108@154906704/154906710], 315
> [KTABLE-TOSTREAM-10]: [119@154906704/154906710], 119
> [KTABLE-TOSTREAM-10]: [154@154906704/154906710], 746
> [KTABLE-TOSTREAM-10]: [154@154906704/154906710], 809{code}
> Could you please take a look?
> Thanks
>  
>  
> Added by John:
> Acceptance Criteria:
>  * add suppress to system tests, such that it's exercised with crash/shutdown 
> recovery, rebalance, etc.
>  ** [https://github.com/apache/kafka/pull/6278]
>  * make sure that there's some system test coverage with caching disabled.
>  ** Follow-on ticket: https://issues.apache.org/jira/browse/KAFKA-7943
>  * test with tighter time bounds with windows of say 30 seconds and use 
> system time without adding any extra time for verification
>  ** Follow-on ticket: https://issues.apache.org/jira/browse/KAFKA-7944



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-6988) Kafka windows classpath too long

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6988?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800922#comment-16800922
 ] 

ASF GitHub Bot commented on KAFKA-6988:
---

ward-eric commented on pull request #6499: KAFKA-6988: Reduce classpath via 
classpath jar
URL: https://github.com/apache/kafka/pull/6499
 
 
   We hit the issue outlined in #5960; however, we came up with a slightly 
different solution that attempts to not pollute the classpath with unnecessary 
items.
   
   Instead we build a classpath jar via Gradle and append it to the classpath.  
The Gradle function handles the regex originally done by `should_include_file`.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Kafka windows classpath too long
> 
>
> Key: KAFKA-6988
> URL: https://issues.apache.org/jira/browse/KAFKA-6988
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.1.0, 2.0.0
>Reporter: lkgen
>Priority: Major
>
> In Kafka windows, the kafka-run-class.bat script is building a CLASSPATH with 
> full path to each jar
> If installation is in a long path directory, the CLASSPATH becomes too long 
> and there is an error of
> {{**The input line is too long. }}
> {{when running zookeeper-server-start.bat and other commands}}
> {{a possible solution may be to expand all jars but add dir\* to classpath}}
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8150) Fix bugs in handling null arrays in generated RPC code

2019-03-25 Thread Colin P. McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe resolved KAFKA-8150.

   Resolution: Fixed
Fix Version/s: 2.2.1

The code path that this fixes isn't used in 2.2, I think.  But I backported the 
patch to that branch just for the purpose of future-proofing.

> Fix bugs in handling null arrays in generated RPC code
> --
>
> Key: KAFKA-8150
> URL: https://issues.apache.org/jira/browse/KAFKA-8150
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>Priority: Major
> Fix For: 2.2.1
>
>
> Fix bugs in handling null arrays in generated RPC code.
> toString should not get a NullPointException.
> Also, read() must properly translate a negative array length to a null field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8102) Trogdor - Add Produce workload transaction generator by interval

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800881#comment-16800881
 ] 

ASF GitHub Bot commented on KAFKA-8102:
---

cmccabe commented on pull request #6444: KAFKA-8102: Add an interval-based 
Trogdor transaction generator
URL: https://github.com/apache/kafka/pull/6444
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Trogdor - Add Produce workload transaction generator by interval
> 
>
> Key: KAFKA-8102
> URL: https://issues.apache.org/jira/browse/KAFKA-8102
> Project: Kafka
>  Issue Type: Improvement
>Reporter: Stanislav Kozlovski
>Assignee: Stanislav Kozlovski
>Priority: Major
>
> Trogdor's specification for produce worker workloads 
> ([ProduceBenchSpec|https://github.com/apache/kafka/blob/trunk/tools/src/main/java/org/apache/kafka/trogdor/workload/ProduceBenchSpec.java])
>  supports configuring a transactional producer using a class that implements 
> `TransactionGenerator` interface.
>  
> [UniformTransactioGenerator|https://github.com/apache/kafka/blob/trunk/tools/src/main/java/org/apache/kafka/trogdor/workload/UniformTransactionsGenerator.java]
>  which triggers a transaction every N records.
> It would be useful to have a generator which supports triggering a 
> transaction in an interval - e.g every 100 milliseconds. This is how Kafka 
> Streams configures its own [EOS semantics by 
> default|https://github.com/apache/kafka/blob/8e975400711b0ea64bf4a00c8c551e448ab48416/streams/src/main/java/org/apache/kafka/streams/StreamsConfig.java#L140].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8150) Fix bugs in handling null arrays in generated RPC code

2019-03-25 Thread Colin P. McCabe (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin P. McCabe updated KAFKA-8150:
---
Affects Version/s: 2.2.1

> Fix bugs in handling null arrays in generated RPC code
> --
>
> Key: KAFKA-8150
> URL: https://issues.apache.org/jira/browse/KAFKA-8150
> Project: Kafka
>  Issue Type: Bug
>Affects Versions: 2.2.1
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>Priority: Major
> Fix For: 2.2.1
>
>
> Fix bugs in handling null arrays in generated RPC code.
> toString should not get a NullPointException.
> Also, read() must properly translate a negative array length to a null field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8150) Fix bugs in handling null arrays in generated RPC code

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800872#comment-16800872
 ] 

ASF GitHub Bot commented on KAFKA-8150:
---

cmccabe commented on pull request #6489: KAFKA-8150: Fix bugs in handling null 
arrays in generated RPC code
URL: https://github.com/apache/kafka/pull/6489
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Fix bugs in handling null arrays in generated RPC code
> --
>
> Key: KAFKA-8150
> URL: https://issues.apache.org/jira/browse/KAFKA-8150
> Project: Kafka
>  Issue Type: Bug
>Reporter: Colin P. McCabe
>Assignee: Colin P. McCabe
>Priority: Major
>
> Fix bugs in handling null arrays in generated RPC code.
> toString should not get a NullPointException.
> Also, read() must properly translate a negative array length to a null field.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8147) Add changelog topic configuration to KTable suppress

2019-03-25 Thread Maarten (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800867#comment-16800867
 ] 

Maarten commented on KAFKA-8147:


I would like to apologize for the unnecessary KIP updates, I'm working on it 
but struggling with Confluence at the moment

> Add changelog topic configuration to KTable suppress
> 
>
> Key: KAFKA-8147
> URL: https://issues.apache.org/jira/browse/KAFKA-8147
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Affects Versions: 2.1.1
>Reporter: Maarten
>Assignee: Maarten
>Priority: Minor
>  Labels: needs-kip
>
> The streams DSL does not provide a way to configure the changelog topic 
> created by KTable.suppress.
> From the perspective of an external user this could be implemented similar to 
> the configuration of aggregate + materialized, i.e., 
> {code:java}
> changelogTopicConfigs = // Configs
> materialized = Materialized.as(..).withLoggingEnabled(changelogTopicConfigs)
> ..
> KGroupedStream.aggregate(..,materialized)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-8026) Flaky Test RegexSourceIntegrationTest#testRegexMatchesTopicsAWhenDeleted

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-8026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800858#comment-16800858
 ] 

ASF GitHub Bot commented on KAFKA-8026:
---

bbejeck commented on pull request #6459: KAFKA-8026: Fix for flaky 
RegexSourceIntegrationTest
URL: https://github.com/apache/kafka/pull/6459
 
 
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Flaky Test RegexSourceIntegrationTest#testRegexMatchesTopicsAWhenDeleted
> 
>
> Key: KAFKA-8026
> URL: https://issues.apache.org/jira/browse/KAFKA-8026
> Project: Kafka
>  Issue Type: Bug
>  Components: streams, unit tests
>Affects Versions: 1.0.2, 1.1.1
>Reporter: Matthias J. Sax
>Assignee: Bill Bejeck
>Priority: Critical
>  Labels: flaky-test
> Fix For: 1.0.3, 1.1.2
>
>
> {quote}java.lang.AssertionError: Condition not met within timeout 15000. 
> Stream tasks not updated
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:276)
> at org.apache.kafka.test.TestUtils.waitForCondition(TestUtils.java:254)
> at 
> org.apache.kafka.streams.integration.RegexSourceIntegrationTest.testRegexMatchesTopicsAWhenDeleted(RegexSourceIntegrationTest.java:215){quote}
> Happend in 1.0 and 1.1 builds:
> [https://builds.apache.org/blue/organizations/jenkins/kafka-1.0-jdk7/detail/kafka-1.0-jdk7/263/tests/]
> and
> [https://builds.apache.org/blue/organizations/jenkins/kafka-1.1-jdk7/detail/kafka-1.1-jdk7/249/tests/]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Resolved] (KAFKA-8014) Extend Connect integration tests to add and remove workers dynamically

2019-03-25 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch resolved KAFKA-8014.
--
Resolution: Fixed

> Extend Connect integration tests to add and remove workers dynamically
> --
>
> Key: KAFKA-8014
> URL: https://issues.apache.org/jira/browse/KAFKA-8014
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 2.3.0, 2.1.2, 2.2.1
>
>
>  To allow for even more integration tests that can focus on testing Connect 
> framework itself, it seems necessary to add the ability to add and remove 
> workers from within a test case. 
> The suggestion is to extend Connect's integration test harness 
> {{EmbeddedConnectCluster}} to include methods to add and remove workers as 
> well as return the workers that are online at any given point.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8014) Extend Connect integration tests to add and remove workers dynamically

2019-03-25 Thread Randall Hauch (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Randall Hauch updated KAFKA-8014:
-
Fix Version/s: 2.2.1
   2.1.2
   2.3.0
  Description: 
 To allow for even more integration tests that can focus on testing Connect 
framework itself, it seems necessary to add the ability to add and remove 
workers from within a test case. 

The suggestion is to extend Connect's integration test harness 
{{EmbeddedConnectCluster}} to include methods to add and remove workers as well 
as return the workers that are online at any given point.

  was:
 

To allow for even more integration tests that can focus on testing Connect 
framework itself, it seems necessary to add the ability to add and remove 
workers from within a test case. 

The suggestion is to extend Connect's integration test harness 
{{EmbeddedConnectCluster}} to include methods to add and remove workers as well 
as return the workers that are online at any given point.


> Extend Connect integration tests to add and remove workers dynamically
> --
>
> Key: KAFKA-8014
> URL: https://issues.apache.org/jira/browse/KAFKA-8014
> Project: Kafka
>  Issue Type: Improvement
>Affects Versions: 2.3.0
>Reporter: Konstantine Karantasis
>Assignee: Konstantine Karantasis
>Priority: Major
> Fix For: 2.3.0, 2.1.2, 2.2.1
>
>
>  To allow for even more integration tests that can focus on testing Connect 
> framework itself, it seems necessary to add the ability to add and remove 
> workers from within a test case. 
> The suggestion is to extend Connect's integration test harness 
> {{EmbeddedConnectCluster}} to include methods to add and remove workers as 
> well as return the workers that are online at any given point.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3539) KafkaProducer.send() may block even though it returns the Future

2019-03-25 Thread radai rosenblatt (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800714#comment-16800714
 ] 

radai rosenblatt commented on KAFKA-3539:
-

IIUC, the root of the problem is the kafka producer stores compressed batches 
of msgs, in a map keyed by the partition these msgs are intended for.
since without metadata there's no knowing the layout of a topic the producer 
cant tell where to "place" a msg, which is why it blocks on no metadata.
one possible solution would be to have an "unknown" msg bucket (with some 
finite capacity) where msgs of unknown destination go. the biggest issue with 
this is that those msgs cannot be compressed (as kafka compresses batches, not 
individual msgs, and there's no guarantee that everything in the unknown bucket 
will go into the same batch).
once metadata is obtained the "unknown bucket" would need to be iterated over, 
and the msgs deposited (and compressed) into the correct queues. this would 
need to happen when metadata arrives and before any new msgs are allowed into 
the producer (to not violate order)

> KafkaProducer.send() may block even though it returns the Future
> 
>
> Key: KAFKA-3539
> URL: https://issues.apache.org/jira/browse/KAFKA-3539
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Oleg Zhurakousky
>Priority: Critical
>
> You can get more details from the us...@kafka.apache.org by searching on the 
> thread with the subject "KafkaProducer block on send".
> The bottom line is that method that returns Future must never block, since it 
> essentially violates the Future contract as it was specifically designed to 
> return immediately passing control back to the user to check for completion, 
> cancel etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (KAFKA-3539) KafkaProducer.send() may block even though it returns the Future

2019-03-25 Thread Spyridon Ninos (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-3539?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800656#comment-16800656
 ] 

Spyridon Ninos commented on KAFKA-3539:
---

Hi guys,

 any solutions proposed? I've hit a similar issue with [~tu...@avast.com] too, 
but by studying the code I am not confident that any solution will be that much 
better than the current one, either semantically or technically.

 

Having said that, some weeks ago I took a look at how to solve the blocking 
nature of the producer - I'd like to know what others have thought as probable 
solutions. Any suggestions?

 

Thanks

> KafkaProducer.send() may block even though it returns the Future
> 
>
> Key: KAFKA-3539
> URL: https://issues.apache.org/jira/browse/KAFKA-3539
> Project: Kafka
>  Issue Type: Bug
>  Components: producer 
>Reporter: Oleg Zhurakousky
>Priority: Critical
>
> You can get more details from the us...@kafka.apache.org by searching on the 
> thread with the subject "KafkaProducer block on send".
> The bottom line is that method that returns Future must never block, since it 
> essentially violates the Future contract as it was specifically designed to 
> return immediately passing control back to the user to check for completion, 
> cancel etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (KAFKA-8153) Streaming application with state stores takes up to 1 hour to restart

2019-03-25 Thread Michael Melsen (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-8153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Melsen updated KAFKA-8153:
--
Affects Version/s: (was: 2.1.1)
   2.1.0

> Streaming application with state stores takes up to 1 hour to restart
> -
>
> Key: KAFKA-8153
> URL: https://issues.apache.org/jira/browse/KAFKA-8153
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Affects Versions: 2.1.0
>Reporter: Michael Melsen
>Priority: Major
>
> We are using spring cloud stream with Kafka streams 2.0.1 and utilizing the 
> InteractiveQueryService to fetch data from the stores. There are 4 stores 
> that persist data on disk after aggregating data. The code for the topology 
> looks like this:
> {code:java}
> @Slf4j
> @EnableBinding(SensorMeasurementBinding.class)
> public class Consumer {
>   public static final String RETENTION_MS = "retention.ms";
>   public static final String CLEANUP_POLICY = "cleanup.policy";
>   @Value("${windowstore.retention.ms}")
>   private String retention;
> /**
>  * Process the data flowing in from a Kafka topic. Aggregate the data to:
>  * - 2 minute
>  * - 15 minutes
>  * - one hour
>  * - 12 hours
>  *
>  * @param stream
>  */
> @StreamListener(SensorMeasurementBinding.ERROR_SCORE_IN)
> public void process(KStream stream) {
> Map topicConfig = new HashMap<>();
> topicConfig.put(RETENTION_MS, retention);
> topicConfig.put(CLEANUP_POLICY, "delete");
> log.info("Changelog and local window store retention.ms: {} and 
> cleanup.policy: {}",
> topicConfig.get(RETENTION_MS),
> topicConfig.get(CLEANUP_POLICY));
> createWindowStore(LocalStore.TWO_MINUTES_STORE, topicConfig, stream);
> createWindowStore(LocalStore.FIFTEEN_MINUTES_STORE, topicConfig, stream);
> createWindowStore(LocalStore.ONE_HOUR_STORE, topicConfig, stream);
> createWindowStore(LocalStore.TWELVE_HOURS_STORE, topicConfig, stream);
>   }
>   private void createWindowStore(
> LocalStore localStore,
> Map topicConfig,
> KStream stream) {
> // Configure how the statestore should be materialized using the provide 
> storeName
> Materialized> materialized 
> = Materialized
> .as(localStore.getStoreName());
> // Set retention of changelog topic
> materialized.withLoggingEnabled(topicConfig);
> // Configure how windows looks like and how long data will be retained in 
> local stores
> TimeWindows configuredTimeWindows = getConfiguredTimeWindows(
> localStore.getTimeUnit(), 
> Long.parseLong(topicConfig.get(RETENTION_MS)));
> // Processing description:
> // The input data are 'samples' with key 
> :::
> // 1. With the map we add the Tag to the key and we extract the error 
> score from the data
> // 2. With the groupByKey we group  the data on the new key
> // 3. With windowedBy we split up the data in time intervals depending on 
> the provided LocalStore enum
> // 4. With reduce we determine the maximum value in the time window
> // 5. Materialized will make it stored in a table
> stream
> .map(getInstallationAssetModelAlgorithmTagKeyMapper())
> .groupByKey()
> .windowedBy(configuredTimeWindows)
> .reduce((aggValue, newValue) -> getMaxErrorScore(aggValue, 
> newValue), materialized);
>   }
>   private TimeWindows getConfiguredTimeWindows(long windowSizeMs, long 
> retentionMs) {
> TimeWindows timeWindows = TimeWindows.of(windowSizeMs);
> timeWindows.until(retentionMs);
> return timeWindows;
>   }
>   /**
>* Determine the max error score to keep by looking at the aggregated error 
> signal and
>* freshly consumed error signal
>*
>* @param aggValue
>* @param newValue
>* @return
>*/
>   private ErrorScore getMaxErrorScore(ErrorScore aggValue, ErrorScore 
> newValue) {
> if(aggValue.getErrorSignal() > newValue.getErrorSignal()) {
> return aggValue;
> }
> return newValue;
>   }
>   private KeyValueMapper KeyValue> 
> getInstallationAssetModelAlgorithmTagKeyMapper() {
> return (s, sensorMeasurement) -> new KeyValue<>(s + "::" + 
> sensorMeasurement.getT(),
> new ErrorScore(sensorMeasurement.getTs(), 
> sensorMeasurement.getE(), sensorMeasurement.getO()));
>   }
> }
> {code}
> So we are materializing aggregated data to four different stores after 
> determining the max value within a specific window for a specific key. Please 
> note that retention which is set to two months of data and the clean up 
> policy delete. We don't compact data.
> The size of the individual state stores on disk is between 14 to 20 gb of 
> data.
> We are making use of Interactive Queries: 
> 

[jira] [Created] (KAFKA-8153) Streaming application with state stores takes up to 1 hour to restart

2019-03-25 Thread Michael Melsen (JIRA)
Michael Melsen created KAFKA-8153:
-

 Summary: Streaming application with state stores takes up to 1 
hour to restart
 Key: KAFKA-8153
 URL: https://issues.apache.org/jira/browse/KAFKA-8153
 Project: Kafka
  Issue Type: Bug
  Components: streams
Affects Versions: 2.1.1
Reporter: Michael Melsen


We are using spring cloud stream with Kafka streams 2.0.1 and utilizing the 
InteractiveQueryService to fetch data from the stores. There are 4 stores that 
persist data on disk after aggregating data. The code for the topology looks 
like this:
{code:java}
@Slf4j
@EnableBinding(SensorMeasurementBinding.class)
public class Consumer {

  public static final String RETENTION_MS = "retention.ms";
  public static final String CLEANUP_POLICY = "cleanup.policy";

  @Value("${windowstore.retention.ms}")
  private String retention;

/**
 * Process the data flowing in from a Kafka topic. Aggregate the data to:
 * - 2 minute
 * - 15 minutes
 * - one hour
 * - 12 hours
 *
 * @param stream
 */
@StreamListener(SensorMeasurementBinding.ERROR_SCORE_IN)
public void process(KStream stream) {

Map topicConfig = new HashMap<>();
topicConfig.put(RETENTION_MS, retention);
topicConfig.put(CLEANUP_POLICY, "delete");

log.info("Changelog and local window store retention.ms: {} and 
cleanup.policy: {}",
topicConfig.get(RETENTION_MS),
topicConfig.get(CLEANUP_POLICY));

createWindowStore(LocalStore.TWO_MINUTES_STORE, topicConfig, stream);
createWindowStore(LocalStore.FIFTEEN_MINUTES_STORE, topicConfig, stream);
createWindowStore(LocalStore.ONE_HOUR_STORE, topicConfig, stream);
createWindowStore(LocalStore.TWELVE_HOURS_STORE, topicConfig, stream);
  }

  private void createWindowStore(
LocalStore localStore,
Map topicConfig,
KStream stream) {

// Configure how the statestore should be materialized using the provide 
storeName
Materialized> materialized = 
Materialized
.as(localStore.getStoreName());

// Set retention of changelog topic
materialized.withLoggingEnabled(topicConfig);

// Configure how windows looks like and how long data will be retained in 
local stores
TimeWindows configuredTimeWindows = getConfiguredTimeWindows(
localStore.getTimeUnit(), 
Long.parseLong(topicConfig.get(RETENTION_MS)));

// Processing description:
// The input data are 'samples' with key 
:::
// 1. With the map we add the Tag to the key and we extract the error score 
from the data
// 2. With the groupByKey we group  the data on the new key
// 3. With windowedBy we split up the data in time intervals depending on 
the provided LocalStore enum
// 4. With reduce we determine the maximum value in the time window
// 5. Materialized will make it stored in a table
stream
.map(getInstallationAssetModelAlgorithmTagKeyMapper())
.groupByKey()
.windowedBy(configuredTimeWindows)
.reduce((aggValue, newValue) -> getMaxErrorScore(aggValue, 
newValue), materialized);
  }

  private TimeWindows getConfiguredTimeWindows(long windowSizeMs, long 
retentionMs) {
TimeWindows timeWindows = TimeWindows.of(windowSizeMs);
timeWindows.until(retentionMs);
return timeWindows;
  }

  /**
   * Determine the max error score to keep by looking at the aggregated error 
signal and
   * freshly consumed error signal
   *
   * @param aggValue
   * @param newValue
   * @return
   */
  private ErrorScore getMaxErrorScore(ErrorScore aggValue, ErrorScore newValue) 
{
if(aggValue.getErrorSignal() > newValue.getErrorSignal()) {
return aggValue;
}
return newValue;
  }

  private KeyValueMapper> 
getInstallationAssetModelAlgorithmTagKeyMapper() {
return (s, sensorMeasurement) -> new KeyValue<>(s + "::" + 
sensorMeasurement.getT(),
new ErrorScore(sensorMeasurement.getTs(), sensorMeasurement.getE(), 
sensorMeasurement.getO()));
  }
}
{code}
So we are materializing aggregated data to four different stores after 
determining the max value within a specific window for a specific key. Please 
note that retention which is set to two months of data and the clean up policy 
delete. We don't compact data.

The size of the individual state stores on disk is between 14 to 20 gb of data.

We are making use of Interactive Queries: 
[https://docs.confluent.io/current/streams/developer-guide/interactive-queries.html#interactive-queries]

On our setup we have 4 instances of our streaming app to be used as one 
consumer group. So every instance will store a specific part of all data in its 
store.

This all seems to work nicely. Until we restart one or more instances and wait 
for it to become available again. (Restart time only is about 3 minutes max). I 
would expect that the restart of the app would not take that long but 
unfortunately it 

[jira] [Commented] (KAFKA-6820) Improve on StreamsMetrics Public APIs

2019-03-25 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/KAFKA-6820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16800458#comment-16800458
 ] 

ASF GitHub Bot commented on KAFKA-6820:
---

guozhangwang commented on pull request #6498: [DO NOT MERGE] KAFKA-6820: 
Refactor Stream Metrics
URL: https://github.com/apache/kafka/pull/6498
 
 
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Improve on StreamsMetrics Public APIs
> -
>
> Key: KAFKA-6820
> URL: https://issues.apache.org/jira/browse/KAFKA-6820
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Priority: Major
>  Labels: needs-kip
>
> Our current `addLatencyAndThroughputSensor`, `addThroughputSensor` are not 
> very well designed and hence not very user friendly to people to add their 
> customized sensors. We could consider improving on this feature. Some related 
> things to consider:
> 1. Our internal built-in metrics should be independent on these public APIs 
> which are for user customized sensor only. See KAFKA-6819 for related 
> description.
> 2. We could enforce the scopeName possible values, and well document on the 
> sensor hierarchies that would be incurred from the function calls. In this 
> way the library can help closing user's sensors automatically when the 
> corresponding scope (store, task, thread, etc) is being de-constructed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)