[GitHub] kafka pull request #2860: kafka-5068: Optionally print out metrics after run...
GitHub user amethystic opened a pull request: https://github.com/apache/kafka/pull/2860 kafka-5068: Optionally print out metrics after running the perf tests @junrao added a config `--print.metrics` to control whether ProducerPerformance prints out metrics at the end of the test. If its okay, will add the code counterpart for consumer. You can merge this pull request into a Git repository by running: $ git pull https://github.com/amethystic/kafka kafka-5068_print_metrics_in_perf_tests Alternatively you can review and apply these changes as the patch at: https://github.com/apache/kafka/pull/2860.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2860 commit 05d55590203d715c7c306338f005646e1493b2fb Author: huxiDate: 2017-04-17T02:50:50Z kafka-5068: Optionally print out metrics after running the perf tests Added a config `--print.metrics` to control whether ProducerPerformance prints out metrics at the end of the test. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (KAFKA-5068) Optionally print out metrics after running the perf tests
[ https://issues.apache.org/jira/browse/KAFKA-5068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] huxi reassigned KAFKA-5068: --- Assignee: huxi > Optionally print out metrics after running the perf tests > - > > Key: KAFKA-5068 > URL: https://issues.apache.org/jira/browse/KAFKA-5068 > Project: Kafka > Issue Type: Improvement > Components: tools >Affects Versions: 0.10.2.0 >Reporter: Jun Rao >Assignee: huxi > Labels: newbie > > Often, we run ProducerPerformance/ConsumerPerformance tests to investigate > performance issues. It's useful for the tool to print out the metrics in the > producer/consumer at the end of the tests. We can make this optional to > preserve the current behavior by default. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (KAFKA-5040) Increase number of Streams producer retries from the default of 0
[ https://issues.apache.org/jira/browse/KAFKA-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ismael Juma updated KAFKA-5040: --- Summary: Increase number of Streams producer retries from the default of 0 (was: Increase number of producer retries from the default of 0) > Increase number of Streams producer retries from the default of 0 > - > > Key: KAFKA-5040 > URL: https://issues.apache.org/jira/browse/KAFKA-5040 > Project: Kafka > Issue Type: Bug > Components: streams >Affects Versions: 0.10.2.0 >Reporter: Eno Thereska >Assignee: Eno Thereska >Priority: Blocker > Fix For: 0.11.0.0, 0.10.2.1 > > > In Kafka Streams, the default value for the producer retries is not changed > from the default of 0. That leads to situations where a streams instance > fails when a broker is temporarily unavailable. Increasing this number to > 0 > would help. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (KAFKA-5007) Kafka Replica Fetcher Thread- Resource Leak
[ https://issues.apache.org/jira/browse/KAFKA-5007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15970549#comment-15970549 ] Joseph Aliase commented on KAFKA-5007: -- It seems ticket https://issues.apache.org/jira/browse/KAFKA-4477 is similar to the current ticket. The ticket was closed since nobody reported in 0.10.1.1. But issue exists in 0.10.1.1. It's a major issue and I would I like to work on it. I need guidance on where to start. [~junrao] > Kafka Replica Fetcher Thread- Resource Leak > --- > > Key: KAFKA-5007 > URL: https://issues.apache.org/jira/browse/KAFKA-5007 > Project: Kafka > Issue Type: Bug > Components: core, network >Affects Versions: 0.10.0.0, 0.10.1.1, 0.10.2.0 > Environment: Centos 7 > Jave 8 >Reporter: Joseph Aliase >Priority: Critical > Labels: reliability > > Kafka is running out of open file descriptor when system network interface is > done. > Issue description: > We have a Kafka Cluster of 5 node running on version 0.10.1.1. The open file > descriptor for the account running Kafka is set to 10. > During an upgrade, network interface went down. Outage continued for 12 hours > eventually all the broker crashed with java.io.IOException: Too many open > files error. > We repeated the test in a lower environment and observed that Open Socket > count keeps on increasing while the NIC is down. > We have around 13 topics with max partition size of 120 and number of replica > fetcher thread is set to 8. > Using an internal monitoring tool we observed that Open Socket descriptor > for the broker pid continued to increase although NIC was down leading to > Open File descriptor error. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
Re: [DISCUSS] KIP-138: Change punctuate semantics
Hi Arun, Thanks for putting the use cases on the wiki. I copied over your Terminology section to the main KIP page as I think it's super important to be clear on the terms. I've made some changes while doing that which I highlight below, as I'd like to encourage comments on these. 1) I removed the mention of logical time, since the API is strictly mandates " milliseconds since midnight, January 1, 1970 UTC" as opposed to any arbitrary logical time (even if it's not enforceable). 2) I broke up the definition of Stream Time into 2 separate terms: Stream Partition Time and Stream Time proper. This is for 2 reasons: a) Follows the definition of Stream Time as it is stated on the ProcessorContext: https://github.com/apache/kafka/blob/0.10.2.0/streams/src/main/java/org/apache/kafka/streams/processor/ProcessorContext.java#L159 b) The timestamp extractors are stealing all the thunder ;-) There's been a lot of discussion about timestamp extractors and merits of event/processing time, however I haven't encountered much in terms of justification why the stream time is fixed to be the /_smallest_/ among all its input stream partition timestamps. I found a comment in the PartitionGroup: https://github.com/apache/kafka/blob/0.10.2.0/streams/src/main/java/org/apache/kafka/streams/processor/internals/PartitionGroup.java#L138 public long timestamp() { // we should always return the smallest timestamp of all partitions // to avoid group partition time goes backward but I can't believe this to be the only reason behind this choice as minimum is not the only function to guarantee the group partition time never going back. Using the largest or the average among partitions' timestamp would also guaranteed the group timestamp not going back as timestamp never goes back for any individual partition. So why was minimum chosen? Is it depended on by window semantics somewhere or anything else? 3) I used the term Punctuate'stimestampargument instead of Punctuation Timestamp since I found the latter sound too similar to Punctuate Time 4) Rephrased Output Record Time. This is something I haven't given any thought before whatsoever. Is it still true to what you meant? Comments appreciated, especially need input on 2b above. Cheers, Michal On 10/04/17 12:58, Arun Mathew wrote: Thanks Ewen. @Michal, @all, I have created a child page to start the Use Cases discussion [https://cwiki.apache.org/confluence/display/KAFKA/Punctuate+Use+Cases]. Please go through it and give your comments. @Tianji, Sorry for the delay. I am trying to make the patch public. -- Arun Mathew On 4/8/17, 02:00, "Ewen Cheslack-Postava"wrote: Arun, I've given you permission to edit the wiki. Let me know if you run into any issues. -Ewen On Fri, Apr 7, 2017 at 1:21 AM, Arun Mathew wrote: > Thanks Michal. I don’t have the access yet [arunmathew88]. Should I be > sending a separate mail for this? > > I thought one of the person following this thread would be able to give me > access. > > > > *From: *Michal Borowiecki > *Reply-To: *"dev@kafka.apache.org" > *Date: *Friday, April 7, 2017 at 17:16 > *To: *"dev@kafka.apache.org" > *Subject: *Re: [DISCUSS] KIP-138: Change punctuate semantics > > > > Hi Arun, > > I was thinking along the same lines as you, listing the use cases on the > wiki, but didn't find time to get around doing that yet. > Don't mind if you do it if you have access now. > I was thinking it would be nice if, once we have the use cases listed, > people could use likes to up-vote the use cases similar to what they're > working on. > > I should have a bit more time to action this in the next few days, but > happy for you to do it if you can beat me to it ;-) > > Cheers, > Michal > > On 07/04/17 04:39, Arun Mathew wrote: > > Sure, Thanks Matthias. My id is [arunmathew88]. > > > > Of course. I was thinking of a subpage where people can collaborate. > > > > Will do as per Michael’s suggestion. > > > > Regards, > > Arun Mathew > > > > On 4/7/17, 12:30, "Matthias J. Sax" wrote: > > > > Please share your Wiki-ID and a committer can give you write access. > > > > Btw: as you did not initiate the KIP, you should not change the KIP > > without the permission of the original author -- in this case Michael. > > > > So you might also just share your thought over the mailing list and > > Michael can update the KIP page. Or, as an alternative, just