Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-08-07 Thread Lucas Wang
Hi Becket, Thanks for the review. The current write up in the KIP won’t change the ordering behavior. Are you ok with addressing that as a separate independent issue (I’ll create a separate ticket for it)? If so, can you please give me a +1 on the vote thread? Thanks, Lucas On Tue, Aug 7, 2018

Re: [DISCUSS] KIP-333 Consider a faster form of rebalancing

2018-08-07 Thread Becket Qin
Hi Richard, Sorry for the late response. As discussed in the other offline thread, I am still not sure if this use case is common enough to have a built-in rebalance policy. I think usually the time to detect the consumer failure and rebalance would be the longer than the catching up time as the

Build failed in Jenkins: kafka-2.0-jdk8 #108

2018-08-07 Thread Apache Jenkins Server
See Changes: [me] MINOR: System test for error handling and writes to DeadLetterQueue [wangguoz] MINOR: Add Scalafmt to Streams Scala API (#4965) -- [...truncated 431.74

Jenkins build is back to normal : kafka-trunk-jdk10 #385

2018-08-07 Thread Apache Jenkins Server
See

Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-08-07 Thread Becket Qin
Thanks for the updated KIP wiki, Lucas. Looks good to me overall. It might be an implementation detail, but do we still plan to use the correlation id to ensure the request processing order? Thanks, Jiangjie (Becket) Qin On Tue, Jul 31, 2018 at 3:39 AM, Lucas Wang wrote: > Thanks for your

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-08-07 Thread Guozhang Wang
@James What you described is true: the transition from dynamic to static memberships are not thought through yet. But I do not think it is an impossible problem: note that we indeed moved the offset commit from ZK to kafka coordinator in 0.8.2 :) The migration plan is to first to double-commits

[jira] [Resolved] (KAFKA-6327) IllegalArgumentException in RocksDB when RocksDBException being generated

2018-08-07 Thread Guozhang Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang resolved KAFKA-6327. -- Resolution: Fixed Fix Version/s: 2.1.0 > IllegalArgumentException in RocksDB when

Re: [DISCUSS] add connect-related packages to "What is considered a "major change" that needs a KIP?"

2018-08-07 Thread Ewen Cheslack-Postava
First, I agree, updating that list would be a good idea. It's likely it will always be a little divergent from any new additions -- the last update was probably when the KIP page was originally created, before either Connect or Streams existed. However, note that we also document the exact set of

Build failed in Jenkins: kafka-2.0-jdk8 #107

2018-08-07 Thread Apache Jenkins Server
See Changes: [wangguoz] MINOR: Fix Streams scala format violations (#5472) [me] KAFKA-7225: Pretransform validated props [jason] MINOR: Fix minikdc cleanup in system tests (#5471)

Re: [DISCUSS] KIP-258: Allow to Store Record Timestamps in RocksDB

2018-08-07 Thread Matthias J. Sax
Thanks for the feedback Bill. I just update the KIP with some of your points. >> Regarding step 3C of the in-place upgrade (users needing to watch the >> restore process), I'm wondering if we want to provide a type of >> StateRestoreListener that could signal when the new stores have reached >>

[jira] [Created] (KAFKA-7254) in kafka streams no way to specify a classpath resource in ssl.truststore.location

2018-08-07 Thread Rahul Battacharya (JIRA)
Rahul Battacharya created KAFKA-7254: Summary: in kafka streams no way to specify a classpath resource in ssl.truststore.location Key: KAFKA-7254 URL: https://issues.apache.org/jira/browse/KAFKA-7254

Build failed in Jenkins: kafka-trunk-jdk10 #384

2018-08-07 Thread Apache Jenkins Server
See -- Started by an SCM change Started by an SCM change [EnvInject] - Loading node environment variables. Building remotely on H24 (ubuntu xenial) in workspace

Build failed in Jenkins: kafka-trunk-jdk10 #383

2018-08-07 Thread Apache Jenkins Server
See Changes: [wangguoz] MINOR: Fix Streams scala format violations (#5472) [me] KAFKA-7225: Pretransform validated props -- [...truncated 2.13 MB...]

Re: [DISCUSS] KIP-346 - Limit blast radius of log compaction failure

2018-08-07 Thread Jason Gustafson
Hi Stanislav, Just a couple quick questions: 1. I may have missed it, but what will be the default value for `max.uncleanable.partitions`? 2. It seems there will be some impact for users that monitoring "time-since-last-run-ms" in order to detect cleaner failures. Not sure it's a major concern,

Enabled spotlessScalaCheck in Jenkins Job for Streams

2018-08-07 Thread Guozhang Wang
Hello folks, Since we have introduced scalafmt to the Streams module, we did not yet enable the check on the Jenkins job yet. It would cause Jenkins job to still pass with ill-formatted code, while the `./gradlew build` process will fail as it automatically execute the check. I have added

Re: [VOTE] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-07 Thread Jason Gustafson
+1 Thanks Vahid. On Tue, Aug 7, 2018 at 11:14 AM, Vahid S Hashemian < vahidhashem...@us.ibm.com> wrote: > Hi all, > > I'd like to start a vote on KIP-289 to modify the default group id of > KafkaConsumer. > The KIP: > https://cwiki.apache.org/confluence/display/KAFKA/KIP- >

Re: [VOTE] KIP-334 Include partitions in exceptions raised during consumer record deserialization/validation

2018-08-07 Thread Jason Gustafson
One difference between the two cases is that we can't generally trust the offset of a corrupt message. Which offset were we planning to use in the exception? Maybe it should be either the fetch offset or one plus the last consumed offset? I think I'm with Colin in preferring

Build failed in Jenkins: kafka-1.1-jdk7 #174

2018-08-07 Thread Apache Jenkins Server
See Changes: [jason] MINOR: Fix minikdc cleanup in system tests (#5471) -- [...truncated 426.50 KB...] kafka.log.LogCleanerIntegrationTest >

Re: [DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-08-07 Thread Jason Gustafson
Hey Dong, I see what you mean. This would have been clearer if the committed offset was the offset of the last consumed record. I don't feel too strongly about it either. Perhaps we can use the more concise name and just rely on the documentation to explain its usage. It should be rare that users

Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression (Updated)

2018-08-07 Thread Jason Gustafson
Hey Colin, The problem for the fetch API is that the broker does not generally know if a batch was compressed with zstd unless it parses it. I think the goal here is to avoid the expensive down-conversion that is needed to ensure compatibility because it is only necessary if zstd is actually in

Re: [DISCUSS] KIP-325: Extend Consumer Group Command to Show Beginning Offsets

2018-08-07 Thread Vahid S Hashemian
Any additional feedback on whether we should also include a partition size column or not? Options: 1. The current KIP (with a partition size column): https://cwiki.apache.org/confluence/display/KAFKA/KIP-325%3A+Extend+Consumer+Group+Command+to+Show+Beginning+Offsets+and+Partition+Size **

Jenkins build is back to normal : kafka-trunk-jdk10 #382

2018-08-07 Thread Apache Jenkins Server
See

Jenkins build is back to normal : kafka-trunk-jdk8 #2875

2018-08-07 Thread Apache Jenkins Server
See

Re: [DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-08-07 Thread Dong Lin
Hey Jason, Thanks for the reply. Regarding 3), I am thinking that both "Offset" and "LastLeaderEpoch" in the OffsetCommitRequest are associated with the last consumed messages. Value of "Offset" is not necessarily the offset of the next message due to log compaction. Since we are naming "Offset"

Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression (Updated)

2018-08-07 Thread Colin McCabe
Thanks for bumping this, Dongjin. ZStd is a good compression codec and I hope we can get this support in soon! I would say we can just bump the API version to indicate that ZStd support is expected in new clients. We probably need some way of indicating to the older clients that they can't

Re: [DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-08-07 Thread Jason Gustafson
Hi Dong, Thanks for the comments. 1) Yes, makes sense. 2) This is an interesting point. The suggestion made more sense in the initial version of the KIP, but I think you are right that we should use the same fencing semantics we use for the Fetch and OffsetForLeaderEpoch APIs. Just like a

[jira] [Resolved] (KAFKA-7225) Kafka Connect ConfigProvider not invoked before validation

2018-08-07 Thread Ewen Cheslack-Postava (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-7225?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ewen Cheslack-Postava resolved KAFKA-7225. -- Resolution: Fixed > Kafka Connect ConfigProvider not invoked before validation

Jenkins build is back to normal : kafka-2.0-jdk8 #106

2018-08-07 Thread Apache Jenkins Server
See

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Matthias J. Sax
Correct. It's not about reordering. Records will still be processed in offset-order per partition. For multi-partition task (like joins), we use the timestamp of the "head" record of each partition to determine which record to process first (to process records across partitions in timestamp order

Re: [DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-08-07 Thread Jason Gustafson
Hey Jun, 57. It's a fair point. I could go either way, but I'm slightly inclined to just document the new API for now. We'll still support seeking to an offset with corresponding epoch information, so deprecating the old seek() seems like overkill. 60. The phrasing was a little confusing. Does

Re: [DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-08-07 Thread Dong Lin
Hey Jason, Thanks for the update. I have some comments below: 1) Since FencedLeaderEpochException indicates that the metadata in the client is outdated, should it extend InvalidMetadataException? 2) It is mentioned that "To fix the problem with KIP-232, we will add the leader epoch the

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Thomas Becker
In typing up a scenario to illustrate my question, I think I found the answer ;) We are not assuming timestamps will be strictly increasing within a topic and trying to make processing order deterministic even in the face of that. Thanks for making me think about it (or please correct me if I'm

Re: [DISCUSS] Applying scalafmt to core code

2018-08-07 Thread Colin McCabe
Hmm. It would be unfortunate to make contributors include unrelated style changes in their PRs. This would be especially hard on new contributors who might not want to make a large change. If we really want to do something like this, I would vote for A1. Just do the change all at once and

Re: [DISCUSS] KIP-348 Eliminate null from SourceTask#poll()

2018-08-07 Thread Colin McCabe
Thanks, Chia-Ping. Looks good to me. regards, Colin On Sat, Aug 4, 2018, at 01:17, Chia-Ping Tsai wrote: > hi Colin > > Thanks for the reviews! You are totally right. The description of > KIP-348 is not accurate. The purpose of KIP-348 is to encourage > connector user to substitute empty

Re: [DISCUSS] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-07 Thread Colin McCabe
Thanks, Vahid. best, Colin On Fri, Aug 3, 2018, at 12:15, Vahid S Hashemian wrote: > The KIP has been updated. > > If it looks good and there are no further comments I'll start a vote early > next week. > > Thanks. > --Vahid > > > > > From: "Vahid S Hashemian" > To:

[VOTE] KIP-289: Improve the default group id behavior in KafkaConsumer

2018-08-07 Thread Vahid S Hashemian
Hi all, I'd like to start a vote on KIP-289 to modify the default group id of KafkaConsumer. The KIP: https://cwiki.apache.org/confluence/display/KAFKA/KIP-289%3A+Improve+the+default+group+id+behavior+in+KafkaConsumer The discussion thread:

Re: [VOTE] KIP-334 Include partitions in exceptions raised during consumer record deserialization/validation

2018-08-07 Thread Colin McCabe
Hi Stanislav, On Sat, Aug 4, 2018, at 10:44, Stanislav Kozlovski wrote: > Hey Colin, > > It may be a bit vague but keep in mind we also raise the exception in the > case where the record is corrupted. > We discussed with Jason offline that message corruption most often prevents > deserialization

Build failed in Jenkins: kafka-trunk-jdk10 #381

2018-08-07 Thread Apache Jenkins Server
See -- Started by an SCM change [EnvInject] - Loading node environment variables. Building remotely on H30 (ubuntu xenial) in workspace

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Gwen Shapira
+1 (binding) On Tue, Aug 7, 2018 at 4:19 AM, Stanislav Kozlovski wrote: > Hey everybody, > I'm starting a vote on KIP-346 > 346+-+Improve+LogCleaner+behavior+on+error> > > -- > Best, > Stanislav > -- *Gwen Shapira* Product Manager |

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Matthias J. Sax
@Thomas, just to rephrase (from my understanding): > So in the scenario you describe, where one topic has >>> vastly lower throughput, you're saying that when the lower throughput topic >>> is fully caught up (no messages in the buffer), the task will idle rather >>> than using the timestamp of

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Harsha
+1 (binding) Thanks, Harsha On Tue, Aug 7, 2018, at 10:22 AM, Manikumar wrote: > +1 (non-binding) > > Thanks for the KIP. > > On Tue, Aug 7, 2018 at 10:42 PM Ray Chiang wrote: > > > +1 (non-binding) > > > > -Ray > > > > On 8/7/18 9:26 AM, Ted Yu wrote: > > > +1 > > > > > > On Tue, Aug 7,

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Manikumar
+1 (non-binding) Thanks for the KIP. On Tue, Aug 7, 2018 at 10:42 PM Ray Chiang wrote: > +1 (non-binding) > > -Ray > > On 8/7/18 9:26 AM, Ted Yu wrote: > > +1 > > > > On Tue, Aug 7, 2018 at 5:25 AM Thomas Becker > wrote: > > > >> +1 (non-binding) > >> > >> We've hit issues with the log

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Ray Chiang
+1 (non-binding) -Ray On 8/7/18 9:26 AM, Ted Yu wrote: +1 On Tue, Aug 7, 2018 at 5:25 AM Thomas Becker wrote: +1 (non-binding) We've hit issues with the log cleaner in the past, and this would be a great improvement. On Tue, 2018-08-07 at 12:19 +0100, Stanislav Kozlovski wrote: Hey

Re: [VOTE] KIP-328: Ability to suppress updates for KTables

2018-08-07 Thread John Roesler
Thanks everyone, KIP-328 has passed with 3 binding votes (Guozhang, Damian, and Matthias) and 3 non-binding (Ted, Bill, and me). Thanks for your time, -John On Mon, Aug 6, 2018 at 6:35 PM Matthias J. Sax wrote: > +1 (binding) > > Thanks for the KIP. > > > -Matthias > > On 8/3/18 12:52 AM,

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Guozhang Wang
@Tommy Yes that's the intent. Again note that the current behavior is indeed "just using the timestamp of the last message I saw", and continue processing what's in the buffer from other streams, but this may introduce out-of-ordering. Guozhang On Tue, Aug 7, 2018 at 9:59 AM, Thomas Becker

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Thomas Becker
Thanks Guozhang. So in the scenario you describe, where one topic has vastly lower throughput, you're saying that when the lower throughput topic is fully caught up (no messages in the buffer), the task will idle rather than using the timestamp of the last message it saw from that topic?

Re: Add to JIRA

2018-08-07 Thread Matthias J. Sax
Added you to the list on contributors in JIRA. You can now assign Jiras to yourself. Most components are written in Java (only brokers and tools are Scala). So you can look for tickets in clients, connect, streams. We try to mark tickets with labels "newbie" or "beginner". This should be a good

Re: [DISCUSS] Applying scalafmt to core code

2018-08-07 Thread Guozhang Wang
Hello Ray, I saw on the original PR Jason (cc'ed) expressed a concern comparing scalafmt with scalastyle: the latter will throw exceptions in the build process to notify developers while the former will automatically reformat the code that developers may not be aware of. So I think maybe Jason

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Guozhang Wang
@Ted Yes, I will update the KIP mentioning this as a separate consideration. @Thomas The idle period may be happening during the processing as well. Think: if you are joining two streams with very different throughput traffic, say for an extreme case, one stream comes in as 100K messages /

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Ted Yu
+1 On Tue, Aug 7, 2018 at 5:25 AM Thomas Becker wrote: > +1 (non-binding) > > We've hit issues with the log cleaner in the past, and this would be a > great improvement. > On Tue, 2018-08-07 at 12:19 +0100, Stanislav Kozlovski wrote: > > Hey everybody, > > I'm starting a vote on KIP-346 > > < >

Build failed in Jenkins: kafka-trunk-jdk10 #380

2018-08-07 Thread Apache Jenkins Server
See Changes: [wangguoz] KAFKA-7250: fix transform function in scala DSL to accept -- [...truncated 1.12 MB...] org.apache.kafka.clients.MetadataTest > testTopicExpiry

Build failed in Jenkins: kafka-trunk-jdk8 #2874

2018-08-07 Thread Apache Jenkins Server
See -- Started by an SCM change [EnvInject] - Loading node environment variables. Building remotely on H32 (ubuntu xenial) in workspace

[jira] [Resolved] (KAFKA-7250) Kafka-Streams-Scala DSL transform shares transformer instance

2018-08-07 Thread Guozhang Wang (JIRA)
[ https://issues.apache.org/jira/browse/KAFKA-7250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guozhang Wang resolved KAFKA-7250. -- Resolution: Fixed Fix Version/s: 2.1.0 2.0.1 > Kafka-Streams-Scala

Re: [DISCUSS] KIP-110: Add Codec for ZStandard Compression (Updated)

2018-08-07 Thread Dongjin Lee
As Kafka 2.0.0 was released, let's reboot this issue, KIP-110 . For newcomers, Here is some summary of the history: KIP-110 was originally worked for the issue KAFKA-4514 but, it lacked benchmark

Re: [DISCUSS] KIP-340: Allow kafka-reassign-partitions.sh and kafka-log-dirs.sh to take admin client property file

2018-08-07 Thread Attila Sasvári
Hi Dong, Thanks for the KIP. +1 on using "command-config". Regarding the testing strategy: - Is it planned to add new system test(s) to execute kafka-log-dirs.sh (kafka-reassign-partitions.sh) against an SSL-enabled broker? I know that the change is quite small, but it might be useful. Best,

Re: [VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Thomas Becker
+1 (non-binding) We've hit issues with the log cleaner in the past, and this would be a great improvement. On Tue, 2018-08-07 at 12:19 +0100, Stanislav Kozlovski wrote: Hey everybody, I'm starting a vote on KIP-346

Re: [DISCUSS] KIP-353: Allow Users to Configure Kafka Streams Timestamp Synchronization

2018-08-07 Thread Thomas Becker
This looks like a big step in the right direction IMO. So am I correct in assuming this idle period would only come into play after startup when waiting for initial records to be fetched? In other words, once we have seen records from all topics and have established the stream time processing

[VOTE] KIP-346 - Improve LogCleaner behavior on error

2018-08-07 Thread Stanislav Kozlovski
Hey everybody, I'm starting a vote on KIP-346 -- Best, Stanislav

Jenkins build is back to normal : kafka-trunk-jdk10 #379

2018-08-07 Thread Apache Jenkins Server
See

Jenkins build is back to normal : kafka-trunk-jdk8 #2872

2018-08-07 Thread Apache Jenkins Server
See

Re: [DISCUSS] KIP-345: Reduce multiple consumer rebalances by specifying member id

2018-08-07 Thread James Cheng
Guozhang, in a previous message, you proposed said this: > On Jul 30, 2018, at 3:56 PM, Guozhang Wang wrote: > > 1. We bump up the JoinGroupRequest with additional fields: > > 1.a) a flag indicating "static" or "dynamic" membership protocols. > 1.b) with "static" membership, we also add the