Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Ted Yu
My previous response was talking about the new method in
InternalTopologyBuilder.
The exception just means there is no uniform extractor for all the sinks.

On Mon, Jun 25, 2018 at 8:02 PM, Matthias J. Sax 
wrote:

> Ted,
>
> Why? Each sink can have a different TopicNameExtractor.
>
>
> -Matthias
>
> On 6/25/18 5:19 PM, Ted Yu wrote:
> > If there are different TopicNameExtractor classes from multiple sink
> nodes,
> > the new method should throw exception alerting user of such scenario.
> >
> >
> > On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck  wrote:
> >
> >> Thanks for the KIP!
> >>
> >> Overall I'm +1 on the KIP.   I have one question.
> >>
> >> The KIP states that the method "topicNameExtractor()" is added to the
> >> InternalTopologyBuilder.java.
> >>
> >> It could be that I'm missing something, but wow does this work if a user
> >> has provided different TopicNameExtractor instances to multiple sink
> nodes?
> >>
> >> Thanks,
> >> Bill
> >>
> >>
> >>
> >> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang 
> wrote:
> >>
> >>> Yup I agree, generally speaking the `toString()` output is not
> >> recommended
> >>> to be relied on programmatically in user's code, but we've observed
> >>> convenience-beats-any-other-reasons again and again in development
> >>> unfortunately. I think we should still not claiming it is part of the
> >>> public APIs that would not be changed anyhow in the future, but just
> >>> mentioning it in the wiki for people to be aware is fine.
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax <
> matth...@confluent.io>
> >>> wrote:
> >>>
>  Thanks for the KIP!
> 
>  I am don't have any further comments.
> 
>  For Guozhang's comment: if we mention anything about `toString()`, we
>  should make explicit that `toString()` output is still not public
>  contract and users should not rely on the output.
> 
>  Furhtermore, for the actual uses output, I would replace "topic:" by
>  "extractor class:" to make the difference obvious.
> 
>  I am just hoping that people actually to not rely on `toString()` what
>  defeats the purpose to the `TopologyDescription` class that was
>  introduced to avoid the dependency... (Just a side comment, not really
>  related to this KIP proposal itself).
> 
> 
>  If there are no further comments in the next days, feel free to start
>  the VOTE and open a PR.
> 
> 
> 
> 
>  -Matthias
> 
>  On 6/22/18 6:04 PM, Guozhang Wang wrote:
> > Thanks for writing the KIP!
> >
> > I'm +1 on the proposed changes over all. One minor suggestion: we
> >>> should
> > also mention that the `Sink#toString` will also be updated, in a way
> >>> that
> > if `topic()` returns null, use the other call, etc. This is because
> > although we do not explicitly state the following logic as public
>  protocols:
> >
> > ```
> >
> > "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> > nodeNames(predecessors);
> >
> >
> > ```
> >
> > There are already some users that rely on
> >>> `topology.describe().toString(
>  )`
> > to have runtime checks on the returned string values, so changing
> >> this
> > means that their app will break and hence need to make code changes.
> >
> > Guozhang
> >
> > On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
> >>> nishanth...@gmail.com
> >
> > wrote:
> >
> >> Hello Everyone,
> >>
> >> I have created a new KIP to discuss extending TopologyDescription.
> >> You
>  can
> >> find it here:
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> >>
> >> Please provide any feedback that you might have.
> >>
> >> Best,
> >> Nishanth Pradeep
> >>
> >
> >
> >
> 
> 
> >>>
> >>>
> >>> --
> >>> -- Guozhang
> >>>
> >>
> >
>
>


Re: [VOTE] KIP-319: Replace numSegments to segmentInterval in Streams window configurations

2018-06-25 Thread Matthias J. Sax
+1 (binding)

On 6/25/18 3:00 PM, Guozhang Wang wrote:
> +1
> 
> On Mon, Jun 25, 2018 at 2:58 PM, Ted Yu  wrote:
> 
>> +1
>>
>> On Mon, Jun 25, 2018 at 2:56 PM, John Roesler  wrote:
>>
>>> Hello All,
>>>
>>> Thanks for the discussion on KIP-319. I'd now like to start the voting.
>>>
>>> As a reminder, KIP-319 proposes a fix to an issue I identified in
>>> KAFKA-7080. Specifically, the issue is that we're creating
>>> CachingWindowStore with the *number of segments* instead of the *segment
>>> size*.
>>>
>>> Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
>>> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
>>>
>>> Additionally, here's a draft PR for clarity:
>>> https://github.com/apache/kafka/pull/5257
>>>
>>> Thanks,
>>> -John
>>>
>>
> 
> 
> 



signature.asc
Description: OpenPGP digital signature


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-25 Thread Matthias J. Sax
KAFKA-7080 is for this KIP.

I meant to create a JIRA to add `segmentInterval` to `Materialized` and
a JIRA to add `Materialized` to `KStream#join(KStream)`.

Thx.


-Matthias

On 6/25/18 2:46 PM, John Roesler wrote:
> Ah, it turns out I did create a ticket: it's KAFKA-7080:
> https://issues.apache.org/jira/browse/KAFKA-7080
> 
> -John
> 
> On Mon, Jun 25, 2018 at 4:44 PM John Roesler  wrote:
> 
>> Matthias,
>>
>> That's a good idea. I'm not sure why I didn't...
>>
>> Thanks,
>> -John
>>
>> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax 
>> wrote:
>>
>>> Ok.
>>>
>>> @John: can you create a JIRA to track this? I think KAFKA-4730 is
>>> related, but actually an own ticket (that is blocked by not having
>>> Materialized for stream-stream joins).
>>>
>>>
>>> -Matthias
>>>
>>> On 6/25/18 2:10 PM, Bill Bejeck wrote:
 I agree that it makes sense to have segmentInterval as a parameter to a
 store, but I also agree with Guozhang's point about not moving as part
>>> of
 this KIP.

 Thanks,
 Bill

 On Mon, Jun 25, 2018 at 4:17 PM John Roesler  wrote:

> Thanks Matthias and Guozhang,
>
> About deprecating the "segments" field instead of making it private.
>>> Yes, I
> just took another look at the code, and that is correct. I'll update
>>> the
> KIP.
>
> I do agree that in the long run, it makes more sense as a parameter to
>>> the
> store somehow than as a parameter to the window. I think this isn't a
>>> super
> high priority, though, because it's not exposed in the DSL (or it
>>> wasn't
> intended to be).
>
> I felt Guozhang's point is valid, and that we should probably revisit
>>> it
> later, possibly in the scope of
> https://issues.apache.org/jira/browse/KAFKA-4730
>
> I'll wait an hour or so for more feedback before moving on to a vote.
>
> Thanks again,
> -John
>
> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang 
>>> wrote:
>
>> Re `segmentInterval` parameter in Windows: currently it is used in two
>> places, the windowed stream aggregation, and the stream-stream joins.
>>> For
>> the former, we can potentially move the parameter from windowedBy() to
>> Materialized, but for the latter we currently do not expose a
> Materialized
>> object yet, only the Windows spec. So I think in this KIP we probably
>> cannot move it immediately.
>>
>> But in future KIPs if we decide to expose the stream-stream join's
>>> store
> /
>> changelog / repartition topic names, we may well adding the
>>> Materialized
>> object into the operator, and we can then move the parameter to
>> Materialized by then.
>>
>>
>> Guozhang
>>
>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
>>> matth...@confluent.io>
>> wrote:
>>
>>> Thanks for the KIP. Overall, I think it makes sense to clean up the
> API.
>>>
>>> Couple of comments:
>>>
 Sadly there's no way to "deprecate" this
 exposure
>>>
>>> I disagree. We can just mark the variable as deprecated and I
>>> advocate
>>> to do this. When the deprecation period passed, we can make it
>>> private
>>> (or actually remove it; cf. my next comment).
>>>
>>>
>>> Parameter, `segmentInterval` is semantically not a "window"
>>> specification parameter but an implementation detail and thus a store
>>> parameter. Would it be better to add it to `Materialized`?
>>>
>>>
>>> -Matthias
>>>
>>> On 6/22/18 5:13 PM, Guozhang Wang wrote:
 Thanks John.

 On Fri, Jun 22, 2018 at 5:05 PM, John Roesler 
>> wrote:

> Thanks for the feedback, Bill and Guozhang,
>
> I've updated the KIP accordingly.
>
> Thanks,
> -John
>
> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang 
>>> wrote:
>
>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
> the
> wiki:
>> the `In Windows, we will:` section code snippet is empty.
>>
>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck 
>>> wrote:
>>
>>> Hi John,
>>>
>>> Thanks for the KIP, and overall it's a +1 for me.
>>>
>>> In the JavaDoc for the segmentInterval method, there's no mention
> of
> the
>>> number of segments there can be at any one time.  While it's
> implied
> that
>>> the number of segments is potentially unbounded, would be better
> to
>>> explicitly state that the previous limit on the number of
>>> segments
>> is
>> going
>>> to be removed as well?
>>>
>>> I have a couple of nit comments.   The method name is still
>>> segmentSize
>> in
>>> the code block vs segmentInterval and the order of the parameters
>> for
> the
>>> third 

Re: [VOTE] 1.0.2 RC0

2018-06-25 Thread Manikumar
+1 (non-binding) Verified tests, quick start, producer/consumer perf tests.

On Sat, Jun 23, 2018 at 2:25 AM Ted Yu  wrote:

> +1
>
> Ran test suite.
>
> Checked signatures.
>
> On Fri, Jun 22, 2018 at 11:42 AM, Vahid S Hashemian <
> vahidhashem...@us.ibm.com> wrote:
>
> > +1 (non-binding)
> >
> > Built from source and ran quickstart successfully on Ubuntu (with Java
> 8).
> >
> > Thanks for running the release Matthias!
> > --Vahid
> >
> >
> >
> >
> > From:   "Matthias J. Sax" 
> > To: dev@kafka.apache.org, us...@kafka.apache.org,
> > kafka-clie...@googlegroups.com
> > Date:   06/22/2018 10:42 AM
> > Subject:[VOTE] 1.0.2 RC0
> >
> >
> >
> > Hello Kafka users, developers and client-developers,
> >
> > This is the first candidate for release of Apache Kafka 1.0.2.
> >
> > This is a bug fix release closing 26 tickets:
> > https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.0.2
> >
> > Release notes for the 1.0.2 release:
> > http://home.apache.org/~mjsax/kafka-1.0.2-rc0/RELEASE_NOTES.html
> >
> > *** Please download, test and vote by Tuesday, 6/26/18 end-of-day, so we
> > can close the vote on Wednesday.
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > http://home.apache.org/~mjsax/kafka-1.0.2-rc0/
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/
> >
> > * Javadoc:
> > http://home.apache.org/~mjsax/kafka-1.0.2-rc0/javadoc/
> >
> > * Tag to be voted upon (off 1.0 branch) is the 1.0.2 tag:
> > https://github.com/apache/kafka/releases/tag/1.0.2-rc0
> >
> > * Documentation:
> > http://kafka.apache.org/10/documentation.html
> >
> > * Protocol:
> > http://kafka.apache.org/10/protocol.html
> >
> > * Successful Jenkins builds for the 1.0 branch:
> > Unit/integration tests:
> https://builds.apache.org/job/kafka-1.0-jdk7/211/
> > System tests:
> > https://jenkins.confluent.io/job/system-test-kafka/job/1.0/217/
> >
> > /**
> >
> > Thanks,
> >   -Matthias
> >
> > [attachment "signature.asc" deleted by Vahid S Hashemian/Silicon
> > Valley/IBM]
> >
> >
> >
> >
>


Build failed in Jenkins: kafka-trunk-jdk8 #2767

2018-06-25 Thread Apache Jenkins Server
See 


Changes:

[jason] KAFKA-6591; Move super user check before ACL matching  (#4618)

--
[...truncated 1.94 MB...]

org.apache.kafka.common.acl.AclOperationTest > testName STARTED

org.apache.kafka.common.acl.AclOperationTest > testName PASSED

org.apache.kafka.common.acl.AclOperationTest > testExhaustive STARTED

org.apache.kafka.common.acl.AclOperationTest > testExhaustive PASSED

org.apache.kafka.common.acl.AclOperationTest > testIsUnknown STARTED

org.apache.kafka.common.acl.AclOperationTest > testIsUnknown PASSED

org.apache.kafka.common.acl.ResourcePatternTest > 
shouldThrowIfResourceNameIsNull STARTED

org.apache.kafka.common.acl.ResourcePatternTest > 
shouldThrowIfResourceNameIsNull PASSED

org.apache.kafka.common.acl.ResourcePatternTest > shouldThrowIfPatternTypeIsAny 
STARTED

org.apache.kafka.common.acl.ResourcePatternTest > shouldThrowIfPatternTypeIsAny 
PASSED

org.apache.kafka.common.acl.ResourcePatternTest > 
shouldThrowIfResourceTypeIsAny STARTED

org.apache.kafka.common.acl.ResourcePatternTest > 
shouldThrowIfResourceTypeIsAny PASSED

org.apache.kafka.common.acl.ResourcePatternTest > 
shouldThrowIfPatternTypeIsMatch STARTED

org.apache.kafka.common.acl.ResourcePatternTest > 
shouldThrowIfPatternTypeIsMatch PASSED

org.apache.kafka.common.acl.AclBindingTest > shouldThrowOnMatchPatternType 
STARTED

org.apache.kafka.common.acl.AclBindingTest > shouldThrowOnMatchPatternType 
PASSED

org.apache.kafka.common.acl.AclBindingTest > shouldThrowOnAnyPatternType STARTED

org.apache.kafka.common.acl.AclBindingTest > shouldThrowOnAnyPatternType PASSED

org.apache.kafka.common.acl.AclBindingTest > testUnknowns STARTED

org.apache.kafka.common.acl.AclBindingTest > testUnknowns PASSED

org.apache.kafka.common.acl.AclBindingTest > testMatching STARTED

org.apache.kafka.common.acl.AclBindingTest > testMatching PASSED

org.apache.kafka.common.acl.AclBindingTest > shouldThrowOnAnyResourceType 
STARTED

org.apache.kafka.common.acl.AclBindingTest > shouldThrowOnAnyResourceType PASSED

org.apache.kafka.common.acl.AclBindingTest > 
shouldNotThrowOnUnknownResourceType STARTED

org.apache.kafka.common.acl.AclBindingTest > 
shouldNotThrowOnUnknownResourceType PASSED

org.apache.kafka.common.acl.AclBindingTest > testMatchesAtMostOne STARTED

org.apache.kafka.common.acl.AclBindingTest > testMatchesAtMostOne PASSED

org.apache.kafka.common.acl.AclBindingTest > shouldNotThrowOnUnknownPatternType 
STARTED

org.apache.kafka.common.acl.AclBindingTest > shouldNotThrowOnUnknownPatternType 
PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchIfBothPrefixedAndResourceIsPrefixOfFilter STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchIfBothPrefixedAndResourceIsPrefixOfFilter PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchPrefixedIfNamePrefixedAnyFilterTypeIsAny STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchPrefixedIfNamePrefixedAnyFilterTypeIsAny PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldBeUnknownIfResourceTypeUnknown STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldBeUnknownIfResourceTypeUnknown PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchPrefixedIfExactMatch STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchPrefixedIfExactMatch PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchIfDifferentName STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchIfDifferentName PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchLiteralIfNameMatchesAndFilterIsOnPatternTypeAny STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchLiteralIfNameMatchesAndFilterIsOnPatternTypeAny PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchWhereResourceTypeIsAny STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchWhereResourceTypeIsAny PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchLiteralWildcardIfFilterHasPatternTypeOfAny STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldNotMatchLiteralWildcardIfFilterHasPatternTypeOfAny PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchLiteralWildcardIfFilterHasPatternTypeOfMatch STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchLiteralWildcardIfFilterHasPatternTypeOfMatch PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchWherePatternTypeIsAny STARTED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchWherePatternTypeIsAny PASSED

org.apache.kafka.common.acl.ResourcePatternFilterTest > 
shouldMatchLiteralIfNameMatchesAndFilterIsOnPatternTypeMatch STARTED


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Matthias J. Sax
Ted,

Why? Each sink can have a different TopicNameExtractor.


-Matthias

On 6/25/18 5:19 PM, Ted Yu wrote:
> If there are different TopicNameExtractor classes from multiple sink nodes,
> the new method should throw exception alerting user of such scenario.
> 
> 
> On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck  wrote:
> 
>> Thanks for the KIP!
>>
>> Overall I'm +1 on the KIP.   I have one question.
>>
>> The KIP states that the method "topicNameExtractor()" is added to the
>> InternalTopologyBuilder.java.
>>
>> It could be that I'm missing something, but wow does this work if a user
>> has provided different TopicNameExtractor instances to multiple sink nodes?
>>
>> Thanks,
>> Bill
>>
>>
>>
>> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang  wrote:
>>
>>> Yup I agree, generally speaking the `toString()` output is not
>> recommended
>>> to be relied on programmatically in user's code, but we've observed
>>> convenience-beats-any-other-reasons again and again in development
>>> unfortunately. I think we should still not claiming it is part of the
>>> public APIs that would not be changed anyhow in the future, but just
>>> mentioning it in the wiki for people to be aware is fine.
>>>
>>>
>>> Guozhang
>>>
>>> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax 
>>> wrote:
>>>
 Thanks for the KIP!

 I am don't have any further comments.

 For Guozhang's comment: if we mention anything about `toString()`, we
 should make explicit that `toString()` output is still not public
 contract and users should not rely on the output.

 Furhtermore, for the actual uses output, I would replace "topic:" by
 "extractor class:" to make the difference obvious.

 I am just hoping that people actually to not rely on `toString()` what
 defeats the purpose to the `TopologyDescription` class that was
 introduced to avoid the dependency... (Just a side comment, not really
 related to this KIP proposal itself).


 If there are no further comments in the next days, feel free to start
 the VOTE and open a PR.




 -Matthias

 On 6/22/18 6:04 PM, Guozhang Wang wrote:
> Thanks for writing the KIP!
>
> I'm +1 on the proposed changes over all. One minor suggestion: we
>>> should
> also mention that the `Sink#toString` will also be updated, in a way
>>> that
> if `topic()` returns null, use the other call, etc. This is because
> although we do not explicitly state the following logic as public
 protocols:
>
> ```
>
> "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> nodeNames(predecessors);
>
>
> ```
>
> There are already some users that rely on
>>> `topology.describe().toString(
 )`
> to have runtime checks on the returned string values, so changing
>> this
> means that their app will break and hence need to make code changes.
>
> Guozhang
>
> On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
>>> nishanth...@gmail.com
>
> wrote:
>
>> Hello Everyone,
>>
>> I have created a new KIP to discuss extending TopologyDescription.
>> You
 can
>> find it here:
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
>>
>> Please provide any feedback that you might have.
>>
>> Best,
>> Nishanth Pradeep
>>
>
>
>


>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
> 



signature.asc
Description: OpenPGP digital signature


[jira] [Created] (KAFKA-7098) Improve accuracy of the log cleaner throttle rate

2018-06-25 Thread Dong Lin (JIRA)
Dong Lin created KAFKA-7098:
---

 Summary: Improve accuracy of the log cleaner throttle rate
 Key: KAFKA-7098
 URL: https://issues.apache.org/jira/browse/KAFKA-7098
 Project: Kafka
  Issue Type: Improvement
Reporter: Dong Lin
Assignee: Dong Lin


LogCleaner uses the Throttler class to throttler the log cleaning rate to the 
user-specified limit, i.e. log.cleaner.io.max.bytes.per.second. However, in 
Throttler.maybeThrottle(), the periodStartNs is set to the time before the 
sleep after the sleep() is called, which artificially increase the actual 
window size and under-estimate the actual log cleaning rate. This causes the 
log cleaning IO to be higher than the user-specified limit.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: SASL Unit test failing

2018-06-25 Thread Ted Yu
I ran the test on Linux as well.

cat /etc/redhat-release
CentOS Linux release 7.2.1511 (Core)

Java version: 1.8.0_161, vendor: Oracle Corporation
Java home: /jdk1.8.0_161/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "linux", version: "3.10.0-327.28.3.el7.x86_64", arch: "amd64",
family: "unix"

On Mon, Jun 25, 2018 at 5:42 PM, Ted Yu  wrote:

> Here was the command I used:
>
> ./gradlew -Dtest.single=SaslAuthenticatorTest clients:test
>
> On Mon, Jun 25, 2018 at 5:39 PM, Ahmed A  wrote:
>
>> I ran test with -i option as follows - "./gradlew  -i test".  The same set
>> of three tests failed.
>>
>> My environment:
>> $ java -version
>> java version "1.8.0_121"
>> Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
>> Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
>>
>> $ cat /etc/redhat-release
>> Red Hat Enterprise Linux Workstation release 7.3 (Maipo)
>> $ uname -a
>> Linux  ahmed  3.10.0-514.36.5.el7.x86_64 #1 SMP Thu Dec 28 21:42:18 EST
>> 2017 x86_64 x86_64 x86_64 GNU/Linux
>>
>>
>> Can you please let me know how I can run an individual unit test, what
>> options do I provide?
>>
>>
>> Thank you,
>> Ahmed.
>>
>>
>>
>> On Mon, Jun 25, 2018 at 2:47 PM, Ted Yu  wrote:
>>
>> > I ran the test alone which passed.
>> >
>> > Can you include -i on the command line to see if there is some clue from
>> > the output ?
>> >
>> > Here is my environment:
>> >
>> > Java version: 1.8.0_151, vendor: Oracle Corporation
>> > Java home:
>> > /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre
>> > Default locale: en_US, platform encoding: UTF-8
>> > OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"
>> >
>> > FYI
>> >
>> > On Mon, Jun 25, 2018 at 12:59 PM, Ahmed A  wrote:
>> >
>> > > Hello,
>> > >
>> > > I did a fresh clone of the kafka src code, and the following SASL unit
>> > > tests have been failing consistently:
>> > > - testMechanismPluggability
>> > > - testMechanismPluggability
>> > > - testMultipleServerMechanisms
>> > >
>> > > All three tests have similar stack trace:
>> > > at org.junit.Assert.assertTrue(Assert.java:52)
>> > > at
>> > > org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(
>> > > NetworkTestUtils.java:79)
>> > > at
>> > > org.apache.kafka.common.network.NetworkTestUtils.checkClient
>> Connection(
>> > > NetworkTestUtils.java:52)
>> > >
>> > > I also noticed, the three tests are using digest-md5.
>> > >
>> > > Has anyone else run into a similar issue or have any ideas for the
>> > failure?
>> > >
>> > > Thank you,
>> > > Ahmed.
>> > >
>> >
>>
>
>


Re: SASL Unit test failing

2018-06-25 Thread Ted Yu
Here was the command I used:

./gradlew -Dtest.single=SaslAuthenticatorTest clients:test

On Mon, Jun 25, 2018 at 5:39 PM, Ahmed A  wrote:

> I ran test with -i option as follows - "./gradlew  -i test".  The same set
> of three tests failed.
>
> My environment:
> $ java -version
> java version "1.8.0_121"
> Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
> Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)
>
> $ cat /etc/redhat-release
> Red Hat Enterprise Linux Workstation release 7.3 (Maipo)
> $ uname -a
> Linux  ahmed  3.10.0-514.36.5.el7.x86_64 #1 SMP Thu Dec 28 21:42:18 EST
> 2017 x86_64 x86_64 x86_64 GNU/Linux
>
>
> Can you please let me know how I can run an individual unit test, what
> options do I provide?
>
>
> Thank you,
> Ahmed.
>
>
>
> On Mon, Jun 25, 2018 at 2:47 PM, Ted Yu  wrote:
>
> > I ran the test alone which passed.
> >
> > Can you include -i on the command line to see if there is some clue from
> > the output ?
> >
> > Here is my environment:
> >
> > Java version: 1.8.0_151, vendor: Oracle Corporation
> > Java home:
> > /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre
> > Default locale: en_US, platform encoding: UTF-8
> > OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"
> >
> > FYI
> >
> > On Mon, Jun 25, 2018 at 12:59 PM, Ahmed A  wrote:
> >
> > > Hello,
> > >
> > > I did a fresh clone of the kafka src code, and the following SASL unit
> > > tests have been failing consistently:
> > > - testMechanismPluggability
> > > - testMechanismPluggability
> > > - testMultipleServerMechanisms
> > >
> > > All three tests have similar stack trace:
> > > at org.junit.Assert.assertTrue(Assert.java:52)
> > > at
> > > org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(
> > > NetworkTestUtils.java:79)
> > > at
> > > org.apache.kafka.common.network.NetworkTestUtils.
> checkClientConnection(
> > > NetworkTestUtils.java:52)
> > >
> > > I also noticed, the three tests are using digest-md5.
> > >
> > > Has anyone else run into a similar issue or have any ideas for the
> > failure?
> > >
> > > Thank you,
> > > Ahmed.
> > >
> >
>


Re: SASL Unit test failing

2018-06-25 Thread Ahmed A
I ran test with -i option as follows - "./gradlew  -i test".  The same set
of three tests failed.

My environment:
$ java -version
java version "1.8.0_121"
Java(TM) SE Runtime Environment (build 1.8.0_121-b13)
Java HotSpot(TM) 64-Bit Server VM (build 25.121-b13, mixed mode)

$ cat /etc/redhat-release
Red Hat Enterprise Linux Workstation release 7.3 (Maipo)
$ uname -a
Linux  ahmed  3.10.0-514.36.5.el7.x86_64 #1 SMP Thu Dec 28 21:42:18 EST
2017 x86_64 x86_64 x86_64 GNU/Linux


Can you please let me know how I can run an individual unit test, what
options do I provide?


Thank you,
Ahmed.



On Mon, Jun 25, 2018 at 2:47 PM, Ted Yu  wrote:

> I ran the test alone which passed.
>
> Can you include -i on the command line to see if there is some clue from
> the output ?
>
> Here is my environment:
>
> Java version: 1.8.0_151, vendor: Oracle Corporation
> Java home:
> /Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre
> Default locale: en_US, platform encoding: UTF-8
> OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"
>
> FYI
>
> On Mon, Jun 25, 2018 at 12:59 PM, Ahmed A  wrote:
>
> > Hello,
> >
> > I did a fresh clone of the kafka src code, and the following SASL unit
> > tests have been failing consistently:
> > - testMechanismPluggability
> > - testMechanismPluggability
> > - testMultipleServerMechanisms
> >
> > All three tests have similar stack trace:
> > at org.junit.Assert.assertTrue(Assert.java:52)
> > at
> > org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(
> > NetworkTestUtils.java:79)
> > at
> > org.apache.kafka.common.network.NetworkTestUtils.checkClientConnection(
> > NetworkTestUtils.java:52)
> >
> > I also noticed, the three tests are using digest-md5.
> >
> > Has anyone else run into a similar issue or have any ideas for the
> failure?
> >
> > Thank you,
> > Ahmed.
> >
>


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Ted Yu
If there are different TopicNameExtractor classes from multiple sink nodes,
the new method should throw exception alerting user of such scenario.


On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck  wrote:

> Thanks for the KIP!
>
> Overall I'm +1 on the KIP.   I have one question.
>
> The KIP states that the method "topicNameExtractor()" is added to the
> InternalTopologyBuilder.java.
>
> It could be that I'm missing something, but wow does this work if a user
> has provided different TopicNameExtractor instances to multiple sink nodes?
>
> Thanks,
> Bill
>
>
>
> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang  wrote:
>
> > Yup I agree, generally speaking the `toString()` output is not
> recommended
> > to be relied on programmatically in user's code, but we've observed
> > convenience-beats-any-other-reasons again and again in development
> > unfortunately. I think we should still not claiming it is part of the
> > public APIs that would not be changed anyhow in the future, but just
> > mentioning it in the wiki for people to be aware is fine.
> >
> >
> > Guozhang
> >
> > On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax 
> > wrote:
> >
> > > Thanks for the KIP!
> > >
> > > I am don't have any further comments.
> > >
> > > For Guozhang's comment: if we mention anything about `toString()`, we
> > > should make explicit that `toString()` output is still not public
> > > contract and users should not rely on the output.
> > >
> > > Furhtermore, for the actual uses output, I would replace "topic:" by
> > > "extractor class:" to make the difference obvious.
> > >
> > > I am just hoping that people actually to not rely on `toString()` what
> > > defeats the purpose to the `TopologyDescription` class that was
> > > introduced to avoid the dependency... (Just a side comment, not really
> > > related to this KIP proposal itself).
> > >
> > >
> > > If there are no further comments in the next days, feel free to start
> > > the VOTE and open a PR.
> > >
> > >
> > >
> > >
> > > -Matthias
> > >
> > > On 6/22/18 6:04 PM, Guozhang Wang wrote:
> > > > Thanks for writing the KIP!
> > > >
> > > > I'm +1 on the proposed changes over all. One minor suggestion: we
> > should
> > > > also mention that the `Sink#toString` will also be updated, in a way
> > that
> > > > if `topic()` returns null, use the other call, etc. This is because
> > > > although we do not explicitly state the following logic as public
> > > protocols:
> > > >
> > > > ```
> > > >
> > > > "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> > > > nodeNames(predecessors);
> > > >
> > > >
> > > > ```
> > > >
> > > > There are already some users that rely on
> > `topology.describe().toString(
> > > )`
> > > > to have runtime checks on the returned string values, so changing
> this
> > > > means that their app will break and hence need to make code changes.
> > > >
> > > > Guozhang
> > > >
> > > > On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
> > nishanth...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Hello Everyone,
> > > >>
> > > >> I have created a new KIP to discuss extending TopologyDescription.
> You
> > > can
> > > >> find it here:
> > > >>
> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> > > >>
> > > >> Please provide any feedback that you might have.
> > > >>
> > > >> Best,
> > > >> Nishanth Pradeep
> > > >>
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Guozhang Wang
Good catch. I think the proposed change is to add that function in
InternalTopologyBuilder#Sink class.



Guozhang

On Mon, Jun 25, 2018 at 2:23 PM, Bill Bejeck  wrote:

> Thanks for the KIP!
>
> Overall I'm +1 on the KIP.   I have one question.
>
> The KIP states that the method "topicNameExtractor()" is added to the
> InternalTopologyBuilder.java.
>
> It could be that I'm missing something, but wow does this work if a user
> has provided different TopicNameExtractor instances to multiple sink nodes?
>
> Thanks,
> Bill
>
>
>
> On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang  wrote:
>
> > Yup I agree, generally speaking the `toString()` output is not
> recommended
> > to be relied on programmatically in user's code, but we've observed
> > convenience-beats-any-other-reasons again and again in development
> > unfortunately. I think we should still not claiming it is part of the
> > public APIs that would not be changed anyhow in the future, but just
> > mentioning it in the wiki for people to be aware is fine.
> >
> >
> > Guozhang
> >
> > On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax 
> > wrote:
> >
> > > Thanks for the KIP!
> > >
> > > I am don't have any further comments.
> > >
> > > For Guozhang's comment: if we mention anything about `toString()`, we
> > > should make explicit that `toString()` output is still not public
> > > contract and users should not rely on the output.
> > >
> > > Furhtermore, for the actual uses output, I would replace "topic:" by
> > > "extractor class:" to make the difference obvious.
> > >
> > > I am just hoping that people actually to not rely on `toString()` what
> > > defeats the purpose to the `TopologyDescription` class that was
> > > introduced to avoid the dependency... (Just a side comment, not really
> > > related to this KIP proposal itself).
> > >
> > >
> > > If there are no further comments in the next days, feel free to start
> > > the VOTE and open a PR.
> > >
> > >
> > >
> > >
> > > -Matthias
> > >
> > > On 6/22/18 6:04 PM, Guozhang Wang wrote:
> > > > Thanks for writing the KIP!
> > > >
> > > > I'm +1 on the proposed changes over all. One minor suggestion: we
> > should
> > > > also mention that the `Sink#toString` will also be updated, in a way
> > that
> > > > if `topic()` returns null, use the other call, etc. This is because
> > > > although we do not explicitly state the following logic as public
> > > protocols:
> > > >
> > > > ```
> > > >
> > > > "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> > > > nodeNames(predecessors);
> > > >
> > > >
> > > > ```
> > > >
> > > > There are already some users that rely on
> > `topology.describe().toString(
> > > )`
> > > > to have runtime checks on the returned string values, so changing
> this
> > > > means that their app will break and hence need to make code changes.
> > > >
> > > > Guozhang
> > > >
> > > > On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
> > nishanth...@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Hello Everyone,
> > > >>
> > > >> I have created a new KIP to discuss extending TopologyDescription.
> You
> > > can
> > > >> find it here:
> > > >>
> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > >> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> > > >>
> > > >> Please provide any feedback that you might have.
> > > >>
> > > >> Best,
> > > >> Nishanth Pradeep
> > > >>
> > > >
> > > >
> > > >
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang


Re: [DISCUSS] KIP-323: Schedulable KTable as Graph source

2018-06-25 Thread Guozhang Wang
Flávio, thanks for creating this KIP.

I think this "single-aggregation" use case is common enough that we should
consider how to efficiently supports it: for example, for KSQL that's built
on top of Streams, we've seen lots of query statements whose return is
expected a single row indicating the "total aggregate" etc. See
https://github.com/confluentinc/ksql/issues/430 for details.

I've not read through https://issues.apache.org/jira/browse/KAFKA-6953, but
I'm wondering if we have discussed the option of supporting it in a
"pre-aggregate" manner: that is we do partial aggregates on parallel tasks,
and then sends the partial aggregated value via a single topic partition
for the final aggregate, to reduce the traffic on that single partition and
hence the final aggregate workload.
Of course, for non-commutative aggregates we'd probably need to provide
another API in addition to aggregate, like the `merge` function for
session-based aggregates, to let users customize the operations of merging
two partial aggregates into a single partial aggregate. What's its pros and
cons compared with the current proposal?


Guozhang

On Mon, Jun 25, 2018 at 3:12 PM, Ted Yu  wrote:

> This would be useful feature.
>
> In the Public Interfaces section, the new method lacks a closing
> parenthesis.
>
> In the Proposed Changes section, if the order of the 3 bullets can match
> the order of the parameters of the new method, it would be easier to read.
>
> For Rejected Alternatives #2, can you add a sentence saying why it was
> rejected ?
>
> Cheers
>
> On Mon, Jun 25, 2018 at 10:13 AM, Flávio Stutz 
> wrote:
>
> > Hey, guys, I've just started a KIP discussion here:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 323%3A+Schedulable+KTable+as+Graph+source
> >
>



-- 
-- Guozhang


Re: [DISCUSS] KIP-323: Schedulable KTable as Graph source

2018-06-25 Thread Ted Yu
This would be useful feature.

In the Public Interfaces section, the new method lacks a closing
parenthesis.

In the Proposed Changes section, if the order of the 3 bullets can match
the order of the parameters of the new method, it would be easier to read.

For Rejected Alternatives #2, can you add a sentence saying why it was
rejected ?

Cheers

On Mon, Jun 25, 2018 at 10:13 AM, Flávio Stutz 
wrote:

> Hey, guys, I've just started a KIP discussion here:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 323%3A+Schedulable+KTable+as+Graph+source
>


Re: [VOTE] KIP-319: Replace numSegments to segmentInterval in Streams window configurations

2018-06-25 Thread Guozhang Wang
+1

On Mon, Jun 25, 2018 at 2:58 PM, Ted Yu  wrote:

> +1
>
> On Mon, Jun 25, 2018 at 2:56 PM, John Roesler  wrote:
>
> > Hello All,
> >
> > Thanks for the discussion on KIP-319. I'd now like to start the voting.
> >
> > As a reminder, KIP-319 proposes a fix to an issue I identified in
> > KAFKA-7080. Specifically, the issue is that we're creating
> > CachingWindowStore with the *number of segments* instead of the *segment
> > size*.
> >
> > Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> > Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
> >
> > Additionally, here's a draft PR for clarity:
> > https://github.com/apache/kafka/pull/5257
> >
> > Thanks,
> > -John
> >
>



-- 
-- Guozhang


Re: [VOTE] KIP-319: Replace numSegments to segmentInterval in Streams window configurations

2018-06-25 Thread Ted Yu
+1

On Mon, Jun 25, 2018 at 2:56 PM, John Roesler  wrote:

> Hello All,
>
> Thanks for the discussion on KIP-319. I'd now like to start the voting.
>
> As a reminder, KIP-319 proposes a fix to an issue I identified in
> KAFKA-7080. Specifically, the issue is that we're creating
> CachingWindowStore with the *number of segments* instead of the *segment
> size*.
>
> Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
> Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ
>
> Additionally, here's a draft PR for clarity:
> https://github.com/apache/kafka/pull/5257
>
> Thanks,
> -John
>


[VOTE] KIP-319: Replace numSegments to segmentInterval in Streams window configurations

2018-06-25 Thread John Roesler
Hello All,

Thanks for the discussion on KIP-319. I'd now like to start the voting.

As a reminder, KIP-319 proposes a fix to an issue I identified in
KAFKA-7080. Specifically, the issue is that we're creating
CachingWindowStore with the *number of segments* instead of the *segment
size*.

Here's the jira: https://issues.apache.org/jira/browse/KAFKA-7080
Here's the KIP: https://cwiki.apache.org/confluence/x/mQU0BQ

Additionally, here's a draft PR for clarity:
https://github.com/apache/kafka/pull/5257

Thanks,
-John


Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests

2018-06-25 Thread Lucas Wang
Hi Harsha,

If I understand correctly, the replication quota mechanism proposed in
KIP-73 can be helpful in that scenario.
Have you tried it out?

Thanks,
Lucas

On Sun, Jun 24, 2018 at 8:28 AM, Harsha  wrote:

> Hi Lucas,
>  One more question, any thoughts on making this configurable
> and also allowing subset of data requests to be prioritized. For example
> ,we notice in our cluster when we take out a broker and bring new one it
> will try to become follower and have lot of fetch requests to other leaders
> in clusters. This will negatively effect the application/client requests.
> We are also exploring the similar solution to de-prioritize if a new
> replica comes in for fetch requests, we are ok with the replica to be
> taking time but the leaders should prioritize the client requests.
>
>
> Thanks,
> Harsha
>
> On Fri, Jun 22nd, 2018 at 11:35 AM Lucas Wang wrote:
>
> >
> >
> >
> > Hi Eno,
> >
> > Sorry for the delayed response.
> > - I haven't implemented the feature yet, so no experimental results so
> > far.
> > And I plan to test in out in the following days.
> >
> > - You are absolutely right that the priority queue does not completely
> > prevent
> > data requests being processed ahead of controller requests.
> > That being said, I expect it to greatly mitigate the effect of stable
> > metadata.
> > In any case, I'll try it out and post the results when I have it.
> >
> > Regards,
> > Lucas
> >
> > On Wed, Jun 20, 2018 at 5:44 AM, Eno Thereska < eno.there...@gmail.com >
> > wrote:
> >
> > > Hi Lucas,
> > >
> > > Sorry for the delay, just had a look at this. A couple of questions:
> > > - did you notice any positive change after implementing this KIP? I'm
> > > wondering if you have any experimental results that show the benefit of
> > the
> > > two queues.
> > >
> > > - priority is usually not sufficient in addressing the problem the KIP
> > > identifies. Even with priority queues, you will sometimes (often?) have
> > the
> > > case that data plane requests will be ahead of the control plane
> > requests.
> > > This happens because the system might have already started processing
> > the
> > > data plane requests before the control plane ones arrived. So it would
> > be
> > > good to know what % of the problem this KIP addresses.
> > >
> > > Thanks
> > > Eno
> > >
> >
> >
> > > On Fri, Jun 15, 2018 at 4:44 PM, Ted Yu < yuzhih...@gmail.com > wrote:
> > >
> > > > Change looks good.
> > > >
> > > > Thanks
> > > >
> > > > On Fri, Jun 15, 2018 at 8:42 AM, Lucas Wang < lucasatu...@gmail.com
> >
> > > wrote:
> > > >
> > > > > Hi Ted,
> > > > >
> > > > > Thanks for the suggestion. I've updated the KIP. Please take
> another
> >
> > > > look.
> > > > >
> > > > > Lucas
> > > > >
> > > > > On Thu, Jun 14, 2018 at 6:34 PM, Ted Yu < yuzhih...@gmail.com >
> > wrote:
> > > > >
> > > > > > Currently in KafkaConfig.scala :
> > > > > >
> > > > > > val QueuedMaxRequests = 500
> > > > > >
> > > > > > It would be good if you can include the default value for this
> new
> >
> > > > config
> > > > > > in the KIP.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Thu, Jun 14, 2018 at 4:28 PM, Lucas Wang <
> lucasatu...@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Hi Ted, Dong
> > > > > > >
> > > > > > > I've updated the KIP by adding a new config, instead of reusing
> > the
> > > > > > > existing one.
> > > > > > > Please take another look when you have time. Thanks a lot!
> > > > > > >
> > > > > > > Lucas
> > > > > > >
> > > > > > > On Thu, Jun 14, 2018 at 2:33 PM, Ted Yu < yuzhih...@gmail.com
> >
> > > wrote:
> > > > > > >
> > > > > > > > bq. that's a waste of resource if control request rate is low
> > > > > > > >
> > > > > > > > I don't know if control request rate can get to 100,000,
> > likely
> > > > not.
> > > > > > Then
> > > > > > > > using the same bound as that for data requests seems high.
> > > > > > > >
> > > > > > > > On Wed, Jun 13, 2018 at 10:13 PM, Lucas Wang <
> > > > lucasatu...@gmail.com >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Ted,
> > > > > > > > >
> > > > > > > > > Thanks for taking a look at this KIP.
> > > > > > > > > Let's say today the setting of "queued.max.requests" in
> > > cluster A
> > > > > is
> > > > > > > > 1000,
> > > > > > > > > while the setting in cluster B is 100,000.
> > > > > > > > > The 100 times difference might have indicated that machines
> > in
> > > > > > cluster
> > > > > > > B
> > > > > > > > > have larger memory.
> > > > > > > > >
> > > > > > > > > By reusing the "queued.max.requests", the
> > controlRequestQueue
> > > in
> > > > > > > cluster
> > > > > > > > B
> > > > > > > > > automatically
> > > > > > > > > gets a 100x capacity without explicitly bothering the
> > > operators.
> > > > > > > > > I understand the counter argument can be that maybe that's
> a
> >
> > > > waste
> > > > > of
> > > > > > > > > resource if control request
> > > > > > > > > rate is low and operators may want to fine tune the
> 

[jira] [Resolved] (KAFKA-6978) Make Streams Window retention time strict

2018-06-25 Thread John Roesler (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6978?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Roesler resolved KAFKA-6978.
-
   Resolution: Fixed
Fix Version/s: 2.1.0

This feature was merged in 
https://github.com/apache/kafka/commit/954be11bf2d3dc9fa11a69830d2ef5ff580ff533

> Make Streams Window retention time strict
> -
>
> Key: KAFKA-6978
> URL: https://issues.apache.org/jira/browse/KAFKA-6978
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: John Roesler
>Assignee: John Roesler
>Priority: Major
> Fix For: 2.1.0
>
>
> Currently, the configured retention time for windows is a lower bound. We 
> actually keep the window around until it's time to roll a new segment. At 
> that time, we drop all windows in the oldest segment.
> As long as a window is still in a segment, we will continue to add 
> late-arriving records to it and also serve IQ queries from it. This is sort 
> of nice, because it makes optimistic use of the fact that the windows live 
> for some time after their retention expires. However, it is also a source of 
> (apparent) non-determinism, and it's arguably better for programability if we 
> adhere strictly to the configured constraints.
> Therefore, the new behavior will be:
>  * once the retention time for a window passes, Streams will drop any 
> later-arriving records (with a warning log and a metric)
>  * likewise, IQ will first check whether the window is younger than its 
> retention time before answering queries.
> No changes need to be made to the underlying segment management, this is 
> purely to make the behavior more strict wrt the configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: SASL Unit test failing

2018-06-25 Thread Ted Yu
I ran the test alone which passed.

Can you include -i on the command line to see if there is some clue from
the output ?

Here is my environment:

Java version: 1.8.0_151, vendor: Oracle Corporation
Java home:
/Library/Java/JavaVirtualMachines/jdk1.8.0_151.jdk/Contents/Home/jre
Default locale: en_US, platform encoding: UTF-8
OS name: "mac os x", version: "10.11.3", arch: "x86_64", family: "mac"

FYI

On Mon, Jun 25, 2018 at 12:59 PM, Ahmed A  wrote:

> Hello,
>
> I did a fresh clone of the kafka src code, and the following SASL unit
> tests have been failing consistently:
> - testMechanismPluggability
> - testMechanismPluggability
> - testMultipleServerMechanisms
>
> All three tests have similar stack trace:
> at org.junit.Assert.assertTrue(Assert.java:52)
> at
> org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(
> NetworkTestUtils.java:79)
> at
> org.apache.kafka.common.network.NetworkTestUtils.checkClientConnection(
> NetworkTestUtils.java:52)
>
> I also noticed, the three tests are using digest-md5.
>
> Has anyone else run into a similar issue or have any ideas for the failure?
>
> Thank you,
> Ahmed.
>


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-25 Thread John Roesler
Ah, it turns out I did create a ticket: it's KAFKA-7080:
https://issues.apache.org/jira/browse/KAFKA-7080

-John

On Mon, Jun 25, 2018 at 4:44 PM John Roesler  wrote:

> Matthias,
>
> That's a good idea. I'm not sure why I didn't...
>
> Thanks,
> -John
>
> On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax 
> wrote:
>
>> Ok.
>>
>> @John: can you create a JIRA to track this? I think KAFKA-4730 is
>> related, but actually an own ticket (that is blocked by not having
>> Materialized for stream-stream joins).
>>
>>
>> -Matthias
>>
>> On 6/25/18 2:10 PM, Bill Bejeck wrote:
>> > I agree that it makes sense to have segmentInterval as a parameter to a
>> > store, but I also agree with Guozhang's point about not moving as part
>> of
>> > this KIP.
>> >
>> > Thanks,
>> > Bill
>> >
>> > On Mon, Jun 25, 2018 at 4:17 PM John Roesler  wrote:
>> >
>> >> Thanks Matthias and Guozhang,
>> >>
>> >> About deprecating the "segments" field instead of making it private.
>> Yes, I
>> >> just took another look at the code, and that is correct. I'll update
>> the
>> >> KIP.
>> >>
>> >> I do agree that in the long run, it makes more sense as a parameter to
>> the
>> >> store somehow than as a parameter to the window. I think this isn't a
>> super
>> >> high priority, though, because it's not exposed in the DSL (or it
>> wasn't
>> >> intended to be).
>> >>
>> >> I felt Guozhang's point is valid, and that we should probably revisit
>> it
>> >> later, possibly in the scope of
>> >> https://issues.apache.org/jira/browse/KAFKA-4730
>> >>
>> >> I'll wait an hour or so for more feedback before moving on to a vote.
>> >>
>> >> Thanks again,
>> >> -John
>> >>
>> >> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang 
>> wrote:
>> >>
>> >>> Re `segmentInterval` parameter in Windows: currently it is used in two
>> >>> places, the windowed stream aggregation, and the stream-stream joins.
>> For
>> >>> the former, we can potentially move the parameter from windowedBy() to
>> >>> Materialized, but for the latter we currently do not expose a
>> >> Materialized
>> >>> object yet, only the Windows spec. So I think in this KIP we probably
>> >>> cannot move it immediately.
>> >>>
>> >>> But in future KIPs if we decide to expose the stream-stream join's
>> store
>> >> /
>> >>> changelog / repartition topic names, we may well adding the
>> Materialized
>> >>> object into the operator, and we can then move the parameter to
>> >>> Materialized by then.
>> >>>
>> >>>
>> >>> Guozhang
>> >>>
>> >>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
>> matth...@confluent.io>
>> >>> wrote:
>> >>>
>>  Thanks for the KIP. Overall, I think it makes sense to clean up the
>> >> API.
>> 
>>  Couple of comments:
>> 
>> > Sadly there's no way to "deprecate" this
>> > exposure
>> 
>>  I disagree. We can just mark the variable as deprecated and I
>> advocate
>>  to do this. When the deprecation period passed, we can make it
>> private
>>  (or actually remove it; cf. my next comment).
>> 
>> 
>>  Parameter, `segmentInterval` is semantically not a "window"
>>  specification parameter but an implementation detail and thus a store
>>  parameter. Would it be better to add it to `Materialized`?
>> 
>> 
>>  -Matthias
>> 
>>  On 6/22/18 5:13 PM, Guozhang Wang wrote:
>> > Thanks John.
>> >
>> > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler 
>> >>> wrote:
>> >
>> >> Thanks for the feedback, Bill and Guozhang,
>> >>
>> >> I've updated the KIP accordingly.
>> >>
>> >> Thanks,
>> >> -John
>> >>
>> >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang 
>>  wrote:
>> >>
>> >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
>> >> the
>> >> wiki:
>> >>> the `In Windows, we will:` section code snippet is empty.
>> >>>
>> >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck 
>>  wrote:
>> >>>
>>  Hi John,
>> 
>>  Thanks for the KIP, and overall it's a +1 for me.
>> 
>>  In the JavaDoc for the segmentInterval method, there's no mention
>> >> of
>> >> the
>>  number of segments there can be at any one time.  While it's
>> >> implied
>> >> that
>>  the number of segments is potentially unbounded, would be better
>> >> to
>>  explicitly state that the previous limit on the number of
>> segments
>> >>> is
>> >>> going
>>  to be removed as well?
>> 
>>  I have a couple of nit comments.   The method name is still
>>  segmentSize
>> >>> in
>>  the code block vs segmentInterval and the order of the parameters
>> >>> for
>> >> the
>>  third persistentWindowStore don't match the order in the JavaDoc.
>> 
>>  Thanks,
>>  Bill
>> 
>> 
>> 
>>  On Thu, Jun 21, 2018 at 3:32 PM John Roesler 
>> >> wrote:
>> 
>> > I've updated the KIP and 

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-25 Thread John Roesler
Matthias,

That's a good idea. I'm not sure why I didn't...

Thanks,
-John

On Mon, Jun 25, 2018 at 4:35 PM Matthias J. Sax 
wrote:

> Ok.
>
> @John: can you create a JIRA to track this? I think KAFKA-4730 is
> related, but actually an own ticket (that is blocked by not having
> Materialized for stream-stream joins).
>
>
> -Matthias
>
> On 6/25/18 2:10 PM, Bill Bejeck wrote:
> > I agree that it makes sense to have segmentInterval as a parameter to a
> > store, but I also agree with Guozhang's point about not moving as part of
> > this KIP.
> >
> > Thanks,
> > Bill
> >
> > On Mon, Jun 25, 2018 at 4:17 PM John Roesler  wrote:
> >
> >> Thanks Matthias and Guozhang,
> >>
> >> About deprecating the "segments" field instead of making it private.
> Yes, I
> >> just took another look at the code, and that is correct. I'll update the
> >> KIP.
> >>
> >> I do agree that in the long run, it makes more sense as a parameter to
> the
> >> store somehow than as a parameter to the window. I think this isn't a
> super
> >> high priority, though, because it's not exposed in the DSL (or it wasn't
> >> intended to be).
> >>
> >> I felt Guozhang's point is valid, and that we should probably revisit it
> >> later, possibly in the scope of
> >> https://issues.apache.org/jira/browse/KAFKA-4730
> >>
> >> I'll wait an hour or so for more feedback before moving on to a vote.
> >>
> >> Thanks again,
> >> -John
> >>
> >> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang 
> wrote:
> >>
> >>> Re `segmentInterval` parameter in Windows: currently it is used in two
> >>> places, the windowed stream aggregation, and the stream-stream joins.
> For
> >>> the former, we can potentially move the parameter from windowedBy() to
> >>> Materialized, but for the latter we currently do not expose a
> >> Materialized
> >>> object yet, only the Windows spec. So I think in this KIP we probably
> >>> cannot move it immediately.
> >>>
> >>> But in future KIPs if we decide to expose the stream-stream join's
> store
> >> /
> >>> changelog / repartition topic names, we may well adding the
> Materialized
> >>> object into the operator, and we can then move the parameter to
> >>> Materialized by then.
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax <
> matth...@confluent.io>
> >>> wrote:
> >>>
>  Thanks for the KIP. Overall, I think it makes sense to clean up the
> >> API.
> 
>  Couple of comments:
> 
> > Sadly there's no way to "deprecate" this
> > exposure
> 
>  I disagree. We can just mark the variable as deprecated and I advocate
>  to do this. When the deprecation period passed, we can make it private
>  (or actually remove it; cf. my next comment).
> 
> 
>  Parameter, `segmentInterval` is semantically not a "window"
>  specification parameter but an implementation detail and thus a store
>  parameter. Would it be better to add it to `Materialized`?
> 
> 
>  -Matthias
> 
>  On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > Thanks John.
> >
> > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler 
> >>> wrote:
> >
> >> Thanks for the feedback, Bill and Guozhang,
> >>
> >> I've updated the KIP accordingly.
> >>
> >> Thanks,
> >> -John
> >>
> >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang 
>  wrote:
> >>
> >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
> >> the
> >> wiki:
> >>> the `In Windows, we will:` section code snippet is empty.
> >>>
> >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck 
>  wrote:
> >>>
>  Hi John,
> 
>  Thanks for the KIP, and overall it's a +1 for me.
> 
>  In the JavaDoc for the segmentInterval method, there's no mention
> >> of
> >> the
>  number of segments there can be at any one time.  While it's
> >> implied
> >> that
>  the number of segments is potentially unbounded, would be better
> >> to
>  explicitly state that the previous limit on the number of segments
> >>> is
> >>> going
>  to be removed as well?
> 
>  I have a couple of nit comments.   The method name is still
>  segmentSize
> >>> in
>  the code block vs segmentInterval and the order of the parameters
> >>> for
> >> the
>  third persistentWindowStore don't match the order in the JavaDoc.
> 
>  Thanks,
>  Bill
> 
> 
> 
>  On Thu, Jun 21, 2018 at 3:32 PM John Roesler 
> >> wrote:
> 
> > I've updated the KIP and draft PR accordingly.
> >
> > On Thu, Jun 21, 2018 at 2:03 PM John Roesler 
> >>> wrote:
> >
> >> Interesting... I did not initially consider it because I didn't
> >> want
> >>> to
> >> have an impact on anyone's Streams apps, but now I see that
> >> unless
> >> developers have subclassed 

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-25 Thread Matthias J. Sax
Ok.

@John: can you create a JIRA to track this? I think KAFKA-4730 is
related, but actually an own ticket (that is blocked by not having
Materialized for stream-stream joins).


-Matthias

On 6/25/18 2:10 PM, Bill Bejeck wrote:
> I agree that it makes sense to have segmentInterval as a parameter to a
> store, but I also agree with Guozhang's point about not moving as part of
> this KIP.
> 
> Thanks,
> Bill
> 
> On Mon, Jun 25, 2018 at 4:17 PM John Roesler  wrote:
> 
>> Thanks Matthias and Guozhang,
>>
>> About deprecating the "segments" field instead of making it private. Yes, I
>> just took another look at the code, and that is correct. I'll update the
>> KIP.
>>
>> I do agree that in the long run, it makes more sense as a parameter to the
>> store somehow than as a parameter to the window. I think this isn't a super
>> high priority, though, because it's not exposed in the DSL (or it wasn't
>> intended to be).
>>
>> I felt Guozhang's point is valid, and that we should probably revisit it
>> later, possibly in the scope of
>> https://issues.apache.org/jira/browse/KAFKA-4730
>>
>> I'll wait an hour or so for more feedback before moving on to a vote.
>>
>> Thanks again,
>> -John
>>
>> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang  wrote:
>>
>>> Re `segmentInterval` parameter in Windows: currently it is used in two
>>> places, the windowed stream aggregation, and the stream-stream joins. For
>>> the former, we can potentially move the parameter from windowedBy() to
>>> Materialized, but for the latter we currently do not expose a
>> Materialized
>>> object yet, only the Windows spec. So I think in this KIP we probably
>>> cannot move it immediately.
>>>
>>> But in future KIPs if we decide to expose the stream-stream join's store
>> /
>>> changelog / repartition topic names, we may well adding the Materialized
>>> object into the operator, and we can then move the parameter to
>>> Materialized by then.
>>>
>>>
>>> Guozhang
>>>
>>> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax 
>>> wrote:
>>>
 Thanks for the KIP. Overall, I think it makes sense to clean up the
>> API.

 Couple of comments:

> Sadly there's no way to "deprecate" this
> exposure

 I disagree. We can just mark the variable as deprecated and I advocate
 to do this. When the deprecation period passed, we can make it private
 (or actually remove it; cf. my next comment).


 Parameter, `segmentInterval` is semantically not a "window"
 specification parameter but an implementation detail and thus a store
 parameter. Would it be better to add it to `Materialized`?


 -Matthias

 On 6/22/18 5:13 PM, Guozhang Wang wrote:
> Thanks John.
>
> On Fri, Jun 22, 2018 at 5:05 PM, John Roesler 
>>> wrote:
>
>> Thanks for the feedback, Bill and Guozhang,
>>
>> I've updated the KIP accordingly.
>>
>> Thanks,
>> -John
>>
>> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang 
 wrote:
>>
>>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
>> the
>> wiki:
>>> the `In Windows, we will:` section code snippet is empty.
>>>
>>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck 
 wrote:
>>>
 Hi John,

 Thanks for the KIP, and overall it's a +1 for me.

 In the JavaDoc for the segmentInterval method, there's no mention
>> of
>> the
 number of segments there can be at any one time.  While it's
>> implied
>> that
 the number of segments is potentially unbounded, would be better
>> to
 explicitly state that the previous limit on the number of segments
>>> is
>>> going
 to be removed as well?

 I have a couple of nit comments.   The method name is still
 segmentSize
>>> in
 the code block vs segmentInterval and the order of the parameters
>>> for
>> the
 third persistentWindowStore don't match the order in the JavaDoc.

 Thanks,
 Bill



 On Thu, Jun 21, 2018 at 3:32 PM John Roesler 
>> wrote:

> I've updated the KIP and draft PR accordingly.
>
> On Thu, Jun 21, 2018 at 2:03 PM John Roesler 
>>> wrote:
>
>> Interesting... I did not initially consider it because I didn't
>> want
>>> to
>> have an impact on anyone's Streams apps, but now I see that
>> unless
>> developers have subclassed `Windows`, the number of segments
>> would
 always
>> be 3!
>>
>> There's one caveat to this, which I think was a mistake. The
>> field
>> `segments` in Windows is public, which means that anyone can
>> actually
 set
>> it directly on any Window instance like:
>>
>> TimeWindows tw = TimeWindows.of(100);
>> tw.segments = 12345;
>>
>> Bypassing the 

Re: [DISCUSS]KIP-216: IQ should throw different exceptions for different errors

2018-06-25 Thread Matthias J. Sax
The scenario I had I mind was, that KS is started in one thread while a
second thread has a reference to the object to issue queries.

If a query is issue before the "main thread" started KS, and the "query
thread" knows that it will eventually get started, it can retry. On the
other hand, if KS is in state PENDING_SHUTDOWN or DEAD, it is impossible
to issue any query against it now or in the future and thus the error is
not retryable.


-Matthias

On 6/25/18 10:15 AM, Guozhang Wang wrote:
> I'm wondering if StreamThreadNotStarted could be merged into
> StreamThreadNotRunning, because I think users' handling logic for the third
> case would be likely the same as the second. Do you have some scenarios
> where users may want to handle them differently?
> 
> Guozhang
> 
> On Sun, Jun 24, 2018 at 5:25 PM, Matthias J. Sax 
> wrote:
> 
>> Sorry to hear! Get well soon!
>>
>> It's not a big deal if the KIP stalls a little bit. Feel free to pick it
>> up again when you find time.
>>
> Is `StreamThreadNotRunningException` really an retryable error?

 When KafkaStream state is REBALANCING, I think it is a retryable error.

 StreamThreadStateStoreProvider#stores() will throw
 StreamThreadNotRunningException when StreamThread state is not
>> RUNNING. The
 user can retry until KafkaStream state is RUNNING.
>>
>> I see. If this is the intention, than I would suggest to have two (or
>> maybe three) different exceptions:
>>
>>  - StreamThreadRebalancingException (retryable)
>>  - StreamThreadNotRunning (not retryable -- thrown if in state
>> PENDING_SHUTDOWN or DEAD
>>  - maybe StreamThreadNotStarted (for state CREATED)
>>
>> The last one is tricky and could also be merged into one of the first
>> two, depending if you want to argue that it's retryable or not. (Just
>> food for though -- not sure what others think.)
>>
>>
>>
>> -Matthias
>>
>> On 6/22/18 8:06 AM, vito jeng wrote:
>>> Matthias,
>>>
>>> Thank you for your assistance.
>>>
 what is the status of this KIP?
>>>
>>> Unfortunately, there is no further progress.
>>> About seven weeks ago, I was injured in sports. I had a broken wrist on
>>> my left wrist.
>>> Many jobs are affected, including this KIP and implementation.
>>>
>>>
 I just re-read it, and have a couple of follow up comments. Why do we
 discuss the internal exceptions you want to add? Also, do we really need
 them? Can't we just throw the correct exception directly instead of
 wrapping it later?
>>>
>>> I think you may be right. As I say in the previous:
>>> "The original idea is that we can distinguish different state store
>>> exception for different handling. But to be honest, I am not quite sure
>>> this is necessary. Maybe have some change during implementation."
>>>
>>> During the implementation, I also feel we maybe not need wrapper it.
>>> We can just throw the correctly directly.
>>>
>>>
 Is `StreamThreadNotRunningException` really an retryable error?
>>>
>>> When KafkaStream state is REBALANCING, I think it is a retryable error.
>>>
>>> StreamThreadStateStoreProvider#stores() will throw
>>> StreamThreadNotRunningException when StreamThread state is not RUNNING.
>> The
>>> user can retry until KafkaStream state is RUNNING.
>>>
>>>
 When would we throw an `StateStoreEmptyException`? The semantics is
>>> unclear to me atm.
>>>
 When the state is RUNNING, is `StateStoreClosedException` a retryable
>>> error?
>>>
>>> These two comments will be answered in another mail.
>>>
>>>
>>>
>>> ---
>>> Vito
>>>
>>> On Mon, Jun 11, 2018 at 8:12 AM, Matthias J. Sax 
>>> wrote:
>>>
 Vito,

 what is the status of this KIP?

 I just re-read it, and have a couple of follow up comments. Why do we
 discuss the internal exceptions you want to add? Also, do we really need
 them? Can't we just throw the correct exception directly instead of
 wrapping it later?

 When would we throw an `StateStoreEmptyException`? The semantics is
 unclear to me atm.

 Is `StreamThreadNotRunningException` really an retryable error?

 When the state is RUNNING, is `StateStoreClosedException` a retryable
 error?

 One more nits: ReadOnlyWindowStore got a new method #fetch(K key, long
 time); that should be added


 Overall I like the KIP but some details are still unclear. Maybe it
 might help if you open an PR in parallel?


 -Matthias

 On 4/24/18 8:18 AM, vito jeng wrote:
> Hi, Guozhang,
>
> Thanks for the comment!
>
>
>
> Hi, Bill,
>
> I'll try to make some update to make the KIP better.
>
> Thanks for the comment!
>
>
> ---
> Vito
>
> On Sat, Apr 21, 2018 at 5:40 AM, Bill Bejeck 
>> wrote:
>
>> Hi Vito,
>>
>> Thanks for the KIP, overall it's a +1 from me.
>>
>> At this point, the only thing I would change is possibly removing the
>> listing of all methods called by the 

Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Bill Bejeck
Thanks for the KIP!

Overall I'm +1 on the KIP.   I have one question.

The KIP states that the method "topicNameExtractor()" is added to the
InternalTopologyBuilder.java.

It could be that I'm missing something, but wow does this work if a user
has provided different TopicNameExtractor instances to multiple sink nodes?

Thanks,
Bill



On Mon, Jun 25, 2018 at 1:25 PM Guozhang Wang  wrote:

> Yup I agree, generally speaking the `toString()` output is not recommended
> to be relied on programmatically in user's code, but we've observed
> convenience-beats-any-other-reasons again and again in development
> unfortunately. I think we should still not claiming it is part of the
> public APIs that would not be changed anyhow in the future, but just
> mentioning it in the wiki for people to be aware is fine.
>
>
> Guozhang
>
> On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax 
> wrote:
>
> > Thanks for the KIP!
> >
> > I am don't have any further comments.
> >
> > For Guozhang's comment: if we mention anything about `toString()`, we
> > should make explicit that `toString()` output is still not public
> > contract and users should not rely on the output.
> >
> > Furhtermore, for the actual uses output, I would replace "topic:" by
> > "extractor class:" to make the difference obvious.
> >
> > I am just hoping that people actually to not rely on `toString()` what
> > defeats the purpose to the `TopologyDescription` class that was
> > introduced to avoid the dependency... (Just a side comment, not really
> > related to this KIP proposal itself).
> >
> >
> > If there are no further comments in the next days, feel free to start
> > the VOTE and open a PR.
> >
> >
> >
> >
> > -Matthias
> >
> > On 6/22/18 6:04 PM, Guozhang Wang wrote:
> > > Thanks for writing the KIP!
> > >
> > > I'm +1 on the proposed changes over all. One minor suggestion: we
> should
> > > also mention that the `Sink#toString` will also be updated, in a way
> that
> > > if `topic()` returns null, use the other call, etc. This is because
> > > although we do not explicitly state the following logic as public
> > protocols:
> > >
> > > ```
> > >
> > > "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> > > nodeNames(predecessors);
> > >
> > >
> > > ```
> > >
> > > There are already some users that rely on
> `topology.describe().toString(
> > )`
> > > to have runtime checks on the returned string values, so changing this
> > > means that their app will break and hence need to make code changes.
> > >
> > > Guozhang
> > >
> > > On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep <
> nishanth...@gmail.com
> > >
> > > wrote:
> > >
> > >> Hello Everyone,
> > >>
> > >> I have created a new KIP to discuss extending TopologyDescription. You
> > can
> > >> find it here:
> > >>
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> > >>
> > >> Please provide any feedback that you might have.
> > >>
> > >> Best,
> > >> Nishanth Pradeep
> > >>
> > >
> > >
> > >
> >
> >
>
>
> --
> -- Guozhang
>


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-25 Thread Bill Bejeck
I agree that it makes sense to have segmentInterval as a parameter to a
store, but I also agree with Guozhang's point about not moving as part of
this KIP.

Thanks,
Bill

On Mon, Jun 25, 2018 at 4:17 PM John Roesler  wrote:

> Thanks Matthias and Guozhang,
>
> About deprecating the "segments" field instead of making it private. Yes, I
> just took another look at the code, and that is correct. I'll update the
> KIP.
>
> I do agree that in the long run, it makes more sense as a parameter to the
> store somehow than as a parameter to the window. I think this isn't a super
> high priority, though, because it's not exposed in the DSL (or it wasn't
> intended to be).
>
> I felt Guozhang's point is valid, and that we should probably revisit it
> later, possibly in the scope of
> https://issues.apache.org/jira/browse/KAFKA-4730
>
> I'll wait an hour or so for more feedback before moving on to a vote.
>
> Thanks again,
> -John
>
> On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang  wrote:
>
> > Re `segmentInterval` parameter in Windows: currently it is used in two
> > places, the windowed stream aggregation, and the stream-stream joins. For
> > the former, we can potentially move the parameter from windowedBy() to
> > Materialized, but for the latter we currently do not expose a
> Materialized
> > object yet, only the Windows spec. So I think in this KIP we probably
> > cannot move it immediately.
> >
> > But in future KIPs if we decide to expose the stream-stream join's store
> /
> > changelog / repartition topic names, we may well adding the Materialized
> > object into the operator, and we can then move the parameter to
> > Materialized by then.
> >
> >
> > Guozhang
> >
> > On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax 
> > wrote:
> >
> > > Thanks for the KIP. Overall, I think it makes sense to clean up the
> API.
> > >
> > > Couple of comments:
> > >
> > > > Sadly there's no way to "deprecate" this
> > > > exposure
> > >
> > > I disagree. We can just mark the variable as deprecated and I advocate
> > > to do this. When the deprecation period passed, we can make it private
> > > (or actually remove it; cf. my next comment).
> > >
> > >
> > > Parameter, `segmentInterval` is semantically not a "window"
> > > specification parameter but an implementation detail and thus a store
> > > parameter. Would it be better to add it to `Materialized`?
> > >
> > >
> > > -Matthias
> > >
> > > On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > > > Thanks John.
> > > >
> > > > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler 
> > wrote:
> > > >
> > > >> Thanks for the feedback, Bill and Guozhang,
> > > >>
> > > >> I've updated the KIP accordingly.
> > > >>
> > > >> Thanks,
> > > >> -John
> > > >>
> > > >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang 
> > > wrote:
> > > >>
> > > >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on
> the
> > > >> wiki:
> > > >>> the `In Windows, we will:` section code snippet is empty.
> > > >>>
> > > >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck 
> > > wrote:
> > > >>>
> > >  Hi John,
> > > 
> > >  Thanks for the KIP, and overall it's a +1 for me.
> > > 
> > >  In the JavaDoc for the segmentInterval method, there's no mention
> of
> > > >> the
> > >  number of segments there can be at any one time.  While it's
> implied
> > > >> that
> > >  the number of segments is potentially unbounded, would be better
> to
> > >  explicitly state that the previous limit on the number of segments
> > is
> > > >>> going
> > >  to be removed as well?
> > > 
> > >  I have a couple of nit comments.   The method name is still
> > > segmentSize
> > > >>> in
> > >  the code block vs segmentInterval and the order of the parameters
> > for
> > > >> the
> > >  third persistentWindowStore don't match the order in the JavaDoc.
> > > 
> > >  Thanks,
> > >  Bill
> > > 
> > > 
> > > 
> > >  On Thu, Jun 21, 2018 at 3:32 PM John Roesler 
> > > >> wrote:
> > > 
> > > > I've updated the KIP and draft PR accordingly.
> > > >
> > > > On Thu, Jun 21, 2018 at 2:03 PM John Roesler 
> > > >>> wrote:
> > > >
> > > >> Interesting... I did not initially consider it because I didn't
> > > >> want
> > > >>> to
> > > >> have an impact on anyone's Streams apps, but now I see that
> unless
> > > >> developers have subclassed `Windows`, the number of segments
> would
> > >  always
> > > >> be 3!
> > > >>
> > > >> There's one caveat to this, which I think was a mistake. The
> field
> > > >> `segments` in Windows is public, which means that anyone can
> > > >> actually
> > >  set
> > > >> it directly on any Window instance like:
> > > >>
> > > >> TimeWindows tw = TimeWindows.of(100);
> > > >> tw.segments = 12345;
> > > >>
> > > >> Bypassing the bounds check and contradicting the javadoc in
> > Windows
> > >  that
> > > >> says users can't directly set it. 

Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-25 Thread John Roesler
Thanks Matthias and Guozhang,

About deprecating the "segments" field instead of making it private. Yes, I
just took another look at the code, and that is correct. I'll update the
KIP.

I do agree that in the long run, it makes more sense as a parameter to the
store somehow than as a parameter to the window. I think this isn't a super
high priority, though, because it's not exposed in the DSL (or it wasn't
intended to be).

I felt Guozhang's point is valid, and that we should probably revisit it
later, possibly in the scope of
https://issues.apache.org/jira/browse/KAFKA-4730

I'll wait an hour or so for more feedback before moving on to a vote.

Thanks again,
-John

On Mon, Jun 25, 2018 at 12:20 PM Guozhang Wang  wrote:

> Re `segmentInterval` parameter in Windows: currently it is used in two
> places, the windowed stream aggregation, and the stream-stream joins. For
> the former, we can potentially move the parameter from windowedBy() to
> Materialized, but for the latter we currently do not expose a Materialized
> object yet, only the Windows spec. So I think in this KIP we probably
> cannot move it immediately.
>
> But in future KIPs if we decide to expose the stream-stream join's store /
> changelog / repartition topic names, we may well adding the Materialized
> object into the operator, and we can then move the parameter to
> Materialized by then.
>
>
> Guozhang
>
> On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax 
> wrote:
>
> > Thanks for the KIP. Overall, I think it makes sense to clean up the API.
> >
> > Couple of comments:
> >
> > > Sadly there's no way to "deprecate" this
> > > exposure
> >
> > I disagree. We can just mark the variable as deprecated and I advocate
> > to do this. When the deprecation period passed, we can make it private
> > (or actually remove it; cf. my next comment).
> >
> >
> > Parameter, `segmentInterval` is semantically not a "window"
> > specification parameter but an implementation detail and thus a store
> > parameter. Would it be better to add it to `Materialized`?
> >
> >
> > -Matthias
> >
> > On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > > Thanks John.
> > >
> > > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler 
> wrote:
> > >
> > >> Thanks for the feedback, Bill and Guozhang,
> > >>
> > >> I've updated the KIP accordingly.
> > >>
> > >> Thanks,
> > >> -John
> > >>
> > >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang 
> > wrote:
> > >>
> > >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the
> > >> wiki:
> > >>> the `In Windows, we will:` section code snippet is empty.
> > >>>
> > >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck 
> > wrote:
> > >>>
> >  Hi John,
> > 
> >  Thanks for the KIP, and overall it's a +1 for me.
> > 
> >  In the JavaDoc for the segmentInterval method, there's no mention of
> > >> the
> >  number of segments there can be at any one time.  While it's implied
> > >> that
> >  the number of segments is potentially unbounded, would be better to
> >  explicitly state that the previous limit on the number of segments
> is
> > >>> going
> >  to be removed as well?
> > 
> >  I have a couple of nit comments.   The method name is still
> > segmentSize
> > >>> in
> >  the code block vs segmentInterval and the order of the parameters
> for
> > >> the
> >  third persistentWindowStore don't match the order in the JavaDoc.
> > 
> >  Thanks,
> >  Bill
> > 
> > 
> > 
> >  On Thu, Jun 21, 2018 at 3:32 PM John Roesler 
> > >> wrote:
> > 
> > > I've updated the KIP and draft PR accordingly.
> > >
> > > On Thu, Jun 21, 2018 at 2:03 PM John Roesler 
> > >>> wrote:
> > >
> > >> Interesting... I did not initially consider it because I didn't
> > >> want
> > >>> to
> > >> have an impact on anyone's Streams apps, but now I see that unless
> > >> developers have subclassed `Windows`, the number of segments would
> >  always
> > >> be 3!
> > >>
> > >> There's one caveat to this, which I think was a mistake. The field
> > >> `segments` in Windows is public, which means that anyone can
> > >> actually
> >  set
> > >> it directly on any Window instance like:
> > >>
> > >> TimeWindows tw = TimeWindows.of(100);
> > >> tw.segments = 12345;
> > >>
> > >> Bypassing the bounds check and contradicting the javadoc in
> Windows
> >  that
> > >> says users can't directly set it. Sadly there's no way to
> > >> "deprecate"
> > > this
> > >> exposure, so I propose just to make it private.
> > >>
> > >> With this new knowledge, I agree, I think we can switch to
> > >> "segmentInterval" throughout the interface.
> > >>
> > >> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang  >
> > > wrote:
> > >>
> > >>> Hello John,
> > >>>
> > >>> Thanks for the KIP.
> > >>>
> > >>> Should we consider making the change on
> > >>> 

Re: Request access to create KIP

2018-06-25 Thread Yishun Guan
Thank you!

On Mon, Jun 25, 2018, 1:03 PM Jason Gustafson  wrote:

> Done. Thanks for contributing!
>
> -Jason
>
> On Mon, Jun 25, 2018 at 12:49 PM, Yishun Guan  wrote:
>
> > Hi, could someone give me access to create KIP? Thanks! - Yishun
> >
> > On Mon, Jun 25, 2018, 10:44 AM Yishun Guan  wrote:
> >
> > > Hi, my wiki id is gyishun. Thanks! - Yishun
> > >
> >
>


Re: Request access to create KIP

2018-06-25 Thread Jason Gustafson
Done. Thanks for contributing!

-Jason

On Mon, Jun 25, 2018 at 12:49 PM, Yishun Guan  wrote:

> Hi, could someone give me access to create KIP? Thanks! - Yishun
>
> On Mon, Jun 25, 2018, 10:44 AM Yishun Guan  wrote:
>
> > Hi, my wiki id is gyishun. Thanks! - Yishun
> >
>


SASL Unit test failing

2018-06-25 Thread Ahmed A
Hello,

I did a fresh clone of the kafka src code, and the following SASL unit
tests have been failing consistently:
- testMechanismPluggability
- testMechanismPluggability
- testMultipleServerMechanisms

All three tests have similar stack trace:
at org.junit.Assert.assertTrue(Assert.java:52)
at
org.apache.kafka.common.network.NetworkTestUtils.waitForChannelReady(NetworkTestUtils.java:79)
at
org.apache.kafka.common.network.NetworkTestUtils.checkClientConnection(NetworkTestUtils.java:52)

I also noticed, the three tests are using digest-md5.

Has anyone else run into a similar issue or have any ideas for the failure?

Thank you,
Ahmed.


Re: Request access to create KIP

2018-06-25 Thread Yishun Guan
Hi, could someone give me access to create KIP? Thanks! - Yishun

On Mon, Jun 25, 2018, 10:44 AM Yishun Guan  wrote:

> Hi, my wiki id is gyishun. Thanks! - Yishun
>


[jira] [Created] (KAFKA-7097) VerifiableProducer does not work properly with --message-create-time argument

2018-06-25 Thread Jasper Knulst (JIRA)
Jasper Knulst created KAFKA-7097:


 Summary: VerifiableProducer does not work properly with 
--message-create-time argument
 Key: KAFKA-7097
 URL: https://issues.apache.org/jira/browse/KAFKA-7097
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 1.0.0
Reporter: Jasper Knulst


If you run:

 

./bin/kafka-verifiable-producer.sh --broker-list  --topic 
test_topic_increasing_p2 --message-create-time  --acks -1 
--max-messages 100

the "" for --message-create-time doesn't take a 13 digit long 
like 1529656934000. 

The error message:

verifiable-producer: error: argument --message-create-time: could not convert 
'1529656934000' to Integer (For input string: "1529656934000")

 

When you provide a 10 digit (1529656934) epoch for the argument it does work 
but this leads to your topic being cleaned up in a few minutes since the 
retention time is expired.

 

The error seems to be obvious since VerifiableProducer.java has:

        Long createTime = (long) res.getInt("createTime");

when parsing the argument. This should be taken as a Long instead.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Build failed in Jenkins: kafka-trunk-jdk8 #2766

2018-06-25 Thread Apache Jenkins Server
See 


Changes:

[mjsax] Minor: add exception to debug log for

--
[...truncated 877.51 KB...]

kafka.utils.SchedulerTest > testRestart STARTED

kafka.utils.SchedulerTest > testRestart PASSED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler STARTED

kafka.utils.SchedulerTest > testReentrantTaskInMockScheduler PASSED

kafka.utils.SchedulerTest > testPeriodicTask STARTED

kafka.utils.SchedulerTest > testPeriodicTask PASSED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered STARTED

kafka.utils.LoggingTest > testLog4jControllerIsRegistered PASSED

kafka.utils.LoggingTest > testLogName STARTED

kafka.utils.LoggingTest > testLogName PASSED

kafka.utils.LoggingTest > testLogNameOverride STARTED

kafka.utils.LoggingTest > testLogNameOverride PASSED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod STARTED

kafka.utils.ZkUtilsTest > testGetSequenceIdMethod PASSED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testAbortedConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions STARTED

kafka.utils.ZkUtilsTest > testGetAllPartitionsTopicWithoutPartitions PASSED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath STARTED

kafka.utils.ZkUtilsTest > testSuccessfulConditionalDeletePath PASSED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath STARTED

kafka.utils.ZkUtilsTest > testPersistentSequentialPath PASSED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing STARTED

kafka.utils.ZkUtilsTest > testClusterIdentifierJsonParsing PASSED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition STARTED

kafka.utils.ZkUtilsTest > testGetLeaderIsrAndEpochForPartition PASSED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 STARTED

kafka.utils.CoreUtilsTest > testGenerateUuidAsBase64 PASSED

kafka.utils.CoreUtilsTest > testAbs STARTED

kafka.utils.CoreUtilsTest > testAbs PASSED

kafka.utils.CoreUtilsTest > testReplaceSuffix STARTED

kafka.utils.CoreUtilsTest > testReplaceSuffix PASSED

kafka.utils.CoreUtilsTest > testCircularIterator STARTED

kafka.utils.CoreUtilsTest > testCircularIterator PASSED

kafka.utils.CoreUtilsTest > testReadBytes STARTED

kafka.utils.CoreUtilsTest > testReadBytes PASSED

kafka.utils.CoreUtilsTest > testCsvList STARTED

kafka.utils.CoreUtilsTest > testCsvList PASSED

kafka.utils.CoreUtilsTest > testReadInt STARTED

kafka.utils.CoreUtilsTest > testReadInt PASSED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate STARTED

kafka.utils.CoreUtilsTest > testAtomicGetOrUpdate PASSED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID STARTED

kafka.utils.CoreUtilsTest > testUrlSafeBase64EncodeUUID PASSED

kafka.utils.CoreUtilsTest > testCsvMap STARTED

kafka.utils.CoreUtilsTest > testCsvMap PASSED

kafka.utils.CoreUtilsTest > testInLock STARTED

kafka.utils.CoreUtilsTest > testInLock PASSED

kafka.utils.CoreUtilsTest > testTryAll STARTED

kafka.utils.CoreUtilsTest > testTryAll PASSED

kafka.utils.CoreUtilsTest > testSwallow STARTED

kafka.utils.CoreUtilsTest > testSwallow PASSED

kafka.utils.IteratorTemplateTest > testIterator STARTED

kafka.utils.IteratorTemplateTest > testIterator PASSED

kafka.utils.json.JsonValueTest > testJsonObjectIterator STARTED

kafka.utils.json.JsonValueTest > testJsonObjectIterator PASSED

kafka.utils.json.JsonValueTest > testDecodeLong STARTED

kafka.utils.json.JsonValueTest > testDecodeLong PASSED

kafka.utils.json.JsonValueTest > testAsJsonObject STARTED

kafka.utils.json.JsonValueTest > testAsJsonObject PASSED

kafka.utils.json.JsonValueTest > testDecodeDouble STARTED

kafka.utils.json.JsonValueTest > testDecodeDouble PASSED

kafka.utils.json.JsonValueTest > testDecodeOption STARTED

kafka.utils.json.JsonValueTest > testDecodeOption PASSED

kafka.utils.json.JsonValueTest > testDecodeString STARTED

kafka.utils.json.JsonValueTest > testDecodeString PASSED

kafka.utils.json.JsonValueTest > testJsonValueToString STARTED

kafka.utils.json.JsonValueTest > testJsonValueToString PASSED

kafka.utils.json.JsonValueTest > testAsJsonObjectOption STARTED

kafka.utils.json.JsonValueTest > testAsJsonObjectOption PASSED

kafka.utils.json.JsonValueTest > testAsJsonArrayOption STARTED

kafka.utils.json.JsonValueTest > testAsJsonArrayOption PASSED

kafka.utils.json.JsonValueTest > testAsJsonArray STARTED

kafka.utils.json.JsonValueTest > testAsJsonArray PASSED

kafka.utils.json.JsonValueTest > testJsonValueHashCode STARTED

kafka.utils.json.JsonValueTest > testJsonValueHashCode PASSED

kafka.utils.json.JsonValueTest > testDecodeInt STARTED

kafka.utils.json.JsonValueTest > testDecodeInt PASSED

kafka.utils.json.JsonValueTest > testDecodeMap STARTED

kafka.utils.json.JsonValueTest > testDecodeMap PASSED

kafka.utils.json.JsonValueTest > testDecodeSeq STARTED

kafka.utils.json.JsonValueTest > testDecodeSeq PASSED


Re: [VOTE] 2.0.0 RC0

2018-06-25 Thread Thomas Crayford
+1 (non-binding) Heroku has run our usual set of upgrade and performance
tests, and we haven't found any notable issues through that.

On Sat, Jun 23, 2018 at 12:30 AM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> +1 (non-binding)
>
> Built from source and ran quickstart successfully on Ubuntu (with Java 8
> and Java 9).
>
> Thanks Rajini!
> --Vahid
>
>


Re: [DISCUSS] - KIP-314: KTable to GlobalKTable Bi-directional Join

2018-06-25 Thread Adam Bellemare
Thanks for your help so far guys.

While I do think that I have a fairly reasonable way forward for
restructuring the topologies and threads, there is, unfortunately, what I
believe is a fatal flaw that cannot be easily resolved. I have updated the
page (
https://cwiki.apache.org/confluence/display/KAFKA/KIP-314%3A+KTable+to+GlobalKTable+Bi-directional+Join
) with the impediments to the solution, all of which revolve around
ensuring that data consistency is maintained. It seems to me that
GlobalKTables are not the way forward here and that I may be best
redirecting my efforts towards KIP-213 ( https://cwiki.apache.org/
confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable ).

I would appreciate being proven wrong on my impediment listings, but if
there are no other ideas I think we should close this KIP and the
associated JIRA. A KTable to GlobalKTable join driven just by the KTable is
simply performed by a stream to GKT join with a groupbyKey and reduce to
form a state-store, so I would see no need to keep it open otherwise
(unless just for the shorthand notation).

Thanks again

Adam

On Fri, Jun 22, 2018 at 9:00 PM, Guozhang Wang  wrote:

> Hello Adam,
>
> Please see my comments inline.
>
> On Thu, Jun 21, 2018 at 8:14 AM, Adam Bellemare 
> wrote:
>
> > Hi Guozhang
> >
> > *Re: Questions*
> > *1)* I do not yet have a solution to this, but I also did not look that
> > closely at it when I begun this KIP. I admit that I was unaware of
> exactly
> > how the GlobalKTable worked alongside the KTable/KStream topologies. You
> > mention "It means the two topologies will be merged, and that merged
> > topology can only be executed as a single task, by a single thread. " -
> is
> > the problem here that the merged topology would be parallelized to other
> > threads/instances? While I am becoming familiar with how the topologies
> are
> > created under the hood, I am not yet fully clear on the implications of
> > your statement. I will look into this further.
> >
> >
> Yes. The issue is that today each task is executed by a single thread only
> at any given time, and hence any state stores are only accessed by a single
> thread (except for interactive queries, and for global tables where the
> global update thread write to the global store, and the local thread read
> from the global store), if we let the global store update thread to be also
> triggering joins and puts send the results into the downstream operators,
> then it means that the global store update thread can access on any state
> stores in the subsequent part of the topology, breaking our current
> threading model.
>
>
> > *2)* " do you mean that although we have a duplicated state store:
> > ModifiedEvents in addition to the original Events with only the enhanced
> > key, this is not avoidable anyways even if we do re-keying?" Yes, that is
> > correct, that is what I meant. I need to improve my knowledge around this
> > component too. I have been browsing the KIP-213 discussion thread and
> > looking at Jan's code
> >
> > *Re: Comments*
> > *1) *Makes sense. I will update the diagram accordingly. Thanks!
> >
> > *2)* Wouldn't outer join require that we emit records from the right
> > GlobalKTable that have no match in the left KTable? This seems undefined
> to
> > me with the current proposal (above issues aside), since multiple threads
> > would be producing the same output event for a single GlobalKTable
> update.
> >
> >
> I was considering mainly about the semantics of table-table joins, that
> whether we should add this operator inside our API. Implementation wise, we
> will only have one global store update thread per instance, so there will
> not be multiple threads producing the same output, but still there would be
> other issues that we should consider indeed, as mentioned above. Again this
> comment is not about implementations, but API wise if it is desirable to
> add it.
>
>
> >
> > Questions for you both:
> > Q1) Is a KTable always materialized? I am looking at the code under the
> > hood, and it seems to me that it's either materialized with an explicit
> > Materialized object, or it's given an anonymous name and the default
> serdes
> > are used. Am I correct in this observation?
> >
> >
> A KTable is not always materialized. For example, a KTable generated from
> `KTable#filter` or `KTable#mapValues` does not create a new materialized
> state store, but we use the caller `KTable` 's state store for anyone who
> wants to query it in joins.
>
> Moving forward, we are also trying to optimize the topology to only
> "logically" materialize a KTable when necessary, this is summarized in
> https://issues.apache.org/jira/browse/KAFKA-6761
>
>
> >
> > Thanks,
> > Adam
> >
> >
>
> --
> -- Guozhang
>


Request access to create KIP

2018-06-25 Thread Yishun Guan
Hi, my wiki id is gyishun. Thanks! - Yishun


[jira] [Created] (KAFKA-7096) Consumer should drop the data for unassigned topic partitions

2018-06-25 Thread Mayuresh Gharat (JIRA)
Mayuresh Gharat created KAFKA-7096:
--

 Summary: Consumer should drop the data for unassigned topic 
partitions
 Key: KAFKA-7096
 URL: https://issues.apache.org/jira/browse/KAFKA-7096
 Project: Kafka
  Issue Type: Improvement
Reporter: Mayuresh Gharat
Assignee: Mayuresh Gharat


currently if a client has assigned topics : T1, T2, T3 and calls poll(), the 
poll might fetch data for partitions for all 3 topics T1, T2, T3. Now if the 
client unassigns some topics (for example T3) and calls poll() we still hold 
the data (for T3) in the completedFetches queue until we actually reach the 
buffered data for the unassigned Topics (T3 in our example) on subsequent 
poll() calls, at which point we drop that data. This process of holding the 
data is unnecessary.

When a client creates a topic, it takes time for the broker to fetch ACLs for 
the topic. But during this time, the client will issue fetchRequest for the 
topic, it will get response for the partitions of this topic. The response 
consist of TopicAuthorizationException for each of the partitions. This 
response for each partition is wrapped with a completedFetch and added to the 
completedFetches queue. Now when the client calls the next poll() it sees the 
TopicAuthorizationException from the first buffered CompletedFetch. At this 
point the client chooses to sleep for 1.5 min as a backoff (as per the design), 
hoping that the Broker fetches the ACL from ACL store in the meantime. Actually 
the Broker has already fetched the ACL by this time. When the client calls 
poll() after the sleep, it again sees the TopicAuthorizationException from the 
second completedFetch and it sleeps again. So it takes (1.5 * 60 * partitions) 
seconds before the client can see any data. With this patch, the client when it 
sees the first TopicAuthorizationException, it can all assign(EmptySet), which 
will get rid of the buffered completedFetches (those with 
TopicAuthorizationException) and it can again call assign(TopicPartitions) 
before calling poll(). With this patch we found that client was able to get the 
records as soon as the Broker fetched the ACLs from ACL store.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-312: Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2018-06-25 Thread Damian Guy
Thanks Bill! +1

On Mon, 25 Jun 2018 at 18:57 Ted Yu  wrote:

> +1
>
> On Mon, Jun 25, 2018 at 9:45 AM, Guozhang Wang  wrote:
>
> > +1.
> >
> > On Mon, Jun 25, 2018 at 8:12 AM, Matthias J. Sax 
> > wrote:
> >
> > > +1 (binding)
> > >
> > > On 6/25/18 6:11 AM, Bill Bejeck wrote:
> > > > All,
> > > > I'd like to start a vote for this KIP now.
> > > >
> > > > Thanks,
> > > > Bill
> > > >
> > >
> > >
> >
> >
> > --
> > -- Guozhang
> >
>


Re: [kafka-clients] Re: [VOTE] 1.1.1 RC1

2018-06-25 Thread Manikumar
+1 (non-binding)  Ran tests,  Verified quick start,  producer/consumer perf
tests


On Sat, Jun 23, 2018 at 8:11 AM Dong Lin  wrote:

> Thank you for testing and voting the release!
>
> I noticed that the date for 1.1.1-rc1 is wrong. Please kindly test and
> vote by Tuesday, June 26, 12 pm PT.
>
> Thanks,
> Dong
>
> On Fri, Jun 22, 2018 at 10:09 AM, Dong Lin  wrote:
>
>> Hello Kafka users, developers and client-developers,
>>
>> This is the second candidate for release of Apache Kafka 1.1.1.
>>
>> Apache Kafka 1.1.1 is a bug-fix release for the 1.1 branch that was
>> first released with 1.1.0 about 3 months ago. We have fixed about 25 issues
>> since that release. A few of the more significant fixes include:
>>
>> KAFKA-6925  - Fix
>> memory leak in StreamsMetricsThreadImpl
>> KAFKA-6937  - In-sync
>> replica delayed during fetch if replica throttle is exceeded
>> KAFKA-6917  - Process
>> txn completion asynchronously to avoid deadlock
>> KAFKA-6893  - Create
>> processors before starting acceptor to avoid ArithmeticException
>> KAFKA-6870  -
>> Fix ConcurrentModificationException in SampledStat
>> KAFKA-6878  - Fix
>> NullPointerException when querying global state store
>> KAFKA-6879  - Invoke
>> session init callbacks outside lock to avoid Controller deadlock
>> KAFKA-6857  - Prevent
>> follower from truncating to the wrong offset if undefined leader epoch is
>> requested
>> KAFKA-6854  - Log
>> cleaner fails with transaction markers that are deleted during clean
>> KAFKA-6747  - Check
>> whether there is in-flight transaction before aborting transaction
>> KAFKA-6748  - Double
>> check before scheduling a new task after the punctuate call
>> KAFKA-6739  -
>> Fix IllegalArgumentException when down-converting from V2 to V0/V1
>> KAFKA-6728  -
>> Fix NullPointerException when instantiating the HeaderConverter
>>
>> Kafka 1.1.1 release plan:
>> https://cwiki.apache.org/confluence/display/KAFKA/Release+Plan+1.1.1
>>
>> Release notes for the 1.1.1 release:
>> http://home.apache.org/~lindong/kafka-1.1.1-rc1/RELEASE_NOTES.html
>>
>> *** Please download, test and vote by Thursday, Jun 22, 12pm PT ***
>>
>> Kafka's KEYS file containing PGP keys we use to sign the release:
>> http://kafka.apache.org/KEYS
>>
>> * Release artifacts to be voted upon (source and binary):
>> http://home.apache.org/~lindong/kafka-1.1.1-rc1/
>>
>> * Maven artifacts to be voted upon:
>> https://repository.apache.org/content/groups/staging/
>>
>> * Javadoc:
>> http://home.apache.org/~lindong/kafka-1.1.1-rc1/javadoc/
>>
>> * Tag to be voted upon (off 1.1 branch) is the 1.1.1-rc1 tag:
>> https://github.com/apache/kafka/tree/1.1.1-rc1
>>
>> * Documentation:
>> http://kafka.apache.org/11/documentation.html
>>
>> * Protocol:
>> http://kafka.apache.org/11/protocol.html
>>
>> * Successful Jenkins builds for the 1.1 branch:
>> Unit/integration tests: *https://builds.apache.org/job/kafka-1.1-jdk7/152/
>> *
>> System tests:
>> https://jenkins.confluent.io/job/system-test-kafka-branch-builder/1817
>>
>>
>> Please test and verify the release artifacts and submit a vote for this
>> RC,
>> or report any issues so we can fix them and get a new RC out ASAP.
>> Although
>> this release vote requires PMC votes to pass, testing, votes, and bug
>> reports are valuable and appreciated from everyone.
>>
>> Cheers,
>> Dong
>>
>>
>>
> --
> You received this message because you are subscribed to the Google Groups
> "kafka-clients" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kafka-clients+unsubscr...@googlegroups.com.
> To post to this group, send email to kafka-clie...@googlegroups.com.
> Visit this group at https://groups.google.com/group/kafka-clients.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/kafka-clients/CAAaarBZCqdUPK8asaZS0ws0yr_vjFw0o8RxFcdRv07%3Df_7g%3DkQ%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>


Re: [Discuss] KIP-321: Add method to get TopicNameExtractor in TopologyDescription

2018-06-25 Thread Guozhang Wang
Yup I agree, generally speaking the `toString()` output is not recommended
to be relied on programmatically in user's code, but we've observed
convenience-beats-any-other-reasons again and again in development
unfortunately. I think we should still not claiming it is part of the
public APIs that would not be changed anyhow in the future, but just
mentioning it in the wiki for people to be aware is fine.


Guozhang

On Sun, Jun 24, 2018 at 5:01 PM, Matthias J. Sax 
wrote:

> Thanks for the KIP!
>
> I am don't have any further comments.
>
> For Guozhang's comment: if we mention anything about `toString()`, we
> should make explicit that `toString()` output is still not public
> contract and users should not rely on the output.
>
> Furhtermore, for the actual uses output, I would replace "topic:" by
> "extractor class:" to make the difference obvious.
>
> I am just hoping that people actually to not rely on `toString()` what
> defeats the purpose to the `TopologyDescription` class that was
> introduced to avoid the dependency... (Just a side comment, not really
> related to this KIP proposal itself).
>
>
> If there are no further comments in the next days, feel free to start
> the VOTE and open a PR.
>
>
>
>
> -Matthias
>
> On 6/22/18 6:04 PM, Guozhang Wang wrote:
> > Thanks for writing the KIP!
> >
> > I'm +1 on the proposed changes over all. One minor suggestion: we should
> > also mention that the `Sink#toString` will also be updated, in a way that
> > if `topic()` returns null, use the other call, etc. This is because
> > although we do not explicitly state the following logic as public
> protocols:
> >
> > ```
> >
> > "Sink: " + name + " (topic: " + topic() + ")\n  <-- " +
> > nodeNames(predecessors);
> >
> >
> > ```
> >
> > There are already some users that rely on `topology.describe().toString(
> )`
> > to have runtime checks on the returned string values, so changing this
> > means that their app will break and hence need to make code changes.
> >
> > Guozhang
> >
> > On Wed, Jun 20, 2018 at 7:20 PM, Nishanth Pradeep  >
> > wrote:
> >
> >> Hello Everyone,
> >>
> >> I have created a new KIP to discuss extending TopologyDescription. You
> can
> >> find it here:
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >> 321%3A+Add+method+to+get+TopicNameExtractor+in+TopologyDescription
> >>
> >> Please provide any feedback that you might have.
> >>
> >> Best,
> >> Nishanth Pradeep
> >>
> >
> >
> >
>
>


-- 
-- Guozhang


Re: [DISCUSS] KIP-319: Replace segments with segmentSize in WindowBytesStoreSupplier

2018-06-25 Thread Guozhang Wang
Re `segmentInterval` parameter in Windows: currently it is used in two
places, the windowed stream aggregation, and the stream-stream joins. For
the former, we can potentially move the parameter from windowedBy() to
Materialized, but for the latter we currently do not expose a Materialized
object yet, only the Windows spec. So I think in this KIP we probably
cannot move it immediately.

But in future KIPs if we decide to expose the stream-stream join's store /
changelog / repartition topic names, we may well adding the Materialized
object into the operator, and we can then move the parameter to
Materialized by then.


Guozhang

On Sun, Jun 24, 2018 at 5:16 PM, Matthias J. Sax 
wrote:

> Thanks for the KIP. Overall, I think it makes sense to clean up the API.
>
> Couple of comments:
>
> > Sadly there's no way to "deprecate" this
> > exposure
>
> I disagree. We can just mark the variable as deprecated and I advocate
> to do this. When the deprecation period passed, we can make it private
> (or actually remove it; cf. my next comment).
>
>
> Parameter, `segmentInterval` is semantically not a "window"
> specification parameter but an implementation detail and thus a store
> parameter. Would it be better to add it to `Materialized`?
>
>
> -Matthias
>
> On 6/22/18 5:13 PM, Guozhang Wang wrote:
> > Thanks John.
> >
> > On Fri, Jun 22, 2018 at 5:05 PM, John Roesler  wrote:
> >
> >> Thanks for the feedback, Bill and Guozhang,
> >>
> >> I've updated the KIP accordingly.
> >>
> >> Thanks,
> >> -John
> >>
> >> On Fri, Jun 22, 2018 at 5:51 PM Guozhang Wang 
> wrote:
> >>
> >>> Thanks for the KIP. I'm +1 on the proposal. One minor comment on the
> >> wiki:
> >>> the `In Windows, we will:` section code snippet is empty.
> >>>
> >>> On Fri, Jun 22, 2018 at 3:29 PM, Bill Bejeck 
> wrote:
> >>>
>  Hi John,
> 
>  Thanks for the KIP, and overall it's a +1 for me.
> 
>  In the JavaDoc for the segmentInterval method, there's no mention of
> >> the
>  number of segments there can be at any one time.  While it's implied
> >> that
>  the number of segments is potentially unbounded, would be better to
>  explicitly state that the previous limit on the number of segments is
> >>> going
>  to be removed as well?
> 
>  I have a couple of nit comments.   The method name is still
> segmentSize
> >>> in
>  the code block vs segmentInterval and the order of the parameters for
> >> the
>  third persistentWindowStore don't match the order in the JavaDoc.
> 
>  Thanks,
>  Bill
> 
> 
> 
>  On Thu, Jun 21, 2018 at 3:32 PM John Roesler 
> >> wrote:
> 
> > I've updated the KIP and draft PR accordingly.
> >
> > On Thu, Jun 21, 2018 at 2:03 PM John Roesler 
> >>> wrote:
> >
> >> Interesting... I did not initially consider it because I didn't
> >> want
> >>> to
> >> have an impact on anyone's Streams apps, but now I see that unless
> >> developers have subclassed `Windows`, the number of segments would
>  always
> >> be 3!
> >>
> >> There's one caveat to this, which I think was a mistake. The field
> >> `segments` in Windows is public, which means that anyone can
> >> actually
>  set
> >> it directly on any Window instance like:
> >>
> >> TimeWindows tw = TimeWindows.of(100);
> >> tw.segments = 12345;
> >>
> >> Bypassing the bounds check and contradicting the javadoc in Windows
>  that
> >> says users can't directly set it. Sadly there's no way to
> >> "deprecate"
> > this
> >> exposure, so I propose just to make it private.
> >>
> >> With this new knowledge, I agree, I think we can switch to
> >> "segmentInterval" throughout the interface.
> >>
> >> On Wed, Jun 20, 2018 at 5:06 PM Guozhang Wang 
> > wrote:
> >>
> >>> Hello John,
> >>>
> >>> Thanks for the KIP.
> >>>
> >>> Should we consider making the change on
> >>> `Stores#persistentWindowStore`
> >>> parameters as well?
> >>>
> >>>
> >>> Guozhang
> >>>
> >>>
> >>> On Wed, Jun 20, 2018 at 1:31 PM, John Roesler 
> > wrote:
> >>>
>  Hi Ted,
> 
>  Ah, when you made that comment to me before, I thought you meant
> >>> as
> >>> opposed
>  to "segments". Now it makes sense that you meant as opposed to
>  "segmentSize".
> 
>  I named it that way to match the peer method "windowSize", which
> >>> is
> >>> also a
>  quantity of milliseconds.
> 
>  I agree that "interval" is more intuitive, but I think I favor
> >>> consistency
>  in this case. Does that seem reasonable?
> 
>  Thanks,
>  -John
> 
>  On Wed, Jun 20, 2018 at 1:06 PM Ted Yu 
> >>> wrote:
> 
> > Normally size is not measured in time unit, such as
> >>> milliseconds.
> > How about naming the new method 

Re: [DISCUSS] KIP-308: Support dynamic update of max.connections.per.ip/max.connections.per.ip.overrides configs

2018-06-25 Thread Dong Lin
Hey Manikumar,

Thanks much for the KIP. It looks pretty good.

Thanks,
Dong

On Thu, Jun 21, 2018 at 11:38 PM, Manikumar 
wrote:

> Hi all,
>
> I have created a KIP to add support for dynamic update of
> max.connections.per.ip/max.connections.per.ip.overrides configs
>
> *https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=85474993
>  >*
>
> Any feedback is appreciated.
>
> Thanks
>


Re: [DISCUSS]KIP-216: IQ should throw different exceptions for different errors

2018-06-25 Thread Guozhang Wang
I'm wondering if StreamThreadNotStarted could be merged into
StreamThreadNotRunning, because I think users' handling logic for the third
case would be likely the same as the second. Do you have some scenarios
where users may want to handle them differently?

Guozhang

On Sun, Jun 24, 2018 at 5:25 PM, Matthias J. Sax 
wrote:

> Sorry to hear! Get well soon!
>
> It's not a big deal if the KIP stalls a little bit. Feel free to pick it
> up again when you find time.
>
> >>> Is `StreamThreadNotRunningException` really an retryable error?
> >>
> >> When KafkaStream state is REBALANCING, I think it is a retryable error.
> >>
> >> StreamThreadStateStoreProvider#stores() will throw
> >> StreamThreadNotRunningException when StreamThread state is not
> RUNNING. The
> >> user can retry until KafkaStream state is RUNNING.
>
> I see. If this is the intention, than I would suggest to have two (or
> maybe three) different exceptions:
>
>  - StreamThreadRebalancingException (retryable)
>  - StreamThreadNotRunning (not retryable -- thrown if in state
> PENDING_SHUTDOWN or DEAD
>  - maybe StreamThreadNotStarted (for state CREATED)
>
> The last one is tricky and could also be merged into one of the first
> two, depending if you want to argue that it's retryable or not. (Just
> food for though -- not sure what others think.)
>
>
>
> -Matthias
>
> On 6/22/18 8:06 AM, vito jeng wrote:
> > Matthias,
> >
> > Thank you for your assistance.
> >
> >> what is the status of this KIP?
> >
> > Unfortunately, there is no further progress.
> > About seven weeks ago, I was injured in sports. I had a broken wrist on
> > my left wrist.
> > Many jobs are affected, including this KIP and implementation.
> >
> >
> >> I just re-read it, and have a couple of follow up comments. Why do we
> >> discuss the internal exceptions you want to add? Also, do we really need
> >> them? Can't we just throw the correct exception directly instead of
> >> wrapping it later?
> >
> > I think you may be right. As I say in the previous:
> > "The original idea is that we can distinguish different state store
> > exception for different handling. But to be honest, I am not quite sure
> > this is necessary. Maybe have some change during implementation."
> >
> > During the implementation, I also feel we maybe not need wrapper it.
> > We can just throw the correctly directly.
> >
> >
> >> Is `StreamThreadNotRunningException` really an retryable error?
> >
> > When KafkaStream state is REBALANCING, I think it is a retryable error.
> >
> > StreamThreadStateStoreProvider#stores() will throw
> > StreamThreadNotRunningException when StreamThread state is not RUNNING.
> The
> > user can retry until KafkaStream state is RUNNING.
> >
> >
> >> When would we throw an `StateStoreEmptyException`? The semantics is
> > unclear to me atm.
> >
> >> When the state is RUNNING, is `StateStoreClosedException` a retryable
> > error?
> >
> > These two comments will be answered in another mail.
> >
> >
> >
> > ---
> > Vito
> >
> > On Mon, Jun 11, 2018 at 8:12 AM, Matthias J. Sax 
> > wrote:
> >
> >> Vito,
> >>
> >> what is the status of this KIP?
> >>
> >> I just re-read it, and have a couple of follow up comments. Why do we
> >> discuss the internal exceptions you want to add? Also, do we really need
> >> them? Can't we just throw the correct exception directly instead of
> >> wrapping it later?
> >>
> >> When would we throw an `StateStoreEmptyException`? The semantics is
> >> unclear to me atm.
> >>
> >> Is `StreamThreadNotRunningException` really an retryable error?
> >>
> >> When the state is RUNNING, is `StateStoreClosedException` a retryable
> >> error?
> >>
> >> One more nits: ReadOnlyWindowStore got a new method #fetch(K key, long
> >> time); that should be added
> >>
> >>
> >> Overall I like the KIP but some details are still unclear. Maybe it
> >> might help if you open an PR in parallel?
> >>
> >>
> >> -Matthias
> >>
> >> On 4/24/18 8:18 AM, vito jeng wrote:
> >>> Hi, Guozhang,
> >>>
> >>> Thanks for the comment!
> >>>
> >>>
> >>>
> >>> Hi, Bill,
> >>>
> >>> I'll try to make some update to make the KIP better.
> >>>
> >>> Thanks for the comment!
> >>>
> >>>
> >>> ---
> >>> Vito
> >>>
> >>> On Sat, Apr 21, 2018 at 5:40 AM, Bill Bejeck 
> wrote:
> >>>
>  Hi Vito,
> 
>  Thanks for the KIP, overall it's a +1 from me.
> 
>  At this point, the only thing I would change is possibly removing the
>  listing of all methods called by the user and the listing of all store
>  types and focus on what states result in which exceptions thrown to
> the
>  user.
> 
>  Thanks,
>  Bill
> 
>  On Fri, Apr 20, 2018 at 2:10 PM, Guozhang Wang 
> >> wrote:
> 
> > Thanks for the KIP Vito!
> >
> > I made a pass over the wiki and it looks great to me. I'm +1 on the
> >> KIP.
> >
> > About the base class InvalidStateStoreException itself, I'd actually
> > suggest we do not deprecate it but still expose it as part of the
> >> public

Re: [DISCUSS] KIP-308: Support dynamic update of max.connections.per.ip/max.connections.per.ip.overrides configs

2018-06-25 Thread Jason Gustafson
Hey Manikumar,

Thanks for the KIP. This seems useful.

-Jason

On Thu, Jun 21, 2018 at 11:38 PM, Manikumar 
wrote:

> Hi all,
>
> I have created a KIP to add support for dynamic update of
> max.connections.per.ip/max.connections.per.ip.overrides configs
>
> *https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=85474993
>  >*
>
> Any feedback is appreciated.
>
> Thanks
>


[DISCUSS] KIP-323: Schedulable KTable as Graph source

2018-06-25 Thread Flávio Stutz
Hey, guys, I've just started a KIP discussion here:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-323%3A+Schedulable+KTable+as+Graph+source


Re: [VOTE] KIP-312: Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2018-06-25 Thread Ted Yu
+1

On Mon, Jun 25, 2018 at 9:45 AM, Guozhang Wang  wrote:

> +1.
>
> On Mon, Jun 25, 2018 at 8:12 AM, Matthias J. Sax 
> wrote:
>
> > +1 (binding)
> >
> > On 6/25/18 6:11 AM, Bill Bejeck wrote:
> > > All,
> > > I'd like to start a vote for this KIP now.
> > >
> > > Thanks,
> > > Bill
> > >
> >
> >
>
>
> --
> -- Guozhang
>


Re: [VOTE] KIP-312: Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2018-06-25 Thread Guozhang Wang
+1.

On Mon, Jun 25, 2018 at 8:12 AM, Matthias J. Sax 
wrote:

> +1 (binding)
>
> On 6/25/18 6:11 AM, Bill Bejeck wrote:
> > All,
> > I'd like to start a vote for this KIP now.
> >
> > Thanks,
> > Bill
> >
>
>


-- 
-- Guozhang


Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-25 Thread Guozhang Wang
+1 from me as well.

On Mon, Jun 25, 2018 at 8:16 AM, Matthias J. Sax 
wrote:

> +1 from my side for using `compaction.strategy` with values "offset",
> "timestamp" and "header" and `compaction.strategy.header`
>
> -Matthias
>
> On 6/25/18 1:25 AM, Luís Cabral wrote:
> >  Hi,
> >
> > So, is everyone OK using the approach with 2 properties?
> >
> > E.g.:
> >
> > Scenario 1:
> > compaction.strategy: offset
> >
> > :- Behaviour is the same as what currently exists, where the
> compaction is done only via the 'offset'
> >
> >
> > Scenario 2:
> > compaction.strategy: timestamp
> >
> > :- Similar to 'offset', but the record timestamp is used instead
> >
> >
> > Scenario 3:
> > compaction.strategy: header
> >
> > compaction.strategy.header: xyz
> >
> > :- Searches the headers for 'xyz' key when performing the
> compaction. Defaults to 'offset' strategy if this header does not exist
> (special note on the '.header' suffix, as this would allow additional
> strategies to add whatever extra configuration they need).
> >
> > Scenario 4 (hypothetical future):
> > compaction.strategy: foo
> >
> > compaction.strategy.foo.name: bar
> > compaction.strategy.foo.order: DESC
> > compaction.strategy.foo.fallback: timestamp
> >
> >
> > :- This one is just to show what I meant with the '.header' suffix
> mentioned in {Scenario 3}
> >
> >
> >
> > Regards,
> > Luís
> >
> >
> > On Monday, June 18, 2018, 11:56:51 PM GMT+2, Guozhang Wang <
> wangg...@gmail.com> wrote:
> >
> >  Hi Matthias,
> >
> > Yes, we are effectively assigning the the whole space of Strings minus
> > current preserved ones as header keys; honestly I think in practice users
> > wanting to use `_something_` would be very rare, but I admit it may still
> > be possible in theory.
> >
> > I think Luis' point about "header=" is that having a
> expression
> > evaluation as the config value is a bit weird, and thinking about it
> twice
> > it is still not flawless: we can still argue that we are effectively
> > assigning the whole sub-space of "header=*" of Strings for headers, and
> > what if users want to use preserved value falling into that sub-space
> > (again, should not really happen in practice, just being paranoid here).
> >
> > It seems that two configs are the common choice that everyone is happy
> with.
> >
> > Guozhang
> >
> >
> > On Mon, Jun 18, 2018 at 2:35 PM, Matthias J. Sax 
> > wrote:
> >
> >> Luis,
> >>
> >> I meant to update the "Rejected Alternative" sections, what you have
> >> done already. Thx.
> >>
> >> Originally, I also had the idea about a second config, but thought it
> >> might be easier to just change the allowed values to be `offset`,
> >> `timestamp`, `header=`. (We try to keep the number of configs small
> >> if possible, as more configs are more confusing to users.)
> >>
> >> I don't think that using `_offset_`, `_timestamp_` and `` solves
> >> the problem because users still might use `_something_` as header key --
> >> and if we want to introduce a new compaction strategy "something" later
> >> we face the same issues as without the underscores. We only reduce the
> >> likelihood that it happens.
> >>
> >> Using `header=` as prefix or introducing a second config, that is only
> >> effective if the strategy is set to `header` seems to be a cleaner
> >> solution.
> >>
> >> @Luis: why do you think that using `header=` is an "incorrect
> >> approach"?
> >>
> >>> Though I would still prefer to keep it as it is, as its a much simple>
> >> and cleaner approach – I’m not so sure that a potential client would
> >>> really be so inconvenienced for having to use “_offset” or
> >>> “_timestamp_” as a header
> >>
> >> I don't think that it's about the issue that people cannot use
> >> `_offset_` or `_timestamp_` in their header (by "use" I mean for
> >> compaction). With the current KIP, they cannot use `offset` or
> >> `timestamp` either. The issue is, that we cannot introduce a new system
> >> supported compaction strategy in the future without potentially breaking
> >> something, as we basically assign the whole space of Strings (minus
> >> `offset`, `timestamp`) as valid configs to enable header based
> compaction.
> >>
> >> Personally, I prefer either adding a config or going with
> >> `header=`. Using `_timestamp_`, `_offset_`, and `` might be
> >> good enough (even if this is the solution I like least)---for this case,
> >> we should state explicitly, that the whole space of `_*_` is reserved
> >> and users are not allowed to set those for header compaction. In fact, I
> >> would also add a check for the config that only allows for `_offset_`
> >> and `_timestamp_` and throws an exception for all other `_*_` configs.
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 6/18/18 2:03 PM, Luís Cabral wrote:
> >>> I’m ok with that...
> >>>
> >>> Ted / Matthias?
> >>>
> >>>
> >>> From: Guozhang Wang
> >>> Sent: 18 June 2018 22:49
> >>> To: dev@kafka.apache.org
> >>> Subject: Re: [VOTE] KIP-280: 

Re: Request permission to assign JIRA

2018-06-25 Thread Jason Gustafson
Added. Thanks for contributing!

-Jason

On Mon, Jun 25, 2018 at 9:27 AM, lambdaliu(刘少波) 
wrote:

> Hi Team,
>
> I am trying to claim a bug in Jira, Could you please help me gain the
> required permissions.
> my JIRA usernane is lambdaliu.
>
> thanks,
> lambdaliu.
>


Request permission to assign JIRA

2018-06-25 Thread 刘少波
Hi Team,

I am trying to claim a bug in Jira, Could you please help me gain the required 
permissions.
my JIRA usernane is lambdaliu.

thanks,
lambdaliu.


[DISCUSS] KIP-320: Allow fetchers to detect and handle log truncation

2018-06-25 Thread Jason Gustafson
Hey All,

I wrote up a KIP to handle one more edge case in the replication protocol
and to support better handling of truncation in the consumer when unclean
leader election is enabled. Let me know what you think.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-320%3A+Allow+fetchers+to+detect+and+handle+log+truncation

Thanks to Anna Povzner and Dong Lin for initial feedback.

Thanks,
Jason


[jira] [Created] (KAFKA-7095) Low traffic consumer is not consuming messages after the offsets is deleted by Kafka

2018-06-25 Thread Aldo Sinanaj (JIRA)
Aldo Sinanaj created KAFKA-7095:
---

 Summary: Low traffic consumer is not consuming messages after the 
offsets is deleted by Kafka
 Key: KAFKA-7095
 URL: https://issues.apache.org/jira/browse/KAFKA-7095
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.1.1
Reporter: Aldo Sinanaj


Hello guys.

I have a low traffic consumers for a given consumer group and I have the 
default broker setting for this property *offsets.retention.minutes*. So if a 
messages is coming after 2 days and Kafka has deleted the offset for that given 
consumer, then the consumer will not consume the new incoming messages. If I 
restart the application it will consume from the earliest which is obvious 
since the offset is deleted.

My question is why it doesn't consume the new messages if I don't restart the 
application? And how does this version of Kafka understands if a consumer is 
active or inactive? Is my consumer considered inactive in this case?

Thanks,

Aldo



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-25 Thread Matthias J. Sax
+1 from my side for using `compaction.strategy` with values "offset",
"timestamp" and "header" and `compaction.strategy.header`

-Matthias

On 6/25/18 1:25 AM, Luís Cabral wrote:
>  Hi,
> 
> So, is everyone OK using the approach with 2 properties?
> 
> E.g.:
> 
> Scenario 1:
>     compaction.strategy: offset
> 
>     :- Behaviour is the same as what currently exists, where the compaction 
> is done only via the 'offset'
> 
> 
> Scenario 2:
>     compaction.strategy: timestamp
> 
>     :- Similar to 'offset', but the record timestamp is used instead
> 
> 
> Scenario 3:
>     compaction.strategy: header
> 
>     compaction.strategy.header: xyz
> 
>     :- Searches the headers for 'xyz' key when performing the compaction. 
> Defaults to 'offset' strategy if this header does not exist (special note on 
> the '.header' suffix, as this would allow additional strategies to add 
> whatever extra configuration they need).
> 
> Scenario 4 (hypothetical future):
>     compaction.strategy: foo
> 
>     compaction.strategy.foo.name: bar
>     compaction.strategy.foo.order: DESC
> compaction.strategy.foo.fallback: timestamp
> 
> 
>     :- This one is just to show what I meant with the '.header' suffix 
> mentioned in {Scenario 3}
> 
> 
> 
> Regards,
> Luís
> 
> 
> On Monday, June 18, 2018, 11:56:51 PM GMT+2, Guozhang Wang 
>  wrote:  
>  
>  Hi Matthias,
> 
> Yes, we are effectively assigning the the whole space of Strings minus
> current preserved ones as header keys; honestly I think in practice users
> wanting to use `_something_` would be very rare, but I admit it may still
> be possible in theory.
> 
> I think Luis' point about "header=" is that having a expression
> evaluation as the config value is a bit weird, and thinking about it twice
> it is still not flawless: we can still argue that we are effectively
> assigning the whole sub-space of "header=*" of Strings for headers, and
> what if users want to use preserved value falling into that sub-space
> (again, should not really happen in practice, just being paranoid here).
> 
> It seems that two configs are the common choice that everyone is happy with.
> 
> Guozhang
> 
> 
> On Mon, Jun 18, 2018 at 2:35 PM, Matthias J. Sax 
> wrote:
> 
>> Luis,
>>
>> I meant to update the "Rejected Alternative" sections, what you have
>> done already. Thx.
>>
>> Originally, I also had the idea about a second config, but thought it
>> might be easier to just change the allowed values to be `offset`,
>> `timestamp`, `header=`. (We try to keep the number of configs small
>> if possible, as more configs are more confusing to users.)
>>
>> I don't think that using `_offset_`, `_timestamp_` and `` solves
>> the problem because users still might use `_something_` as header key --
>> and if we want to introduce a new compaction strategy "something" later
>> we face the same issues as without the underscores. We only reduce the
>> likelihood that it happens.
>>
>> Using `header=` as prefix or introducing a second config, that is only
>> effective if the strategy is set to `header` seems to be a cleaner
>> solution.
>>
>> @Luis: why do you think that using `header=` is an "incorrect
>> approach"?
>>
>>> Though I would still prefer to keep it as it is, as its a much simple>
>> and cleaner approach – I’m not so sure that a potential client would
>>> really be so inconvenienced for having to use “_offset” or
>>> “_timestamp_” as a header
>>
>> I don't think that it's about the issue that people cannot use
>> `_offset_` or `_timestamp_` in their header (by "use" I mean for
>> compaction). With the current KIP, they cannot use `offset` or
>> `timestamp` either. The issue is, that we cannot introduce a new system
>> supported compaction strategy in the future without potentially breaking
>> something, as we basically assign the whole space of Strings (minus
>> `offset`, `timestamp`) as valid configs to enable header based compaction.
>>
>> Personally, I prefer either adding a config or going with
>> `header=`. Using `_timestamp_`, `_offset_`, and `` might be
>> good enough (even if this is the solution I like least)---for this case,
>> we should state explicitly, that the whole space of `_*_` is reserved
>> and users are not allowed to set those for header compaction. In fact, I
>> would also add a check for the config that only allows for `_offset_`
>> and `_timestamp_` and throws an exception for all other `_*_` configs.
>>
>>
>> -Matthias
>>
>>
>> On 6/18/18 2:03 PM, Luís Cabral wrote:
>>> I’m ok with that...
>>>
>>> Ted / Matthias?
>>>
>>>
>>> From: Guozhang Wang
>>> Sent: 18 June 2018 22:49
>>> To: dev@kafka.apache.org
>>> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>>>
>>> How about make the preserved values to be "_offset_" and "_timestamp_"
>>> then? Currently in the KIP they are reserved as "offset" and "timestamp".
>>>
>>>
>>> Guozhang
>>>
>>> On Mon, Jun 18, 2018 at 1:40 PM, Luís Cabral
>> 
>>> wrote:
>>>
 Hi Guozhang,

 Yes, that is what I meant 

Re: [VOTE] KIP-312: Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2018-06-25 Thread Matthias J. Sax
+1 (binding)

On 6/25/18 6:11 AM, Bill Bejeck wrote:
> All,
> I'd like to start a vote for this KIP now.
> 
> Thanks,
> Bill
> 



signature.asc
Description: OpenPGP digital signature


[VOTE] KIP-312: Add Overloaded StreamsBuilder Build Method to Accept java.util.Properties

2018-06-25 Thread Bill Bejeck
All,
I'd like to start a vote for this KIP now.

Thanks,
Bill


Build failed in Jenkins: kafka-2.0-jdk8 #56

2018-06-25 Thread Apache Jenkins Server
See 


Changes:

[rajinisivaram] MINOR: Fix timing issue in advertised listener update test 
(#5256)

--
[...truncated 432.85 KB...]

kafka.zookeeper.ZooKeeperClientTest > 
testBlockOnRequestCompletionFromStateChangeHandler PASSED

kafka.zookeeper.ZooKeeperClientTest > testUnresolvableConnectString STARTED

kafka.zookeeper.ZooKeeperClientTest > testUnresolvableConnectString PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testPipelinedGetData STARTED

kafka.zookeeper.ZooKeeperClientTest > testPipelinedGetData PASSED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChildChangeHandlerForChildChange 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChildChangeHandlerForChildChange 
PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenExistingZNodeWithChildren 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetChildrenExistingZNodeWithChildren 
PASSED

kafka.zookeeper.ZooKeeperClientTest > testSetDataExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testSetDataExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testMixedPipeline STARTED

kafka.zookeeper.ZooKeeperClientTest > testMixedPipeline PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetDataExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetDataExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testDeleteExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testDeleteExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testSessionExpiry STARTED

kafka.zookeeper.ZooKeeperClientTest > testSessionExpiry PASSED

kafka.zookeeper.ZooKeeperClientTest > testSetDataNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testSetDataNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testDeleteNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testDeleteNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testExistsExistingZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testExistsExistingZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testZooKeeperStateChangeRateMetrics PASSED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChangeHandlerForDeletion STARTED

kafka.zookeeper.ZooKeeperClientTest > testZNodeChangeHandlerForDeletion PASSED

kafka.zookeeper.ZooKeeperClientTest > testGetAclNonExistentZNode STARTED

kafka.zookeeper.ZooKeeperClientTest > testGetAclNonExistentZNode PASSED

kafka.zookeeper.ZooKeeperClientTest > testStateChangeHandlerForAuthFailure 
STARTED

kafka.zookeeper.ZooKeeperClientTest > testStateChangeHandlerForAuthFailure 
PASSED

kafka.network.SocketServerTest > testGracefulClose STARTED

kafka.network.SocketServerTest > testGracefulClose PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingAlreadyDone PASSED

kafka.network.SocketServerTest > controlThrowable STARTED

kafka.network.SocketServerTest > controlThrowable PASSED

kafka.network.SocketServerTest > testRequestMetricsAfterStop STARTED

kafka.network.SocketServerTest > testRequestMetricsAfterStop PASSED

kafka.network.SocketServerTest > testConnectionIdReuse STARTED

kafka.network.SocketServerTest > testConnectionIdReuse PASSED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
STARTED

kafka.network.SocketServerTest > testClientDisconnectionUpdatesRequestMetrics 
PASSED

kafka.network.SocketServerTest > testProcessorMetricsTags STARTED

kafka.network.SocketServerTest > testProcessorMetricsTags PASSED

kafka.network.SocketServerTest > testMaxConnectionsPerIp STARTED

kafka.network.SocketServerTest > testMaxConnectionsPerIp PASSED

kafka.network.SocketServerTest > testConnectionId STARTED

kafka.network.SocketServerTest > testConnectionId PASSED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics STARTED

kafka.network.SocketServerTest > 
testBrokerSendAfterChannelClosedUpdatesRequestMetrics PASSED

kafka.network.SocketServerTest > testNoOpAction STARTED

kafka.network.SocketServerTest > testNoOpAction PASSED

kafka.network.SocketServerTest > simpleRequest STARTED

kafka.network.SocketServerTest > simpleRequest PASSED

kafka.network.SocketServerTest > closingChannelException STARTED

kafka.network.SocketServerTest > closingChannelException PASSED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingInProgress STARTED

kafka.network.SocketServerTest > 
testSendActionResponseWithThrottledChannelWhereThrottlingInProgress PASSED

kafka.network.SocketServerTest 

[VOTE] KIP-293: Add new metrics for consumer/replication fetch requests

2018-06-25 Thread Adam Kotwasinski
Hello,

In the absence of additional feedback on this KIP I'd like to start a vote.

To summarize, the KIP simply proposes to add a consumer metric to track
the number of fetch requests made by (real) client consumers and not
other replicating brokers.

KIP link: 
https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=80452537
PR: https://github.com/apache/kafka/pull/4936
Discussion thread:
https://www.mail-archive.com/dev@kafka.apache.org/msg87465.html

Yours faithfully,
Adam Kotwasinski


[jira] [Created] (KAFKA-7094) Variate should unify code style in one method, and use camel name

2018-06-25 Thread Matt Wang (JIRA)
Matt Wang created KAFKA-7094:


 Summary: Variate should unify code style in one method, and use  
camel name
 Key: KAFKA-7094
 URL: https://issues.apache.org/jira/browse/KAFKA-7094
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 1.0.1
Reporter: Matt Wang


In one method, there are two variates, partitionsTobeLeader and 
partitionsToBeFollower, which should use unify code style, that will be helpful 
to code maintenance.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


Re: [VOTE] KIP-280: Enhanced log compaction

2018-06-25 Thread Luís Cabral
 Hi,

So, is everyone OK using the approach with 2 properties?

E.g.:

Scenario 1:
    compaction.strategy: offset

    :- Behaviour is the same as what currently exists, where the compaction is 
done only via the 'offset'


Scenario 2:
    compaction.strategy: timestamp

    :- Similar to 'offset', but the record timestamp is used instead


Scenario 3:
    compaction.strategy: header

    compaction.strategy.header: xyz

    :- Searches the headers for 'xyz' key when performing the compaction. 
Defaults to 'offset' strategy if this header does not exist (special note on 
the '.header' suffix, as this would allow additional strategies to add whatever 
extra configuration they need).

Scenario 4 (hypothetical future):
    compaction.strategy: foo

    compaction.strategy.foo.name: bar
    compaction.strategy.foo.order: DESC
compaction.strategy.foo.fallback: timestamp


    :- This one is just to show what I meant with the '.header' suffix 
mentioned in {Scenario 3}



Regards,
Luís


On Monday, June 18, 2018, 11:56:51 PM GMT+2, Guozhang Wang 
 wrote:  
 
 Hi Matthias,

Yes, we are effectively assigning the the whole space of Strings minus
current preserved ones as header keys; honestly I think in practice users
wanting to use `_something_` would be very rare, but I admit it may still
be possible in theory.

I think Luis' point about "header=" is that having a expression
evaluation as the config value is a bit weird, and thinking about it twice
it is still not flawless: we can still argue that we are effectively
assigning the whole sub-space of "header=*" of Strings for headers, and
what if users want to use preserved value falling into that sub-space
(again, should not really happen in practice, just being paranoid here).

It seems that two configs are the common choice that everyone is happy with.

Guozhang


On Mon, Jun 18, 2018 at 2:35 PM, Matthias J. Sax 
wrote:

> Luis,
>
> I meant to update the "Rejected Alternative" sections, what you have
> done already. Thx.
>
> Originally, I also had the idea about a second config, but thought it
> might be easier to just change the allowed values to be `offset`,
> `timestamp`, `header=`. (We try to keep the number of configs small
> if possible, as more configs are more confusing to users.)
>
> I don't think that using `_offset_`, `_timestamp_` and `` solves
> the problem because users still might use `_something_` as header key --
> and if we want to introduce a new compaction strategy "something" later
> we face the same issues as without the underscores. We only reduce the
> likelihood that it happens.
>
> Using `header=` as prefix or introducing a second config, that is only
> effective if the strategy is set to `header` seems to be a cleaner
> solution.
>
> @Luis: why do you think that using `header=` is an "incorrect
> approach"?
>
> > Though I would still prefer to keep it as it is, as its a much simple>
> and cleaner approach – I’m not so sure that a potential client would
> > really be so inconvenienced for having to use “_offset” or
> > “_timestamp_” as a header
>
> I don't think that it's about the issue that people cannot use
> `_offset_` or `_timestamp_` in their header (by "use" I mean for
> compaction). With the current KIP, they cannot use `offset` or
> `timestamp` either. The issue is, that we cannot introduce a new system
> supported compaction strategy in the future without potentially breaking
> something, as we basically assign the whole space of Strings (minus
> `offset`, `timestamp`) as valid configs to enable header based compaction.
>
> Personally, I prefer either adding a config or going with
> `header=`. Using `_timestamp_`, `_offset_`, and `` might be
> good enough (even if this is the solution I like least)---for this case,
> we should state explicitly, that the whole space of `_*_` is reserved
> and users are not allowed to set those for header compaction. In fact, I
> would also add a check for the config that only allows for `_offset_`
> and `_timestamp_` and throws an exception for all other `_*_` configs.
>
>
> -Matthias
>
>
> On 6/18/18 2:03 PM, Luís Cabral wrote:
> > I’m ok with that...
> >
> > Ted / Matthias?
> >
> >
> > From: Guozhang Wang
> > Sent: 18 June 2018 22:49
> > To: dev@kafka.apache.org
> > Subject: Re: [VOTE] KIP-280: Enhanced log compaction
> >
> > How about make the preserved values to be "_offset_" and "_timestamp_"
> > then? Currently in the KIP they are reserved as "offset" and "timestamp".
> >
> >
> > Guozhang
> >
> > On Mon, Jun 18, 2018 at 1:40 PM, Luís Cabral
> 
> > wrote:
> >
> >> Hi Guozhang,
> >>
> >> Yes, that is what I meant (separate configs).
> >> Though I would still prefer to keep it as it is, as its a much simpler
> and
> >> cleaner approach – I’m not so sure that a potential client would really
> be
> >> so inconvenienced for having to use “_offset” or “_timestamp_” as a
> header
> >>
> >> Cheers,
> >> Luís
> >>
> >>
> >> From: Guozhang Wang
> >> Sent: 18 June 2018 19:35
> >> To: 

Re: Using Kafka as Primary Datastore without compaction

2018-06-25 Thread Aman Rastogi
As per my understanding, you should set the retention period to Long.MAX
hours. This will ensure that your messages won't get compacted because
retention period is huge.

Regards,
Aman

On Mon, Jun 25, 2018 at 8:27 AM, Barathan Kulothongan  wrote:

> Hi There, I am currently reading the Kafka Definitive guide book as we
> wanted to Architect an application using Kafka as the primary data store.
> In the Log Compaction section you have mentioned the dirty records are
> compacted with leaving just the latest records in the topic. Actually the
> application we are architecting needs to maintain bi-temporal data with all
> the message versions of the same key intact. What are the properties we
> should set in order to make sure none of the messages gets compacted? Or is
> it always safe to store the messages in another datastore (Postgresql)
> without relying Kafka as our only data store.
>
> PS: The application that will use Kafka as the datastore is expected to
> store less than 1GB worth of data in 200+ topics.
>
> Regards,
> Barathan Kulothongan