date:20171018

GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/4096

HOTFIX: Poll with zero milliseconds during restoration phase

1. After the poll call, re-check if the state has been changed or not; if 
yes, initialize the tasks again.

2. Cherry-pick the flaky test fix from #4095

3. Minor log4j improvements.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka KHotfix-restore-only

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4096.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4096






---

[GitHub] kafka pull request #4086: [WIP] KAFKA-6085: Pause all partitions before task...

Github user guozhangwang closed the pull request at:

https://github.com/apache/kafka/pull/4086


---

[GitHub] kafka pull request #4086: [WIP] KAFKA-6085: Pause all partitions before task...

Github user guozhangwang closed the pull request at:

https://github.com/apache/kafka/pull/4086


---

[GitHub] kafka pull request #4086: [WIP] KAFKA-6085: Pause all partitions before task...

GitHub user guozhangwang reopened a pull request:

https://github.com/apache/kafka/pull/4086

[WIP] KAFKA-6085: Pause all partitions before tasks are initialized

Mirror of #4085 against trunk. This PR contains two fixes (one major and 
one minor):

Major: on rebalance, pause all partitions instead of the partitions for 
tasks with state stores only, so that no records will be returned in the same 
`pollRecords()` call.

Minor: during the restoration phase, when thread state is still 
PARTITION_ASSIGNED, call consumer.poll with hard-coded pollMs = 0.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka KHotfix-restore-only

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4086.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4086


commit 62bf4784779f7379e849289c4456363f352cb850
Author: Guozhang Wang 
Date:   2017-10-12T21:18:46Z

dummy

commit 5726e39cba8a79e6858e8b932c5116b60f2ae314
Author: Guozhang Wang 
Date:   2017-10-12T21:18:46Z

dummy

fix issues

Remove debugging information

commit 8214a3ee340791eb18f7e5fa77f2510470cf977a
Author: Matthias J. Sax 
Date:   2017-10-17T00:38:31Z

MINOR: update exception message for KIP-120

Author: Matthias J. Sax 

Reviewers: Guozhang Wang 

Closes #4078 from mjsax/hotfix-streams

commit 637b76342801cf4a32c9e65aa89bfe0bf76c24a7
Author: Jason Gustafson 
Date:   2017-10-17T00:49:35Z

MINOR: A few javadoc fixes

Author: Jason Gustafson 

Reviewers: Guozhang Wang 

Closes #4076 from hachikuji/javadoc-fixes

commit f57c505f6e714b891a6d30e5501b463f14316708
Author: Damian Guy 
Date:   2017-10-17T01:01:32Z

MINOR: add equals to SessionWindows

Author: Damian Guy 

Reviewers: Guozhang Wang , Matthias J. 
Sax, Bill Bejeck 

Closes #4074 from dguy/minor-session-window-equals

commit 2f1dd0d4da24eee352f20436902d825d7851c45b
Author: Guozhang Wang 
Date:   2017-10-18T01:27:35Z

normal poll with zero during restoration

commit 043f28ac89b50f9145ac719449f03a427376dcde
Author: Guozhang Wang 
Date:   2017-10-19T04:58:36Z

recheck state change




---

[GitHub] kafka pull request #4095: KAFKA-5140: Fix reset integration test

GitHub user guozhangwang opened a pull request:

https://github.com/apache/kafka/pull/4095

KAFKA-5140: Fix reset integration test

The MockTime was incorrectly used across multiple test methods within the 
class, as a class rule. Instead we set it on each test case; also remove the 
scala MockTime dependency.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/guozhangwang/kafka 
KMinor-reset-integration-test

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4095.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4095


commit 1aac7bec94de5b2d26d72bade9e654192f18d576
Author: Guozhang Wang 
Date:   2017-10-12T21:18:46Z

dummy

commit 83f176c26dd092f924e2d273a4fa763ce5d7cdcf
Author: Guozhang Wang 
Date:   2017-10-19T04:44:18Z

fix issues




---

[jira] [Resolved] (KAFKA-6088) Kafka Consumer slows down when reading from highly compacted topics

2017-10-18 Thread James Cheng (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Cheng resolved KAFKA-6088.

Resolution: Won't Fix

It is fixed in kafka client 0.11.0.0, and 0.11.0.0 clients can be used against 
brokers as far back as 0.10.0.0. So if anyone is affected, they can update 
their kafka clients in order to get the fix. So, we won't issue a patch fix to 
older releases.

> Kafka Consumer slows down when reading from highly compacted topics
> ---
>
> Key: KAFKA-6088
> URL: https://issues.apache.org/jira/browse/KAFKA-6088
> Project: Kafka
>  Issue Type: Bug
>  Components: clients
>Affects Versions: 0.10.2.1
>Reporter: James Cheng
> Fix For: 0.11.0.0
>
>
> Summary of the issue
> -
> We found a performance issue with the Kafka Consumer where it gets less 
> efficient if you have frequent gaps in offsets (which happens when there is 
> lots of compaction on the topic).
> The issue is present in 0.10.2.1 and possibly prior.
> It is fixed in 0.11.0.0.
> Summary of cause
> -
> The fetcher code assumes that there will be no gaps in message offsets. If 
> there are, it does an additional round trip to the broker. For topics with 
> large gaps in offsets, it is possible that most calls to {{poll()}} will 
> generate a roundtrip to the broker.
> Background and details 
> -
> We have a topic with roughly 8 million records. The topic is log compacted. 
> It turns out that most of the initial records in the topic were never 
> overwritten, whereas in the 2nd half of the topic we had lots of overwritten 
> records. That means that for the first part of the topic, there are no gaps 
> in offsets. But in the 2nd part of the topic, there are frequent gaps in the 
> offsets (due to records being compacted away).
> We have a consumer that starts up and reads the entire topic from beginning 
> to end. We noticed that the consumer would read through the first part of the 
> topic very quickly. When it got to the part of the topic with frequent gaps 
> in offsets, consumption rate slowed down dramatically. This slowdown was 
> consistent across multiple runs.
> What is happening is this:
> 1) A call to {{poll()}} happens. The consumer goes to the broker and returns 
> 1MB of data (the default of {{max.partition.fetch.bytes}}). It then returns 
> to the caller just 500 records (the default of {{max.poll.records}}), and 
> keeps the rest of the data in memory to use in future calls to {{poll()}}. 
> 2) Before returning the 500 records, the consumer library records the *next* 
> offset it should return. It does so by taking the offset of the last record, 
> and adds 1 to it. (The offset of the 500th message from the set, plus 1). It 
> calls this the {{nextOffset}}
> 3) The application finishes processing the 500 messages, and makes another 
> call to {{poll()}} happens. During this call, the consumer library does a 
> sanity check. It checks that the first message of the set *it is about to 
> return* has an offset that matches the value of {{nextOffset}}. That is it 
> checks if the 501th record has an offset that is 1 greater than the 500th 
> record.
>   a. If it matches, then it returns an additional 500 records, and 
> increments the {{nextOffset}} to (offset of the 1000th record, plus 1)
>   b. If it doesn't match, then it throws away the remainder of the 1MB of 
> data that it stored in memory in step 1, and it goes back to the broker to 
> fetch an additional 1MB of data, starting at the offset {{nextOffset}}.
> In topics have no gaps (a non-compacted topic), then the code will always hit 
> the 3a code path.
> If the topic has gaps in offsets and the call to {{poll()}} happens to fall 
> onto a gap, then the code will hit code path 3b.
> If the gaps are frequent, then it will frequently hit code path 3b.
> The worst case scenario that can happen is if you have a large number of 
> gaps, and you run with {{max.poll.records=1}}. Every gap will result in a new 
> fetch to the broker. You may possibly end up only processing one message per 
> fetch. Or, said another way, you will end up doing a single fetch for every 
> single message in the partition.
> Repro
> -
> We created a repro. It appears that the bug is in 0.10.2.1, but was fixed in 
> 0.11. I've attached the tarball with all the code and instructions. 
> The repro is:
> 1) Create a single partition topic with log compaction turned on 
> 2) Write messages with the following keys: 1 1 2 2 3 3 4 4 5 5 ... (each 
> message key written twice in a row) 
> 3) Let compaction happen. This would mean that that offsets 0 2 4 6 8 10 ... 
> would be compacted away 
> 4) Consume from this topic with {{max.poll.records=1}}
> More concretely,
> Here is the producer code:
> {code}
> Producer

[jira] [Created] (KAFKA-6088) Kafka Consumer slows down when reading from highly compacted topics

2017-10-18 Thread James Cheng (JIRA)

James Cheng created KAFKA-6088:
--

 Summary: Kafka Consumer slows down when reading from highly 
compacted topics
 Key: KAFKA-6088
 URL: https://issues.apache.org/jira/browse/KAFKA-6088
 Project: Kafka
  Issue Type: Bug
  Components: clients
Affects Versions: 0.10.2.1
Reporter: James Cheng
 Fix For: 0.11.0.0


Summary of the issue
-
We found a performance issue with the Kafka Consumer where it gets less 
efficient if you have frequent gaps in offsets (which happens when there is 
lots of compaction on the topic).

The issue is present in 0.10.2.1 and possibly prior.

It is fixed in 0.11.0.0.

Summary of cause
-
The fetcher code assumes that there will be no gaps in message offsets. If 
there are, it does an additional round trip to the broker. For topics with 
large gaps in offsets, it is possible that most calls to {{poll()}} will 
generate a roundtrip to the broker.

Background and details 
-
We have a topic with roughly 8 million records. The topic is log compacted. It 
turns out that most of the initial records in the topic were never overwritten, 
whereas in the 2nd half of the topic we had lots of overwritten records. That 
means that for the first part of the topic, there are no gaps in offsets. But 
in the 2nd part of the topic, there are frequent gaps in the offsets (due to 
records being compacted away).

We have a consumer that starts up and reads the entire topic from beginning to 
end. We noticed that the consumer would read through the first part of the 
topic very quickly. When it got to the part of the topic with frequent gaps in 
offsets, consumption rate slowed down dramatically. This slowdown was 
consistent across multiple runs.

What is happening is this:
1) A call to {{poll()}} happens. The consumer goes to the broker and returns 
1MB of data (the default of {{max.partition.fetch.bytes}}). It then returns to 
the caller just 500 records (the default of {{max.poll.records}}), and keeps 
the rest of the data in memory to use in future calls to {{poll()}}. 
2) Before returning the 500 records, the consumer library records the *next* 
offset it should return. It does so by taking the offset of the last record, 
and adds 1 to it. (The offset of the 500th message from the set, plus 1). It 
calls this the {{nextOffset}}
3) The application finishes processing the 500 messages, and makes another call 
to {{poll()}} happens. During this call, the consumer library does a sanity 
check. It checks that the first message of the set *it is about to return* has 
an offset that matches the value of {{nextOffset}}. That is it checks if the 
501th record has an offset that is 1 greater than the 500th record.
a. If it matches, then it returns an additional 500 records, and 
increments the {{nextOffset}} to (offset of the 1000th record, plus 1)
b. If it doesn't match, then it throws away the remainder of the 1MB of 
data that it stored in memory in step 1, and it goes back to the broker to 
fetch an additional 1MB of data, starting at the offset {{nextOffset}}.

In topics have no gaps (a non-compacted topic), then the code will always hit 
the 3a code path.
If the topic has gaps in offsets and the call to {{poll()}} happens to fall 
onto a gap, then the code will hit code path 3b.

If the gaps are frequent, then it will frequently hit code path 3b.

The worst case scenario that can happen is if you have a large number of gaps, 
and you run with {{max.poll.records=1}}. Every gap will result in a new fetch 
to the broker. You may possibly end up only processing one message per fetch. 
Or, said another way, you will end up doing a single fetch for every single 
message in the partition.


Repro
-

We created a repro. It appears that the bug is in 0.10.2.1, but was fixed in 
0.11. I've attached the tarball with all the code and instructions. 

The repro is:
1) Create a single partition topic with log compaction turned on 
2) Write messages with the following keys: 1 1 2 2 3 3 4 4 5 5 ... (each 
message key written twice in a row) 
3) Let compaction happen. This would mean that that offsets 0 2 4 6 8 10 ... 
would be compacted away 
4) Consume from this topic with {{max.poll.records=1}}

More concretely,

Here is the producer code:
{code}
Producer producer = new KafkaProducer(props); 
for (int i = 0; i < 100; i++) { 
producer.send(new ProducerRecord("compacted", 
Integer.toString(i), Integer.toString(i))); 
producer.send(new ProducerRecord("compacted", 
Integer.toString(i), Integer.toString(i))); 
} 
producer.flush(); 
producer.close();
{code}


When consuming with a 0.10.2.1 consumer, you can see this pattern (with Fetcher 
logs at DEBUG, see file consumer_0.10.2/debug.log):

{code}
offset = 1, key = 0, value = 0 
22:58:51.262 [main] DEBUG

Re: [DISCUSS] KIP-205: Add getAllKeys() API to ReadOnlyWindowStore

2017-10-18 Thread Richard Yu

Soliciting more feedback before vote.

On Wed, Oct 18, 2017 at 8:26 PM, Richard Yu 
wrote:

> Is this KIP close to completion? Because we could start working on the
> code itself now. (Its at about this stage).
>
> On Mon, Oct 16, 2017 at 7:37 PM, Richard Yu 
> wrote:
>
>> As Guozhang Wang mentioned earlier, we want to mirror the structure of
>> similar Store class (namely KTable). The WindowedStore class might be
>> unique in itself as it uses fetch() methods, but in my opinion, uniformity
>> should be better suited for simplicity.
>>
>> On Mon, Oct 16, 2017 at 11:54 AM, Xavier Léauté 
>> wrote:
>>
>>> Thank you Richard! Do you or Guozhang have any thoughts on my suggestions
>>> to use fetchAll() and fetchAll(timeFrom, timeTo) and reserve the "range"
>>> keyword for when we query a specific range of keys?
>>>
>>> Xavier
>>>
>>> On Sat, Oct 14, 2017 at 2:32 PM Richard Yu 
>>> wrote:
>>>
>>> > Thanks for the clarifications, Xavier.
>>> > I have removed most of the methods except for keys() and all() which
>>> has
>>> > been renamed to Guozhang Wang's suggestions.
>>> >
>>> > Hope this helps.
>>> >
>>> > On Fri, Oct 13, 2017 at 3:28 PM, Xavier Léauté 
>>> > wrote:
>>> >
>>> > > Thanks for the KIP Richard, this is a very useful addition!
>>> > >
>>> > > As far as the API changes, I just have a few comments on the methods
>>> that
>>> > > don't seem directly related to the KIP title, and naming of course
>>> :).
>>> > > On the implementation, see my notes further down that will hopefully
>>> > > clarify a few things.
>>> > >
>>> > > Regarding the "bonus" methods:
>>> > > I agree with Guozhang that the KIP lacks proper motivation for
>>> adding the
>>> > > min, max, and allLatest methods.
>>> > > It is also not clear to me what min and max would really mean, what
>>> > > ordering do we refer to here? Are we first ordering by time, then
>>> key, or
>>> > > first by key, then time?
>>> > > The allLatest method might be useful, but I don't really see how it
>>> would
>>> > > be used in practice if we have to scan the entire range of keys for
>>> all
>>> > the
>>> > > state stores, every single time.
>>> > >
>>> > > Maybe we could flesh the motivation behind those extra methods, but
>>> in
>>> > the
>>> > > interest of time, and moving the KIP forward it might make sense to
>>> file
>>> > a
>>> > > follow-up once we have more concrete use-cases.
>>> > >
>>> > > On naming:
>>> > > I also agree with Guozhang that "keys()" should be renamed. It feels
>>> a
>>> > bit
>>> > > of a misnomer, since it not only returns keys, but also the values.
>>> > >
>>> > > As far as what to rename it to, I would argue we already have some
>>> > > discrepancy between key-value stores using range() vs. window stores
>>> > using
>>> > > fetch().
>>> > > I assume we called the window method "fetch" instead of "get"
>>> because you
>>> > > might get back more than one window for the requested key.
>>> > >
>>> > > If we wanted to make things consistent with both existing key-value
>>> store
>>> > > naming and window store naming, we could do the following:
>>> > > Decide that "all" always refers to the entire range of keys,
>>> independent
>>> > of
>>> > > the window and similarly "range" always refers to a particular range
>>> of
>>> > > keys, irrespective of the window.
>>> > > We can then prefix methods with "fetch" to indicate that more than
>>> one
>>> > > window may be returned for each key in the range.
>>> > >
>>> > > This would give us:
>>> > > - a new fetchAll() method for all the keys, which makes it clear
>>> that you
>>> > > might get back the same key in different windows
>>> > > - a new fetchAll(timeFrom, timeTo) method to get all the keys in a
>>> given
>>> > > time range, again with possibly more than one window per key
>>> > > - and we'd have to rename fetch(K,K,long, long) to fetchRange(K, K,
>>> long,
>>> > > long)  and deprecate the old one to indicate a range of keys
>>> > >
>>> > > One inconsistency I noted: the "Proposed Changes" section in your KIP
>>> > talks
>>> > > about a "range(timeFrom, timeTo)" method, I think you meant to refer
>>> to
>>> > the
>>> > > all(from, to) method, but I'm sure you'll fix that once we decide on
>>> > > naming.
>>> > >
>>> > > On the implementation side:
>>> > > You mentioned that caching and rocksdb store have very different
>>> > key/value
>>> > > structures, and while it appears to be that way on the surface, the
>>> > > structure between the two is actually very similar. Keys in the
>>> cache are
>>> > > prefixed with a segment ID to ensure the ordering in the cache stays
>>> > > consistent with the rocksdb implementation, which maintains multiple
>>> > > rocksdb instances, one for each segment. So we just "artificially"
>>> mirror
>>> > > the segment structure in the cache.
>>> > >
>>> > > The reason for keeping the ordering consistent is pretty simple:

Re: [DISCUSS] KIP-205: Add getAllKeys() API to ReadOnlyWindowStore

2017-10-18 Thread Richard Yu

Is this KIP close to completion? Because we could start working on the code
itself now. (Its at about this stage).

On Mon, Oct 16, 2017 at 7:37 PM, Richard Yu 
wrote:

> As Guozhang Wang mentioned earlier, we want to mirror the structure of
> similar Store class (namely KTable). The WindowedStore class might be
> unique in itself as it uses fetch() methods, but in my opinion, uniformity
> should be better suited for simplicity.
>
> On Mon, Oct 16, 2017 at 11:54 AM, Xavier Léauté 
> wrote:
>
>> Thank you Richard! Do you or Guozhang have any thoughts on my suggestions
>> to use fetchAll() and fetchAll(timeFrom, timeTo) and reserve the "range"
>> keyword for when we query a specific range of keys?
>>
>> Xavier
>>
>> On Sat, Oct 14, 2017 at 2:32 PM Richard Yu 
>> wrote:
>>
>> > Thanks for the clarifications, Xavier.
>> > I have removed most of the methods except for keys() and all() which has
>> > been renamed to Guozhang Wang's suggestions.
>> >
>> > Hope this helps.
>> >
>> > On Fri, Oct 13, 2017 at 3:28 PM, Xavier Léauté 
>> > wrote:
>> >
>> > > Thanks for the KIP Richard, this is a very useful addition!
>> > >
>> > > As far as the API changes, I just have a few comments on the methods
>> that
>> > > don't seem directly related to the KIP title, and naming of course :).
>> > > On the implementation, see my notes further down that will hopefully
>> > > clarify a few things.
>> > >
>> > > Regarding the "bonus" methods:
>> > > I agree with Guozhang that the KIP lacks proper motivation for adding
>> the
>> > > min, max, and allLatest methods.
>> > > It is also not clear to me what min and max would really mean, what
>> > > ordering do we refer to here? Are we first ordering by time, then
>> key, or
>> > > first by key, then time?
>> > > The allLatest method might be useful, but I don't really see how it
>> would
>> > > be used in practice if we have to scan the entire range of keys for
>> all
>> > the
>> > > state stores, every single time.
>> > >
>> > > Maybe we could flesh the motivation behind those extra methods, but in
>> > the
>> > > interest of time, and moving the KIP forward it might make sense to
>> file
>> > a
>> > > follow-up once we have more concrete use-cases.
>> > >
>> > > On naming:
>> > > I also agree with Guozhang that "keys()" should be renamed. It feels a
>> > bit
>> > > of a misnomer, since it not only returns keys, but also the values.
>> > >
>> > > As far as what to rename it to, I would argue we already have some
>> > > discrepancy between key-value stores using range() vs. window stores
>> > using
>> > > fetch().
>> > > I assume we called the window method "fetch" instead of "get" because
>> you
>> > > might get back more than one window for the requested key.
>> > >
>> > > If we wanted to make things consistent with both existing key-value
>> store
>> > > naming and window store naming, we could do the following:
>> > > Decide that "all" always refers to the entire range of keys,
>> independent
>> > of
>> > > the window and similarly "range" always refers to a particular range
>> of
>> > > keys, irrespective of the window.
>> > > We can then prefix methods with "fetch" to indicate that more than one
>> > > window may be returned for each key in the range.
>> > >
>> > > This would give us:
>> > > - a new fetchAll() method for all the keys, which makes it clear that
>> you
>> > > might get back the same key in different windows
>> > > - a new fetchAll(timeFrom, timeTo) method to get all the keys in a
>> given
>> > > time range, again with possibly more than one window per key
>> > > - and we'd have to rename fetch(K,K,long, long) to fetchRange(K, K,
>> long,
>> > > long)  and deprecate the old one to indicate a range of keys
>> > >
>> > > One inconsistency I noted: the "Proposed Changes" section in your KIP
>> > talks
>> > > about a "range(timeFrom, timeTo)" method, I think you meant to refer
>> to
>> > the
>> > > all(from, to) method, but I'm sure you'll fix that once we decide on
>> > > naming.
>> > >
>> > > On the implementation side:
>> > > You mentioned that caching and rocksdb store have very different
>> > key/value
>> > > structures, and while it appears to be that way on the surface, the
>> > > structure between the two is actually very similar. Keys in the cache
>> are
>> > > prefixed with a segment ID to ensure the ordering in the cache stays
>> > > consistent with the rocksdb implementation, which maintains multiple
>> > > rocksdb instances, one for each segment. So we just "artificially"
>> mirror
>> > > the segment structure in the cache.
>> > >
>> > > The reason for keeping the ordering consistent is pretty simple: keep
>> in
>> > > mind that when we query a cached window store we are effectively
>> querying
>> > > both the cache and the persistent rocksdb store at the same time,
>> merging
>> > > results from both. To make that merge as painless as possible, we
>> ensure

[GitHub] kafka pull request #4094: MINOR: Correct KafkaProducer Javadoc spelling of p...

2017-10-18 Thread hmcl

GitHub user hmcl opened a pull request:

https://github.com/apache/kafka/pull/4094

MINOR: Correct KafkaProducer Javadoc spelling of property 
'max.in.flight.requests.per.connection'

Currently, in branches _trunk_, _0.11.0_, and _1.0_ the property 
**max.in.flight.requests.per.connection** is incorrectly misspelled as 
_max.inflight.requests.per.connection_



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/hmcl/kafka-apache trunk_MINOR_Doc_InflightProp

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4094.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4094


commit caff1e1b78d12a8086bfa52b2b9a307097d8840b
Author: Hugo Louro 
Date:   2017-10-19T02:30:30Z

MINOR: Correct KafkaProducer Javadoc spelling of property 
'max.in.flight.requests.per.connection'




---

[GitHub] kafka pull request #4093: KAFKA-6083: The Fetcher should add the InvalidReco...

2017-10-18 Thread efeg

Github user efeg closed the pull request at:

https://github.com/apache/kafka/pull/4093


---

[GitHub] kafka pull request #4093: KAFKA-6083: The Fetcher should add the InvalidReco...

2017-10-18 Thread efeg

GitHub user efeg reopened a pull request:

https://github.com/apache/kafka/pull/4093

KAFKA-6083: The Fetcher should add the InvalidRecordException as a cause to 
the KafkaException when invalid record is found.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/efeg/kafka bug/KAFKA-6083

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4093.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4093


commit 88e07d3a3115c4342ad3714d4397ff39b326f12a
Author: Adem Efe Gencer 
Date:   2017-10-19T00:20:04Z

Add the InvalidRecordException as a cause to the KafkaException when 
invalid record is found.




---

[GitHub] kafka pull request #4093: [KAFKA-6083] The Fetcher should add the InvalidRec...

2017-10-18 Thread efeg

GitHub user efeg opened a pull request:

https://github.com/apache/kafka/pull/4093

[KAFKA-6083] The Fetcher should add the InvalidRecordException as a cause 
to the KafkaException when invalid record is found.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/efeg/kafka bug/KAFKA-6083

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4093.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4093


commit 88e07d3a3115c4342ad3714d4397ff39b326f12a
Author: Adem Efe Gencer 
Date:   2017-10-19T00:20:04Z

Add the InvalidRecordException as a cause to the KafkaException when 
invalid record is found.




---

Re: [DISCUSS] KIP-211: Revise Expiration Semantics of Consumer Group Offsets

2017-10-18 Thread Ted Yu

Please fill out 'Rejected Alternatives' section.

Thanks

On Wed, Oct 18, 2017 at 4:45 PM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> Hi all,
>
> I created a KIP to address the group offset expiration issue reported in
> KAFKA-4682:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets
>
> Your feedback is welcome!
>
> Thanks.
> --Vahid
>
>

[GitHub] kafka pull request #4092: KAFKA-6087: Scanning plugin.path needs to support ...

2017-10-18 Thread kkonstantine

GitHub user kkonstantine opened a pull request:

https://github.com/apache/kafka/pull/4092

KAFKA-6087: Scanning plugin.path needs to support relative symlinks.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kkonstantine/kafka 
KAFKA-6087-Scanning-plugin.path-needs-to-support-relative-symlinks

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4092.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4092


commit 1348ea60fcdf4a10ff2b27039b2dc7d5ad16141e
Author: Konstantine Karantasis 
Date:   2017-10-18T23:49:19Z

KAFKA-6087: Scanning plugin.path needs to support relative symlinks.




---

[jira] [Created] (KAFKA-6087) Scanning plugin.path needs to support relative symlinks

2017-10-18 Thread Konstantine Karantasis (JIRA)

Konstantine Karantasis created KAFKA-6087:
-

 Summary: Scanning plugin.path needs to support relative symlinks
 Key: KAFKA-6087
 URL: https://issues.apache.org/jira/browse/KAFKA-6087
 Project: Kafka
  Issue Type: Bug
  Components: KafkaConnect
Affects Versions: 0.11.0.0
Reporter: Konstantine Karantasis
Assignee: Konstantine Karantasis
 Fix For: 1.0.0, 0.11.0.2



Discovery of Kafka Connect plugins supports symbolic links from within the 
{{plugin.path}} locations, but this ability is restricted to absolute symbolic 
links.

It's essential to support relative symbolic links, as this is the most common 
use case from within the plugin locations. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[DISCUSS] KIP-211: Revise Expiration Semantics of Consumer Group Offsets

2017-10-18 Thread Vahid S Hashemian

Hi all,

I created a KIP to address the group offset expiration issue reported in 
KAFKA-4682:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-211%3A+Revise+Expiration+Semantics+of+Consumer+Group+Offsets

Your feedback is welcome!

Thanks.
--Vahid

[jira] [Created] (KAFKA-6086) KIP-210 Provide for custom error handling when Kafka Streams fails to produce

2017-10-18 Thread Matt Farmer (JIRA)

Matt Farmer created KAFKA-6086:
--

 Summary: KIP-210 Provide for custom error handling when Kafka 
Streams fails to produce
 Key: KAFKA-6086
 URL: https://issues.apache.org/jira/browse/KAFKA-6086
 Project: Kafka
  Issue Type: Improvement
Reporter: Matt Farmer


This is an issue related to the following KIP:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-210+-+Provide+for+custom+error+handling++when+Kafka+Streams+fails+to+produce



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Re: [DISCUSS] KIP-210: Provide for custom error handling when Kafka Streams fails to produce

2017-10-18 Thread Matt Farmer

I’ll create the JIRA ticket.

I think that config name will work. I’ll update the KIP accordingly.
On Wed, Oct 18, 2017 at 6:09 PM Ted Yu  wrote:

> Can you create JIRA that corresponds to the KIP ?
>
> For the new config, how about naming it
> production.exception.processor.class
> ? This way it is clear that class name should be specified.
>
> Cheers
>
> On Wed, Oct 18, 2017 at 2:40 PM, Matt Farmer  wrote:
>
> > Hello everyone,
> >
> > This is the discussion thread for the KIP that I just filed here:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 210+-+Provide+for+custom+error+handling++when+Kafka+
> > Streams+fails+to+produce
> >
> > Looking forward to getting some feedback from folks about this idea and
> > working toward a solution we can contribute back. :)
> >
> > Cheers,
> > Matt Farmer
> >
>

[jira] [Resolved] (KAFKA-5911) Avoid creation of extra Map for futures in KafkaAdminClient

2017-10-18 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved KAFKA-5911.
---
Resolution: Later

> Avoid creation of extra Map for futures in KafkaAdminClient
> ---
>
> Key: KAFKA-5911
> URL: https://issues.apache.org/jira/browse/KAFKA-5911
> Project: Kafka
>  Issue Type: Bug
>Reporter: Ted Yu
>  Labels: client
> Attachments: 5911.v1.txt
>
>
> In various methods from KafkaAdminClient, there is extra Map created when 
> constructing XXResult instance.
> e.g.
> {code}
> return new DescribeReplicaLogDirResult(new 
> HashMap(futures));
> {code}
> Prior to returning, futures Map is already filled.
> Calling get() and values() does not involve the internals of HashMap when we 
> consider thread-safety.
> The extra Map doesn't need to be created.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Jenkins build is back to normal : kafka-trunk-jdk7 #2901

See

Build failed in Jenkins: kafka-trunk-jdk9 #134

See 


Changes:

[ismael] MINOR: Remove dead code

--
[...truncated 1.41 MB...]
kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyDeletionOfResourceAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyDeletionOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFound STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFound PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAclInheritance STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAclInheritance PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDistributedConcurrentModificationOfResourceAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testDistributedConcurrentModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testAclManagementAPIs PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testWildCardAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testTopicAcl PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testSuperUserHasAccess PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testDenyTakesPrecedence PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testNoAclFoundOverride PASSED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls STARTED

kafka.security.auth.SimpleAclAuthorizerTest > 
testHighConcurrencyModificationOfResourceAcls PASSED

kafka.security.auth.SimpleAclAuthorizerTest > testLoadCache STARTED

kafka.security.auth.SimpleAclAuthorizerTest > testLoadCache PASSED

kafka.integration.PrimitiveApiTest > testMultiProduce STARTED

kafka.integration.PrimitiveApiTest > testMultiProduce PASSED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch STARTED

kafka.integration.PrimitiveApiTest > testDefaultEncoderProducerAndFetch PASSED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize 
STARTED

kafka.integration.PrimitiveApiTest > testFetchRequestCanProperlySerialize PASSED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests STARTED

kafka.integration.PrimitiveApiTest > testPipelinedProduceRequests PASSED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch STARTED

kafka.integration.PrimitiveApiTest > testProduceAndMultiFetch PASSED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression STARTED

kafka.integration.PrimitiveApiTest > 
testDefaultEncoderProducerAndFetchWithCompression PASSED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic STARTED

kafka.integration.PrimitiveApiTest > testConsumerEmptyTopic PASSED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest STARTED

kafka.integration.PrimitiveApiTest > testEmptyFetchRequest PASSED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete STARTED

kafka.integration.MetricsDuringTopicCreationDeletionTest > 
testMetricsDuringTopicCreateDelete PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooLow PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooLow 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToEarliestWhenOffsetTooHigh 
PASSED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
STARTED

kafka.integration.AutoOffsetResetTest > testResetToLatestWhenOffsetTooHigh 
PASSED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
STARTED

kafka.integration.TopicMetadataTest > testIsrAfterBrokerShutDownAndJoinsBack 
PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopicWithCollision PASSED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics STARTED

kafka.integration.TopicMetadataTest > testAliveBrokerListWithNoTopics PASSED

kafka.integration.TopicMetadataTest > testAutoCreateTopic STARTED

kafka.integration.TopicMetadataTest > testAutoCreateTopic PASSED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata STARTED

kafka.integration.TopicMetadataTest > testGetAllTopicMetadata PASSED

[jira] [Created] (KAFKA-6085) Streams rebalancing may cause a first batch of fetched records to be dropped

2017-10-18 Thread Guozhang Wang (JIRA)

Guozhang Wang created KAFKA-6085:


 Summary: Streams rebalancing may cause a first batch of fetched 
records to be dropped
 Key: KAFKA-6085
 URL: https://issues.apache.org/jira/browse/KAFKA-6085
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.11.0.1
Reporter: Guozhang Wang
Assignee: Guozhang Wang
Priority: Blocker
 Fix For: 1.0.0


This is a regression introduced in KAFKA-5152:

Assuming you have one task without any state stores (and hence no restoration 
needed for that task), and a rebalance happened in a {{records = 
pollRequests(pollTimeMs);}} call:

1. We name this `pollRequests` call A. And within call A the rebalance will 
happen, which put the thread state from RUNNING to PARTITION_REVOKED, and then 
from PARITION_REVOKED to PARTITION_ASSIGNED. Assume the same task gets assigned 
again, this task will be in the initialized set of tasks but NOT in the running 
tasks yet.

2. Within the same call A, a fetch request may be sent and a response with a 
batch of records could be returned, and it will be returned from 
`pollRequests`. At this time the thread state become PARTITION_ASSIGNED and the 
task is not "running" yet.

3. Now the bug comes in this line:

{{!records.isEmpty() && taskManager.hasActiveRunningTasks()}}

Since the task is not ing the active running set yet, this returned set of 
records would be skipped. Effectively these records are dropped on the floor 
and would never be consumed again.

4. In the next run loop, the same `pollRequest()` will be called again. Let's 
call it B. After B is called we will set the thread state to RUNNING and put 
the task to the running task set. But at this point the previous batch of 
records will not be returned any more.

So the bug lies in the fact that within a single run loop of the stream thread. 
We may complete a rebalance with tasks assigned but not yet initialized, AND we 
can fetch a bunch of records for that not-initialized task and drop on the 
floor.

With further investigation I can confirm that the new flaky test 
https://issues.apache.org/jira/browse/KAFKA-5140 's root cause is also this 
bug. And a recent PR https://github.com/apache/kafka/pull/4086 exposed this bug 
by failing the reset integration test more frequently.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (KAFKA-3083) a soft failure in controller may leave a topic partition in an inconsistent state


 [ 
https://issues.apache.org/jira/browse/KAFKA-3083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-3083.

   Resolution: Fixed
 Assignee: Onur Karaman  (was: Mayuresh Gharat)
Fix Version/s: 1.1.0

This is now fixed in KAFKA-5642.

> a soft failure in controller may leave a topic partition in an inconsistent 
> state
> -
>
> Key: KAFKA-3083
> URL: https://issues.apache.org/jira/browse/KAFKA-3083
> Project: Kafka
>  Issue Type: Bug
>  Components: core
>Affects Versions: 0.9.0.0
>Reporter: Jun Rao
>Assignee: Onur Karaman
>  Labels: reliability
> Fix For: 1.1.0
>
>
> The following sequence can happen.
> 1. Broker A is the controller and is in the middle of processing a broker 
> change event. As part of this process, let's say it's about to shrink the isr 
> of a partition.
> 2. Then broker A's session expires and broker B takes over as the new 
> controller. Broker B sends the initial leaderAndIsr request to all brokers.
> 3. Broker A continues by shrinking the isr of the partition in ZK and sends 
> the new leaderAndIsr request to the broker (say C) that leads the partition. 
> Broker C will reject this leaderAndIsr since the request comes from a 
> controller with an older epoch. Now we could be in a situation that Broker C 
> thinks the isr has all replicas, but the isr stored in ZK is different.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (KAFKA-3038) Speeding up partition reassignment after broker failure


 [ 
https://issues.apache.org/jira/browse/KAFKA-3038?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-3038.

Resolution: Duplicate

This is now fixed in KAFKA-5642.

> Speeding up partition reassignment after broker failure
> ---
>
> Key: KAFKA-3038
> URL: https://issues.apache.org/jira/browse/KAFKA-3038
> Project: Kafka
>  Issue Type: Improvement
>  Components: controller, core
>Affects Versions: 0.9.0.0
>Reporter: Eno Thereska
>
> After a broker failure the controller does several writes to Zookeeper for 
> each partition on the failed broker. Writes are done one at a time, in closed 
> loop, which is slow especially under high latency networks. Zookeeper has 
> support for batching operations (the "multi" API). It is expected that 
> substituting serial writes with batched ones should reduce failure handling 
> time by an order of magnitude.
> This is identified as an issue in 
> https://cwiki.apache.org/confluence/display/KAFKA/kafka+Detailed+Replication+Design+V3
>  (section End-to-end latency during a broker failure)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Resolved] (KAFKA-4444) Aggregate requests sent from controller to broker during controlled shutdown


 [ 
https://issues.apache.org/jira/browse/KAFKA-?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-.

Resolution: Duplicate

This is now fixed in KAFKA-5642.

> Aggregate requests sent from controller to broker during controlled shutdown
> 
>
> Key: KAFKA-
> URL: https://issues.apache.org/jira/browse/KAFKA-
> Project: Kafka
>  Issue Type: Bug
>Reporter: Dong Lin
>Assignee: Dong Lin
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] kafka pull request #4091: Test branch

2017-10-18 Thread dnangel026

Github user dnangel026 closed the pull request at:

https://github.com/apache/kafka/pull/4091


---

[GitHub] kafka pull request #4091: Test branch

2017-10-18 Thread dnangel026

GitHub user dnangel026 opened a pull request:

https://github.com/apache/kafka/pull/4091

Test branch

Test

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/Microsoft/kafka testBrach

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4091.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4091


commit 114ac44e8068a75cf3bd942f565c3660b918
Author: yezhu 
Date:   2017-10-18T21:46:42Z

Test branch




---

Re: [DISCUSS] KIP-210: Provide for custom error handling when Kafka Streams fails to produce

2017-10-18 Thread Ted Yu

Can you create JIRA that corresponds to the KIP ?

For the new config, how about naming it production.exception.processor.class
? This way it is clear that class name should be specified.

Cheers

On Wed, Oct 18, 2017 at 2:40 PM, Matt Farmer  wrote:

> Hello everyone,
>
> This is the discussion thread for the KIP that I just filed here:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 210+-+Provide+for+custom+error+handling++when+Kafka+
> Streams+fails+to+produce
>
> Looking forward to getting some feedback from folks about this idea and
> working toward a solution we can contribute back. :)
>
> Cheers,
> Matt Farmer
>

Build failed in Jenkins: kafka-trunk-jdk7 #2900

See 


Changes:

[ismael] MINOR: Remove dead code

--
[...truncated 379.38 KB...]

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryTailIfUndefinedPassed STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryTailIfUndefinedPassed PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecorded STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnUnsupportedIfNoEpochRecorded PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliest STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliest PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPersistEpochsBetweenInstances STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPersistEpochsBetweenInstances PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotClearAnythingIfOffsetToFirstOffset STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotClearAnythingIfOffsetToFirstOffset PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotLetOffsetsGoBackwardsEvenIfEpochsProgress STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotLetOffsetsGoBackwardsEvenIfEpochsProgress PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldGetFirstOffsetOfSubsequentEpochWhenOffsetRequestedForPreviousEpoch STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldGetFirstOffsetOfSubsequentEpochWhenOffsetRequestedForPreviousEpoch PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest2 STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest2 PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearEarliestOnEmptyCache 
STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearEarliestOnEmptyCache 
PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPreserveResetOffsetOnClearEarliestIfOneExists STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldPreserveResetOffsetOnClearEarliestIfOneExists PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldUpdateOffsetBetweenEpochBoundariesOnClearEarliest PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnInvalidOffsetIfEpochIsRequestedWhichIsNotCurrentlyTracked STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldReturnInvalidOffsetIfEpochIsRequestedWhichIsNotCurrentlyTracked PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldFetchEndOffsetOfEmptyCache 
STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldFetchEndOffsetOfEmptyCache 
PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliestAndUpdateItsOffset STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldRetainLatestEpochOnClearAllEarliestAndUpdateItsOffset PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearAllEntries STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearAllEntries PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearLatestOnEmptyCache 
STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > shouldClearLatestOnEmptyCache 
PASSED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryHeadIfUndefinedPassed STARTED

kafka.server.epoch.LeaderEpochFileCacheTest > 
shouldNotResetEpochHistoryHeadIfUndefinedPassed PASSED

kafka.server.epoch.LeaderEpochIntegrationTest > 
shouldIncreaseLeaderEpochBetweenLeaderRestarts STARTED

kafka.server.epoch.LeaderEpochIntegrationTest > 
shouldIncreaseLeaderEpochBetweenLeaderRestarts PASSED

kafka.server.epoch.LeaderEpochIntegrationTest > 
shouldAddCurrentLeaderEpochToMessagesAsTheyAreWrittenToLeader STARTED

kafka.server.epoch.LeaderEpochIntegrationTest > 
shouldAddCurrentLeaderEpochToMessagesAsTheyAreWrittenToLeader PASSED

kafka.server.epoch.LeaderEpochIntegrationTest > 
shouldSendLeaderEpochRequestAndGetAResponse STARTED

kafka.server.epoch.LeaderEpochIntegrationTest > 
shouldSendLeaderEpochRequestAndGetAResponse PASSED

kafka.server.epoch.OffsetsForLeaderEpochTest > shouldGetEpochsFromReplica 
STARTED

kafka.server.epoch.OffsetsForLeaderEpochTest > shouldGetEpochsFromReplica PASSED

kafka.server.epoch.OffsetsForLeaderEpochTest > 
shouldReturnUnknownTopicOrPartitionIfThrown STARTED

kafka.server.epoch.OffsetsForLeaderEpochTest > 
shouldReturnUnknownTopicOrPartitionIfThrown PASSED

kafka.server.epoch.OffsetsForLeaderEpochTest > 
shouldReturnNoLeaderForPartitionIfThrown STARTED

kafka.server.epoch.OffsetsForLeaderEpochTest > 
shouldReturnNoLeaderForPartitionIfThrown PASSED

kafka.server.epoch.EpochDrivenReplicationProtocolAcceptanceTest > 
shouldSurviveFastLeaderChange STARTED

[GitHub] kafka pull request #4090: [KAFKA-6084] Propagate JSON parsing errors in Reas...

2017-10-18 Thread viktorsomogyi

GitHub user viktorsomogyi opened a pull request:

https://github.com/apache/kafka/pull/4090

[KAFKA-6084] Propagate JSON parsing errors in ReassignPartitionsCommand



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/viktorsomogyi/kafka KAFKA-6084

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4090.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4090


commit 3f1a24014e022ad351ea669ba73efa645ccca5f3
Author: Viktor Somogyi 
Date:   2017-10-14T11:16:35Z

[KAFKA-6084] Propagate JSON parsing errors in ReassignPartitionsCommand




---

[DISCUSS] KIP-210: Provide for custom error handling when Kafka Streams fails to produce

2017-10-18 Thread Matt Farmer

Hello everyone,

This is the discussion thread for the KIP that I just filed here:
https://cwiki.apache.org/confluence/display/KAFKA/KIP-210+-+Provide+for+custom+error+handling++when+Kafka+Streams+fails+to+produce

Looking forward to getting some feedback from folks about this idea and
working toward a solution we can contribute back. :)

Cheers,
Matt Farmer

[jira] [Created] (KAFKA-6084) ReassignPartitionsCommand should propagate JSON parsing failures

2017-10-18 Thread Viktor Somogyi (JIRA)

Viktor Somogyi created KAFKA-6084:
-

 Summary: ReassignPartitionsCommand should propagate JSON parsing 
failures
 Key: KAFKA-6084
 URL: https://issues.apache.org/jira/browse/KAFKA-6084
 Project: Kafka
  Issue Type: Improvement
  Components: admin
Affects Versions: 0.11.0.0
Reporter: Viktor Somogyi
Assignee: Viktor Somogyi
Priority: Minor
 Attachments: Screen Shot 2017-10-18 at 23.31.22.png

Basically looking at Json.scala it will always swallow any parsing errors:
{code}
  def parseFull(input: String): Option[JsonValue] =
try Option(mapper.readTree(input)).map(JsonValue(_))
catch { case _: JsonProcessingException => None }
{code}

However sometimes it is easy to figure out the problem by simply looking at the 
JSON, in some cases it is not very trivial, such as some invisible characters 
(like byte order mark) won't be displayed by most of the text editors and can 
people spend time on figuring out what's the problem.

As Jackson provides a really detailed exception about what failed and how, it 
is easy to propagate the failure to the user.

As an example I attached a BOM prefixed JSON which fails with the following 
error which is very counterintuitive:
{noformat}
[root@localhost ~]# kafka-reassign-partitions --zookeeper localhost:2181 
--reassignment-json-file /root/increase-replication-factor.json --execute
Partitions reassignment failed due to Partition reassignment data file 
/root/increase-replication-factor.json is empty
kafka.common.AdminCommandFailedException: Partition reassignment data file 
/root/increase-replication-factor.json is empty
at 
kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:120)
at 
kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:52)
at kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
...
{noformat}

In case of the above error it would be much better to see what fails exactly:
{noformat}
kafka.common.AdminCommandFailedException: Admin command failed
at 
kafka.admin.ReassignPartitionsCommand$.parsePartitionReassignmentData(ReassignPartitionsCommand.scala:267)
at 
kafka.admin.ReassignPartitionsCommand$.parseAndValidate(ReassignPartitionsCommand.scala:275)
at 
kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:197)
at 
kafka.admin.ReassignPartitionsCommand$.executeAssignment(ReassignPartitionsCommand.scala:193)
at 
kafka.admin.ReassignPartitionsCommand$.main(ReassignPartitionsCommand.scala:64)
at 
kafka.admin.ReassignPartitionsCommand.main(ReassignPartitionsCommand.scala)
Caused by: com.fasterxml.jackson.core.JsonParseException: Unexpected character 
('' (code 65279 / 0xfeff)): expected a valid value (number, String, array, 
object, 'true', 'false' or 'null')
 at [Source: (String)"{"version":1,
  "partitions":[
   {"topic": "test1", "partition": 0, "replicas": [1,2]},
   {"topic": "test2", "partition": 1, "replicas": [2,3]}
]}"; line: 1, column: 2]
at 
com.fasterxml.jackson.core.JsonParser._constructError(JsonParser.java:1798)
at 
com.fasterxml.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:663)
at 
com.fasterxml.jackson.core.base.ParserMinimalBase._reportUnexpectedChar(ParserMinimalBase.java:561)
at 
com.fasterxml.jackson.core.json.ReaderBasedJsonParser._handleOddValue(ReaderBasedJsonParser.java:1892)
at 
com.fasterxml.jackson.core.json.ReaderBasedJsonParser.nextToken(ReaderBasedJsonParser.java:747)
at 
com.fasterxml.jackson.databind.ObjectMapper._readTreeAndClose(ObjectMapper.java:4030)
at 
com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2539)
at kafka.utils.Json$.kafka$utils$Json$$doParseFull(Json.scala:46)
at kafka.utils.Json$$anonfun$tryParseFull$1.apply(Json.scala:44)
at kafka.utils.Json$$anonfun$tryParseFull$1.apply(Json.scala:44)
at scala.util.Try$.apply(Try.scala:192)
at kafka.utils.Json$.tryParseFull(Json.scala:44)
at 
kafka.admin.ReassignPartitionsCommand$.parsePartitionReassignmentData(ReassignPartitionsCommand.scala:241)
... 5 more
{noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] kafka pull request #4087: MINOR: Remove dead code

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4087


---

[GitHub] kafka-site pull request #100: MINOR: Pinterest link correction

2017-10-18 Thread manjuapu

Github user manjuapu commented on a diff in the pull request:

https://github.com/apache/kafka-site/pull/100#discussion_r145538033
  
--- Diff: 0110/streams/index.html ---
@@ -111,23 +93,41 @@ Streams API use cases


  
-   https://www.nytimes.com; target="_blank" 
class="grid__logo__link">
+   https://medium.com/@Pinterest_Engineering/using-kafka-streams-api-for-predictive-budgeting-9f58d206c996;
 target="_blank" class="grid__logo__link">
--- End diff --

@guozhangwang I have link for all logos as well as link in the content.


---

Re: [kafka-clients] [VOTE] 1.0.0 RC2

2017-10-18 Thread Guozhang Wang

Thanks for pointing out, Jun, Ismael.

Will update the statement.


Guozhang

On Wed, Oct 18, 2017 at 9:51 AM, Ismael Juma  wrote:

> Also, only part 1 of KIP-113 landed. The release planning page has the
> correct info for what it's worth.
>
> Ismael
>
> On 18 Oct 2017 5:42 pm, "Jun Rao"  wrote:
>
>> Hi, Guozhang,
>>
>> Thanks for running the release. Just a quick clarification. The statement
>> that "* Controller improvements: async ZK access for faster administrative
>> request handling" is not accurate. What's included in 1.0.0 is a logging
>> change improvement in the controller, which does give significant perf
>> benefit. However, the async ZK changes are in trunk and will be in 1.1.0.
>>
>> Jun
>>
>> On Tue, Oct 17, 2017 at 9:47 AM, Guozhang Wang 
>> wrote:
>>
>> > Hello Kafka users, developers and client-developers,
>> >
>> > This is the third candidate for release of Apache Kafka 1.0.0. The main
>> > PRs that gets merged in after RC1 are the following:
>> >
>> > https://github.com/apache/kafka/commit/dc6bfa553e73ffccd1e604963e076c
>> > 78d8ddcd69
>> >
>> > It's worth noting that starting in this version we are using a different
>> > version protocol with three digits: *major.minor.bug-fix*
>> >
>> > Any and all testing is welcome, but the following areas are worth
>> > highlighting:
>> >
>> > 1. Client developers should verify that their clients can
>> produce/consume
>> > to/from 1.0.0 brokers (ideally with compressed and uncompressed data).
>> > 2. Performance and stress testing. Heroku and LinkedIn have helped with
>> > this in the past (and issues have been found and fixed).
>> > 3. End users can verify that their apps work correctly with the new
>> > release.
>> >
>> > This is a major version release of Apache Kafka. It includes 29 new
>> KIPs.
>> > See the release notes and release plan (*https://cwiki.apache.org/con
>> fluence/pages/viewpage.action?pageId=71764913
>> > > pageId=71764913>*)
>> > for more details. A few feature highlights:
>> >
>> > * Java 9 support with significantly faster TLS and CRC32C
>> implementations
>> > * JBOD improvements: disk failure only disables failed disk but not the
>> > broker (KIP-112/KIP-113)
>> > * Controller improvements: async ZK access for faster administrative
>> > request handling
>> > * Newly added metrics across all the modules (KIP-164, KIP-168, KIP-187,
>> > KIP-188, KIP-196)
>> > * Kafka Streams API improvements (KIP-120 / 130 / 138 / 150 / 160 /
>> 161),
>> > and drop compatibility "Evolving" annotations
>> >
>> > Release notes for the 1.0.0 release:
>> > *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/RELEASE_NOTES.html
>> > *
>> >
>> >
>> >
>> > *** Please download, test and vote by Friday, October 20, 8pm PT
>> >
>> > Kafka's KEYS file containing PGP keys we use to sign the release:
>> > http://kafka.apache.org/KEYS
>> >
>> > * Release artifacts to be voted upon (source and binary):
>> > *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/
>> > *
>> >
>> > * Maven artifacts to be voted upon:
>> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
>> >
>> > * Javadoc:
>> > *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/javadoc/
>> > *
>> >
>> > * Tag to be voted upon (off 1.0 branch) is the 1.0.0-rc2 tag:
>> >
>> > https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
>> > 51d5f12e190a38547839c7d2710c97faaeaca586
>> >
>> > * Documentation:
>> > Note the documentation can't be pushed live due to changes that will
>> not go
>> > live until the release. You can manually verify by downloading
>> > http://home.apache.org/~guozhang/kafka-1.0.0-rc2/
>> > kafka_2.11-1.0.0-site-docs.tgz
>> >
>> > * Successful Jenkins builds for the 1.0.0 branch:
>> > Unit/integration tests: https://builds.apache.org/job/
>> kafka-1.0-jdk7/40/
>> > System test: https://jenkins.confluent.io/job/system-test-kafka-1.0/6/
>> >
>> >
>> > /**
>> >
>> >
>> > Thanks,
>> > -- Guozhang
>> >
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups
>> > "kafka-clients" group.
>> > To unsubscribe from this group and stop receiving emails from it, send
>> an
>> > email to kafka-clients+unsubscr...@googlegroups.com.
>> > To post to this group, send email to kafka-clie...@googlegroups.com.
>> > Visit this group at https://groups.google.com/group/kafka-clients.
>> > To view this discussion on the web visit https://groups.google.com/d/
>> > msgid/kafka-clients/CAHwHRrXD0nLUqFV0HV_Mtz5eY%
>> > 2B2RhXCSk_xX1FPkHV%3D0s6u7pQ%40mail.gmail.com
>> > > LUqFV0HV_Mtz5eY%2B2RhXCSk_xX1FPkHV%3D0s6u7pQ%40mail.
>>

[GitHub] kafka-site pull request #100: MINOR: Pinterest link correction

Github user guozhangwang commented on a diff in the pull request:

https://github.com/apache/kafka-site/pull/100#discussion_r145530925
  
--- Diff: 0110/streams/index.html ---
@@ -82,26 +82,8 @@ The easiest way to write 
mission-critical real-time ap
 Streams API use cases
  

- 
-   https://linecorp.com/; target="_blank" 
class="grid__logo__link">
- 
-   
- https://engineering.linecorp.com/en/blog/detail/80; 
target="_blank">LINE uses Apache Kafka as a central datahub for our 
services to communicate to one another. Hundreds of billions of messages are 
produced daily and are used to execute various business logic, threat 
detection, search indexing and data analysis. LINE leverages Kafka Streams to 
reliably transform and filter topics enabling sub topics consumers can 
efficiently consume, meanwhile retaining easy maintainability thanks to its 
sophisticated yet minimal code base.
- 
-   
-   
- 
-   http://www.zalando.com; target="_blank" 
class="grid__logo__link">
- 
-   
-   As the leading online fashion retailer in Europe, Zalando uses 
Kafka as an ESB (Enterprise Service Bus), which helps us in transitioning from 
a monolithic to a micro services architecture. Using Kafka for processing
- https://kafka-summit.org/sessions/using-kstreams-ktables-calculate-real-time-domain-rankings/;
 target='blank'> event streams enables our technical team to do near-real 
time business intelligence.
-   
- 
-   
-   
  
-   https://www.nytimes.com; target="_blank" 
class="grid__logo__link">
+   https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/; 
target="_blank" class="grid__logo__link">
--- End diff --

Maybe better to use this url instead as it is from the original nyt website?


https://open.nytimes.com/publishing-with-apache-kafka-at-the-new-york-times-7f0e3b7d2077


---

[GitHub] kafka-site pull request #100: MINOR: Pinterest link correction

Github user guozhangwang commented on a diff in the pull request:

https://github.com/apache/kafka-site/pull/100#discussion_r145531542
  
--- Diff: 0110/streams/index.html ---
@@ -111,23 +93,41 @@ Streams API use cases


  
-   https://www.nytimes.com; target="_blank" 
class="grid__logo__link">
+   https://medium.com/@Pinterest_Engineering/using-kafka-streams-api-for-predictive-budgeting-9f58d206c996;
 target="_blank" class="grid__logo__link">
--- End diff --

Hmm.. why we need to ref links for pinterest, with one for its logo and the 
content as well while for others we just use one ref link for the logo only?


---

[jira] [Created] (KAFKA-6083) The Fetcher should add the InvalidRecordException as a cause to the KafkaException when invalid record is found.

2017-10-18 Thread Jiangjie Qin (JIRA)

Jiangjie Qin created KAFKA-6083:
---

 Summary: The Fetcher should add the InvalidRecordException as a 
cause to the KafkaException when invalid record is found.
 Key: KAFKA-6083
 URL: https://issues.apache.org/jira/browse/KAFKA-6083
 Project: Kafka
  Issue Type: Improvement
  Components: clients, consumer
Affects Versions: 1.0.0
Reporter: Jiangjie Qin
 Fix For: 1.0.1


In the Fetcher, when there is an InvalidRecoredException thrown, we will 
convert it to a KafkaException, we should also add the InvalidRecordException 
to it as the cause.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Build failed in Jenkins: kafka-trunk-jdk7 #2899

See 


Changes:

[junrao] KAFKA-6051; Close the ReplicaFetcherBlockingSend earlier on shutdown

--
[...truncated 1.83 MB...]

org.apache.kafka.streams.KafkaStreamsTest > testStateGlobalThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingUncaughtExceptionHandlerNotInCreateState STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingUncaughtExceptionHandlerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics STARTED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics PASSED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata STARTED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled 
STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest >

[jira] [Created] (KAFKA-6082) consider fencing zookeeper updates with controller epoch zkVersion

2017-10-18 Thread Onur Karaman (JIRA)

Onur Karaman created KAFKA-6082:
---

 Summary: consider fencing zookeeper updates with controller epoch 
zkVersion
 Key: KAFKA-6082
 URL: https://issues.apache.org/jira/browse/KAFKA-6082
 Project: Kafka
  Issue Type: Sub-task
Reporter: Onur Karaman


If we want, we can use multi-op to fence zookeeper updates with the controller 
epoch's zkVersion.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (KAFKA-6081) response error code checking

2017-10-18 Thread Onur Karaman (JIRA)

Onur Karaman created KAFKA-6081:
---

 Summary: response error code checking
 Key: KAFKA-6081
 URL: https://issues.apache.org/jira/browse/KAFKA-6081
 Project: Kafka
  Issue Type: Sub-task
Reporter: Onur Karaman


In most cases in the controller, we assume that requests succeed. We should 
instead check for their responses.

Example: partition reassignment has the following todo:
{code}
// TODO: Eventually partition reassignment could use a callback that does 
retries if deletion failed
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Jenkins build is back to normal : kafka-trunk-jdk8 #2145

See

[jira] [Resolved] (KAFKA-5083) always leave the last surviving member of the ISR in ZK

2017-10-18 Thread Onur Karaman (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-5083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Onur Karaman resolved KAFKA-5083.
-
Resolution: Fixed

This has been fixed in KAFKA-5642.

> always leave the last surviving member of the ISR in ZK
> ---
>
> Key: KAFKA-5083
> URL: https://issues.apache.org/jira/browse/KAFKA-5083
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
>
> Currently we erase ISR membership if the replica to be removed from the ISR 
> is the last surviving member of the ISR and unclean leader election is 
> enabled for the corresponding topic.
> We should investigate leaving the last replica in ISR in ZK, independent of 
> whether unclean leader election is enabled or not. That way, if people 
> re-disabled unclean leader election, we can still try to elect the leader 
> from the last in-sync replica.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] kafka pull request #4089: KAFKA-6071: Use ZookeeperClient in LogManager

2017-10-18 Thread omkreddy

GitHub user omkreddy opened a pull request:

https://github.com/apache/kafka/pull/4089

KAFKA-6071: Use ZookeeperClient in LogManager



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/omkreddy/kafka KAFKA-6071-ZK-LOGMANAGER

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4089.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4089


commit 0f19b38f702094f3543a490f22104a6f275c38a7
Author: Manikumar Reddy 
Date:   2017-10-18T18:24:37Z

KAFKA-6071: Use ZookeeperClient in LogManager




---

[jira] [Created] (KAFKA-6080) Transactional EoS for source connectors

2017-10-18 Thread Antony Stubbs (JIRA)

Antony Stubbs created KAFKA-6080:


 Summary: Transactional EoS for source connectors
 Key: KAFKA-6080
 URL: https://issues.apache.org/jira/browse/KAFKA-6080
 Project: Kafka
  Issue Type: New Feature
  Components: KafkaConnect
Reporter: Antony Stubbs


Exactly once (eos) message production for source connectors.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Created] (KAFKA-6079) Idempotent production for source connectors

2017-10-18 Thread Antony Stubbs (JIRA)

Antony Stubbs created KAFKA-6079:


 Summary: Idempotent production for source connectors
 Key: KAFKA-6079
 URL: https://issues.apache.org/jira/browse/KAFKA-6079
 Project: Kafka
  Issue Type: New Feature
  Components: KafkaConnect
Reporter: Antony Stubbs


Idempotent production for source connection to reduce duplicates at least from 
retires.




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Jenkins build is back to normal : kafka-trunk-jdk9 #133

See

Build failed in Jenkins: kafka-trunk-jdk7 #2898

See 


Changes:

[junrao] KAFKA-5642; Use async ZookeeperClient in Controller

--
[...truncated 1.83 MB...]

org.apache.kafka.streams.KafkaStreamsTest > testStateGlobalThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingUncaughtExceptionHandlerNotInCreateState STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingUncaughtExceptionHandlerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCleanup PASSED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState STARTED

org.apache.kafka.streams.KafkaStreamsTest > 
shouldThrowExceptionSettingStateListenerNotInCreateState PASSED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics STARTED

org.apache.kafka.streams.KafkaStreamsTest > testNumberDefaultMetrics PASSED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata STARTED

org.apache.kafka.streams.KafkaStreamsTest > shouldReturnThreadMetadata PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCloseIsIdempotent PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning 
STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotCleanupWhileRunning PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateThreadClose PASSED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges STARTED

org.apache.kafka.streams.KafkaStreamsTest > testStateChanges PASSED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice STARTED

org.apache.kafka.streams.KafkaStreamsTest > testCannotStartTwice PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfProducerEnableIdempotenceIsOverriddenIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldUseNewConfigsWhenPresent 
PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce STARTED

org.apache.kafka.streams.StreamsConfigTest > shouldAcceptAtLeastOnce PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldUseCorrectDefaultsWhenNoneSpecified PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerEnableIdempotenceIfEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
STARTED

org.apache.kafka.streams.StreamsConfigTest > defaultSerdeShouldBeConfigured 
PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSetDifferentDefaultsIfEosEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldNotOverrideUserConfigRetriesIfExactlyOnceEnabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldSupportPrefixedConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldOverrideStreamsDefaultConsumerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs STARTED

org.apache.kafka.streams.StreamsConfigTest > testGetProducerConfigs PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled 
STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldAllowSettingProducerMaxInFlightRequestPerConnectionsWhenEosDisabled PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldThrowStreamsExceptionIfValueSerdeConfigFails PASSED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden STARTED

org.apache.kafka.streams.StreamsConfigTest > 
shouldResetToDefaultIfConsumerAutoCommitIsOverridden PASSED

org.apache.kafka.streams.StreamsConfigTest >

[GitHub] kafka pull request #4088: MINOR: Controller and async ZookeeperClient improv...

2017-10-18 Thread ijuma

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/4088

MINOR: Controller and async ZookeeperClient improvements (WIP)

More to come.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka async-zkclient-cleanups

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4088.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4088


commit e538ba1acbcfc97e752f596e746fea2f2d4c8b38
Author: Ismael Juma 
Date:   2017-10-18T17:27:46Z

WIP




---

[GitHub] kafka pull request #4056: KAFKA-6051 Close the ReplicaFetcherBlockingSend ea...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4056


---

Re: [kafka-clients] [VOTE] 1.0.0 RC2

Also, only part 1 of KIP-113 landed. The release planning page has the
correct info for what it's worth.

Ismael

On 18 Oct 2017 5:42 pm, "Jun Rao"  wrote:

> Hi, Guozhang,
>
> Thanks for running the release. Just a quick clarification. The statement
> that "* Controller improvements: async ZK access for faster administrative
> request handling" is not accurate. What's included in 1.0.0 is a logging
> change improvement in the controller, which does give significant perf
> benefit. However, the async ZK changes are in trunk and will be in 1.1.0.
>
> Jun
>
> On Tue, Oct 17, 2017 at 9:47 AM, Guozhang Wang  wrote:
>
> > Hello Kafka users, developers and client-developers,
> >
> > This is the third candidate for release of Apache Kafka 1.0.0. The main
> > PRs that gets merged in after RC1 are the following:
> >
> > https://github.com/apache/kafka/commit/dc6bfa553e73ffccd1e604963e076c
> > 78d8ddcd69
> >
> > It's worth noting that starting in this version we are using a different
> > version protocol with three digits: *major.minor.bug-fix*
> >
> > Any and all testing is welcome, but the following areas are worth
> > highlighting:
> >
> > 1. Client developers should verify that their clients can produce/consume
> > to/from 1.0.0 brokers (ideally with compressed and uncompressed data).
> > 2. Performance and stress testing. Heroku and LinkedIn have helped with
> > this in the past (and issues have been found and fixed).
> > 3. End users can verify that their apps work correctly with the new
> > release.
> >
> > This is a major version release of Apache Kafka. It includes 29 new KIPs.
> > See the release notes and release plan (*https://cwiki.apache.org/
> confluence/pages/viewpage.action?pageId=71764913
> >  action?pageId=71764913>*)
> > for more details. A few feature highlights:
> >
> > * Java 9 support with significantly faster TLS and CRC32C implementations
> > * JBOD improvements: disk failure only disables failed disk but not the
> > broker (KIP-112/KIP-113)
> > * Controller improvements: async ZK access for faster administrative
> > request handling
> > * Newly added metrics across all the modules (KIP-164, KIP-168, KIP-187,
> > KIP-188, KIP-196)
> > * Kafka Streams API improvements (KIP-120 / 130 / 138 / 150 / 160 / 161),
> > and drop compatibility "Evolving" annotations
> >
> > Release notes for the 1.0.0 release:
> > *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/RELEASE_NOTES.html
> > *
> >
> >
> >
> > *** Please download, test and vote by Friday, October 20, 8pm PT
> >
> > Kafka's KEYS file containing PGP keys we use to sign the release:
> > http://kafka.apache.org/KEYS
> >
> > * Release artifacts to be voted upon (source and binary):
> > *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/
> > *
> >
> > * Maven artifacts to be voted upon:
> > https://repository.apache.org/content/groups/staging/org/apache/kafka/
> >
> > * Javadoc:
> > *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/javadoc/
> > *
> >
> > * Tag to be voted upon (off 1.0 branch) is the 1.0.0-rc2 tag:
> >
> > https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> > 51d5f12e190a38547839c7d2710c97faaeaca586
> >
> > * Documentation:
> > Note the documentation can't be pushed live due to changes that will not
> go
> > live until the release. You can manually verify by downloading
> > http://home.apache.org/~guozhang/kafka-1.0.0-rc2/
> > kafka_2.11-1.0.0-site-docs.tgz
> >
> > * Successful Jenkins builds for the 1.0.0 branch:
> > Unit/integration tests: https://builds.apache.org/job/kafka-1.0-jdk7/40/
> > System test: https://jenkins.confluent.io/job/system-test-kafka-1.0/6/
> >
> >
> > /**
> >
> >
> > Thanks,
> > -- Guozhang
> >
> > --
> > You received this message because you are subscribed to the Google Groups
> > "kafka-clients" group.
> > To unsubscribe from this group and stop receiving emails from it, send an
> > email to kafka-clients+unsubscr...@googlegroups.com.
> > To post to this group, send email to kafka-clie...@googlegroups.com.
> > Visit this group at https://groups.google.com/group/kafka-clients.
> > To view this discussion on the web visit https://groups.google.com/d/
> > msgid/kafka-clients/CAHwHRrXD0nLUqFV0HV_Mtz5eY%
> > 2B2RhXCSk_xX1FPkHV%3D0s6u7pQ%40mail.gmail.com
> >  CAHwHRrXD0nLUqFV0HV_Mtz5eY%2B2RhXCSk_xX1FPkHV%3D0s6u7pQ%
> 40mail.gmail.com?utm_medium=email_source=footer>
> > .
> > For more options, visit https://groups.google.com/d/optout.
> >
>

Re: [kafka-clients] [VOTE] 1.0.0 RC2

2017-10-18 Thread Jun Rao

Hi, Guozhang,

Thanks for running the release. Just a quick clarification. The statement
that "* Controller improvements: async ZK access for faster administrative
request handling" is not accurate. What's included in 1.0.0 is a logging
change improvement in the controller, which does give significant perf
benefit. However, the async ZK changes are in trunk and will be in 1.1.0.

Jun

On Tue, Oct 17, 2017 at 9:47 AM, Guozhang Wang  wrote:

> Hello Kafka users, developers and client-developers,
>
> This is the third candidate for release of Apache Kafka 1.0.0. The main
> PRs that gets merged in after RC1 are the following:
>
> https://github.com/apache/kafka/commit/dc6bfa553e73ffccd1e604963e076c
> 78d8ddcd69
>
> It's worth noting that starting in this version we are using a different
> version protocol with three digits: *major.minor.bug-fix*
>
> Any and all testing is welcome, but the following areas are worth
> highlighting:
>
> 1. Client developers should verify that their clients can produce/consume
> to/from 1.0.0 brokers (ideally with compressed and uncompressed data).
> 2. Performance and stress testing. Heroku and LinkedIn have helped with
> this in the past (and issues have been found and fixed).
> 3. End users can verify that their apps work correctly with the new
> release.
>
> This is a major version release of Apache Kafka. It includes 29 new KIPs.
> See the release notes and release plan 
> (*https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=71764913
> *)
> for more details. A few feature highlights:
>
> * Java 9 support with significantly faster TLS and CRC32C implementations
> * JBOD improvements: disk failure only disables failed disk but not the
> broker (KIP-112/KIP-113)
> * Controller improvements: async ZK access for faster administrative
> request handling
> * Newly added metrics across all the modules (KIP-164, KIP-168, KIP-187,
> KIP-188, KIP-196)
> * Kafka Streams API improvements (KIP-120 / 130 / 138 / 150 / 160 / 161),
> and drop compatibility "Evolving" annotations
>
> Release notes for the 1.0.0 release:
> *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/RELEASE_NOTES.html
> *
>
>
>
> *** Please download, test and vote by Friday, October 20, 8pm PT
>
> Kafka's KEYS file containing PGP keys we use to sign the release:
> http://kafka.apache.org/KEYS
>
> * Release artifacts to be voted upon (source and binary):
> *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/
> *
>
> * Maven artifacts to be voted upon:
> https://repository.apache.org/content/groups/staging/org/apache/kafka/
>
> * Javadoc:
> *http://home.apache.org/~guozhang/kafka-1.0.0-rc2/javadoc/
> *
>
> * Tag to be voted upon (off 1.0 branch) is the 1.0.0-rc2 tag:
>
> https://git-wip-us.apache.org/repos/asf?p=kafka.git;a=tag;h=
> 51d5f12e190a38547839c7d2710c97faaeaca586
>
> * Documentation:
> Note the documentation can't be pushed live due to changes that will not go
> live until the release. You can manually verify by downloading
> http://home.apache.org/~guozhang/kafka-1.0.0-rc2/
> kafka_2.11-1.0.0-site-docs.tgz
>
> * Successful Jenkins builds for the 1.0.0 branch:
> Unit/integration tests: https://builds.apache.org/job/kafka-1.0-jdk7/40/
> System test: https://jenkins.confluent.io/job/system-test-kafka-1.0/6/
>
>
> /**
>
>
> Thanks,
> -- Guozhang
>
> --
> You received this message because you are subscribed to the Google Groups
> "kafka-clients" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to kafka-clients+unsubscr...@googlegroups.com.
> To post to this group, send email to kafka-clie...@googlegroups.com.
> Visit this group at https://groups.google.com/group/kafka-clients.
> To view this discussion on the web visit https://groups.google.com/d/
> msgid/kafka-clients/CAHwHRrXD0nLUqFV0HV_Mtz5eY%
> 2B2RhXCSk_xX1FPkHV%3D0s6u7pQ%40mail.gmail.com
> 
> .
> For more options, visit https://groups.google.com/d/optout.
>

Unit test failure in ClientUtilsTest - corporate firewall - InetSocketAddress.isUnresolved

2017-10-18 Thread michael.guyver

Hi everyone,

I've tried to build kafka-0.11.0.1 and see a unit-test error with the following
stack trace:

org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls
given in bootstrap.servers
at
org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:64)
at org.apache.kafka.clients.ClientUtilsTest.check(ClientUtilsTest.java:53)
at
org.apache.kafka.clients.ClientUtilsTest.testParseAndValidateAddresses(ClientUtilsTest.java:32)

This appears to result from the offered address remaining unresolved (precis):

new InetSocketAddress(host, port).isUnresolved()

I'm building this on a corporate intranet with restrictive firewall settings
(not that they would obviously impact DNS queries, but it's a possibility).

Regards

Michael
This communication is issued by UBS Limited, UBS AG and/or
affiliates to professional investors only. It is the product of
a sales/trading desk and not the Research Department. This is
not a personal recommendation, an offer to buy or sell or a
solicitation to buy or sell any securities, investment products
or other financial instrument or service. This communication is
subject to terms available at the following link:
www.ubs.com/salesandtradingdisclaimers.

Intended for recipient only and not for further distribution without
the consent of UBS.

UBS Limited is a subsidiary of UBS AG. UBS Limited is a company
limited by shares incorporated in the United Kingdom, registered
in England and Wales with the number 2035362.
Registered office: 5 Broadgate, London EC2M 2QS
UBS Limited is authorised by the Prudential Regulation Authority and
regulated by the Financial Conduct Authority and the Prudential
Regulation Authority.

UBS AG is a public company incorporated with limited liability in
Switzerland domiciled in the Canton of Basel-City and the Canton of
Zurich respectively registered at the Commercial Registry offices in
those Cantons with new Identification No: CHE-101.329.561 as from 18
December 2013 (and prior to 18 December 2013 with Identification
No: CH-270.3.004.646-4) and having respective head offices at
Aeschenvorstadt 1, 4051 Basel and Bahnhofstrasse 45, 8001 Zurich,
Switzerland and is authorised and regulated by the Financial Market
Supervisory Authority in Switzerland. Registered in the United
Kingdom as a foreign company with No: FC021146 and having a UK
Establishment registered at Companies House, Cardiff, with
No: BR 004507. The principal office of UK Establishment:
5 Broadgate, London EC2M 2QS. In the United Kingdom, UBS AG is
authorised by the Prudential Regulation Authority and subject to
regulation by the Financial Conduct Authority and limited regulation
by the Prudential Regulation Authority. Details about the extent of
our regulation by the Prudential Regulation Authority are available
from us on request.

UBS reserves the right to retain all messages. Messages are protected
and accessed only in legally justified cases.

[GitHub] kafka pull request #3765: KAFKA-5642: Use async ZookeeperClient in Controlle...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/3765


---

Re: [VOTE] KIP-207:The Offsets which ListOffsetsResponse returns should monotonically increase even during a partition leader change

2017-10-18 Thread Colin McCabe

On Wed, Oct 18, 2017, at 04:09, Ismael Juma wrote:
> Thanks for the KIP, +1 (binding). A few comments:
> 
> 1. I agree with Jun about LEADER_NOT_AVAILABLE for the error code for
> older
> versions.
> 2. OffsetNotAvailableException seems clear enough (i.e. we don't need the
> "ForPartition" part)

Yeah, that is shorter and probably clearer.  Changed.

> 3. The KIP seems to be missing the compatibility section.

Added.

> 4. It would be good to mention that it's now possible for a fetch to
> succeed while list offsets will not for a period of time. And for older
> versions, the latter will return LeaderNotAvailable while the former
> would
> work fine, which is a bit unexpected. Not much we can do about it, but
> worth mentioning it in my opinion.

Fair enough

cheers,
Colin

> 
> Ismael
> 
> On Tue, Oct 17, 2017 at 9:26 PM, Jun Rao  wrote:
> 
> > Hi, Colin,
> >
> > Thanks for the KIP. +1. Just a minor comment. For the old client requests,
> > would it be better to return a LEADER_NOT_AVAILABLE error instead?
> >
> > Jun
> >
> > On Tue, Oct 17, 2017 at 11:11 AM, Colin McCabe  wrote:
> >
> > > Hi all,
> > >
> > > I'd like to start the voting process for KIP-207:The  Offsets which
> > > ListOffsetsResponse returns should monotonically increase even during a
> > > partition leader change.
> > >
> > > See
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 207%3A+Offsets+returned+by+ListOffsetsResponse+should+be+
> > > monotonically+increasing+even+during+a+partition+leader+change
> > > for details.
> > >
> > > The voting process will run for at least 72 hours.
> > >
> > > regards,
> > > Colin
> > >
> >

[jira] [Resolved] (KAFKA-5642) Use async ZookeeperClient in Controller


 [ 
https://issues.apache.org/jira/browse/KAFKA-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jun Rao resolved KAFKA-5642.

   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 3765
[https://github.com/apache/kafka/pull/3765]

> Use async ZookeeperClient in Controller
> ---
>
> Key: KAFKA-5642
> URL: https://issues.apache.org/jira/browse/KAFKA-5642
> Project: Kafka
>  Issue Type: Sub-task
>Reporter: Onur Karaman
>Assignee: Onur Karaman
> Fix For: 1.1.0
>
>
> Synchronous zookeeper writes means that we wait an entire round trip before 
> doing the next write. These synchronous writes are happening at a 
> per-partition granularity in several places, so partition-heavy clusters 
> suffer from the controller doing many sequential round trips to zookeeper.
> * PartitionStateMachine.electLeaderForPartition updates leaderAndIsr in 
> zookeeper on transition to OnlinePartition. This gets triggered per-partition 
> sequentially with synchronous writes during controlled shutdown of the 
> shutting down broker's replicas for which it is the leader.
> * ReplicaStateMachine updates leaderAndIsr in zookeeper on transition to 
> OfflineReplica when calling KafkaController.removeReplicaFromIsr. This gets 
> triggered per-partition sequentially with synchronous writes for failed or 
> controlled shutdown brokers.
> KAFKA-5501 introduced an async ZookeeperClient that encourages pipelined 
> requests to zookeeper. We should replace ZkClient's usage with this client.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Re: working on KAFKA-4928

2017-10-18 Thread Ted Yu

There was a PR https://github.com/apache/kafka/pull/2889 which was closed.

It would be good for committers to chime in on the previous attempt.

On Wed, Oct 18, 2017 at 8:39 AM, Pavel Drankov  wrote:

> Hi,
>
> My name is Pavel and I'm a very new to Kafka. I actually would like to work
> on KAFKA-4928, which is for newbies.
>
> Can someone point me out what DumpLogSegments is and where I can start?
> Which cases should I use in order to test it?
>
> Best wishes,
> Pavel
>

Re: [VOTE] KIP-207:The Offsets which ListOffsetsResponse returns should monotonically increase even during a partition leader change

2017-10-18 Thread Colin McCabe

On Tue, Oct 17, 2017, at 13:26, Jun Rao wrote:
> Hi, Colin,
> 
> Thanks for the KIP. +1. Just a minor comment. For the old client
> requests,
> would it be better to return a LEADER_NOT_AVAILABLE error instead?

Good idea.  I changed it to LeaderNotAvailableException.

best,
Colin

> 
> Jun
> 
> On Tue, Oct 17, 2017 at 11:11 AM, Colin McCabe 
> wrote:
> 
> > Hi all,
> >
> > I'd like to start the voting process for KIP-207:The  Offsets which
> > ListOffsetsResponse returns should monotonically increase even during a
> > partition leader change.
> >
> > See
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 207%3A+Offsets+returned+by+ListOffsetsResponse+should+be+
> > monotonically+increasing+even+during+a+partition+leader+change
> > for details.
> >
> > The voting process will run for at least 72 hours.
> >
> > regards,
> > Colin
> >

Re: [DISCUSS] KIP-204 : adding records deletion operation to the new Admin Client API

2017-10-18 Thread Colin McCabe

Having a class there makes sense to me.  It also helps clarify what the
Long represents (a record offset).

regards,
Colin


On Wed, Oct 18, 2017, at 06:19, Dong Lin wrote:
> Sure. This makes sense. I agree it is better to replace Long with a new
> class.
> 
> On Wed, Oct 18, 2017 at 6:16 AM, Ismael Juma  wrote:
> 
> > Hi Dong,
> >
> > Yes, I mean replacing the `Long` with a class in the map. The class would
> > have static factory methods for the various use cases. To use the
> > `createPartitions` example, there is a `NewPartitions.increaseTo` method.
> >
> > Not sure why you think it's too complicated. It provides better type
> > safety, it's more informative and makes it easier to evolve. Thankfully
> > Java has lambdas for a while now and mapping a collection from one type to
> > another is reasonably simple.
> >
> > Your suggestion doesn't work because both methods would have the same
> > "erased" signature. You can't have two overloaded methods that have the
> > same signature apart from generic parameters. Also, we'd end up with 2
> > methods in AdminClient.
> >
> > Ismael
> >
> >
> > On Wed, Oct 18, 2017 at 1:42 PM, Dong Lin  wrote:
> >
> > > Hey Ismael,
> > >
> > > To clarify, I think you are saying that we should replace
> > > "deleteRecords(Map partitionsAndOffsets)" with
> > > "deleteRecords(Map
> > > partitionsAndOffsets)", where DeleteRecordsParameter should be include a
> > > "Long value", and probably "Boolean isBeforeOrAfter" and "Boolean
> > > isOffsetOrTime" in the future.
> > >
> > > I get the point that we want to only include additional parameter
> > > in DeleteRecordsOptions. I just feel it is a bit overkill to have a new
> > > class DeleteRecordsParameter which will only support offset in the near
> > > future. This method signature seems a bit too complicated.
> > >
> > > How about we use deleteRecords(Map
> > > partitionsAndOffsets) for now, and add deleteRecords(Map > > DeleteRecordsParameter> partitionsAndOffsets) when we need to support
> > core
> > > parameter in the future? By doing this we can make user's experience
> > better
> > > (i.e. not having to instantiate DeleteRecordsParameter) until it is
> > > necessary to be more complicated.
> > >
> > > Dong
> > >
> > > On Wed, Oct 18, 2017 at 4:46 AM, Ismael Juma  wrote:
> > >
> > > > Hi Dong,
> > > >
> > > > I think it makes sense to use the parameters to define the specifics of
> > > the
> > > > request. However, we would probably want to replace the `Long` with a
> > > class
> > > > (similar to `createPartitions`) instead of relying on
> > > > `DeleteRecordsOptions`. The latter is typically used for defining
> > > > additional options, not for defining the core behaviour.
> > > >
> > > > Ismael
> > > >
> > > > On Wed, Oct 18, 2017 at 12:27 AM, Dong Lin 
> > wrote:
> > > >
> > > > > Hey Colin,
> > > > >
> > > > > I have also thought about deleteRecordsBeforeOffset so that we can
> > keep
> > > > the
> > > > > name consistent with the existing API in the Scala AdminClient. But
> > > then
> > > > I
> > > > > think it may be better to be able to specify in DeleteRecordsOptions
> > > > > whether the deletion is before/after timestamp or offset. By doing
> > this
> > > > we
> > > > > have one API rather than four API in Java AdminClient going forward.
> > > What
> > > > > do you think?
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > > On Tue, Oct 17, 2017 at 11:35 AM, Colin McCabe 
> > > > wrote:
> > > > >
> > > > > > Hi Paolo,
> > > > > >
> > > > > > This is a nice improvement.
> > > > > >
> > > > > > I agree that the discussion of creating a DeleteTopicPolicy can
> > wait
> > > > > > until later.  Perhaps we can do it in a follow-on KIP.  However, we
> > > do
> > > > > > need to specify what ACL permissions are needed to invoke this API.
> > > > > > That should be in the JavaDoc comments as well.  Based on the
> > > previous
> > > > > > discussion, I am assuming that this means DELETE on the TOPIC
> > > resource?
> > > > > > Can you add this to the KIP?
> > > > > >
> > > > > > Right now you have the signature:
> > > > > > > DeleteRecordsResult deleteRecords(Map
> > > > > > partitionsAndOffsets)
> > > > > >
> > > > > > Since this function is all about deleting records that come before
> > a
> > > > > > certain offset, how about calling it deleteRecordsBeforeOffset?
> > That
> > > > > > way, if we come up with another way of deleting records in the
> > future
> > > > > > (such as a timestamp or transaction-based way) it will not be
> > > > confusing.
> > > > > >
> > > > > > On Mon, Oct 16, 2017, at 20:50, Becket Qin wrote:
> > > > > > > Hi Paolo,
> > > > > > >
> > > > > > > Thanks for the KIP and sorry for being late on the thread. I am
> > > > > wondering
> > > > > > > what is the KafkaFuture

[GitHub] kafka pull request #4087: MINOR: Remove dead code

2017-10-18 Thread ijuma

GitHub user ijuma opened a pull request:

https://github.com/apache/kafka/pull/4087

MINOR: Remove dead code



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ijuma/kafka remove-dead-code

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/4087.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #4087


commit 9acf274dab1f9df0818d9fce3819b7a0543188f0
Author: Ismael Juma 
Date:   2017-10-18T15:54:55Z

MINOR: Remove dead code




---

[GitHub] kafka-site pull request #100: MINOR: Pinterest link correction

2017-10-18 Thread manjuapu

GitHub user manjuapu opened a pull request:

https://github.com/apache/kafka-site/pull/100

MINOR: Pinterest link correction

@guozhangwang Pleas review

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/manjuapu/kafka-site asf-site

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka-site/pull/100.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #100


commit c8b319c4cc3e6018bc973209876003785c7c4406
Author: Manjula Kumar 
Date:   2017-10-18T15:52:31Z

MINOR: Pinterest link correction




---

working on KAFKA-4928

2017-10-18 Thread Pavel Drankov

Hi,

My name is Pavel and I'm a very new to Kafka. I actually would like to work
on KAFKA-4928, which is for newbies.

Can someone point me out what DumpLogSegments is and where I can start?
Which cases should I use in order to test it?

Best wishes,
Pavel

Re: KIP Access

2017-10-18 Thread Damian Guy

Hi Matt,

You should have permission now.
Thanks,
Damian

On Wed, 18 Oct 2017 at 15:54 Matt Farmer  wrote:

> Hey everyone,
>
> I'm a software engineer at MailChimp working on our Data Systems team. I'm
> looking to file a KIP to improve the error handling hooks that Kafka
> Streams exposes when producing messages. We've got a use case internally
> that requires us to be able to ignore certain classes of errors (in this
> case RecordTooLargeException), log some details about the message, and
> carry on processing other messages. We developed some changes to allow this
> internally, and would like to kick off the process of contributing back a
> similar change upstream. (I kind of expect what we contribute back to have
> a bit of a different shape than what we built internally.)
>
> It looks like filing a KIP is the correct next step here, but it looks like
> I need some additional permissions on Confluence. What's the process for
> getting those permissions? My Confluence username is farmdawgnation, if
> that helps.
>
> Thanks,
> Matt
>

KIP Access

2017-10-18 Thread Matt Farmer

Hey everyone,

I'm a software engineer at MailChimp working on our Data Systems team. I'm
looking to file a KIP to improve the error handling hooks that Kafka
Streams exposes when producing messages. We've got a use case internally
that requires us to be able to ignore certain classes of errors (in this
case RecordTooLargeException), log some details about the message, and
carry on processing other messages. We developed some changes to allow this
internally, and would like to kick off the process of contributing back a
similar change upstream. (I kind of expect what we contribute back to have
a bit of a different shape than what we built internally.)

It looks like filing a KIP is the correct next step here, but it looks like
I need some additional permissions on Confluence. What's the process for
getting those permissions? My Confluence username is farmdawgnation, if
that helps.

Thanks,
Matt

Re: [VOTE] KIP-203: Add toLowerCase support to sasl.kerberos.principal.to.local rule

2017-10-18 Thread Rajini Sivaram

+1 (binding)

On Mon, Oct 9, 2017 at 5:32 PM, Manikumar  wrote:

> I'm bumping this up to get some attention :)
>
> On Wed, Sep 27, 2017 at 8:46 PM, Tom Bentley 
> wrote:
>
> > +1 (nonbinding)
> >
> > On 27 September 2017 at 16:10, Manikumar 
> > wrote:
> >
> > > Hi All,
> > >
> > > I'd like to start the vote on KIP-203. Details are here:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 203%3A+Add+toLowerCase+support+to+sasl.kerberos.
> principal.to.local+rule
> > >
> > > Thanks,
> > >
> >
>

Re: [DISCUSS] KIP-204 : adding records deletion operation to the new Admin Client API

2017-10-18 Thread Dong Lin

Sure. This makes sense. I agree it is better to replace Long with a new
class.

On Wed, Oct 18, 2017 at 6:16 AM, Ismael Juma  wrote:

> Hi Dong,
>
> Yes, I mean replacing the `Long` with a class in the map. The class would
> have static factory methods for the various use cases. To use the
> `createPartitions` example, there is a `NewPartitions.increaseTo` method.
>
> Not sure why you think it's too complicated. It provides better type
> safety, it's more informative and makes it easier to evolve. Thankfully
> Java has lambdas for a while now and mapping a collection from one type to
> another is reasonably simple.
>
> Your suggestion doesn't work because both methods would have the same
> "erased" signature. You can't have two overloaded methods that have the
> same signature apart from generic parameters. Also, we'd end up with 2
> methods in AdminClient.
>
> Ismael
>
>
> On Wed, Oct 18, 2017 at 1:42 PM, Dong Lin  wrote:
>
> > Hey Ismael,
> >
> > To clarify, I think you are saying that we should replace
> > "deleteRecords(Map partitionsAndOffsets)" with
> > "deleteRecords(Map
> > partitionsAndOffsets)", where DeleteRecordsParameter should be include a
> > "Long value", and probably "Boolean isBeforeOrAfter" and "Boolean
> > isOffsetOrTime" in the future.
> >
> > I get the point that we want to only include additional parameter
> > in DeleteRecordsOptions. I just feel it is a bit overkill to have a new
> > class DeleteRecordsParameter which will only support offset in the near
> > future. This method signature seems a bit too complicated.
> >
> > How about we use deleteRecords(Map
> > partitionsAndOffsets) for now, and add deleteRecords(Map > DeleteRecordsParameter> partitionsAndOffsets) when we need to support
> core
> > parameter in the future? By doing this we can make user's experience
> better
> > (i.e. not having to instantiate DeleteRecordsParameter) until it is
> > necessary to be more complicated.
> >
> > Dong
> >
> > On Wed, Oct 18, 2017 at 4:46 AM, Ismael Juma  wrote:
> >
> > > Hi Dong,
> > >
> > > I think it makes sense to use the parameters to define the specifics of
> > the
> > > request. However, we would probably want to replace the `Long` with a
> > class
> > > (similar to `createPartitions`) instead of relying on
> > > `DeleteRecordsOptions`. The latter is typically used for defining
> > > additional options, not for defining the core behaviour.
> > >
> > > Ismael
> > >
> > > On Wed, Oct 18, 2017 at 12:27 AM, Dong Lin 
> wrote:
> > >
> > > > Hey Colin,
> > > >
> > > > I have also thought about deleteRecordsBeforeOffset so that we can
> keep
> > > the
> > > > name consistent with the existing API in the Scala AdminClient. But
> > then
> > > I
> > > > think it may be better to be able to specify in DeleteRecordsOptions
> > > > whether the deletion is before/after timestamp or offset. By doing
> this
> > > we
> > > > have one API rather than four API in Java AdminClient going forward.
> > What
> > > > do you think?
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > > On Tue, Oct 17, 2017 at 11:35 AM, Colin McCabe 
> > > wrote:
> > > >
> > > > > Hi Paolo,
> > > > >
> > > > > This is a nice improvement.
> > > > >
> > > > > I agree that the discussion of creating a DeleteTopicPolicy can
> wait
> > > > > until later.  Perhaps we can do it in a follow-on KIP.  However, we
> > do
> > > > > need to specify what ACL permissions are needed to invoke this API.
> > > > > That should be in the JavaDoc comments as well.  Based on the
> > previous
> > > > > discussion, I am assuming that this means DELETE on the TOPIC
> > resource?
> > > > > Can you add this to the KIP?
> > > > >
> > > > > Right now you have the signature:
> > > > > > DeleteRecordsResult deleteRecords(Map
> > > > > partitionsAndOffsets)
> > > > >
> > > > > Since this function is all about deleting records that come before
> a
> > > > > certain offset, how about calling it deleteRecordsBeforeOffset?
> That
> > > > > way, if we come up with another way of deleting records in the
> future
> > > > > (such as a timestamp or transaction-based way) it will not be
> > > confusing.
> > > > >
> > > > > On Mon, Oct 16, 2017, at 20:50, Becket Qin wrote:
> > > > > > Hi Paolo,
> > > > > >
> > > > > > Thanks for the KIP and sorry for being late on the thread. I am
> > > > wondering
> > > > > > what is the KafkaFuture returned by all() call? Should it
> be
> > a
> > > > > > Map instead?
> > > > >
> > > > > Good point.
> > > > >
> > > > > cheers,
> > > > > Colin
> > > > >
> > > > >
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) QIn
> > > > > >
> > > > > > On Thu, Sep 28, 2017 at 3:48 AM, Paolo Patierno <
> > ppatie...@live.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Hi,
> > > > >

Re: [DISCUSS] KIP-204 : adding records deletion operation to the new Admin Client API

Hi Dong,

Yes, I mean replacing the `Long` with a class in the map. The class would
have static factory methods for the various use cases. To use the
`createPartitions` example, there is a `NewPartitions.increaseTo` method.

Not sure why you think it's too complicated. It provides better type
safety, it's more informative and makes it easier to evolve. Thankfully
Java has lambdas for a while now and mapping a collection from one type to
another is reasonably simple.

Your suggestion doesn't work because both methods would have the same
"erased" signature. You can't have two overloaded methods that have the
same signature apart from generic parameters. Also, we'd end up with 2
methods in AdminClient.

Ismael


On Wed, Oct 18, 2017 at 1:42 PM, Dong Lin  wrote:

> Hey Ismael,
>
> To clarify, I think you are saying that we should replace
> "deleteRecords(Map partitionsAndOffsets)" with
> "deleteRecords(Map
> partitionsAndOffsets)", where DeleteRecordsParameter should be include a
> "Long value", and probably "Boolean isBeforeOrAfter" and "Boolean
> isOffsetOrTime" in the future.
>
> I get the point that we want to only include additional parameter
> in DeleteRecordsOptions. I just feel it is a bit overkill to have a new
> class DeleteRecordsParameter which will only support offset in the near
> future. This method signature seems a bit too complicated.
>
> How about we use deleteRecords(Map
> partitionsAndOffsets) for now, and add deleteRecords(Map DeleteRecordsParameter> partitionsAndOffsets) when we need to support core
> parameter in the future? By doing this we can make user's experience better
> (i.e. not having to instantiate DeleteRecordsParameter) until it is
> necessary to be more complicated.
>
> Dong
>
> On Wed, Oct 18, 2017 at 4:46 AM, Ismael Juma  wrote:
>
> > Hi Dong,
> >
> > I think it makes sense to use the parameters to define the specifics of
> the
> > request. However, we would probably want to replace the `Long` with a
> class
> > (similar to `createPartitions`) instead of relying on
> > `DeleteRecordsOptions`. The latter is typically used for defining
> > additional options, not for defining the core behaviour.
> >
> > Ismael
> >
> > On Wed, Oct 18, 2017 at 12:27 AM, Dong Lin  wrote:
> >
> > > Hey Colin,
> > >
> > > I have also thought about deleteRecordsBeforeOffset so that we can keep
> > the
> > > name consistent with the existing API in the Scala AdminClient. But
> then
> > I
> > > think it may be better to be able to specify in DeleteRecordsOptions
> > > whether the deletion is before/after timestamp or offset. By doing this
> > we
> > > have one API rather than four API in Java AdminClient going forward.
> What
> > > do you think?
> > >
> > > Thanks,
> > > Dong
> > >
> > > On Tue, Oct 17, 2017 at 11:35 AM, Colin McCabe 
> > wrote:
> > >
> > > > Hi Paolo,
> > > >
> > > > This is a nice improvement.
> > > >
> > > > I agree that the discussion of creating a DeleteTopicPolicy can wait
> > > > until later.  Perhaps we can do it in a follow-on KIP.  However, we
> do
> > > > need to specify what ACL permissions are needed to invoke this API.
> > > > That should be in the JavaDoc comments as well.  Based on the
> previous
> > > > discussion, I am assuming that this means DELETE on the TOPIC
> resource?
> > > > Can you add this to the KIP?
> > > >
> > > > Right now you have the signature:
> > > > > DeleteRecordsResult deleteRecords(Map
> > > > partitionsAndOffsets)
> > > >
> > > > Since this function is all about deleting records that come before a
> > > > certain offset, how about calling it deleteRecordsBeforeOffset?  That
> > > > way, if we come up with another way of deleting records in the future
> > > > (such as a timestamp or transaction-based way) it will not be
> > confusing.
> > > >
> > > > On Mon, Oct 16, 2017, at 20:50, Becket Qin wrote:
> > > > > Hi Paolo,
> > > > >
> > > > > Thanks for the KIP and sorry for being late on the thread. I am
> > > wondering
> > > > > what is the KafkaFuture returned by all() call? Should it be
> a
> > > > > Map instead?
> > > >
> > > > Good point.
> > > >
> > > > cheers,
> > > > Colin
> > > >
> > > >
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) QIn
> > > > >
> > > > > On Thu, Sep 28, 2017 at 3:48 AM, Paolo Patierno <
> ppatie...@live.com>
> > > > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > >
> > > > > > maybe we want to start without the delete records policy for now
> > > > waiting
> > > > > > for a real needs. So I'm removing it from the KIP.
> > > > > >
> > > > > > I hope for more comments on this KIP-204 so that we can start a
> > vote
> > > on
> > > > > > Monday.
> > > > > >
> > > > > >
> > > > > > Thanks.
> > > > > >
> > > > > >
> > > > > > Paolo Patierno
> > > > > > Senior Software

[jira] [Created] (KAFKA-6078) Investigate failure of ReassignPartitionsClusterTest.shouldExpandCluster

2017-10-18 Thread Dong Lin (JIRA)

Dong Lin created KAFKA-6078:
---

 Summary: Investigate failure of 
ReassignPartitionsClusterTest.shouldExpandCluster
 Key: KAFKA-6078
 URL: https://issues.apache.org/jira/browse/KAFKA-6078
 Project: Kafka
  Issue Type: Bug
Reporter: Dong Lin
Assignee: Dong Lin


See https://github.com/apache/kafka/pull/4084



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Re: [DISCUSS] KIP-204 : adding records deletion operation to the new Admin Client API

2017-10-18 Thread Dong Lin

Hey Ismael,

To clarify, I think you are saying that we should replace
"deleteRecords(Map partitionsAndOffsets)" with
"deleteRecords(Map
partitionsAndOffsets)", where DeleteRecordsParameter should be include a
"Long value", and probably "Boolean isBeforeOrAfter" and "Boolean
isOffsetOrTime" in the future.

I get the point that we want to only include additional parameter
in DeleteRecordsOptions. I just feel it is a bit overkill to have a new
class DeleteRecordsParameter which will only support offset in the near
future. This method signature seems a bit too complicated.

How about we use deleteRecords(Map
partitionsAndOffsets) for now, and add deleteRecords(Map partitionsAndOffsets) when we need to support core
parameter in the future? By doing this we can make user's experience better
(i.e. not having to instantiate DeleteRecordsParameter) until it is
necessary to be more complicated.

Dong

On Wed, Oct 18, 2017 at 4:46 AM, Ismael Juma  wrote:

> Hi Dong,
>
> I think it makes sense to use the parameters to define the specifics of the
> request. However, we would probably want to replace the `Long` with a class
> (similar to `createPartitions`) instead of relying on
> `DeleteRecordsOptions`. The latter is typically used for defining
> additional options, not for defining the core behaviour.
>
> Ismael
>
> On Wed, Oct 18, 2017 at 12:27 AM, Dong Lin  wrote:
>
> > Hey Colin,
> >
> > I have also thought about deleteRecordsBeforeOffset so that we can keep
> the
> > name consistent with the existing API in the Scala AdminClient. But then
> I
> > think it may be better to be able to specify in DeleteRecordsOptions
> > whether the deletion is before/after timestamp or offset. By doing this
> we
> > have one API rather than four API in Java AdminClient going forward. What
> > do you think?
> >
> > Thanks,
> > Dong
> >
> > On Tue, Oct 17, 2017 at 11:35 AM, Colin McCabe 
> wrote:
> >
> > > Hi Paolo,
> > >
> > > This is a nice improvement.
> > >
> > > I agree that the discussion of creating a DeleteTopicPolicy can wait
> > > until later.  Perhaps we can do it in a follow-on KIP.  However, we do
> > > need to specify what ACL permissions are needed to invoke this API.
> > > That should be in the JavaDoc comments as well.  Based on the previous
> > > discussion, I am assuming that this means DELETE on the TOPIC resource?
> > > Can you add this to the KIP?
> > >
> > > Right now you have the signature:
> > > > DeleteRecordsResult deleteRecords(Map
> > > partitionsAndOffsets)
> > >
> > > Since this function is all about deleting records that come before a
> > > certain offset, how about calling it deleteRecordsBeforeOffset?  That
> > > way, if we come up with another way of deleting records in the future
> > > (such as a timestamp or transaction-based way) it will not be
> confusing.
> > >
> > > On Mon, Oct 16, 2017, at 20:50, Becket Qin wrote:
> > > > Hi Paolo,
> > > >
> > > > Thanks for the KIP and sorry for being late on the thread. I am
> > wondering
> > > > what is the KafkaFuture returned by all() call? Should it be a
> > > > Map instead?
> > >
> > > Good point.
> > >
> > > cheers,
> > > Colin
> > >
> > >
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) QIn
> > > >
> > > > On Thu, Sep 28, 2017 at 3:48 AM, Paolo Patierno 
> > > > wrote:
> > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > > maybe we want to start without the delete records policy for now
> > > waiting
> > > > > for a real needs. So I'm removing it from the KIP.
> > > > >
> > > > > I hope for more comments on this KIP-204 so that we can start a
> vote
> > on
> > > > > Monday.
> > > > >
> > > > >
> > > > > Thanks.
> > > > >
> > > > >
> > > > > Paolo Patierno
> > > > > Senior Software Engineer (IoT) @ Red Hat
> > > > > Microsoft MVP on Azure & IoT
> > > > > Microsoft Azure Advisor
> > > > >
> > > > > Twitter : @ppatierno
> > > > > Linkedin : paolopatierno
> > > > > Blog : DevExperience
> > > > >
> > > > >
> > > > > 
> > > > > From: Paolo Patierno 
> > > > > Sent: Thursday, September 28, 2017 5:56 AM
> > > > > To: dev@kafka.apache.org
> > > > > Subject: Re: [DISCUSS] KIP-204 : adding records deletion operation
> to
> > > the
> > > > > new Admin Client API
> > > > >
> > > > > Hi,
> > > > >
> > > > >
> > > > > I have just updated the KIP-204 description with the new
> > > > > TopicDeletionPolicy suggested by the KIP-201.
> > > > >
> > > > >
> > > > > Paolo Patierno
> > > > > Senior Software Engineer (IoT) @ Red Hat
> > > > > Microsoft MVP on Azure & IoT
> > > > > Microsoft Azure Advisor
> > > > >
> > > > > Twitter :

Re: [DISCUSS]: KIP-159: Introducing Rich functions to Streams

2017-10-18 Thread Jeyhun Karimov

Hi,

The main intuition is to solve [1], which is part of this KIP.
I agree with you that this might not seem semantically correct as we are
not committing record state.
Alternatively, we can remove commit() from RecordContext and add
ProcessorContext (which has commit() method) as an extra argument to Rich
methods:

instead of
public interface RichValueMapper {
VR apply(final V value,
 final K key,
 final RecordContext recordContext);
}

we can adopt

public interface RichValueMapper {
VR apply(final V value,
 final K key,
 final RecordContext recordContext,
 final ProcessorContext processorContext);
}


However, in this case, a user can get confused as ProcessorContext and
RecordContext share some methods with the same name.


Cheers,
Jeyhun


[1] https://issues.apache.org/jira/browse/KAFKA-3907


On Tue, Oct 17, 2017 at 3:19 AM Guozhang Wang  wrote:

> Regarding #6 above, I'm still not clear why we would need `commit()` in
> both ProcessorContext and RecordContext, could you elaborate a bit more?
>
> To me `commit()` is really a processor context not a record context
> logically: when you call that function, it means we would commit the state
> of the whole task up to this processed record, not only that single record
> itself.
>
>
> Guozhang
>
> On Mon, Oct 16, 2017 at 9:19 AM, Jeyhun Karimov 
> wrote:
>
> > Hi,
> >
> > Thanks for the feedback.
> >
> >
> > 0. RichInitializer definition seems missing.
> >
> >
> >
> > - Fixed.
> >
> >
> >  I'd suggest moving the key parameter in the RichValueXX and RichReducer
> > > after the value parameters, as well as in the templates; e.g.
> > > public interface RichValueJoiner {
> > > VR apply(final V1 value1, final V2 value2, final K key, final
> > > RecordContext
> > > recordContext);
> > > }
> >
> >
> >
> > - Fixed.
> >
> >
> > 2. Some of the listed functions are not necessary since their pairing
> APIs
> > > are being deprecated in 1.0 already:
> > >  KGroupedStream groupBy(final RichKeyValueMapper ?
> > > super V, KR> selector,
> > >final Serde keySerde,
> > >final Serde valSerde);
> > >  KStream leftJoin(final KTable table,
> > >  final RichValueJoiner super
> > > V,
> > > ? super VT, ? extends VR> joiner,
> > >  final Serde keySerde,
> > >  final Serde valSerde);
> >
> >
> > -Fixed
> >
> > 3. For a few functions where we are adding three APIs for a combo of both
> > > mapper / joiner, or both initializer / aggregator, or adder /
> subtractor,
> > > I'm wondering if we can just keep one that use "rich" functions for
> both;
> > > so that we can have less overloads and let users who only want to
> access
> > > one of them to just use dummy parameter declarations. For example:
> > >
> > >  KStream join(final GlobalKTable
> globalKTable,
> > >  final RichKeyValueMapper > > super
> > >  V, ? extends GK> keyValueMapper,
> > >  final RichValueJoiner super
> > > V,
> > > ? super GV, ? extends RV> joiner);
> >
> >
> >
> > -Agreed. Fixed.
> >
> >
> > 4. For TimeWindowedKStream, I'm wondering why we do not make its
> > > Initializer also "rich" functions? I.e.
> >
> >
> > - It was a typo. Fixed.
> >
> >
> > 5. We need to move "RecordContext" from o.a.k.processor.internals to
> > > o.a.k.processor.
> > >
> > > 6. I'm not clear why we want to move `commit()` from ProcessorContext
> to
> > > RecordContext?
> > >
> >
> > -
> > Because it makes sense logically and  to reduce code maintenance (both
> > interfaces have offset() timestamp() topic() partition() methods),  I
> > inherit ProcessorContext from RecordContext.
> > Since we need commit() method both in ProcessorContext and in
> RecordContext
> > I move commit() method to parent class (RecordContext).
> >
> >
> > Cheers,
> > Jeyhun
> >
> >
> >
> > On Wed, Oct 11, 2017 at 12:59 AM, Guozhang Wang 
> > wrote:
> >
> > > Jeyhun,
> > >
> > > Thanks for the updated KIP, here are my comments.
> > >
> > > 0. RichInitializer definition seems missing.
> > >
> > > 1. I'd suggest moving the key parameter in the RichValueXX and
> > RichReducer
> > > after the value parameters, as well as in the templates; e.g.
> > >
> > > public interface RichValueJoiner {
> > > VR apply(final V1 value1, final V2 value2, final K key, final
> > > RecordContext
> > > recordContext);
> > > }
> > >
> > > My motivation is that for lambda expression in J8, users that would not
> > > care about the key but only the context, or vice versa, is likely to
> > write
> > > it as (value1, value2, dummy, context) -> ... than putting the dummy at
> > the
> > > beginning of the parameter list.

[GitHub] kafka pull request #2555: MINOR: Javadoc typo

2017-10-18 Thread astubbs

Github user astubbs closed the pull request at:

https://github.com/apache/kafka/pull/2555


---

Re: [DISCUSS] KIP-204 : adding records deletion operation to the new Admin Client API

Hi Dong,

I think it makes sense to use the parameters to define the specifics of the
request. However, we would probably want to replace the `Long` with a class
(similar to `createPartitions`) instead of relying on
`DeleteRecordsOptions`. The latter is typically used for defining
additional options, not for defining the core behaviour.

Ismael

On Wed, Oct 18, 2017 at 12:27 AM, Dong Lin  wrote:

> Hey Colin,
>
> I have also thought about deleteRecordsBeforeOffset so that we can keep the
> name consistent with the existing API in the Scala AdminClient. But then I
> think it may be better to be able to specify in DeleteRecordsOptions
> whether the deletion is before/after timestamp or offset. By doing this we
> have one API rather than four API in Java AdminClient going forward. What
> do you think?
>
> Thanks,
> Dong
>
> On Tue, Oct 17, 2017 at 11:35 AM, Colin McCabe  wrote:
>
> > Hi Paolo,
> >
> > This is a nice improvement.
> >
> > I agree that the discussion of creating a DeleteTopicPolicy can wait
> > until later.  Perhaps we can do it in a follow-on KIP.  However, we do
> > need to specify what ACL permissions are needed to invoke this API.
> > That should be in the JavaDoc comments as well.  Based on the previous
> > discussion, I am assuming that this means DELETE on the TOPIC resource?
> > Can you add this to the KIP?
> >
> > Right now you have the signature:
> > > DeleteRecordsResult deleteRecords(Map
> > partitionsAndOffsets)
> >
> > Since this function is all about deleting records that come before a
> > certain offset, how about calling it deleteRecordsBeforeOffset?  That
> > way, if we come up with another way of deleting records in the future
> > (such as a timestamp or transaction-based way) it will not be confusing.
> >
> > On Mon, Oct 16, 2017, at 20:50, Becket Qin wrote:
> > > Hi Paolo,
> > >
> > > Thanks for the KIP and sorry for being late on the thread. I am
> wondering
> > > what is the KafkaFuture returned by all() call? Should it be a
> > > Map instead?
> >
> > Good point.
> >
> > cheers,
> > Colin
> >
> >
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) QIn
> > >
> > > On Thu, Sep 28, 2017 at 3:48 AM, Paolo Patierno 
> > > wrote:
> > >
> > > > Hi,
> > > >
> > > >
> > > > maybe we want to start without the delete records policy for now
> > waiting
> > > > for a real needs. So I'm removing it from the KIP.
> > > >
> > > > I hope for more comments on this KIP-204 so that we can start a vote
> on
> > > > Monday.
> > > >
> > > >
> > > > Thanks.
> > > >
> > > >
> > > > Paolo Patierno
> > > > Senior Software Engineer (IoT) @ Red Hat
> > > > Microsoft MVP on Azure & IoT
> > > > Microsoft Azure Advisor
> > > >
> > > > Twitter : @ppatierno
> > > > Linkedin : paolopatierno
> > > > Blog : DevExperience
> > > >
> > > >
> > > > 
> > > > From: Paolo Patierno 
> > > > Sent: Thursday, September 28, 2017 5:56 AM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [DISCUSS] KIP-204 : adding records deletion operation to
> > the
> > > > new Admin Client API
> > > >
> > > > Hi,
> > > >
> > > >
> > > > I have just updated the KIP-204 description with the new
> > > > TopicDeletionPolicy suggested by the KIP-201.
> > > >
> > > >
> > > > Paolo Patierno
> > > > Senior Software Engineer (IoT) @ Red Hat
> > > > Microsoft MVP on Azure & IoT
> > > > Microsoft Azure Advisor
> > > >
> > > > Twitter : @ppatierno
> > > > Linkedin : paolopatierno
> > > > Blog : DevExperience
> > > >
> > > >
> > > > 
> > > > From: Paolo Patierno 
> > > > Sent: Tuesday, September 26, 2017 4:57 PM
> > > > To: dev@kafka.apache.org
> > > > Subject: Re: [DISCUSS] KIP-204 : adding records deletion operation to
> > the
> > > > new Admin Client API
> > > >
> > > > Hi Tom,
> > > >
> > > > as I said in the KIP-201 discussion I'm ok with having a unique
> > > > DeleteTopicPolicy but then it could be useful having more information
> > then
> > > > just the topic name; partitions and offset for messages deletion
> could
> > be
> > > > useful for a fine grained use cases.
> > > >
> > > >
> > > > Paolo Patierno
> > > > Senior Software Engineer (IoT) @ Red Hat
> > > > Microsoft MVP on Azure & IoT
> > > > Microsoft Azure Advisor
> > > >
> > > > Twitter : @ppatierno
> > > > Linkedin : paolopatierno
> > > > Blog : DevExperience
> > > >
> > > >
> > > > 
> > > > From: Tom Bentley 
> > > > Sent: Tuesday, September 26, 2017 4:32 PM
> > > > To: dev@kafka.apache.org
> > > >

Re: [VOTE] KIP-207:The Offsets which ListOffsetsResponse returns should monotonically increase even during a partition leader change

Thanks for the KIP, +1 (binding). A few comments:

1. I agree with Jun about LEADER_NOT_AVAILABLE for the error code for older
versions.
2. OffsetNotAvailableException seems clear enough (i.e. we don't need the
"ForPartition" part)
3. The KIP seems to be missing the compatibility section.
4. It would be good to mention that it's now possible for a fetch to
succeed while list offsets will not for a period of time. And for older
versions, the latter will return LeaderNotAvailable while the former would
work fine, which is a bit unexpected. Not much we can do about it, but
worth mentioning it in my opinion.

Ismael

On Tue, Oct 17, 2017 at 9:26 PM, Jun Rao  wrote:

> Hi, Colin,
>
> Thanks for the KIP. +1. Just a minor comment. For the old client requests,
> would it be better to return a LEADER_NOT_AVAILABLE error instead?
>
> Jun
>
> On Tue, Oct 17, 2017 at 11:11 AM, Colin McCabe  wrote:
>
> > Hi all,
> >
> > I'd like to start the voting process for KIP-207:The  Offsets which
> > ListOffsetsResponse returns should monotonically increase even during a
> > partition leader change.
> >
> > See
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 207%3A+Offsets+returned+by+ListOffsetsResponse+should+be+
> > monotonically+increasing+even+during+a+partition+leader+change
> > for details.
> >
> > The voting process will run for at least 72 hours.
> >
> > regards,
> > Colin
> >
>

Re: [VOTE] KIP-204 : adding records deletion operation to the new Admin Client API

2017-10-18 Thread Viktor Somogyi

+1 (non-binding)

On Wed, Oct 18, 2017 at 8:23 AM, Manikumar 
wrote:

> + (non-binding)
>
>
> Thanks,
> Manikumar
>
> On Tue, Oct 17, 2017 at 7:42 AM, Dong Lin  wrote:
>
> > Thanks for the KIP. +1 (non-binding)
> >
> > On Wed, Oct 11, 2017 at 2:27 AM, Ted Yu  wrote:
> >
> > > +1
> > >
> > > On Mon, Oct 2, 2017 at 10:51 PM, Paolo Patierno 
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I didn't see any further discussion around this KIP, so I'd like to
> > start
> > > > the vote for it.
> > > >
> > > > Just for reference : https://cwiki.apache.org/
> > > > confluence/display/KAFKA/KIP-204+%3A+adding+records+
> > > > deletion+operation+to+the+new+Admin+Client+API
> > > >
> > > >
> > > > Thanks,
> > > >
> > > > Paolo Patierno
> > > > Senior Software Engineer (IoT) @ Red Hat
> > > > Microsoft MVP on Azure & IoT
> > > > Microsoft Azure Advisor
> > > >
> > > > Twitter : @ppatierno
> > > > Linkedin : paolopatierno
> > > > Blog : DevExperience
> > > >
> > >
> >
>

Build failed in Jenkins: kafka-trunk-jdk9 #132

See 


Changes:

[damian.guy] KAFKA-6023 ThreadCache#sizeBytes() should check overflow

--
[...truncated 1.40 MB...]
kafka.message.MessageTest > testFieldValues STARTED

kafka.message.MessageTest > testFieldValues PASSED

kafka.message.MessageTest > testInvalidMagicByte STARTED

kafka.message.MessageTest > testInvalidMagicByte PASSED

kafka.message.MessageTest > testEquality STARTED

kafka.message.MessageTest > testEquality PASSED

kafka.message.MessageCompressionTest > testCompressSize STARTED

kafka.message.MessageCompressionTest > testCompressSize PASSED

kafka.message.MessageCompressionTest > testSimpleCompressDecompress STARTED

kafka.message.MessageCompressionTest > testSimpleCompressDecompress PASSED

kafka.message.ByteBufferMessageSetTest > testMessageWithProvidedOffsetSeq 
STARTED

kafka.message.ByteBufferMessageSetTest > testMessageWithProvidedOffsetSeq PASSED

kafka.message.ByteBufferMessageSetTest > testValidBytes STARTED

kafka.message.ByteBufferMessageSetTest > testValidBytes PASSED

kafka.message.ByteBufferMessageSetTest > testValidBytesWithCompression STARTED

kafka.message.ByteBufferMessageSetTest > testValidBytesWithCompression PASSED

kafka.message.ByteBufferMessageSetTest > 
testWriteToChannelThatConsumesPartially STARTED

kafka.message.ByteBufferMessageSetTest > 
testWriteToChannelThatConsumesPartially PASSED

kafka.message.ByteBufferMessageSetTest > testIteratorIsConsistent STARTED

kafka.message.ByteBufferMessageSetTest > testIteratorIsConsistent PASSED

kafka.message.ByteBufferMessageSetTest > testWrittenEqualsRead STARTED

kafka.message.ByteBufferMessageSetTest > testWrittenEqualsRead PASSED

kafka.message.ByteBufferMessageSetTest > testWriteTo STARTED

kafka.message.ByteBufferMessageSetTest > testWriteTo PASSED

kafka.message.ByteBufferMessageSetTest > testEquals STARTED

kafka.message.ByteBufferMessageSetTest > testEquals PASSED

kafka.message.ByteBufferMessageSetTest > testSizeInBytes STARTED

kafka.message.ByteBufferMessageSetTest > testSizeInBytes PASSED

kafka.message.ByteBufferMessageSetTest > testIterator STARTED

kafka.message.ByteBufferMessageSetTest > testIterator PASSED

kafka.metrics.KafkaTimerTest > testKafkaTimer STARTED

kafka.metrics.KafkaTimerTest > testKafkaTimer PASSED

kafka.metrics.MetricsTest > testMetricsReporterAfterDeletingTopic STARTED

kafka.metrics.MetricsTest > testMetricsReporterAfterDeletingTopic PASSED

kafka.metrics.MetricsTest > 
testBrokerTopicMetricsUnregisteredAfterDeletingTopic STARTED

kafka.metrics.MetricsTest > 
testBrokerTopicMetricsUnregisteredAfterDeletingTopic PASSED

kafka.metrics.MetricsTest > testClusterIdMetric STARTED

kafka.metrics.MetricsTest > testClusterIdMetric PASSED

kafka.metrics.MetricsTest > testControllerMetrics STARTED

kafka.metrics.MetricsTest > testControllerMetrics PASSED

kafka.metrics.MetricsTest > testMetricsLeak STARTED

kafka.metrics.MetricsTest > testMetricsLeak PASSED

kafka.metrics.MetricsTest > testBrokerTopicMetricsBytesInOut STARTED

kafka.metrics.MetricsTest > testBrokerTopicMetricsBytesInOut PASSED

kafka.javaapi.consumer.ZookeeperConsumerConnectorTest > testBasic STARTED

kafka.javaapi.consumer.ZookeeperConsumerConnectorTest > testBasic PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testSizeInBytesWithCompression 
STARTED

kafka.javaapi.message.ByteBufferMessageSetTest > testSizeInBytesWithCompression 
PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > 
testIteratorIsConsistentWithCompression STARTED

kafka.javaapi.message.ByteBufferMessageSetTest > 
testIteratorIsConsistentWithCompression PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testIteratorIsConsistent 
STARTED

kafka.javaapi.message.ByteBufferMessageSetTest > testIteratorIsConsistent PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testEqualsWithCompression 
STARTED

kafka.javaapi.message.ByteBufferMessageSetTest > testEqualsWithCompression 
PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testWrittenEqualsRead STARTED

kafka.javaapi.message.ByteBufferMessageSetTest > testWrittenEqualsRead PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testEquals STARTED

kafka.javaapi.message.ByteBufferMessageSetTest > testEquals PASSED

kafka.javaapi.message.ByteBufferMessageSetTest > testSizeInBytes STARTED

kafka.javaapi.message.ByteBufferMessageSetTest > testSizeInBytes PASSED

kafka.security.auth.ResourceTypeTest > testJavaConversions STARTED

kafka.security.auth.ResourceTypeTest > testJavaConversions PASSED

kafka.security.auth.ResourceTypeTest > testFromString STARTED

kafka.security.auth.ResourceTypeTest > testFromString PASSED

kafka.security.auth.OperationTest > testJavaConversions STARTED

kafka.security.auth.OperationTest > testJavaConversions PASSED

kafka.security.auth.PermissionTypeTest > testJavaConversions STARTED

kafka.security.auth.PermissionTypeTest >

Build failed in Jenkins: kafka-trunk-jdk8 #2144

See 


Changes:

[damian.guy] KAFKA-6023 ThreadCache#sizeBytes() should check overflow

--
[...truncated 1.77 MB...]

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenCreated PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testMultipleConsumersCanReadFromPartitionedTopic PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testRegexMatchesTopicsAWhenDeleted PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
testNoMessagesSentExceptionFromOverlappingPatterns PASSED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource STARTED

org.apache.kafka.streams.integration.RegexSourceIntegrationTest > 
shouldAddStateStoreToRegexDefinedSource PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithZeroSizedCache PASSED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache STARTED

org.apache.kafka.streams.integration.KStreamRepartitionJoinTest > 
shouldCorrectlyRepartitionOnJoinOperationsWithNonZeroSizedCache PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldCompactTopicsForStateChangelogs PASSED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs STARTED

org.apache.kafka.streams.integration.InternalTopicIntegrationTest > 
shouldUseCompactAndDeleteForWindowStoreChangelogs PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduce PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregate PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCount PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldGroupByKey PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountWithInternalStore PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldReduceWindowed PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldCountSessionWindows PASSED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed STARTED

org.apache.kafka.streams.integration.KStreamAggregationIntegrationTest > 
shouldAggregateWindowed PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableLeftJoin PASSED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin STARTED

org.apache.kafka.streams.integration.GlobalKTableIntegrationTest > 
shouldKStreamGlobalKTableJoin PASSED

org.apache.kafka.streams.integration.ResetIntegrationTest > 
testReprocessingFromScratchAfterResetWithIntermediateUserTopic STARTED

org.apache.kafka.streams.integration.ResetIntegrationTest >

[GitHub] kafka pull request #4041: KAFKA-6023 ThreadCache#sizeBytes() should check ov...

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/4041


---

[jira] [Resolved] (KAFKA-6023) ThreadCache#sizeBytes() should check overflow

2017-10-18 Thread Damian Guy (JIRA)


 [ 
https://issues.apache.org/jira/browse/KAFKA-6023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Damian Guy resolved KAFKA-6023.
---
   Resolution: Fixed
Fix Version/s: 1.1.0

Issue resolved by pull request 4041
[https://github.com/apache/kafka/pull/4041]

> ThreadCache#sizeBytes() should check overflow
> -
>
> Key: KAFKA-6023
> URL: https://issues.apache.org/jira/browse/KAFKA-6023
> Project: Kafka
>  Issue Type: Bug
>  Components: streams
>Reporter: Ted Yu
>Assignee: siva santhalingam
>Priority: Minor
> Fix For: 1.1.0
>
>
> {code}
> long sizeBytes() {
> long sizeInBytes = 0;
> for (final NamedCache namedCache : caches.values()) {
> sizeInBytes += namedCache.sizeInBytes();
> }
> return sizeInBytes;
> }
> {code}
> The summation w.r.t. sizeInBytes may overflow.
> Check similar to what is done in size() should be performed.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[GitHub] kafka-site pull request #99: Rephrase mailing lists section to avoid incorre...

Github user scholzj commented on a diff in the pull request:

https://github.com/apache/kafka-site/pull/99#discussion_r145347106
  
--- Diff: contact.html ---
@@ -11,16 +11,16 @@



- us...@kafka.apache.org: A list for general user 
questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. 
Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.
+   User mailing list: A list for general 
user questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. Once 
subscribed, send your emails to us...@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.


- dev@kafka.apache.org: A list for discussion on Kafka 
development. To subscribe, send an email to dev-subscr...@kafka.apache.org. Archives 
available http://mail-archives.apache.org/mod_mbox/kafka-dev;>here.
+   Developer mailing list: A list for 
discussion on Kafka development. To subscribe, send an email to dev-subscr...@kafka.apache.org. Once 
subscribed, send your emails to dev@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-dev;>here.


- j...@kafka.apache.org: A list to track Kafka https://issues.apache.org/jira/projects/KAFKA;>JIRA notifications. To 
subscribe, send an email to jira-subscr...@kafka.apache.org. Archives 
available http://mail-archives.apache.org/mod_mbox/kafka-jira;>here.
+   JIRA mailing list: A list to track 
Kafka https://issues.apache.org/jira/projects/KAFKA;>JIRA 
notifications. To subscribe, send an email to jira-subscr...@kafka.apache.org. Once 
subscribed, send your emails to j...@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-jira;>here.
--- End diff --

Fixed.


---

[GitHub] kafka-site pull request #99: Rephrase mailing lists section to avoid incorre...

Github user scholzj commented on a diff in the pull request:

https://github.com/apache/kafka-site/pull/99#discussion_r145345710
  
--- Diff: contact.html ---
@@ -11,16 +11,16 @@



- us...@kafka.apache.org: A list for general user 
questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. 
Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.
+   User maling list: A list for general 
user questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. Once 
subscribed, send your emails to us...@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.
--- End diff --

Fixed, thanks for noticing.


---

[GitHub] kafka-site pull request #99: Rephrase mailing lists section to avoid incorre...

2017-10-18 Thread tombentley

Github user tombentley commented on a diff in the pull request:

https://github.com/apache/kafka-site/pull/99#discussion_r145345640
  
--- Diff: contact.html ---
@@ -11,16 +11,16 @@



- us...@kafka.apache.org: A list for general user 
questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. 
Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.
+   User mailing list: A list for general 
user questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. Once 
subscribed, send your emails to us...@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.


- dev@kafka.apache.org: A list for discussion on Kafka 
development. To subscribe, send an email to dev-subscr...@kafka.apache.org. Archives 
available http://mail-archives.apache.org/mod_mbox/kafka-dev;>here.
+   Developer mailing list: A list for 
discussion on Kafka development. To subscribe, send an email to dev-subscr...@kafka.apache.org. Once 
subscribed, send your emails to dev@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-dev;>here.


- j...@kafka.apache.org: A list to track Kafka https://issues.apache.org/jira/projects/KAFKA;>JIRA notifications. To 
subscribe, send an email to jira-subscr...@kafka.apache.org. Archives 
available http://mail-archives.apache.org/mod_mbox/kafka-jira;>here.
+   JIRA mailing list: A list to track 
Kafka https://issues.apache.org/jira/projects/KAFKA;>JIRA 
notifications. To subscribe, send an email to jira-subscr...@kafka.apache.org. Once 
subscribed, send your emails to j...@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-jira;>here.
--- End diff --

"Once subscribed, send your emails to..." I'm not sure that's how the JIRA 
list is intended to work. It's simply about notification of changes in JIRA; 
people won't normally mail the list themselves.


---

Re: KafkaSpout not consuming the first uncommitted offset data from kafka

2017-10-18 Thread Michael Noll

Hi Senthil,

you should ask this question in the Apache Storm mailing list.

At first sight this looks like a problem with Storm's KafkaSpout
implementation, not with Kafka.

Best wishes,
Michael




On Thu, Sep 28, 2017 at 8:47 PM, senthil kumar 
wrote:

> Hi Kafka,
>
> I have a trident topology in storm which consumes data from kafka. Now i am
> seeing an issue in KafkaSpout. This is not consuming the very first tthe
> first uncommitted offset data from kafka.
>
> My storm version is 1.1.1 and kafka version is 0.11.0.0. I have a topic say
> X and partition of the topic is 3.
>
> I have following configuration to consume data using KafkaSpout
>
>
> KafkaSpoutConfig kafkaConfig =
> KafkaSpoutConfig.builder(PropertyUtil.getStringValue(
> PropertyUtil.KAFKA_BROKERS),
> PropertyUtil.getStringValue(PropertyUtil.TOPIC_NAME))
> .setProp(ConsumerConfig.FETCH_MAX_BYTES_CONFIG, "4194304")
> .setProp(ConsumerConfig.GROUP_ID_CONFIG,PropertyUtil.
> getStringValue(PropertyUtil.CONSUMER_ID))
> .setProp(ConsumerConfig.RECEIVE_BUFFER_CONFIG, "4194304")
> .setProp(ConsumerConfig.ENABLE_AUTO_COMMIT_CONFIG, "true")
> .setFirstPollOffsetStrategy(FirstPollOffsetStrategy.UNCOMMITTED_LATEST)
> .build();
>
> TopologyBuilder builder = new TopologyBuilder();
>
> builder.setSpout("spout", new KafkaSpout(kafkaConfig),3);
>
> Following are my test cases
>
> 1. Processor started with new consumer id. The very first time it starts to
> read the data from latest. Fine.
> 2. Sending some messages to kafka and i am seeing all the messages are
> consumed by my trident topology.
> 3. Stopped my trident topology.
> 4. Sending some messages to kafka (partition_0). Say example
> > msg_1
> > msg_2
> > msg_3
> > msg_4
> > msg_5
>
> 5. Started the topology. And kafkaspout consumes the data from msg_2. It is
> not consuming the msg_1.
> 6. Stopped  the topology.
> 7. Sending some messages to kafka to all the partitions (_0, _1, _2). Say
> example
> Partition_0
> > msg_6
> > msg_7
> > msg_8
> Partition_1
> > msg_9
> > msg_10
> > msg_11
> Partition_2
> > msg_12
> > msg_13
> > msg_14
>
> 8. Started the topology. And kafkaspout consumes following messages
> > msg_7
> > msg_8
> > msg_10
> > msg_11
> > msg_13
> > msg_14
>
> It skipped the earliest uncommitted message in each partition.
>
> Below is the definitions of UNCOMMITTED_LATEST in JavaDoc.
>
> UNCOMMITTED_LATEST means that the kafka spout polls records from the last
> committed offset, if any. If no offset has been committed, it behaves as
> LATEST.
>
> As per the definitions, it should read from last committed offset. But it
> looks like it is reading from uncommitted earliest + 1. I meant the pointer
> seems to be wrong.
>
> Please have a look and let me know if anything wrong in my tests.
>
> I am expecting a response from you, even it is not an issue.
>
> Thanks,
> Senthil
>

[GitHub] kafka-site pull request #99: Rephrase mailing lists section to avoid incorre...

2017-10-18 Thread tombentley

Github user tombentley commented on a diff in the pull request:

https://github.com/apache/kafka-site/pull/99#discussion_r145344978
  
--- Diff: contact.html ---
@@ -11,16 +11,16 @@



- us...@kafka.apache.org: A list for general user 
questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. 
Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.
+   User maling list: A list for general 
user questions about Kafka. To subscribe, send an email to users-subscr...@kafka.apache.org. Once 
subscribed, send your emails to us...@kafka.apache.org. Archives are available http://mail-archives.apache.org/mod_mbox/kafka-users;>here.
--- End diff --

maling â mailing


---

[GitHub] kafka-site issue #81: MINOR: Fix typo in the gradle command for code style c...

Github user scholzj commented on the issue:

https://github.com/apache/kafka-site/pull/81
  
Can someone have a look at this? It is just a simple typo, so it should be 
very easy to review and merge.


---

[GitHub] kafka-site pull request #99: Rephrase mailing lists section to avoid incorre...