Re?? [DISCUSS] KIP-68 Add a consumed log retention before log retention

2016-10-30 Thread ????????
Hi All, 


As per our discussion, there are two ways to clean the consumed log:


1) Use an Admin Tool to find the min commit offset for some topics of the 
specified set of consumer groups, then send the trim API to all the replicas of 
the brokers,
then the brokers will start to trim the log segments of these topics.


The benefit of this method is to keep the broker simple and more flexible for 
the users, but it is more complicated for the users to clean all the messages 
which are consumed.


2) Broker will periodically do the consumed log retention as the KIP mentioned. 
This method is simple for the users and it can automatically clean the consumed 
log, but it will add more query work to the brokers.


Which method is better?


Thanks,
David








--  --
??: "Mayuresh Gharat";;
: 2016??10??29??(??) 1:43
??: "dev"; 

: Re: [DISCUSS] KIP-68 Add a consumed log retention before log retention



I do agree with Guozhang on having applications request an external
service(admin) that talks to kafka, for trimming, in which case this
external service(admin) can check if its OK to send the trim request to
kafka brokers based on a certain conditions.
On broker side we can have authorization by way of ACLs may be, saying that
only this external admin service is allowed to call trim(). In this way we
can actually move the main decision making process out of core.

Thanks,

Mayuresh

On Fri, Oct 28, 2016 at 10:33 AM, Guozhang Wang  wrote:

> Yes trim() should be an admin API and, if security is concerned, it should
> be under admin authorization as well.
>
> For applications that needs this feature, it then boils down to the problem
> that they should request the authorization token from who operates Kafka
> before starting their app to use in their own client, which I think is a
> feasible requirement.
>
>
> Guozhang
>
>
> On Fri, Oct 28, 2016 at 9:42 AM, Mayuresh Gharat <
> gharatmayures...@gmail.com
> > wrote:
>
> > Hi Guozhang,
> >
> > I agree that pushing out the complexity of coordination to the client
> > application makes it more simple for the broker in the sense that it does
> > not have to be the decision maker regarding when to trim and till what
> > offset. An I agree that if we go in this direction, providing an offset
> > parameter makes sense.
> >
> >
> > But since the main motivation for this seems like saving or reclaiming
> the
> > disk space on broker side, I am not 100% sure how good it is to rely on
> the
> > client application to be a good citizen and call the trim API.
> > Also I see the trim() api as more of an admin api rather than client API.
> >
> >
> > Thanks,
> >
> > Mayuresh
> >
> > On Fri, Oct 28, 2016 at 7:12 AM, Guozhang Wang 
> wrote:
> >
> > > Here are my thoughts:
> > >
> > > If there are indeed multiple consumer groups on the same topic that
> needs
> > > to coordinate, it is equally complex if the coordination is on the
> broker
> > > or among the applications themselves: for the latter case, you would
> > > imagine some coordination services used (like ZK) to register groups
> for
> > > that topic and let these groups agree upon the minimum offset that is
> > safe
> > > to trim for all of them; for the former case, we just need to move this
> > > coordination service into the broker side, which to me is not a good
> > design
> > > under the principle of making broker simple.
> > >
> > > And as we discussed, there are scenarios where the offset to trim is
> not
> > > necessarily dependent on the committed offsets, even if the topic is
> only
> > > consumed by a single consumer group and we do not need any
> coordination.
> > So
> > > I think it is appropriate to require an "offset parameter" in the trim
> > API.
> > >
> > > Guozhang
> > >
> > >
> > >
> > >
> > > On Fri, Oct 28, 2016 at 1:27 AM, Becket Qin 
> > wrote:
> > >
> > > > Hey Guozhang,
> > > >
> > > > I think the trim() interface is generally useful. What I was
> wondering
> > is
> > > > the following:
> > > > if the user has multiple applications to coordinate, it seems simpler
> > for
> > > > the broker to coordinate instead of asking the applications to
> > coordinate
> > > > among themselves. If we let the broker do the coordination and do not
> > > want
> > > > to reuse committed offset for trim(), we kind of need something like
> > > > "offset for trim", which do not seems to be general enough to have.
> But
> > > if
> > > > there is a single application then we don't need to worry about the
> > > > coordination hence this is no longer a problem.
> > > >
> > > > The use cases for multiple consumer groups I am thinking of is some
> > kind
> > > of
> > > > fork in the DAG, i.e. one intermediate result stream used by multiple
> > > > downstream jobs. But that may not be a big deal if the processing is
> > > within
> > > > the same 

[GitHub] kafka pull request #2032: KAFKA-3559: Recycle old tasks when possible

2016-10-30 Thread enothereska
Github user enothereska closed the pull request at:

https://github.com/apache/kafka/pull/2032


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3559) Task creation time taking too long in rebalance callback

2016-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15619883#comment-15619883
 ] 

ASF GitHub Bot commented on KAFKA-3559:
---

Github user enothereska closed the pull request at:

https://github.com/apache/kafka/pull/2032


> Task creation time taking too long in rebalance callback
> 
>
> Key: KAFKA-3559
> URL: https://issues.apache.org/jira/browse/KAFKA-3559
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Eno Thereska
>  Labels: architecture
> Fix For: 0.10.2.0
>
>
> Currently in Kafka Streams, we create stream tasks upon getting newly 
> assigned partitions in rebalance callback function {code} onPartitionAssigned 
> {code}, which involves initialization of the processor state stores as well 
> (including opening the rocksDB, restore the store from changelog, etc, which 
> takes time).
> With a large number of state stores, the initialization time itself could 
> take tens of seconds, which usually is larger than the consumer session 
> timeout. As a result, when the callback is completed, the consumer is already 
> treated as failed by the coordinator and rebalance again.
> We need to consider if we can optimize the initialization process, or move it 
> out of the callback function, and while initializing the stores one-by-one, 
> use poll call to send heartbeats to avoid being kicked out by coordinator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-3559) Task creation time taking too long in rebalance callback

2016-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15619884#comment-15619884
 ] 

ASF GitHub Bot commented on KAFKA-3559:
---

GitHub user enothereska reopened a pull request:

https://github.com/apache/kafka/pull/2032

KAFKA-3559: Recycle old tasks when possible



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka 
KAFKA-3559-onPartitionAssigned

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2032.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2032


commit 28a50430e4136ba75f0d9b957a67f22e7b1e86a0
Author: Eno Thereska 
Date:   2016-10-17T10:46:45Z

Recycle old tasks when possible

commit b3dc438bf1665b9364b19f5efa908dd35d2b7af3
Author: Eno Thereska 
Date:   2016-10-19T15:13:36Z

Adjusted based on Damian's comments

commit f8cfe74d85e0a8cd5efacca87eced236319c83b9
Author: Eno Thereska 
Date:   2016-10-19T17:44:39Z

Refactor

commit 62bb3fd4a90dd28bc7bb58bf077b7ecb60207c7e
Author: Eno Thereska 
Date:   2016-10-24T14:24:48Z

Merge remote-tracking branch 'origin/trunk' into 
KAFKA-3559-onPartitionAssigned

commit 841caa3721172d2d89ec16ef6dfd149f25498649
Author: Eno Thereska 
Date:   2016-10-24T17:32:05Z

Addressed Guozhang's comments

commit c4498564907243c35df832407933b8a9cf32f4ef
Author: Eno Thereska 
Date:   2016-10-25T11:07:28Z

Refactor

commit 4ba24c1ecb8c6293adce426a92b6021e86c9e8b7
Author: Eno Thereska 
Date:   2016-10-25T12:20:04Z

Merge remote-tracking branch 'origin/trunk' into 
KAFKA-3559-onPartitionAssigned

commit 0fe12633b8593eda3b5b7b75bc87244276c95ce2
Author: Eno Thereska 
Date:   2016-10-28T20:46:18Z

Minor reshuffle

commit 7bf5d96cd66ab77130cad39fbff821fccd83aa06
Author: Eno Thereska 
Date:   2016-10-28T21:44:48Z

Guozhang's suggestion to clear queue

commit ecc5e8a54f908507cb32ab785ee748e1d9e2cfb4
Author: Eno Thereska 
Date:   2016-10-29T19:01:59Z

Clear another queue

commit dffa9a2896eade6501794596bb08a9a2545e81b0
Author: Eno Thereska 
Date:   2016-10-29T19:14:56Z

Merge with trunk

commit 8f07571d28d137eaf7951e05c799e9458dca703f
Author: Eno Thereska 
Date:   2016-10-29T22:22:21Z

Separate cache test into another file




> Task creation time taking too long in rebalance callback
> 
>
> Key: KAFKA-3559
> URL: https://issues.apache.org/jira/browse/KAFKA-3559
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Eno Thereska
>  Labels: architecture
> Fix For: 0.10.2.0
>
>
> Currently in Kafka Streams, we create stream tasks upon getting newly 
> assigned partitions in rebalance callback function {code} onPartitionAssigned 
> {code}, which involves initialization of the processor state stores as well 
> (including opening the rocksDB, restore the store from changelog, etc, which 
> takes time).
> With a large number of state stores, the initialization time itself could 
> take tens of seconds, which usually is larger than the consumer session 
> timeout. As a result, when the callback is completed, the consumer is already 
> treated as failed by the coordinator and rebalance again.
> We need to consider if we can optimize the initialization process, or move it 
> out of the callback function, and while initializing the stores one-by-one, 
> use poll call to send heartbeats to avoid being kicked out by coordinator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2032: KAFKA-3559: Recycle old tasks when possible

2016-10-30 Thread enothereska
GitHub user enothereska reopened a pull request:

https://github.com/apache/kafka/pull/2032

KAFKA-3559: Recycle old tasks when possible



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/enothereska/kafka 
KAFKA-3559-onPartitionAssigned

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2032.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2032


commit 28a50430e4136ba75f0d9b957a67f22e7b1e86a0
Author: Eno Thereska 
Date:   2016-10-17T10:46:45Z

Recycle old tasks when possible

commit b3dc438bf1665b9364b19f5efa908dd35d2b7af3
Author: Eno Thereska 
Date:   2016-10-19T15:13:36Z

Adjusted based on Damian's comments

commit f8cfe74d85e0a8cd5efacca87eced236319c83b9
Author: Eno Thereska 
Date:   2016-10-19T17:44:39Z

Refactor

commit 62bb3fd4a90dd28bc7bb58bf077b7ecb60207c7e
Author: Eno Thereska 
Date:   2016-10-24T14:24:48Z

Merge remote-tracking branch 'origin/trunk' into 
KAFKA-3559-onPartitionAssigned

commit 841caa3721172d2d89ec16ef6dfd149f25498649
Author: Eno Thereska 
Date:   2016-10-24T17:32:05Z

Addressed Guozhang's comments

commit c4498564907243c35df832407933b8a9cf32f4ef
Author: Eno Thereska 
Date:   2016-10-25T11:07:28Z

Refactor

commit 4ba24c1ecb8c6293adce426a92b6021e86c9e8b7
Author: Eno Thereska 
Date:   2016-10-25T12:20:04Z

Merge remote-tracking branch 'origin/trunk' into 
KAFKA-3559-onPartitionAssigned

commit 0fe12633b8593eda3b5b7b75bc87244276c95ce2
Author: Eno Thereska 
Date:   2016-10-28T20:46:18Z

Minor reshuffle

commit 7bf5d96cd66ab77130cad39fbff821fccd83aa06
Author: Eno Thereska 
Date:   2016-10-28T21:44:48Z

Guozhang's suggestion to clear queue

commit ecc5e8a54f908507cb32ab785ee748e1d9e2cfb4
Author: Eno Thereska 
Date:   2016-10-29T19:01:59Z

Clear another queue

commit dffa9a2896eade6501794596bb08a9a2545e81b0
Author: Eno Thereska 
Date:   2016-10-29T19:14:56Z

Merge with trunk

commit 8f07571d28d137eaf7951e05c799e9458dca703f
Author: Eno Thereska 
Date:   2016-10-29T22:22:21Z

Separate cache test into another file




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Build failed in Jenkins: kafka-trunk-jdk8 #1011

2016-10-30 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-3559: Recycle old tasks when possible

[wangguoz] KAFKA-4302: Simplify KTableSource

[wangguoz] HOTFIX: improve error message on invalid input record timestamp

--
[...truncated 7505 lines...]
kafka.log.LogSegmentTest > testChangeFileSuffixes STARTED

kafka.log.LogSegmentTest > testChangeFileSuffixes PASSED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptTimeIndex STARTED

kafka.log.LogSegmentTest > testRecoveryFixesCorruptTimeIndex PASSED

kafka.log.LogSegmentTest > testReloadLargestTimestampAfterTruncation STARTED

kafka.log.LogSegmentTest > testReloadLargestTimestampAfterTruncation PASSED

kafka.log.LogSegmentTest > testMaxOffset STARTED

kafka.log.LogSegmentTest > testMaxOffset PASSED

kafka.log.LogSegmentTest > testNextOffsetCalculation STARTED

kafka.log.LogSegmentTest > testNextOffsetCalculation PASSED

kafka.log.LogSegmentTest > testFindOffsetByTimestamp STARTED

kafka.log.LogSegmentTest > testFindOffsetByTimestamp PASSED

kafka.log.LogSegmentTest > testReadOnEmptySegment STARTED

kafka.log.LogSegmentTest > testReadOnEmptySegment PASSED

kafka.log.LogSegmentTest > testReadAfterLast STARTED

kafka.log.LogSegmentTest > testReadAfterLast PASSED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown STARTED

kafka.log.LogSegmentTest > testCreateWithInitFileSizeClearShutdown PASSED

kafka.log.LogSegmentTest > testTruncateFull STARTED

kafka.log.LogSegmentTest > testTruncateFull PASSED

kafka.log.LogCleanerTest > testCleanCorruptMessageSet STARTED

kafka.log.LogCleanerTest > testCleanCorruptMessageSet PASSED

kafka.log.LogCleanerTest > testBuildOffsetMap STARTED

kafka.log.LogCleanerTest > testBuildOffsetMap PASSED

kafka.log.LogCleanerTest > testBuildOffsetMapFakeLarge STARTED

kafka.log.LogCleanerTest > testBuildOffsetMapFakeLarge PASSED

kafka.log.LogCleanerTest > testSegmentGrouping STARTED

kafka.log.LogCleanerTest > testSegmentGrouping PASSED

kafka.log.LogCleanerTest > testCleanSegmentsWithAbort STARTED

kafka.log.LogCleanerTest > testCleanSegmentsWithAbort PASSED

kafka.log.LogCleanerTest > testSegmentGroupingWithSparseOffsets STARTED

kafka.log.LogCleanerTest > testSegmentGroupingWithSparseOffsets PASSED

kafka.log.LogCleanerTest > testLargeMessage STARTED

kafka.log.LogCleanerTest > testLargeMessage PASSED

kafka.log.LogCleanerTest > testRecoveryAfterCrash STARTED

kafka.log.LogCleanerTest > testRecoveryAfterCrash PASSED

kafka.log.LogCleanerTest > testCleaningWithUncleanableSection STARTED

kafka.log.LogCleanerTest > testCleaningWithUncleanableSection PASSED

kafka.log.LogCleanerTest > testLogToClean STARTED

kafka.log.LogCleanerTest > testLogToClean PASSED

kafka.log.LogCleanerTest > testCleaningWithDeletes STARTED

kafka.log.LogCleanerTest > testCleaningWithDeletes PASSED

kafka.log.LogCleanerTest > testClientHandlingOfCorruptMessageSet STARTED

kafka.log.LogCleanerTest > testClientHandlingOfCorruptMessageSet PASSED

kafka.log.LogCleanerTest > testCleanSegments STARTED

kafka.log.LogCleanerTest > testCleanSegments PASSED

kafka.log.LogCleanerTest > testLogToCleanWithUncleanableSection STARTED

kafka.log.LogCleanerTest > testLogToCleanWithUncleanableSection PASSED

kafka.log.LogCleanerTest > testBuildPartialOffsetMap STARTED

kafka.log.LogCleanerTest > testBuildPartialOffsetMap PASSED

kafka.log.LogCleanerTest > testCleaningWithUnkeyedMessages STARTED

kafka.log.LogCleanerTest > testCleaningWithUnkeyedMessages PASSED

kafka.log.LogCleanerTest > testPartialSegmentClean STARTED

kafka.log.LogCleanerTest > testPartialSegmentClean PASSED

kafka.log.LogConfigTest > shouldValidateThrottledReplicasConfig STARTED

kafka.log.LogConfigTest > shouldValidateThrottledReplicasConfig PASSED

kafka.log.LogConfigTest > testFromPropsEmpty STARTED

kafka.log.LogConfigTest > testFromPropsEmpty PASSED

kafka.log.LogConfigTest > testKafkaConfigToProps STARTED

kafka.log.LogConfigTest > testKafkaConfigToProps PASSED

kafka.log.LogConfigTest > testFromPropsInvalid STARTED

kafka.log.LogConfigTest > testFromPropsInvalid PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[0] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[0] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[0] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleaningNestedMessagesWithMultipleVersions[0] PASSED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] STARTED

kafka.log.LogCleanerIntegrationTest > cleanerTest[0] PASSED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[0] STARTED

kafka.log.LogCleanerIntegrationTest > testCleanerWithMessageFormatV0[0] PASSED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[1] STARTED

kafka.log.LogCleanerIntegrationTest > 
testCleansCombinedCompactAndDeleteTopic[1] PASSED


[jira] [Commented] (KAFKA-4348) On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on Kafka server

2016-10-30 Thread huxi (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620906#comment-15620906
 ] 

huxi commented on KAFKA-4348:
-

Are there any non-default configs relating to the timeouts on the server side 
or client side? 

> On Mac OS, KafkaConsumer.poll returns 0 when there are still messages on 
> Kafka server
> -
>
> Key: KAFKA-4348
> URL: https://issues.apache.org/jira/browse/KAFKA-4348
> Project: Kafka
>  Issue Type: Bug
>  Components: consumer
>Affects Versions: 0.9.0.0, 0.9.0.1, 0.10.0.1
> Environment: Mac OS X EI Capitan, Java 1.8.0_111
>Reporter: Yiquan Zhou
>  Labels: consumer, mac, polling
>
> Steps to reproduce:
> 1. start the zookeeper and kafka server using the default properties from the 
> distribution: 
> $ bin/zookeeper-server-start.sh config/zookeeper.properties
> $ bin/kafka-server-start.sh config/server.properties 
> 2. create a Kafka consumer using the Java API KafkaConsumer.poll(long 
> timeout). It polls the records from the server every second (timeout set to 
> 1000) and prints the number of records polled. The code can be found here: 
> https://gist.github.com/yiquanzhou/a94569a2c4ec8992444c83f3c393f596
> 3. use bin/kafka-verifiable-producer.sh to generate some messages: 
> $ bin/kafka-verifiable-producer.sh --topic connect-test --max-messages 20 
> --broker-list localhost:9092
> wait until all 200k messages are generated and sent to the server. 
> 4. Run the consumer Java code. In the output console of the consumer, we can 
> see that the consumer starts to poll some records, then it polls 0 records 
> for several seconds before polling some more. like this:
> polled 27160 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26886 records
> polled 26886 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 0 records
> polled 26701 records
> polled 26214 records
> The bug slows down the consumption of messages a lot. And in our use case, 
> the consumer wrongly assumes that all messages are read from the topic.
> It is only reproducible on Mac OS X but neither on Linux nor Windows.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2076: HOTFIX: improve error message on invalid input rec...

2016-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2076


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-81: Max in-flight fetches

2016-10-30 Thread Jun Rao
Hi, Mickael,

I agree with others that it's better to be able to control the bytes the
consumer can read from sockets, instead of limiting the fetch requests.
KIP-72 has a proposal to bound the memory size at the socket selector
level. Perhaps that can be leveraged in this KIP too.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-72%3A+Allow+putting+a+bound+on+memory+consumed+by+Incoming+requests

Thanks,

Jun

On Thu, Oct 27, 2016 at 3:23 PM, Jay Kreps  wrote:

> This is a good observation on limiting total memory usage. If I understand
> the proposal I think it is that the consumer client would stop sending
> fetch requests once a certain number of in-flight fetch requests is met. I
> think a better approach would be to always issue one fetch request to each
> broker immediately, allow the server to process that request, and send data
> back to the local machine where it would be stored in the socket buffer (up
> to that buffer size). Instead of throttling the requests sent, the consumer
> should ideally throttle the responses read from the socket buffer at any
> given time. That is, in a single poll call, rather than reading from every
> single socket it should just read until it has a given amount of memory
> used then bail out early. It can come back and read more from the other
> sockets after those messages are processed.
>
> The advantage of this approach is that you don't incur the additional
> latency.
>
> -Jay
>
> On Mon, Oct 10, 2016 at 6:41 AM, Mickael Maison 
> wrote:
>
> > Hi all,
> >
> > I would like to discuss the following KIP proposal:
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 81%3A+Max+in-flight+fetches
> >
> >
> > Feedback and comments are welcome.
> > Thanks !
> >
> > Mickael
> >
>


[GitHub] kafka pull request #2077: MINOR: Fix document header/footer links

2016-10-30 Thread gwenshap
GitHub user gwenshap opened a pull request:

https://github.com/apache/kafka/pull/2077

MINOR: Fix document header/footer links

Based on:  https://github.com/apache/kafka-site/pull/27

I recommend also merging into 10.1.0.0 branch to avoid mishaps when 
re-publishing docs.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/gwenshap/kafka protocol_doc_fix

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/2077.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2077


commit 28d9c8913cb5d9e6d5e604dec6e57d8920a89f15
Author: Gwen Shapira 
Date:   2016-10-30T19:07:11Z

MINOR: Fix document header/footer links




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-72 Allow Sizing Incoming Request Queue in Bytes

2016-10-30 Thread Jun Rao
Hi, Radai,

Sorry for the late response. How should the benchmark results be
interpreted? The higher the ops/s, the better? It would also be useful to
test this out on LinkedIn's traffic with enough socket connections to see
if there is any performance degradation.

Also, there is a separate proposal KIP-81 to bound the consumer memory
usage. Perhaps you can chime it there on whether this proposal can be
utilized there too.

https://cwiki.apache.org/confluence/display/KAFKA/KIP-81%3A+Bound+Fetch+memory+usage+in+the+consumer

Thanks,

Jun

On Tue, Sep 27, 2016 at 10:23 AM, radai  wrote:

> Hi Jun,
>
> 10 - mute/unmute functionality has been added in
> https://github.com/radai-rosenblatt/kafka/tree/broker-
> memory-pool-with-muting.
> I have yet to run stress tests to see how it behaves versus without muting
>
> 11 - I've added a SimplePool implementation (nothing more than an
> AtomicLong really) and compared it with my GC pool (that uses weak refs) -
> https://github.com/radai-rosenblatt/kafka-benchmarks/
> tree/master/memorypool-benchmarks.
> the results show no noticeable difference. what the results _do_ show
> though is that for large requests (1M) performance drops very sharply.
> since the SimplePool is essentially identical to current kafka code
> behaviour (the nechmark never reaches out of memory conditions) it would
> suggest to me that kafka performance for large request suffers greatly from
> the cost of allocating (and releasing) large buffers (instead of actually
> pooling them for later re-use). this means that an implementation of memory
> pool that actually pools ( :-) ) is very likely to improve broker
> performance for large requests.
>
> 12 - if there was a single thread iterating over selection keys then
> stopping at 1st unsatisfiable request might work (if iteration order over
> selection keys is deterministic, which is OS-dependent). however, kafka
> spawns multiple selectors sharing the same pool so i doubt the approach
> would gain anything. also notice that the current code already shuffles the
> selection keys if memory is low (<10%) to try and guarantee fairness.
>
> attached the benchmark results for the pool implementations:
>
> BenchmarkMode  Cnt
> ScoreError  Units
> GarbageCollectedMemoryPoolBenchmark.alloc_100k  thrpt5
> 198272.519 ±  16045.965  ops/s
> GarbageCollectedMemoryPoolBenchmark.alloc_10k   thrpt5
> 2781439.307 ± 185287.072  ops/s
> GarbageCollectedMemoryPoolBenchmark.alloc_1kthrpt5
> 6029199.952 ± 465936.118  ops/s
> GarbageCollectedMemoryPoolBenchmark.alloc_1mthrpt5
> 18464.272 ±332.861  ops/s
> SimpleMemoryPoolBenchmark.alloc_100kthrpt5
> 204240.066 ±   2207.619  ops/s
> SimpleMemoryPoolBenchmark.alloc_10k thrpt5
> 3000794.525 ±  83510.836  ops/s
> SimpleMemoryPoolBenchmark.alloc_1k  thrpt5
> 5893671.778 ± 274239.541  ops/s
> SimpleMemoryPoolBenchmark.alloc_1m  thrpt5
> 18728.085 ±792.563  ops/s
>
>
>
> On Sat, Sep 24, 2016 at 9:07 AM, Jun Rao  wrote:
>
> > Hi, Radi,
> >
> > For 10, yes, we don't want the buffer pool to wake up the selector every
> > time some memory is freed up. We only want to do that when there is
> pending
> > requests to the buffer pool not honored due to not enough memory.
> >
> > For 11, we probably want to be a bit careful with Weak References. In
> > https://issues.apache.org/jira/browse/KAFKA-1989, we initially tried an
> > implementation based on Weak Reference, but abandoned it due to too much
> GC
> > overhead. It probably also makes the code a bit harder to understand. So,
> > perhaps it would be better if we can avoid it.
> >
> > For 12, that's a good point. I thought the selector will do some
> shuffling
> > for fairness. Perhaps we should stop allocating from the buffer pool when
> > we see the first key whose memory can't be honored?
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Sat, Sep 24, 2016 at 8:44 AM, radai 
> wrote:
> >
> > > Hi Jun,
> > >
> > > 10 - I'll add this functionality to the mute/unmute branch. as every
> > > mute/unmute operation is O(#connections / #selectorThreads) maybe a
> > > watermark approach is better than waking when _any_ mem is available?
> > >
> > > 11 - "gc notifications" are done by using a ReferenceQueue (
> > > https://docs.oracle.com/javase/8/docs/api/java/lang/
> > > ref/ReferenceQueue.html)
> > > in combination with weak references to allocated buffers. when a buffer
> > is
> > > reclaimed by the GC the corresponding weak ref to it is enqueued. the
> > pool
> > > maintains a set of outstanding buffer IDs (every allocated buffer gets
> a
> > > unique id - basically a sequence). a buffer explicitly returned has its
> > id
> > > removed from the tracking set and the weak reference to it destroyed,
> so
> > > its reference will never be enqueued by the GC even if it is GC'ed
> 

Build failed in Jenkins: kafka-trunk-jdk7 #1662

2016-10-30 Thread Apache Jenkins Server
See 

Changes:

[wangguoz] KAFKA-3559: Recycle old tasks when possible

[wangguoz] KAFKA-4302: Simplify KTableSource

[wangguoz] HOTFIX: improve error message on invalid input record timestamp

--
[...truncated 14302 lines...]
org.apache.kafka.streams.processor.internals.assignment.TaskAssignorTest > 
testAssignWithStandby STARTED

org.apache.kafka.streams.processor.internals.assignment.TaskAssignorTest > 
testAssignWithStandby PASSED

org.apache.kafka.streams.processor.internals.assignment.TaskAssignorTest > 
testAssignWithoutStandby STARTED

org.apache.kafka.streams.processor.internals.assignment.TaskAssignorTest > 
testAssignWithoutStandby PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingMultiplexingTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingMultiplexingTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingStatefulTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingStatefulTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleMultiSourceTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingSimpleMultiSourceTopology PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testTopologyMetadata STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testTopologyMetadata PASSED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingMultiplexByNameTopology STARTED

org.apache.kafka.streams.processor.internals.ProcessorTopologyTest > 
testDrivingMultiplexByNameTopology PASSED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
testSpecificPartition STARTED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
testSpecificPartition PASSED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
shouldThrowStreamsExceptionAfterMaxAttempts STARTED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
shouldThrowStreamsExceptionAfterMaxAttempts PASSED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
shouldRetryWhenTimeoutExceptionOccursOnSend STARTED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
shouldRetryWhenTimeoutExceptionOccursOnSend PASSED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
testStreamPartitioner STARTED

org.apache.kafka.streams.processor.internals.RecordCollectorTest > 
testStreamPartitioner PASSED

org.apache.kafka.streams.processor.internals.PunctuationQueueTest > 
testPunctuationInterval STARTED

org.apache.kafka.streams.processor.internals.PunctuationQueueTest > 
testPunctuationInterval PASSED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenAuthorizationException 
STARTED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenAuthorizationException 
PASSED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenKafkaException STARTED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowProcessorStateExceptionOnInitializeOffsetsWhenKafkaException PASSED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowWakeupExceptionOnInitializeOffsetsWhenWakeupException STARTED

org.apache.kafka.streams.processor.internals.AbstractTaskTest > 
shouldThrowWakeupExceptionOnInitializeOffsetsWhenWakeupException PASSED

org.apache.kafka.streams.processor.internals.StreamPartitionAssignorTest > 
shouldThrowExceptionIfApplicationServerConfigIsNotHostPortPair STARTED

org.apache.kafka.streams.processor.internals.StreamPartitionAssignorTest > 
shouldThrowExceptionIfApplicationServerConfigIsNotHostPortPair PASSED

org.apache.kafka.streams.processor.internals.StreamPartitionAssignorTest > 
shouldMapUserEndPointToTopicPartitions STARTED

org.apache.kafka.streams.processor.internals.StreamPartitionAssignorTest > 
shouldMapUserEndPointToTopicPartitions PASSED

org.apache.kafka.streams.processor.internals.StreamPartitionAssignorTest > 
shouldAddUserDefinedEndPointToSubscription STARTED

org.apache.kafka.streams.processor.internals.StreamPartitionAssignorTest > 
shouldAddUserDefinedEndPointToSubscription PASSED

org.apache.kafka.streams.processor.internals.StreamPartitionAssignorTest > 
testAssignWithStandbyReplicas STARTED


[GitHub] kafka-site pull request #27: fix includes to missing files in 0101 docs and ...

2016-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka-site/pull/27


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] KIP-72 Allow Sizing Incoming Request Queue in Bytes

2016-10-30 Thread radai
Hi Jun,

the benchmarks just spawn 16 threads where each thread allocates a chunk of
memory from the pool and immediately releases it. 16 was chosen because its
typical for LinkedIn setups. the benchmarks never "consume" more than 16 *
[single allocation size] and so do not test out-of-memory performance, but
rather "normal" operating conditions. tests were run with 4 memory
allocation sizes - 1k, 10k, 100k and 1M (1M being the largest typical
single request size setting at LinkedIn). the results are in ops/sec (for
context - a single request involves a single allocation/release cycle,
typical LinkedIn setups dont go beyond 20k requests/sec on a single broker).

results show that the GC pool (which is a combination of an AtomicLong
outstanding bytes count + weak references for allocated buffers) has a
negligible performance cost vs the simple benchmark (which does nothing,
same as current code).

the more interesting thing that the results show is that as the requested
buffer size gets larger a single allocate/release cycle becomes more
expensive. since the benchmark never hold a lot of outstanding memory (16 *
buf size tops) i suspect the issue is memory fragmentation - its harder to
find larger contiguous chunks of heap.

this indicates that for throughput scenarios (large request batches) broker
performance may actually be impacted by the overhead of allocating and
releasing buffers (the situation may even be worse - inter-broker requests
are much larger), and an implementation of memory pool that actually
recycles buffers (mine just acts as a limiter and leak detector) might
improve broker performance under high throughput conditions (but thats
probably a separate followup change).

I expect to stress test my code this week (though no guarantees).

I'll look at KIP-81.

On Sun, Oct 30, 2016 at 12:27 PM, Jun Rao  wrote:

> Hi, Radai,
>
> Sorry for the late response. How should the benchmark results be
> interpreted? The higher the ops/s, the better? It would also be useful to
> test this out on LinkedIn's traffic with enough socket connections to see
> if there is any performance degradation.
>
> Also, there is a separate proposal KIP-81 to bound the consumer memory
> usage. Perhaps you can chime it there on whether this proposal can be
> utilized there too.
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 81%3A+Bound+Fetch+memory+usage+in+the+consumer
>
> Thanks,
>
> Jun
>
> On Tue, Sep 27, 2016 at 10:23 AM, radai 
> wrote:
>
> > Hi Jun,
> >
> > 10 - mute/unmute functionality has been added in
> > https://github.com/radai-rosenblatt/kafka/tree/broker-
> > memory-pool-with-muting.
> > I have yet to run stress tests to see how it behaves versus without
> muting
> >
> > 11 - I've added a SimplePool implementation (nothing more than an
> > AtomicLong really) and compared it with my GC pool (that uses weak refs)
> -
> > https://github.com/radai-rosenblatt/kafka-benchmarks/
> > tree/master/memorypool-benchmarks.
> > the results show no noticeable difference. what the results _do_ show
> > though is that for large requests (1M) performance drops very sharply.
> > since the SimplePool is essentially identical to current kafka code
> > behaviour (the nechmark never reaches out of memory conditions) it would
> > suggest to me that kafka performance for large request suffers greatly
> from
> > the cost of allocating (and releasing) large buffers (instead of actually
> > pooling them for later re-use). this means that an implementation of
> memory
> > pool that actually pools ( :-) ) is very likely to improve broker
> > performance for large requests.
> >
> > 12 - if there was a single thread iterating over selection keys then
> > stopping at 1st unsatisfiable request might work (if iteration order over
> > selection keys is deterministic, which is OS-dependent). however, kafka
> > spawns multiple selectors sharing the same pool so i doubt the approach
> > would gain anything. also notice that the current code already shuffles
> the
> > selection keys if memory is low (<10%) to try and guarantee fairness.
> >
> > attached the benchmark results for the pool implementations:
> >
> > BenchmarkMode  Cnt
> > ScoreError  Units
> > GarbageCollectedMemoryPoolBenchmark.alloc_100k  thrpt5
> > 198272.519 ±  16045.965  ops/s
> > GarbageCollectedMemoryPoolBenchmark.alloc_10k   thrpt5
> > 2781439.307 ± 185287.072  ops/s
> > GarbageCollectedMemoryPoolBenchmark.alloc_1kthrpt5
> > 6029199.952 ± 465936.118  ops/s
> > GarbageCollectedMemoryPoolBenchmark.alloc_1mthrpt5
> > 18464.272 ±332.861  ops/s
> > SimpleMemoryPoolBenchmark.alloc_100kthrpt5
> > 204240.066 ±   2207.619  ops/s
> > SimpleMemoryPoolBenchmark.alloc_10k thrpt5
> > 3000794.525 ±  83510.836  ops/s
> > SimpleMemoryPoolBenchmark.alloc_1k  thrpt5
> > 5893671.778 ± 274239.541  ops/s
> > 

[jira] [Commented] (KAFKA-4350) Can't mirror from Kafka 0.9 to Kafka 0.10.1

2016-10-30 Thread Emanuele Cesena (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621249#comment-15621249
 ] 

Emanuele Cesena commented on KAFKA-4350:


[~hachikuji] Sorry, I'm a bit confused. I need to replicate from Kafka 0.9 to 
0.10.

I don't think MM 0.9 works for writing to Kafka 0.10, or at least I've tried it 
in my settings (i.e., K 0.9 -> MM 0.9 -> K 0.10), it didn't work, then I read 
around and always saw recommendations to use the new MM against the old broker, 
not viceversa.

So I went for MM 0.10: K 0.9 -> MM 0.10 -> K 0.10.
In this case, I can use either the new client or the old client (setting 
bootstrap.servers vs zookeeper.connect).
I understand that the new client may not work for reading from K 0.9.

But supposedly, the old client (in MM 0.10) should work, shouldn't it? This is 
the core of the issue I'm reporting, I'm getting errors even with the old 
client.

> Can't mirror from Kafka 0.9 to Kafka 0.10.1
> ---
>
> Key: KAFKA-4350
> URL: https://issues.apache.org/jira/browse/KAFKA-4350
> Project: Kafka
>  Issue Type: Bug
>Reporter: Emanuele Cesena
>
> I'm running 2 clusters: K9 with Kafka 0.9 and K10 with Kafka 0.10.1.
> In K10, I've set up mirror maker to clone a topic from K9 to K10.
> Mirror maker immediately fails while starting, any suggestion? Following 
> error message and configs.
> Error message:
> {code:java} 
> [2016-10-26 23:54:01,663] FATAL [mirrormaker-thread-0] Mirror maker thread 
> failure due to  (kafka.tools.MirrorMaker$MirrorMakerThread)
> org.apache.kafka.common.protocol.types.SchemaException: Error reading field 
> 'cluster_id': Error reading string of length 418, only 43 bytes available
> at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:73)
> at 
> org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:380)
> at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:180)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:193)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:248)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1013)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:979)
> at 
> kafka.tools.MirrorMaker$MirrorMakerNewConsumer.receive(MirrorMaker.scala:582)
> at 
> kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:431)
> [2016-10-26 23:54:01,679] FATAL [mirrormaker-thread-0] Mirror maker thread 
> exited abnormally, stopping the whole mirror maker. 
> (kafka.tools.MirrorMaker$MirrorMakerThread)
> {code} 
> Consumer:
> {code:} 
> group.id=mirrormaker001
> client.id=mirrormaker001
> bootstrap.servers=...K9...
> security.protocol=PLAINTEXT
> auto.offset.reset=earliest
> {code} 
> (note that I first run without client.id, then tried adding a client.id 
> because -- same error in both cases)
> Producer:
> {code:}
> bootstrap.servers=...K10...
> security.protocol=PLAINTEXT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (KAFKA-4350) Can't mirror from Kafka 0.9 to Kafka 0.10.1

2016-10-30 Thread Jiangjie Qin (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15621305#comment-15621305
 ] 

Jiangjie Qin commented on KAFKA-4350:
-

[~ecesena] In Kafka the wire protocol compatibility is taken care by the 
brokers. The clients will always send the latest wire protocol they are aware 
of. It is the brokers' responsibility to adapt to the clients protocol version 
and send corresponding response back. For example, if the broker is running at 
0.10 and it receives a request which is sent by a 0.9 client, the broker will 
send the response with the version in 0.9. That means the newer brokers support 
older clients.

However, if a broker is running in 0.9 and receives a request sent by a 0.10 
client. Because the broker does not even know the 0.10 protocol, depending on 
the type of request, different exception might be seen on the client side. In 
your case, it seems the 0.9 broker sent back a metadata response to a 0.10 
client. The 0.10 client is expecting a cluster_id field in the response which 
does not exist in 0.9 metadata response. Therefore a schema exception was 
thrown as you saw. This is the limitation mentioned by [~hachikuji].

So if you want to replicate data between 0.9 and 0.10 brokers, mirror maker in 
0.9 Kafka should be used so that it can talk to both 0.9 and 0.10 brokers.



> Can't mirror from Kafka 0.9 to Kafka 0.10.1
> ---
>
> Key: KAFKA-4350
> URL: https://issues.apache.org/jira/browse/KAFKA-4350
> Project: Kafka
>  Issue Type: Bug
>Reporter: Emanuele Cesena
>
> I'm running 2 clusters: K9 with Kafka 0.9 and K10 with Kafka 0.10.1.
> In K10, I've set up mirror maker to clone a topic from K9 to K10.
> Mirror maker immediately fails while starting, any suggestion? Following 
> error message and configs.
> Error message:
> {code:java} 
> [2016-10-26 23:54:01,663] FATAL [mirrormaker-thread-0] Mirror maker thread 
> failure due to  (kafka.tools.MirrorMaker$MirrorMakerThread)
> org.apache.kafka.common.protocol.types.SchemaException: Error reading field 
> 'cluster_id': Error reading string of length 418, only 43 bytes available
> at org.apache.kafka.common.protocol.types.Schema.read(Schema.java:73)
> at 
> org.apache.kafka.clients.NetworkClient.parseResponse(NetworkClient.java:380)
> at 
> org.apache.kafka.clients.NetworkClient.handleCompletedReceives(NetworkClient.java:449)
> at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:269)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:232)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:180)
> at 
> org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:193)
> at 
> org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:248)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1013)
> at 
> org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:979)
> at 
> kafka.tools.MirrorMaker$MirrorMakerNewConsumer.receive(MirrorMaker.scala:582)
> at 
> kafka.tools.MirrorMaker$MirrorMakerThread.run(MirrorMaker.scala:431)
> [2016-10-26 23:54:01,679] FATAL [mirrormaker-thread-0] Mirror maker thread 
> exited abnormally, stopping the whole mirror maker. 
> (kafka.tools.MirrorMaker$MirrorMakerThread)
> {code} 
> Consumer:
> {code:} 
> group.id=mirrormaker001
> client.id=mirrormaker001
> bootstrap.servers=...K9...
> security.protocol=PLAINTEXT
> auto.offset.reset=earliest
> {code} 
> (note that I first run without client.id, then tried adding a client.id 
> because -- same error in both cases)
> Producer:
> {code:}
> bootstrap.servers=...K10...
> security.protocol=PLAINTEXT
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: [VOTE] KIP-85: Dynamic JAAS configuration for Kafka clients

2016-10-30 Thread Jun Rao
Rajini,

Thanks for the KIP. +1

Jun

On Wed, Oct 26, 2016 at 8:26 AM, Rajini Sivaram <
rajinisiva...@googlemail.com> wrote:

> I would like to initiate the voting process for KIP-85: Dynamic JAAS
> configuration for Kafka Clients:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-85%3A+Dynamic+JAAS+
> configuration+for+Kafka+clients
>
> This KIP enables Java clients to connect to Kafka using SASL without a
> physical jaas.conf file. This will also be useful to configure multiple
> KafkaClient login contexts when multiple users are supported within a JVM.
>
> Thank you...
>
> Regards,
>
> Rajini
>


[GitHub] kafka pull request #2032: KAFKA-3559: Recycle old tasks when possible

2016-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2032


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-3559) Task creation time taking too long in rebalance callback

2016-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620357#comment-15620357
 ] 

ASF GitHub Bot commented on KAFKA-3559:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2032


> Task creation time taking too long in rebalance callback
> 
>
> Key: KAFKA-3559
> URL: https://issues.apache.org/jira/browse/KAFKA-3559
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Eno Thereska
>  Labels: architecture
> Fix For: 0.10.2.0
>
>
> Currently in Kafka Streams, we create stream tasks upon getting newly 
> assigned partitions in rebalance callback function {code} onPartitionAssigned 
> {code}, which involves initialization of the processor state stores as well 
> (including opening the rocksDB, restore the store from changelog, etc, which 
> takes time).
> With a large number of state stores, the initialization time itself could 
> take tens of seconds, which usually is larger than the consumer session 
> timeout. As a result, when the callback is completed, the consumer is already 
> treated as failed by the coordinator and rebalance again.
> We need to consider if we can optimize the initialization process, or move it 
> out of the callback function, and while initializing the stores one-by-one, 
> use poll call to send heartbeats to avoid being kicked out by coordinator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (KAFKA-3559) Task creation time taking too long in rebalance callback

2016-10-30 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-3559.
--
Resolution: Fixed

Issue resolved by pull request 2032
[https://github.com/apache/kafka/pull/2032]

> Task creation time taking too long in rebalance callback
> 
>
> Key: KAFKA-3559
> URL: https://issues.apache.org/jira/browse/KAFKA-3559
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Guozhang Wang
>Assignee: Eno Thereska
>  Labels: architecture
> Fix For: 0.10.2.0
>
>
> Currently in Kafka Streams, we create stream tasks upon getting newly 
> assigned partitions in rebalance callback function {code} onPartitionAssigned 
> {code}, which involves initialization of the processor state stores as well 
> (including opening the rocksDB, restore the store from changelog, etc, which 
> takes time).
> With a large number of state stores, the initialization time itself could 
> take tens of seconds, which usually is larger than the consumer session 
> timeout. As a result, when the callback is completed, the consumer is already 
> treated as failed by the coordinator and rebalance again.
> We need to consider if we can optimize the initialization process, or move it 
> out of the callback function, and while initializing the stores one-by-one, 
> use poll call to send heartbeats to avoid being kicked out by coordinator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[GitHub] kafka pull request #2065: KAFKA-4302: Simplify KTableSource

2016-10-30 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2065


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[jira] [Commented] (KAFKA-4302) Simplify KTableSource

2016-10-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15620376#comment-15620376
 ] 

ASF GitHub Bot commented on KAFKA-4302:
---

Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/2065


> Simplify KTableSource
> -
>
> Key: KAFKA-4302
> URL: https://issues.apache.org/jira/browse/KAFKA-4302
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> With the new "interactive queries" feature, source tables are always 
> materialized. Thus, we can remove the stale flag {{KTableSoure#materialized}} 
> (which is always true now) to simply to code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (KAFKA-4302) Simplify KTableSource

2016-10-30 Thread Guozhang Wang (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-4302?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang updated KAFKA-4302:
-
   Resolution: Fixed
Fix Version/s: 0.10.2.0
   Status: Resolved  (was: Patch Available)

Issue resolved by pull request 2065
[https://github.com/apache/kafka/pull/2065]

> Simplify KTableSource
> -
>
> Key: KAFKA-4302
> URL: https://issues.apache.org/jira/browse/KAFKA-4302
> Project: Kafka
>  Issue Type: Improvement
>  Components: streams
>Reporter: Matthias J. Sax
>Assignee: Matthias J. Sax
>Priority: Minor
> Fix For: 0.10.2.0
>
>
> With the new "interactive queries" feature, source tables are always 
> materialized. Thus, we can remove the stale flag {{KTableSoure#materialized}} 
> (which is always true now) to simply to code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)