[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-13 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/kafka/pull/648


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-11 Thread ZoneMayor
Github user ZoneMayor closed the pull request at:

https://github.com/apache/kafka/pull/648


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-11 Thread ZoneMayor
GitHub user ZoneMayor reopened a pull request:

https://github.com/apache/kafka/pull/648

KAFKA-2837: fix transient failure of kafka.api.ProducerBounceTest > 
testBrokerFailure

I can reproduced this transient failure, it seldom happen;
code is like below:
 // rolling bounce brokers
for (i <- 0 until numServers) {
  for (server <- servers) {
server.shutdown()
server.awaitShutdown()
server.startup()
Thread.sleep(2000)
  }

  // Make sure the producer do not see any exception
  // in returned metadata due to broker failures
  assertTrue(scheduler.failed == false)

  // Make sure the leader still exists after bouncing brokers
  (0 until numPartitions).foreach(partition => 
TestUtils.waitUntilLeaderIsElectedOrChanged(zkUtils, topic1, partition))
Brokers keep rolling restart, and producer keep sending messages;
In every loop, it will wait for election of partition leader;
But if the election is slow, more messages will be buffered in 
RecordAccumulator's BufferPool;
The limit for buffer is set to be 3;
TimeoutException("Failed to allocate memory within the configured max 
blocking time") will show up when out of memory;
Since for every restart of the broker, it will sleep for 2000 ms,  so this 
transient failure seldom happen;
But if I reduce the sleeping period, the bigger chance failure happens; 
for example if the broker with role of controller suffered a restart, it 
will take time to select controller first, then select leader, which will lead 
to more messges blocked in KafkaProducer:RecordAccumulator:BufferPool;
In this fix, I just enlarge the producer's buffer size to be 1MB;
@guozhangwang , Could you give some comments?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZoneMayor/kafka trunk-KAFKA-2837

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/648.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #648


commit 95374147a28208d4850f6e73f714bf418935fc2d
Author: ZoneMayor 
Date:   2015-11-27T03:49:34Z

Merge pull request #1 from apache/trunk

merge

commit cec5b48b651a7efd3900cfa3c1fd0ab1eeeaa3ec
Author: ZoneMayor 
Date:   2015-12-01T10:44:02Z

Merge pull request #2 from apache/trunk

2015-12-1

commit a119d547bf1741625ce0627073c7909992a20f15
Author: ZoneMayor 
Date:   2015-12-04T13:42:27Z

Merge pull request #3 from apache/trunk

2015-12-04#KAFKA-2893

commit b767a8dff85fc71c75d4cf5178c3f6f03ff81bfc
Author: ZoneMayor 
Date:   2015-12-09T10:42:30Z

Merge pull request #5 from apache/trunk

2015-12-9

commit cd5e6f4700a4387f9383b84aca0ee9c4639b1033
Author: jinxing 
Date:   2015-12-09T13:49:07Z

KAFKA-2837: fix transient failure kafka.api.ProducerBounceTest > 
testBrokerFailure

commit 8ded9104a04861f789a7a990c2ddd4fc38a899cd
Author: ZoneMayor 
Date:   2015-12-10T04:47:06Z

Merge pull request #6 from apache/trunk

2015-12-10

commit 2bcf010c73923bb24bbd9cece7e39983b2bdce0c
Author: jinxing 
Date:   2015-12-10T04:47:39Z

KAFKA-2837: WIP

commit dae4a3cc0b564bb25121d54e65b5ad363c3e866d
Author: jinxing 
Date:   2015-12-10T04:48:21Z

Merge branch 'trunk-KAFKA-2837' of https://github.com/ZoneMayor/kafka into 
trunk-KAFKA-2837

commit 7118e11813e445bca3eab65a23028e76138b136a
Author: jinxing 
Date:   2015-12-10T04:51:43Z

KAFKA-2837: WIP

commit 310dd6b34547b52aad21a35dcf631bda3e15ab64
Author: jinxing 
Date:   2015-12-11T03:43:32Z

KAFKA-2837: WIP




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-10 Thread ZoneMayor
GitHub user ZoneMayor reopened a pull request:

https://github.com/apache/kafka/pull/648

KAFKA-2837: fix transient failure of kafka.api.ProducerBounceTest > 
testBrokerFailure

I can reproduced this transient failure, it seldom happen;
code is like below:
 // rolling bounce brokers
for (i <- 0 until numServers) {
  for (server <- servers) {
server.shutdown()
server.awaitShutdown()
server.startup()
Thread.sleep(2000)
  }

  // Make sure the producer do not see any exception
  // in returned metadata due to broker failures
  assertTrue(scheduler.failed == false)

  // Make sure the leader still exists after bouncing brokers
  (0 until numPartitions).foreach(partition => 
TestUtils.waitUntilLeaderIsElectedOrChanged(zkUtils, topic1, partition))
Brokers keep rolling restart, and producer keep sending messages;
In every loop, it will wait for election of partition leader;
But if the election is slow, more messages will be buffered in 
RecordAccumulator's BufferPool;
The limit for buffer is set to be 3;
TimeoutException("Failed to allocate memory within the configured max 
blocking time") will show up when out of memory;
Since for every restart of the broker, it will sleep for 2000 ms,  so this 
transient failure seldom happen;
But if I reduce the sleeping period, the bigger chance failure happens; 
for example if the broker with role of controller suffered a restart, it 
will take time to select controller first, then select leader, which will lead 
to more messges blocked in KafkaProducer:RecordAccumulator:BufferPool;
In this fix, I just enlarge the producer's buffer size to be 1MB;
@guozhangwang , Could you give some comments?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZoneMayor/kafka trunk-KAFKA-2837

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/648.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #648


commit 95374147a28208d4850f6e73f714bf418935fc2d
Author: ZoneMayor 
Date:   2015-11-27T03:49:34Z

Merge pull request #1 from apache/trunk

merge

commit cec5b48b651a7efd3900cfa3c1fd0ab1eeeaa3ec
Author: ZoneMayor 
Date:   2015-12-01T10:44:02Z

Merge pull request #2 from apache/trunk

2015-12-1

commit a119d547bf1741625ce0627073c7909992a20f15
Author: ZoneMayor 
Date:   2015-12-04T13:42:27Z

Merge pull request #3 from apache/trunk

2015-12-04#KAFKA-2893

commit b767a8dff85fc71c75d4cf5178c3f6f03ff81bfc
Author: ZoneMayor 
Date:   2015-12-09T10:42:30Z

Merge pull request #5 from apache/trunk

2015-12-9

commit cd5e6f4700a4387f9383b84aca0ee9c4639b1033
Author: jinxing 
Date:   2015-12-09T13:49:07Z

KAFKA-2837: fix transient failure kafka.api.ProducerBounceTest > 
testBrokerFailure

commit 8ded9104a04861f789a7a990c2ddd4fc38a899cd
Author: ZoneMayor 
Date:   2015-12-10T04:47:06Z

Merge pull request #6 from apache/trunk

2015-12-10

commit 2bcf010c73923bb24bbd9cece7e39983b2bdce0c
Author: jinxing 
Date:   2015-12-10T04:47:39Z

KAFKA-2837: WIP

commit dae4a3cc0b564bb25121d54e65b5ad363c3e866d
Author: jinxing 
Date:   2015-12-10T04:48:21Z

Merge branch 'trunk-KAFKA-2837' of https://github.com/ZoneMayor/kafka into 
trunk-KAFKA-2837

commit 7118e11813e445bca3eab65a23028e76138b136a
Author: jinxing 
Date:   2015-12-10T04:51:43Z

KAFKA-2837: WIP




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-10 Thread ZoneMayor
Github user ZoneMayor closed the pull request at:

https://github.com/apache/kafka/pull/648


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-10 Thread ZoneMayor
Github user ZoneMayor closed the pull request at:

https://github.com/apache/kafka/pull/648


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-10 Thread ZoneMayor
GitHub user ZoneMayor reopened a pull request:

https://github.com/apache/kafka/pull/648

KAFKA-2837: fix transient failure of kafka.api.ProducerBounceTest > 
testBrokerFailure

I can reproduced this transient failure, it seldom happen;
code is like below:
 // rolling bounce brokers
for (i <- 0 until numServers) {
  for (server <- servers) {
server.shutdown()
server.awaitShutdown()
server.startup()
Thread.sleep(2000)
  }

  // Make sure the producer do not see any exception
  // in returned metadata due to broker failures
  assertTrue(scheduler.failed == false)

  // Make sure the leader still exists after bouncing brokers
  (0 until numPartitions).foreach(partition => 
TestUtils.waitUntilLeaderIsElectedOrChanged(zkUtils, topic1, partition))
Brokers keep rolling restart, and producer keep sending messages;
In every loop, it will wait for election of partition leader;
But if the election is slow, more messages will be buffered in 
RecordAccumulator's BufferPool;
The limit for buffer is set to be 3;
TimeoutException("Failed to allocate memory within the configured max 
blocking time") will show up when out of memory;
Since for every restart of the broker, it will sleep for 2000 ms,  so this 
transient failure seldom happen;
But if I reduce the sleeping period, the bigger chance failure happens; 
for example if the broker with role of controller suffered a restart, it 
will take time to select controller first, then select leader, which will lead 
to more messges blocked in KafkaProducer:RecordAccumulator:BufferPool;
In this fix, I just enlarge the producer's buffer size to be 1MB;
@guozhangwang , Could you give some comments?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZoneMayor/kafka trunk-KAFKA-2837

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/648.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #648


commit 95374147a28208d4850f6e73f714bf418935fc2d
Author: ZoneMayor 
Date:   2015-11-27T03:49:34Z

Merge pull request #1 from apache/trunk

merge

commit cec5b48b651a7efd3900cfa3c1fd0ab1eeeaa3ec
Author: ZoneMayor 
Date:   2015-12-01T10:44:02Z

Merge pull request #2 from apache/trunk

2015-12-1

commit a119d547bf1741625ce0627073c7909992a20f15
Author: ZoneMayor 
Date:   2015-12-04T13:42:27Z

Merge pull request #3 from apache/trunk

2015-12-04#KAFKA-2893

commit b767a8dff85fc71c75d4cf5178c3f6f03ff81bfc
Author: ZoneMayor 
Date:   2015-12-09T10:42:30Z

Merge pull request #5 from apache/trunk

2015-12-9

commit cd5e6f4700a4387f9383b84aca0ee9c4639b1033
Author: jinxing 
Date:   2015-12-09T13:49:07Z

KAFKA-2837: fix transient failure kafka.api.ProducerBounceTest > 
testBrokerFailure

commit 8ded9104a04861f789a7a990c2ddd4fc38a899cd
Author: ZoneMayor 
Date:   2015-12-10T04:47:06Z

Merge pull request #6 from apache/trunk

2015-12-10

commit 2bcf010c73923bb24bbd9cece7e39983b2bdce0c
Author: jinxing 
Date:   2015-12-10T04:47:39Z

KAFKA-2837: WIP

commit dae4a3cc0b564bb25121d54e65b5ad363c3e866d
Author: jinxing 
Date:   2015-12-10T04:48:21Z

Merge branch 'trunk-KAFKA-2837' of https://github.com/ZoneMayor/kafka into 
trunk-KAFKA-2837

commit 7118e11813e445bca3eab65a23028e76138b136a
Author: jinxing 
Date:   2015-12-10T04:51:43Z

KAFKA-2837: WIP




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] kafka pull request: KAFKA-2837: fix transient failure of kafka.api...

2015-12-09 Thread ZoneMayor
GitHub user ZoneMayor opened a pull request:

https://github.com/apache/kafka/pull/648

KAFKA-2837: fix transient failure of kafka.api.ProducerBounceTest > 
testBrokerFailure

I can reproduced this transient failure, it seldom happen;
code is like below:
 // rolling bounce brokers
for (i <- 0 until numServers) {
  for (server <- servers) {
server.shutdown()
server.awaitShutdown()
server.startup()
Thread.sleep(2000)
  }

  // Make sure the producer do not see any exception
  // in returned metadata due to broker failures
  assertTrue(scheduler.failed == false)

  // Make sure the leader still exists after bouncing brokers
  (0 until numPartitions).foreach(partition => 
TestUtils.waitUntilLeaderIsElectedOrChanged(zkUtils, topic1, partition))
Brokers keep rolling restart, and producer keep sending messages;
In every loop, it will wait for election of partition leader;
But if the election is slow, more messages will be buffered in 
RecordAccumulator's BufferPool;
The limit for buffer is set to be 3;
TimeoutException("Failed to allocate memory within the configured max 
blocking time") will show up when out of memory;
Since for every restart of the broker, it will sleep for 2000 ms,  so this 
transient failure seldom happen;
But if I reduce the sleeping period, the bigger chance failure happens; 
for example if the broker with role of controller suffered a restart, it 
will take time to select controller first, then select leader, which will lead 
to more messges blocked in KafkaProducer:RecordAccumulator:BufferPool;
In this fix, I just enlarge the producer's buffer size to be 1MB;
@guozhangwang , Could you give some comments?

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ZoneMayor/kafka trunk-KAFKA-2837

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/kafka/pull/648.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #648


commit 95374147a28208d4850f6e73f714bf418935fc2d
Author: ZoneMayor 
Date:   2015-11-27T03:49:34Z

Merge pull request #1 from apache/trunk

merge

commit cec5b48b651a7efd3900cfa3c1fd0ab1eeeaa3ec
Author: ZoneMayor 
Date:   2015-12-01T10:44:02Z

Merge pull request #2 from apache/trunk

2015-12-1

commit a119d547bf1741625ce0627073c7909992a20f15
Author: ZoneMayor 
Date:   2015-12-04T13:42:27Z

Merge pull request #3 from apache/trunk

2015-12-04#KAFKA-2893

commit b767a8dff85fc71c75d4cf5178c3f6f03ff81bfc
Author: ZoneMayor 
Date:   2015-12-09T10:42:30Z

Merge pull request #5 from apache/trunk

2015-12-9

commit cd5e6f4700a4387f9383b84aca0ee9c4639b1033
Author: jinxing 
Date:   2015-12-09T13:49:07Z

KAFKA-2837: fix transient failure kafka.api.ProducerBounceTest > 
testBrokerFailure




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---