Re: Kafka TLP website move

2012-12-10 Thread Joe Schaefer


You need to tell us HOW to create it,
either as a CMS site or using svnpubsub!




 From: Jay Kreps jay.kr...@gmail.com
To: infrastruct...@apache.org infrastruct...@apache.org 
Cc: dev@kafka.apache.org 
Sent: Monday, December 10, 2012 12:30 PM
Subject: Re: Kafka TLP website move
 

Ooops, wrong ticket:
  https://issues.apache.org/jira/browse/INFRA-5586

:-)

-Jay




On Mon, Dec 10, 2012 at 9:29 AM, Jay Kreps jay.kr...@gmail.com wrote:

Hey guys,

It's been a few weeks and we are still waiting on getting a top-level website 
url for Kafka. Tried just making it myself, but that didn't work:
  jkreps@minotaur:/www$ mkdir kafka.apache.org
  mkdir: kafka.apache.org: Permission denied

Are we confused? Can anyone help?

Here is the ticket:
  https://issues.apache.org/jira/browse/KAFKA-654

Thanks!

-Jay





[jira] [Commented] (KAFKA-604) Add missing metrics in 0.8

2012-12-10 Thread Yang Ye (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528116#comment-13528116
 ] 

Yang Ye commented on KAFKA-604:
---

Sure, I'll do that soon

Best,
---
Victor Yang Ye
+1(650)283-6547

http://www.linkedin.com/pub/victor-yang-ye/13/ba3/4b8http://www.linkedin.com/profile/view?id=47740172
http://www.facebook.com/yeyangever

Founder   Uyan.cc
Software Engineer in Distributed Data System Group, LinkedIn Corporation
Dept. of Computer Science, Graduate School of Art and Scient, Columbia
University
Special Pilot Computer Science Class, Tsinghua University

yeyange...@gmail.com
y...@linkedin.com
yy2...@columbia.edu






 Add missing metrics in 0.8
 --

 Key: KAFKA-604
 URL: https://issues.apache.org/jira/browse/KAFKA-604
 Project: Kafka
  Issue Type: Bug
  Components: core
Affects Versions: 0.8
Reporter: Jun Rao
 Attachments: kafka_604_v1.patch, kafka_604_v2.patch

   Original Estimate: 24h
  Remaining Estimate: 24h

 It would be good if we add the following metrics:
 Producer: droppedMessageRate per topic
 ReplicaManager: partition count on the broker
 FileMessageSet: logFlushTimer per log (i.e., partition). Also, logFlushTime 
 should probably be moved to LogSegment since the flush now includes index 
 flush time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (KAFKA-654) Irrecoverable error while trying to roll a segment that already exists

2012-12-10 Thread Neha Narkhede (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-654?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neha Narkhede resolved KAFKA-654.
-

Resolution: Fixed
  Assignee: Neha Narkhede

Thanks for the review, committed patch v1 to 0.8 branch.

 Irrecoverable error while trying to roll a segment that already exists
 --

 Key: KAFKA-654
 URL: https://issues.apache.org/jira/browse/KAFKA-654
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Neha Narkhede
Priority: Blocker
 Attachments: kafka-654-v1.patch


 I tried setting up a 5 broker 0.8 cluster and sending messages to 100s of 
 topics on it. For a couple of topic partitions, the produce requests never 
 succeed since they fail on the leader with the following error - 
 [2012-12-05 22:54:05,711] WARN [Kafka Log on Broker 2], Newly rolled segment 
 file 000
 0.log already exists; deleting it first (kafka.log.Log)
 [2012-12-05 22:54:05,711] WARN [Kafka Log on Broker 2], Newly rolled segment 
 file 000
 0.index already exists; deleting it first (kafka.log.Log)
 [2012-12-05 22:54:05,715] ERROR [ReplicaFetcherThread-1-0-on-broker-2], Error 
 due to  (kafka.server.R
 eplicaFetcherThread)
 kafka.common.KafkaException: Trying to roll a new log segment for topic 
 partition NusWriteEvent-4 with start offset 0 while it already exsits
 at kafka.log.Log.rollToOffset(Log.scala:456)
 at kafka.log.Log.roll(Log.scala:434)
 at kafka.log.Log.maybeRoll(Log.scala:423)
 at kafka.log.Log.append(Log.scala:257)
 at 
 kafka.server.ReplicaFetcherThread.processPartitionData(ReplicaFetcherThread.scala:51)
 at 
 kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:125)
 at 
 kafka.server.AbstractFetcherThread$$anonfun$doWork$5.apply(AbstractFetcherThread.scala:108)
 at 
 scala.collection.immutable.HashMap$HashMap1.foreach(HashMap.scala:125)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:344)
 at 
 scala.collection.immutable.HashMap$HashTrieMap.foreach(HashMap.scala:344)
 at 
 kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:108)
 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Kafka TLP website move

2012-12-10 Thread Jay Kreps
Cool, makes sense. Let's go with SVN and svnpubsub then. The site
subdirectory that we would published is
  https://svn.apache.org/repos/asf/kafka/site

In the future if we switch to git we will just leave the site in svn and
continue to use that for site updates.

-Jay


On Mon, Dec 10, 2012 at 10:15 AM, Ted Dunning ted.dunn...@gmail.com wrote:

 We have an SVN repo for the web site and a git repo for code.


 On Mon, Dec 10, 2012 at 10:13 AM, Jay Kreps jay.kr...@gmail.com wrote:

 I am confused. The CMS documentation says this:
 * *

 *Instead of developing versioning support and a notification scheme into
 a database driven CMS, Apache's subversion 
 infrastructurehttp://svn.apache.org/was chosen as the central data store 
 for everything. The fact that the web
 interface to the CMS interacts with the subversion repository in a LAN
 environment, combined with the lightning-fast SSDs that serve as l2arc
 cache for the underlying FreeBSD ZFS filesystem, eliminates virtually all
 subversion network/disk latency. Subversion continues to scale past 1M
 commits to deliver high performance to Apache developers, as well as to our
 internal programs that rely on it.*
 How do the ASF's git projects maintain their websites if both website
 options require SVN? Should we just have a separate repository for the site
 that is in SVN (currently they are in the same repository)? Basically the
 project voted to move to git, so I don't want to make any choices that
 block that.

 -Jay


 On Mon, Dec 10, 2012 at 9:53 AM, Joe Schaefer joe_schae...@yahoo.comwrote:

 No, you may not stick with a manual process. If the
 CMS doesn't suit you (there is no requirement to use
 markdown- other cms sites use html), you must use
 svnpubsub.  There is no gitpubsub and there are no
 plans to write one.


   --
 *From:* Jay Kreps jay.kr...@gmail.com
  *To:* Joe Schaefer joe_schae...@yahoo.com
 *Sent:* Monday, December 10, 2012 12:50 PM

 *Subject:* Re: Kafka TLP website move

 The CMS sounds like it requires some kind of markdown format. Our site
 is in HTML, so that won't work.

 svnpubsub sounds like it requires svn. We are trying to move to git, so
 that probably isn't good either.

 Is it possible to stick with the manual update process we had for the
 incubator site?

 Thanks!

 -Jay

 On Mon, Dec 10, 2012 at 9:32 AM, Joe Schaefer joe_schae...@yahoo.comwrote:

 MS site or using svnpubsub









Async segment delete patch

2012-12-10 Thread Jay Kreps
Hello fellow log maintainers, I have a patch and would love your feedback:
https://issues.apache.org/jira/browse/KAFKA-636

There will be a few more in this series as I finish off the log compaction
work. This patch is against trunk, though we may end up needing to backport
it if we hit delete-related issues in 0.8.

Cheers,

-Jay


Re: Kafka TLP website move

2012-12-10 Thread Ted Dunning
Our experience with CMS has been good, btw.  Consider as an option.  You
get a very simple mark-down based web site with browser editing if you like
it.  Works very well.

On Mon, Dec 10, 2012 at 10:26 AM, Jay Kreps jay.kr...@gmail.com wrote:

 Cool, makes sense. Let's go with SVN and svnpubsub then. The site
 subdirectory that we would published is
   https://svn.apache.org/repos/asf/kafka/site

 In the future if we switch to git we will just leave the site in svn and
 continue to use that for site updates.

 -Jay


 On Mon, Dec 10, 2012 at 10:15 AM, Ted Dunning ted.dunn...@gmail.comwrote:

 We have an SVN repo for the web site and a git repo for code.


 On Mon, Dec 10, 2012 at 10:13 AM, Jay Kreps jay.kr...@gmail.com wrote:

 I am confused. The CMS documentation says this:
 * *

 *Instead of developing versioning support and a notification scheme
 into a database driven CMS, Apache's subversion 
 infrastructurehttp://svn.apache.org/was chosen as the central data store 
 for everything. The fact that the web
 interface to the CMS interacts with the subversion repository in a LAN
 environment, combined with the lightning-fast SSDs that serve as l2arc
 cache for the underlying FreeBSD ZFS filesystem, eliminates virtually all
 subversion network/disk latency. Subversion continues to scale past 1M
 commits to deliver high performance to Apache developers, as well as to our
 internal programs that rely on it.*
 How do the ASF's git projects maintain their websites if both website
 options require SVN? Should we just have a separate repository for the site
 that is in SVN (currently they are in the same repository)? Basically the
 project voted to move to git, so I don't want to make any choices that
 block that.

 -Jay


 On Mon, Dec 10, 2012 at 9:53 AM, Joe Schaefer joe_schae...@yahoo.comwrote:

 No, you may not stick with a manual process. If the
 CMS doesn't suit you (there is no requirement to use
 markdown- other cms sites use html), you must use
 svnpubsub.  There is no gitpubsub and there are no
 plans to write one.


   --
 *From:* Jay Kreps jay.kr...@gmail.com
  *To:* Joe Schaefer joe_schae...@yahoo.com
 *Sent:* Monday, December 10, 2012 12:50 PM

 *Subject:* Re: Kafka TLP website move

 The CMS sounds like it requires some kind of markdown format. Our site
 is in HTML, so that won't work.

 svnpubsub sounds like it requires svn. We are trying to move to git, so
 that probably isn't good either.

 Is it possible to stick with the manual update process we had for the
 incubator site?

 Thanks!

 -Jay

 On Mon, Dec 10, 2012 at 9:32 AM, Joe Schaefer 
 joe_schae...@yahoo.comwrote:

 MS site or using svnpubsub










[jira] Subscription: outstanding kafka patches

2012-12-10 Thread jira
Issue Subscription
Filter: outstanding kafka patches (57 issues)
The list of outstanding kafka patches
Subscriber: kafka-mailing-list

Key Summary
KAFKA-664   Kafka server threads die due to OOME during long running test
https://issues.apache.org/jira/browse/KAFKA-664
KAFKA-651   Create testcases on auto create topics
https://issues.apache.org/jira/browse/KAFKA-651
KAFKA-646   Provide aggregate stats at the high level Producer and 
ZookeeperConsumerConnector level
https://issues.apache.org/jira/browse/KAFKA-646
KAFKA-645   Create a shell script to run System Test with DEBUG details and 
tee console output to a file
https://issues.apache.org/jira/browse/KAFKA-645
KAFKA-637   Separate log4j environment variable from KAFKA_OPTS in 
kafka-run-class.sh
https://issues.apache.org/jira/browse/KAFKA-637
KAFKA-636   Make log segment delete asynchronous
https://issues.apache.org/jira/browse/KAFKA-636
KAFKA-628   System Test Failure Case 5005 (Mirror Maker bouncing) - Data Loss 
in ConsoleConsumer
https://issues.apache.org/jira/browse/KAFKA-628
KAFKA-621   System Test 9051 : ConsoleConsumer doesn't receives any data for 20 
topics but works for 10
https://issues.apache.org/jira/browse/KAFKA-621
KAFKA-607   System Test Transient Failure (case 4011 Log Retention) - 
ConsoleConsumer receives less data
https://issues.apache.org/jira/browse/KAFKA-607
KAFKA-606   System Test Transient Failure (case 0302 GC Pause) - Log segments 
mismatched across replicas
https://issues.apache.org/jira/browse/KAFKA-606
KAFKA-604   Add missing metrics in 0.8
https://issues.apache.org/jira/browse/KAFKA-604
KAFKA-598   decouple fetch size from max message size
https://issues.apache.org/jira/browse/KAFKA-598
KAFKA-597   Refactor KafkaScheduler
https://issues.apache.org/jira/browse/KAFKA-597
KAFKA-583   SimpleConsumerShell may receive less data inconsistently
https://issues.apache.org/jira/browse/KAFKA-583
KAFKA-552   No error messages logged for those failing-to-send messages from 
Producer
https://issues.apache.org/jira/browse/KAFKA-552
KAFKA-547   The ConsumerStats MBean name should include the groupid
https://issues.apache.org/jira/browse/KAFKA-547
KAFKA-530   kafka.server.KafkaApis: kafka.common.OffsetOutOfRangeException
https://issues.apache.org/jira/browse/KAFKA-530
KAFKA-493   High CPU usage on inactive server
https://issues.apache.org/jira/browse/KAFKA-493
KAFKA-479   ZK EPoll taking 100% CPU usage with Kafka Client
https://issues.apache.org/jira/browse/KAFKA-479
KAFKA-465   Performance test scripts - refactoring leftovers from tools to perf 
package
https://issues.apache.org/jira/browse/KAFKA-465
KAFKA-438   Code cleanup in MessageTest
https://issues.apache.org/jira/browse/KAFKA-438
KAFKA-419   Updated PHP client library to support kafka 0.7+
https://issues.apache.org/jira/browse/KAFKA-419
KAFKA-414   Evaluate mmap-based writes for Log implementation
https://issues.apache.org/jira/browse/KAFKA-414
KAFKA-411   Message Error in high cocurrent environment
https://issues.apache.org/jira/browse/KAFKA-411
KAFKA-404   When using chroot path, create chroot on startup if it doesn't exist
https://issues.apache.org/jira/browse/KAFKA-404
KAFKA-399   0.7.1 seems to show less performance than 0.7.0
https://issues.apache.org/jira/browse/KAFKA-399
KAFKA-398   Enhance SocketServer to Enable Sending Requests
https://issues.apache.org/jira/browse/KAFKA-398
KAFKA-397   kafka.common.InvalidMessageSizeException: null
https://issues.apache.org/jira/browse/KAFKA-397
KAFKA-388   Add a highly available consumer co-ordinator to a Kafka cluster
https://issues.apache.org/jira/browse/KAFKA-388
KAFKA-374   Move to java CRC32 implementation
https://issues.apache.org/jira/browse/KAFKA-374
KAFKA-346   Don't call commitOffsets() during rebalance
https://issues.apache.org/jira/browse/KAFKA-346
KAFKA-345   Add a listener to ZookeeperConsumerConnector to get notified on 
rebalance events
https://issues.apache.org/jira/browse/KAFKA-345
KAFKA-319   compression support added to php client does not pass unit tests
https://issues.apache.org/jira/browse/KAFKA-319
KAFKA-318   update zookeeper dependency to 3.3.5
https://issues.apache.org/jira/browse/KAFKA-318
KAFKA-314   Go Client Multi-produce
https://issues.apache.org/jira/browse/KAFKA-314
KAFKA-313   Add JSON output and looping options to ConsumerOffsetChecker
https://issues.apache.org/jira/browse/KAFKA-313
KAFKA-312   Add 'reset' operation for AsyncProducerDroppedEvents
https://issues.apache.org/jira/browse/KAFKA-312
KAFKA-298   Go Client support max message size
   

[jira] [Commented] (KAFKA-646) Provide aggregate stats at the high level Producer and ZookeeperConsumerConnector level

2012-12-10 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528183#comment-13528183
 ] 

Neha Narkhede commented on KAFKA-646:
-

Patch v2 looks good to me. Few minor questions -
1. Producer
You probably don't validate the client id anymore in the secondary constructor. 
Shouldn't we do that ?

2. ZookeeperConsumerConnector
consumerTopicStats is unused

3. Do the singleton validate() APIs need to be synchronized ? 


 Provide aggregate stats at the high level Producer and 
 ZookeeperConsumerConnector level
 ---

 Key: KAFKA-646
 URL: https://issues.apache.org/jira/browse/KAFKA-646
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Swapnil Ghike
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: bugs
 Fix For: 0.8

 Attachments: kafka-646-patch-num1-v1.patch, 
 kafka-646-patch-num1-v2.patch


 WIth KAFKA-622, we measure ProducerRequestStats and 
 FetchRequestAndResponseStats at the SyncProducer and SimpleConsumer level 
 respectively. We could also aggregate them in the high level Producer and 
 ZookeeperConsumerConnector level to provide an overall sense of 
 request/response rate/size at the client level. Currently, I am not 
 completely clear about the math that might be necessary for such  aggregation 
 or if metrics already provides an API for aggregating stats of the same type.
 We should also address the comments by Jun at KAFKA-622, I am copy pasting 
 them here:
 60. What happens if have 2 instances of Consumers with the same clientid in 
 the same jvm? Does one of them fail because it fails to register metrics? 
 Ditto for Producers.
 61. ConsumerTopicStats: What if a topic is named AllTopics? We use to handle 
 this by adding a - in topic specific stats.
 62. ZookeeperConsumerConnector: Do we need to validate groupid?
 63. ClientId: Does the clientid length need to be different from topic length?
 64. AbstractFetcherThread: When building a fetch request, do we need to pass 
 in brokerInfo as part of the client id? BrokerInfo contains the source broker 
 info and the fetch requests are always made to the source broker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-646) Provide aggregate stats at the high level Producer and ZookeeperConsumerConnector level

2012-12-10 Thread Swapnil Ghike (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil Ghike updated KAFKA-646:


Attachment: kafka-646-patch-num1-v3.patch

Thanks for reviewing.

Patch v3:
1. Oh, that's because clientId is validated at the end of ProducerConfig 
constructor.

2. Removed it. 

3. Currently the validate() APIs only check for illegal chars and they don't 
yet check whether the incoming clientId has already been taken. (I am planning 
to do it in a separate patch in the same jira, after this patch has been 
checked in). 

 Provide aggregate stats at the high level Producer and 
 ZookeeperConsumerConnector level
 ---

 Key: KAFKA-646
 URL: https://issues.apache.org/jira/browse/KAFKA-646
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Swapnil Ghike
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: bugs
 Fix For: 0.8

 Attachments: kafka-646-patch-num1-v1.patch, 
 kafka-646-patch-num1-v2.patch, kafka-646-patch-num1-v3.patch


 WIth KAFKA-622, we measure ProducerRequestStats and 
 FetchRequestAndResponseStats at the SyncProducer and SimpleConsumer level 
 respectively. We could also aggregate them in the high level Producer and 
 ZookeeperConsumerConnector level to provide an overall sense of 
 request/response rate/size at the client level. Currently, I am not 
 completely clear about the math that might be necessary for such  aggregation 
 or if metrics already provides an API for aggregating stats of the same type.
 We should also address the comments by Jun at KAFKA-622, I am copy pasting 
 them here:
 60. What happens if have 2 instances of Consumers with the same clientid in 
 the same jvm? Does one of them fail because it fails to register metrics? 
 Ditto for Producers.
 61. ConsumerTopicStats: What if a topic is named AllTopics? We use to handle 
 this by adding a - in topic specific stats.
 62. ZookeeperConsumerConnector: Do we need to validate groupid?
 63. ClientId: Does the clientid length need to be different from topic length?
 64. AbstractFetcherThread: When building a fetch request, do we need to pass 
 in brokerInfo as part of the client id? BrokerInfo contains the source broker 
 info and the fetch requests are always made to the source broker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-647) Provide a property in System Test for no. of topics and topics string will be generated automatically

2012-12-10 Thread John Fung (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Fung updated KAFKA-647:


Attachment: kafka-647-v1.patch

Uploaded kafka-647-v1.patch for the following changes:

1. To let System Test to generate the topic string with a specified no. of 
topics, add the testcase argument num_topics_for_auto_generated_string to 
system_test/_testsuite/testcase_/testcase__properties.json such as:

  testcase_args: {
broker_type: leader,
bounce_broker: false,
. . .
num_topics_for_auto_generated_string: 20,
. . .
  },

2. The topics string generated would be in the format of:
topic_0001,topic_0002,topic_0003, . . .,topic_.

3. The topic prefix is hard-coded to be topic_. As long as the topics index 
are incremented by 1, it doesn't matter too much about the prefix.

4. The existing topics specification in other test cases are still supported. 
This multi topics string will only be generated if the testcase argument in #1 
is specified

 Provide a property in System Test for no. of topics and topics string will be 
 generated automatically
 -

 Key: KAFKA-647
 URL: https://issues.apache.org/jira/browse/KAFKA-647
 Project: Kafka
  Issue Type: Task
Reporter: John Fung
Assignee: John Fung
  Labels: replication-testing
 Attachments: kafka-647-v1.patch


 Currently the topics string is specified in the testcase__properties.json 
 file such as:
 testcase_9051_properties.json:
 topic: 
 t001,t002,t003,t004,t005,t006,t007,t008,t009,t010,t011,t012,t013,t014,t015,t016,t017,t018,t019,t020,

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-647) Provide a property in System Test for no. of topics and topics string will be generated automatically

2012-12-10 Thread John Fung (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

John Fung updated KAFKA-647:


Status: Patch Available  (was: Open)

 Provide a property in System Test for no. of topics and topics string will be 
 generated automatically
 -

 Key: KAFKA-647
 URL: https://issues.apache.org/jira/browse/KAFKA-647
 Project: Kafka
  Issue Type: Task
Reporter: John Fung
Assignee: John Fung
  Labels: replication-testing
 Attachments: kafka-647-v1.patch


 Currently the topics string is specified in the testcase__properties.json 
 file such as:
 testcase_9051_properties.json:
 topic: 
 t001,t002,t003,t004,t005,t006,t007,t008,t009,t010,t011,t012,t013,t014,t015,t016,t017,t018,t019,t020,

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (KAFKA-646) Provide aggregate stats at the high level Producer and ZookeeperConsumerConnector level

2012-12-10 Thread Swapnil Ghike (JIRA)

 [ 
https://issues.apache.org/jira/browse/KAFKA-646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Swapnil Ghike updated KAFKA-646:


Attachment: kafka-646-patch-num1-v4.patch

Fixed a typo in FetchRequestAndResponseStats mbean creation.

 Provide aggregate stats at the high level Producer and 
 ZookeeperConsumerConnector level
 ---

 Key: KAFKA-646
 URL: https://issues.apache.org/jira/browse/KAFKA-646
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Swapnil Ghike
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: bugs
 Fix For: 0.8

 Attachments: kafka-646-patch-num1-v1.patch, 
 kafka-646-patch-num1-v2.patch, kafka-646-patch-num1-v3.patch, 
 kafka-646-patch-num1-v4.patch


 WIth KAFKA-622, we measure ProducerRequestStats and 
 FetchRequestAndResponseStats at the SyncProducer and SimpleConsumer level 
 respectively. We could also aggregate them in the high level Producer and 
 ZookeeperConsumerConnector level to provide an overall sense of 
 request/response rate/size at the client level. Currently, I am not 
 completely clear about the math that might be necessary for such  aggregation 
 or if metrics already provides an API for aggregating stats of the same type.
 We should also address the comments by Jun at KAFKA-622, I am copy pasting 
 them here:
 60. What happens if have 2 instances of Consumers with the same clientid in 
 the same jvm? Does one of them fail because it fails to register metrics? 
 Ditto for Producers.
 61. ConsumerTopicStats: What if a topic is named AllTopics? We use to handle 
 this by adding a - in topic specific stats.
 62. ZookeeperConsumerConnector: Do we need to validate groupid?
 63. ClientId: Does the clientid length need to be different from topic length?
 64. AbstractFetcherThread: When building a fetch request, do we need to pass 
 in brokerInfo as part of the client id? BrokerInfo contains the source broker 
 info and the fetch requests are always made to the source broker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (KAFKA-667) Rename .highwatermark file

2012-12-10 Thread Jay Kreps (JIRA)
Jay Kreps created KAFKA-667:
---

 Summary: Rename .highwatermark file
 Key: KAFKA-667
 URL: https://issues.apache.org/jira/browse/KAFKA-667
 Project: Kafka
  Issue Type: Improvement
Affects Versions: 0.8
Reporter: Jay Kreps
Assignee: Jay Kreps
Priority: Minor


The 0.8 branch currently has a file in each log directory called
  .highwatermark
Soon we hope to add two more files in the same format. One will hold the 
cleaner position for log deduplication, and the other will hold the flusher 
position for log flush. Each of these is sort of a highwater mark. It would 
be good to rename .highwatermark to be a little bit more intuitive when we add 
these other files. I propose:
  replication-offset-checkpoint
  flusher-offset-checkpoint
  cleaner-offset-checkpoint
replication-offset-checkpoint would replace the .highwatermark file. I am not 
making them dot files since they represent an important part of the persistent 
state and so the user should see it. Also shell * doesn't match hidden files, 
so if you did something like cp my_log/* to my_backup_log/* you would not get 
corresponding .highwatermark file.

I am filing this bug now because it might be nice to just make this trivial 
change now and avoid having to handle backwards compatibility later. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Objections to doing this in 0.8?

2012-12-10 Thread Jay Kreps
https://issues.apache.org/jira/browse/KAFKA-667

Goal would just be forward compatibility with a more sane naming scheme...

-Jay


Re: Objections to doing this in 0.8?

2012-12-10 Thread Neha Narkhede
Prefer doing this now than later.

Thanks,
Neha

On Mon, Dec 10, 2012 at 3:05 PM, Jay Kreps jay.kr...@gmail.com wrote:
 https://issues.apache.org/jira/browse/KAFKA-667

 Goal would just be forward compatibility with a more sane naming scheme...

 -Jay


[jira] [Commented] (KAFKA-513) Add state change log to Kafka brokers

2012-12-10 Thread Jay Kreps (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528432#comment-13528432
 ] 

Jay Kreps commented on KAFKA-513:
-

It would be nice if we updated our log4j.properties as part of this ticket so 
that this log went to a different log file (and not to console) since this is 
meant for debugging and will confuse everyone except for Neha :-). Would 
probably make it easier to read the state transitions too...

 Add state change log to Kafka brokers
 -

 Key: KAFKA-513
 URL: https://issues.apache.org/jira/browse/KAFKA-513
 Project: Kafka
  Issue Type: Sub-task
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: replication, tools
 Fix For: 0.8

   Original Estimate: 96h
  Remaining Estimate: 96h

 Once KAFKA-499 is checked in, every controller to broker communication can be 
 modelled as a state change for one or more partitions. Every state change 
 request will carry the controller epoch. If there is a problem with the state 
 of some partitions, it will be good to have a tool that can create a timeline 
 of requested and completed state changes. This will require each broker to 
 output a state change log that has entries like
 [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() for 
 partition [foo, 0] from controller 2, epoch 1
 [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() for 
 partition [foo, 0] from controller 2, epoch 1
 On controller, this will look like -
 [2012-09-10 10:06:17,198] controller 2, epoch 1, initiated state change 
 request LeaderAndIsr() for partition [foo, 0]
 We need a tool that can collect the state change log from all brokers and 
 create a per-partition timeline of state changes -
 [foo, 0]
 [2012-09-10 10:06:17,198] controller 2, epoch 1 initiated state change 
 request LeaderAndIsr() 
 [2012-09-10 10:06:17,280] broker 1 received request LeaderAndIsr() from 
 controller 2, epoch 1
 [2012-09-10 10:06:17,350] broker 1 completed request LeaderAndIsr() from 
 controller 2, epoch 1
 This JIRA involves adding the state change log to each broker and adding the 
 tool to create the timeline 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-646) Provide aggregate stats at the high level Producer and ZookeeperConsumerConnector level

2012-12-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-646?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528559#comment-13528559
 ] 

Jun Rao commented on KAFKA-646:
---

Thanks for patch v4. Looks good overall. A few minor comments:

40. GroupId,ClientId: The validation code is identical. Could we combine them 
into one utility? We can throw a generic InvalidConfigurationException with the 
right text.

41. The patch does apply because of changes in 
system_test/testcase_to_run.json. Do you actually intend to change this file?

 Provide aggregate stats at the high level Producer and 
 ZookeeperConsumerConnector level
 ---

 Key: KAFKA-646
 URL: https://issues.apache.org/jira/browse/KAFKA-646
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Swapnil Ghike
Assignee: Swapnil Ghike
Priority: Blocker
  Labels: bugs
 Fix For: 0.8

 Attachments: kafka-646-patch-num1-v1.patch, 
 kafka-646-patch-num1-v2.patch, kafka-646-patch-num1-v3.patch, 
 kafka-646-patch-num1-v4.patch


 WIth KAFKA-622, we measure ProducerRequestStats and 
 FetchRequestAndResponseStats at the SyncProducer and SimpleConsumer level 
 respectively. We could also aggregate them in the high level Producer and 
 ZookeeperConsumerConnector level to provide an overall sense of 
 request/response rate/size at the client level. Currently, I am not 
 completely clear about the math that might be necessary for such  aggregation 
 or if metrics already provides an API for aggregating stats of the same type.
 We should also address the comments by Jun at KAFKA-622, I am copy pasting 
 them here:
 60. What happens if have 2 instances of Consumers with the same clientid in 
 the same jvm? Does one of them fail because it fails to register metrics? 
 Ditto for Producers.
 61. ConsumerTopicStats: What if a topic is named AllTopics? We use to handle 
 this by adding a - in topic specific stats.
 62. ZookeeperConsumerConnector: Do we need to validate groupid?
 63. ClientId: Does the clientid length need to be different from topic length?
 64. AbstractFetcherThread: When building a fetch request, do we need to pass 
 in brokerInfo as part of the client id? BrokerInfo contains the source broker 
 info and the fetch requests are always made to the source broker.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: Objections to doing this in 0.8?

2012-12-10 Thread Jun Rao
That sounds goods to me.

Thanks,

Jun

On Mon, Dec 10, 2012 at 3:05 PM, Jay Kreps jay.kr...@gmail.com wrote:

 https://issues.apache.org/jira/browse/KAFKA-667

 Goal would just be forward compatibility with a more sane naming scheme...

 -Jay



[jira] [Commented] (KAFKA-581) provides windows batch script for starting Kafka/Zookeeper

2012-12-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528688#comment-13528688
 ] 

Jun Rao commented on KAFKA-581:
---

Thanks for the patch. Committed the stop scripts to 0.8.

 provides windows batch script for starting Kafka/Zookeeper
 --

 Key: KAFKA-581
 URL: https://issues.apache.org/jira/browse/KAFKA-581
 Project: Kafka
  Issue Type: Improvement
  Components: config
Affects Versions: 0.8
 Environment: Windows
Reporter: antoine vianey
Priority: Trivial
  Labels: features, run, windows
 Fix For: 0.8

 Attachments: kafka-console-consumer.bat, kafka-console-producer.bat, 
 kafka-run-class.bat, kafka-server-start.bat, kafka-server-stop.bat, sbt.bat, 
 zookeeper-server-start.bat, zookeeper-server-stop.bat

   Original Estimate: 24h
  Remaining Estimate: 24h

 Provide a port for quickstarting Kafka dev on Windows :
 - kafka-run-class.bat
 - kafka-server-start.bat
 - zookeeper-server-start.bat
 This will help Kafka community growth 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-374) Move to java CRC32 implementation

2012-12-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528690#comment-13528690
 ] 

Jun Rao commented on KAFKA-374:
---

Should this be a post 0.8 item?

 Move to java CRC32 implementation
 -

 Key: KAFKA-374
 URL: https://issues.apache.org/jira/browse/KAFKA-374
 Project: Kafka
  Issue Type: New Feature
  Components: core
Affects Versions: 0.8
Reporter: Jay Kreps
Priority: Minor
  Labels: newbie
 Attachments: KAFKA-374-draft.patch, KAFKA-374.patch


 We keep a per-record crc32. This is fairly cheap algorithm, but the java 
 implementation uses JNI and it seems to be a bit expensive for small records. 
 I have seen this before in Kafka profiles, and I noticed it on another 
 application I was working on. Basically with small records the native 
 implementation can only checksum  100MB/sec. Hadoop has done some analysis 
 of this and replaced it with a Java implementation that is 2x faster for 
 large values and 5-10x faster for small values. Details are here HADOOP-6148.
 We should do a quick read/write benchmark on log and message set iteration 
 and see if this improves things.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-664) Kafka server threads die due to OOME during long running test

2012-12-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528707#comment-13528707
 ] 

Jun Rao commented on KAFKA-664:
---

If the problem is due to an expired request not being removed from the request 
LinkedList in the watcher, then there should be at most 1 such outstanding 
request per topic/partition. So, if the number of topic/partition is fixed, the 
memory space taken by those outstanding requests should be bounded too, right? 
Not sure why this causes memory usage to keep going up.

 Kafka server threads die due to OOME during long running test
 -

 Key: KAFKA-664
 URL: https://issues.apache.org/jira/browse/KAFKA-664
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Jay Kreps
Priority: Blocker
  Labels: bugs
 Fix For: 0.8

 Attachments: kafka-664-draft-2.patch, kafka-664-draft.patch, Screen 
 Shot 2012-12-09 at 11.22.50 AM.png, Screen Shot 2012-12-09 at 11.23.09 
 AM.png, Screen Shot 2012-12-09 at 11.31.29 AM.png, thread-dump.log, 
 watchersForKey.png


 I set up a Kafka cluster with 5 brokers (JVM memory 512M) and set up a long 
 running producer process that sends data to 100s of partitions continuously 
 for ~15 hours. After ~4 hours of operation, few server threads (acceptor and 
 processor) exited due to OOME -
 [2012-12-07 08:24:44,355] ERROR OOME with size 1700161893 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
 'kafka-acceptor': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
 'kafka-processor-9092-1': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:46,344] INFO Unable to reconnect to ZooKeeper service, 
 session 0x13afd0753870103 has expired, closing socket connection 
 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:24:46,344] INFO zookeeper state changed (Expired) 
 (org.I0Itec.zkclient.ZkClient)
 [2012-12-07 08:24:46,344] INFO Initiating client connection, 
 connectString=eat1-app309.corp:12913,eat1-app310.corp:12913,eat1-app311.corp:12913,eat1-app312.corp:12913,eat1-app313.corp:12913
  sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@19202d69 
 (org.apache.zookeeper.ZooKeeper)
 [2012-12-07 08:24:55,702] ERROR OOME with size 2001040997 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:01,192] ERROR Uncaught exception in thread 
 'kafka-request-handler-0': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:08,739] INFO Opening socket connection to server 
 eat1-app311.corp/172.20.72.75:12913 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:14,221] INFO Socket connection established to 
 eat1-app311.corp/172.20.72.75:12913, initiating session 
 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:17,943] INFO Client session timed out, have not heard from 
 server in 3722ms for sessionid 0x0, closing socket connection and attempting 
 reconnect (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:19,805] ERROR error in loggedRunnable (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:23,528] ERROR OOME with size 1853095936 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 It seems like it runs out of memory while trying to read the producer 
 request, but its unclear so far. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (KAFKA-664) Kafka server threads die due to OOME during long running test

2012-12-10 Thread Jun Rao (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528723#comment-13528723
 ] 

Jun Rao commented on KAFKA-664:
---

Got it. So the issue is that for low volume topics, the fetch requests made by 
the followers keep got timed out. Those timeouted requests won't be removed 
from the request LinkedList in the watcher until the next produce request for 
that topic comes, which could be a long time.

 Kafka server threads die due to OOME during long running test
 -

 Key: KAFKA-664
 URL: https://issues.apache.org/jira/browse/KAFKA-664
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Jay Kreps
Priority: Blocker
  Labels: bugs
 Fix For: 0.8

 Attachments: kafka-664-draft-2.patch, kafka-664-draft.patch, Screen 
 Shot 2012-12-09 at 11.22.50 AM.png, Screen Shot 2012-12-09 at 11.23.09 
 AM.png, Screen Shot 2012-12-09 at 11.31.29 AM.png, thread-dump.log, 
 watchersForKey.png


 I set up a Kafka cluster with 5 brokers (JVM memory 512M) and set up a long 
 running producer process that sends data to 100s of partitions continuously 
 for ~15 hours. After ~4 hours of operation, few server threads (acceptor and 
 processor) exited due to OOME -
 [2012-12-07 08:24:44,355] ERROR OOME with size 1700161893 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
 'kafka-acceptor': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
 'kafka-processor-9092-1': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:46,344] INFO Unable to reconnect to ZooKeeper service, 
 session 0x13afd0753870103 has expired, closing socket connection 
 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:24:46,344] INFO zookeeper state changed (Expired) 
 (org.I0Itec.zkclient.ZkClient)
 [2012-12-07 08:24:46,344] INFO Initiating client connection, 
 connectString=eat1-app309.corp:12913,eat1-app310.corp:12913,eat1-app311.corp:12913,eat1-app312.corp:12913,eat1-app313.corp:12913
  sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@19202d69 
 (org.apache.zookeeper.ZooKeeper)
 [2012-12-07 08:24:55,702] ERROR OOME with size 2001040997 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:01,192] ERROR Uncaught exception in thread 
 'kafka-request-handler-0': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:08,739] INFO Opening socket connection to server 
 eat1-app311.corp/172.20.72.75:12913 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:14,221] INFO Socket connection established to 
 eat1-app311.corp/172.20.72.75:12913, initiating session 
 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:17,943] INFO Client session timed out, have not heard from 
 server in 3722ms for sessionid 0x0, closing socket connection and attempting 
 reconnect (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:19,805] ERROR error in loggedRunnable (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:23,528] ERROR OOME with size 1853095936 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 It seems like it runs out of memory while trying to read the producer 
 request, but its unclear so far. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (KAFKA-664) Kafka server threads die due to OOME during long running test

2012-12-10 Thread Neha Narkhede (JIRA)

[ 
https://issues.apache.org/jira/browse/KAFKA-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13528733#comment-13528733
 ] 

Neha Narkhede edited comment on KAFKA-664 at 12/11/12 6:45 AM:
---

That's correct. I'm tempted to checkin v2 for now and wait for the purgatory 
refactor patch. Until then, we can probably keep the JIRA open. Thoughts ?

  was (Author: nehanarkhede):
That's correct.
  
 Kafka server threads die due to OOME during long running test
 -

 Key: KAFKA-664
 URL: https://issues.apache.org/jira/browse/KAFKA-664
 Project: Kafka
  Issue Type: Bug
Affects Versions: 0.8
Reporter: Neha Narkhede
Assignee: Jay Kreps
Priority: Blocker
  Labels: bugs
 Fix For: 0.8

 Attachments: kafka-664-draft-2.patch, kafka-664-draft.patch, Screen 
 Shot 2012-12-09 at 11.22.50 AM.png, Screen Shot 2012-12-09 at 11.23.09 
 AM.png, Screen Shot 2012-12-09 at 11.31.29 AM.png, thread-dump.log, 
 watchersForKey.png


 I set up a Kafka cluster with 5 brokers (JVM memory 512M) and set up a long 
 running producer process that sends data to 100s of partitions continuously 
 for ~15 hours. After ~4 hours of operation, few server threads (acceptor and 
 processor) exited due to OOME -
 [2012-12-07 08:24:44,355] ERROR OOME with size 1700161893 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
 'kafka-acceptor': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:44,356] ERROR Uncaught exception in thread 
 'kafka-processor-9092-1': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:24:46,344] INFO Unable to reconnect to ZooKeeper service, 
 session 0x13afd0753870103 has expired, closing socket connection 
 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:24:46,344] INFO zookeeper state changed (Expired) 
 (org.I0Itec.zkclient.ZkClient)
 [2012-12-07 08:24:46,344] INFO Initiating client connection, 
 connectString=eat1-app309.corp:12913,eat1-app310.corp:12913,eat1-app311.corp:12913,eat1-app312.corp:12913,eat1-app313.corp:12913
  sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@19202d69 
 (org.apache.zookeeper.ZooKeeper)
 [2012-12-07 08:24:55,702] ERROR OOME with size 2001040997 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:01,192] ERROR Uncaught exception in thread 
 'kafka-request-handler-0': (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:08,739] INFO Opening socket connection to server 
 eat1-app311.corp/172.20.72.75:12913 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:14,221] INFO Socket connection established to 
 eat1-app311.corp/172.20.72.75:12913, initiating session 
 (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:17,943] INFO Client session timed out, have not heard from 
 server in 3722ms for sessionid 0x0, closing socket connection and attempting 
 reconnect (org.apache.zookeeper.ClientCnxn)
 [2012-12-07 08:25:19,805] ERROR error in loggedRunnable (kafka.utils.Utils$)
 java.lang.OutOfMemoryError: Java heap space
 [2012-12-07 08:25:23,528] ERROR OOME with size 1853095936 
 (kafka.network.BoundedByteBufferReceive)
 java.lang.OutOfMemoryError: Java heap space
 It seems like it runs out of memory while trying to read the producer 
 request, but its unclear so far. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira