[jira] [Updated] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Ozeritskiy updated KAFKA-1414: - Attachment: parallel-dir-loading-trunk.patch parallel-dir-loading-0.8.patch Speedup broker startup after hard reset --- Key: KAFKA-1414 URL: https://issues.apache.org/jira/browse/KAFKA-1414 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: parallel-dir-loading-0.8.patch, parallel-dir-loading-trunk.patch After hard reset due to power failure broker takes way too much time recovering unflushed segments in a single thread. This could be easiliy improved launching multiple threads (one per data dirrectory, assuming that typically each data directory is on a dedicated drive). Localy we trie this simple patch to LogManager.loadLogs and it seems to work, however I'm too new to scala, so do not take it literally: {code} /** * Recover and load all logs in the given data directories */ private def loadLogs(dirs: Seq[File]) { val threads : Array[Thread] = new Array[Thread](dirs.size) var i: Int = 0 val me = this for(dir - dirs) { val thread = new Thread( new Runnable { def run() { val recoveryPoints = me.recoveryPointCheckpoints(dir).read /* load the logs */ val subDirs = dir.listFiles() if(subDirs != null) { val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) if(cleanShutDownFile.exists()) info(Found clean shutdown file. Skipping recovery for all logs in data directory '%s'.format(dir.getAbsolutePath)) for(dir - subDirs) { if(dir.isDirectory) { info(Loading log ' + dir.getName + ') val topicPartition = Log.parseTopicPartitionName(dir.getName) val config = topicConfigs.getOrElse(topicPartition.topic, defaultConfig) val log = new Log(dir, config, recoveryPoints.getOrElse(topicPartition, 0L), scheduler, time) val previous = addLogWithLock(topicPartition, log) if(previous != null) throw new IllegalArgumentException(Duplicate log directories found: %s, %s!.format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath)) } } cleanShutDownFile.delete() } } }) thread.start() threads(i) = thread i = i + 1 } for(thread - threads) { thread.join() } } def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { logCreationOrDeletionLock synchronized { this.logs.put(topicPartition, log) } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexey Ozeritskiy updated KAFKA-1414: - Affects Version/s: (was: 0.8.1) 0.9.0 0.8.2 0.8.1.1 Status: Patch Available (was: Open) Speedup broker startup after hard reset --- Key: KAFKA-1414 URL: https://issues.apache.org/jira/browse/KAFKA-1414 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.1.1, 0.8.2, 0.9.0 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: parallel-dir-loading-0.8.patch, parallel-dir-loading-trunk.patch After hard reset due to power failure broker takes way too much time recovering unflushed segments in a single thread. This could be easiliy improved launching multiple threads (one per data dirrectory, assuming that typically each data directory is on a dedicated drive). Localy we trie this simple patch to LogManager.loadLogs and it seems to work, however I'm too new to scala, so do not take it literally: {code} /** * Recover and load all logs in the given data directories */ private def loadLogs(dirs: Seq[File]) { val threads : Array[Thread] = new Array[Thread](dirs.size) var i: Int = 0 val me = this for(dir - dirs) { val thread = new Thread( new Runnable { def run() { val recoveryPoints = me.recoveryPointCheckpoints(dir).read /* load the logs */ val subDirs = dir.listFiles() if(subDirs != null) { val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) if(cleanShutDownFile.exists()) info(Found clean shutdown file. Skipping recovery for all logs in data directory '%s'.format(dir.getAbsolutePath)) for(dir - subDirs) { if(dir.isDirectory) { info(Loading log ' + dir.getName + ') val topicPartition = Log.parseTopicPartitionName(dir.getName) val config = topicConfigs.getOrElse(topicPartition.topic, defaultConfig) val log = new Log(dir, config, recoveryPoints.getOrElse(topicPartition, 0L), scheduler, time) val previous = addLogWithLock(topicPartition, log) if(previous != null) throw new IllegalArgumentException(Duplicate log directories found: %s, %s!.format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath)) } } cleanShutDownFile.delete() } } }) thread.start() threads(i) = thread i = i + 1 } for(thread - threads) { thread.join() } } def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { logCreationOrDeletionLock synchronized { this.logs.put(topicPartition, log) } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1521) Producer Graceful Shutdown issue in Container (Kafka version 0.8.x.x)
[ https://issues.apache.org/jira/browse/KAFKA-1521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053139#comment-14053139 ] Jun Rao commented on KAFKA-1521: In general, we probably can't just shutdown Metrics on producer close. There could be other producers/consumers or other usage of Metrics in the same jvm. Those metrics-meter-tick threads seem to be created globally, not per producer instance. So, they should be terminated when the jvm exits. Metrics does register a shutdown hook to achieve that. If your container somehow overrides those shutdown hooks, you can explicitly call the metrics shutdown hook on jvm exit. Producer Graceful Shutdown issue in Container (Kafka version 0.8.x.x) - Key: KAFKA-1521 URL: https://issues.apache.org/jira/browse/KAFKA-1521 Project: Kafka Issue Type: Bug Components: producer Affects Versions: 0.8.0, 0.8.1.1 Environment: Tomcat Container or Any other J2EE container Reporter: Bravesh Mistry Assignee: Jun Rao Priority: Minor Hi Kafka Team, We are running multiple webapps in tomcat container, and we have producer which are managed by the ServletContextListener (Lifecycle). Upon contextInitialized we create and on contextDestroyed we call the producer.close() but underlying Metric Lib does not shutdown. So we have thread leak due to this issue. I had to call Metrics.defaultRegistry().shutdown() to resolve this issue. is this know issue ? I know the metric lib have JVM Shutdown hook, but it will not be invoke since the contain thread is un-deploying the web app and class loader goes way and leaking thread does not find the under lying Kafka class. Because of this tomcat, it not shutting down gracefully. Are you guys planing to un-register metrics when Producer close is called or shutdown Metrics pool for client.id ? Here is logs: SEVERE: The web application [ ] appears to have started a thread named [metrics-meter-tick-thread-1] but has failed to stop it. This is very likely to create a memory leak. SEVERE: The web application [] appears to have started a thread named [metrics-meter-tick-thread-2] but has failed to stop it. This is very likely to create a memory leak. Thanks, Bhavesh -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: topic's partition have no leader and isr
Also, which version of Kafka are you using? Thanks, Jun On Thu, Jul 3, 2014 at 2:26 AM, 鞠大升 dashen...@gmail.com wrote: hi, all I have a topic with 32 partitions, after some reassign operation, 2 partitions became to no leader and isr. --- Topic:org.mobile_nginx PartitionCount:32 ReplicationFactor:1 Configs: Topic: org.mobile_nginx Partition: 0Leader: 3 Replicas: 3 Isr: 3 Topic: org.mobile_nginx Partition: 1Leader: 4 Replicas: 4 Isr: 4 Topic: org.mobile_nginx Partition: 2Leader: 5 Replicas: 5 Isr: 5 Topic: org.mobile_nginx Partition: 3Leader: 6 Replicas: 6 Isr: 6 Topic: org.mobile_nginx Partition: 4Leader: 3 Replicas: 3 Isr: 3 Topic: org.mobile_nginx Partition: 5Leader: 4 Replicas: 4 Isr: 4 Topic: org.mobile_nginx Partition: 6Leader: 5 Replicas: 5 Isr: 5 Topic: org.mobile_nginx Partition: 7Leader: 6 Replicas: 6 Isr: 6 Topic: org.mobile_nginx Partition: 8Leader: 3 Replicas: 3 Isr: 3 Topic: org.mobile_nginx Partition: 9Leader: 4 Replicas: 4 Isr: 4 Topic: org.mobile_nginx Partition: 10 Leader: 2 Replicas: 1 Isr: 2 Topic: org.mobile_nginx Partition: 11 Leader: 2 Replicas: 2 Isr: 2 Topic: org.mobile_nginx Partition: 12 Leader: 3 Replicas: 1 Isr: 3 Topic: org.mobile_nginx Partition: 13 Leader: 2 Replicas: 2 Isr: 2 Topic: org.mobile_nginx Partition: 14 Leader: 4 Replicas: 4 Isr: 4 Topic: org.mobile_nginx Partition: 15 Leader: 2 Replicas: 2 Isr: 2 Topic: org.mobile_nginx Partition: 16 Leader: 4 Replicas: 4 Isr: 4 Topic: org.mobile_nginx Partition: 17 Leader: 5 Replicas: 5 Isr: 5 Topic: org.mobile_nginx Partition: 18 Leader: 6 Replicas: 6 Isr: 6 Topic: org.mobile_nginx Partition: 19 Leader: 5 Replicas: 5 Isr: 5 Topic: org.mobile_nginx Partition: 20 Leader: 2 Replicas: 2 Isr: 2 Topic: org.mobile_nginx Partition: 21 Leader: 3 Replicas: 3 Isr: 3 Topic: org.mobile_nginx Partition: 22 Leader: 4 Replicas: 4 Isr: 4 Topic: org.mobile_nginx Partition: 23 Leader: 5 Replicas: 5 Isr: 5 Topic: org.mobile_nginx Partition: 24 Leader: 6 Replicas: 6 Isr: 6 Topic: org.mobile_nginx Partition: 25 Leader: -1 Replicas: 6,1 Isr: Topic: org.mobile_nginx Partition: 26 Leader: 2 Replicas: 2 Isr: 2 Topic: org.mobile_nginx Partition: 27 Leader: 3 Replicas: 3 Isr: 3 Topic: org.mobile_nginx Partition: 28 Leader: 4 Replicas: 4 Isr: 4 Topic: org.mobile_nginx Partition: 29 Leader: 5 Replicas: 5 Isr: 5 Topic: org.mobile_nginx Partition: 30 Leader: 6 Replicas: 6 Isr: 6 Topic: org.mobile_nginx Partition: 31 Leader: -1 Replicas: 3,1 Isr: --- partition-25 and partition-32 have no leader and no isr. No matter reassign or leader election operation, can not reduce replicas number, and can not election a leader for 4 days. Anyone have any idea how to resolve this problem? -- dashengju +86 13810875910 dashen...@gmail.com
[jira] [Commented] (KAFKA-1517) Messages is a required argument to Producer Performance Test
[ https://issues.apache.org/jira/browse/KAFKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053141#comment-14053141 ] Jun Rao commented on KAFKA-1517: Yes, I think it would be better if we document it as required and get rid of the default. Messages is a required argument to Producer Performance Test Key: KAFKA-1517 URL: https://issues.apache.org/jira/browse/KAFKA-1517 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Daniel Compton Priority: Trivial When running the producer performance test without providing a messages argument, you get an error: {noformat} $bin/kafka-producer-perf-test.sh --topics mirrormirror --broker-list kafka-dc21:9092 Missing required argument [messages] Option Description -- --- .. --messages Long: countThe number of messages to send or consume (default: 9223372036854775807) {noformat} However the [shell command documentation|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/perf/src/main/scala/kafka/perf/PerfConfig.scala#L25] doesn't say that this is required and implies that [2^63-1|http://en.wikipedia.org/wiki/9223372036854775807] (Long.MaxValue) messages will be sent. It should probably look like the [ConsoleProducer|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/core/src/main/scala/kafka/producer/ConsoleProducer.scala#L32] and prefix the documentation with REQUIRED or we should make this a non-required argument and set the default value to something sane like 100,000 messages. Which option is preferable for this? -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 23291: Patch for KAFKA-1528
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23291/ --- Review request for kafka. Bugs: KAFKA-1528 https://issues.apache.org/jira/browse/KAFKA-1528 Repository: kafka Description --- See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ See https://help.github.com/articles/dealing-with-line-endings Normalize line endings. Contents the same. Diffs - .gitattributes PRE-CREATION bin/windows/kafka-run-class.bat f4d2904a3320ae756ed4171f6e99566f1f0cf963 bin/windows/kafka-server-stop.bat b9496ce0fe7438b9df1bbdb10bd86ddefab27ff4 bin/windows/zookeeper-server-stop.bat 29bdee81ac3ac4a96b081ac369d8f6d85c237747 Diff: https://reviews.apache.org/r/23291/diff/ Testing --- Thanks, Evgeny Vereshchagin
[jira] [Updated] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgeny Vereshchagin updated KAFKA-1528: --- Attachment: KAFKA-1528.patch Normalize all the line endings -- Key: KAFKA-1528 URL: https://issues.apache.org/jira/browse/KAFKA-1528 Project: Kafka Issue Type: Improvement Reporter: Evgeny Vereshchagin Priority: Trivial Attachments: KAFKA-1528.patch Hi! I add .gitattributes file and remove all '\r' from some .bat files See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and https://help.github.com/articles/dealing-with-line-endings for explanation. Maybe, add https://help.github.com/articles/dealing-with-line-endings to contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (KAFKA-1528) Normalize all the line endings
Evgeny Vereshchagin created KAFKA-1528: -- Summary: Normalize all the line endings Key: KAFKA-1528 URL: https://issues.apache.org/jira/browse/KAFKA-1528 Project: Kafka Issue Type: Improvement Reporter: Evgeny Vereshchagin Priority: Trivial Attachments: KAFKA-1528.patch Hi! I add .gitattributes file and remove all '\r' from some .bat files See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and https://help.github.com/articles/dealing-with-line-endings for explanation. Maybe, add https://help.github.com/articles/dealing-with-line-endings to contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053142#comment-14053142 ] Evgeny Vereshchagin commented on KAFKA-1528: Created reviewboard https://reviews.apache.org/r/23291/diff/ against branch trunk Normalize all the line endings -- Key: KAFKA-1528 URL: https://issues.apache.org/jira/browse/KAFKA-1528 Project: Kafka Issue Type: Improvement Reporter: Evgeny Vereshchagin Priority: Trivial Attachments: KAFKA-1528.patch Hi! I add .gitattributes file and remove all '\r' from some .bat files See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and https://help.github.com/articles/dealing-with-line-endings for explanation. Maybe, add https://help.github.com/articles/dealing-with-line-endings to contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Assigned] (KAFKA-1519) Console consumer: expose configuration option to enable/disable writing the line separator
[ https://issues.apache.org/jira/browse/KAFKA-1519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Rao reassigned KAFKA-1519: -- Assignee: Gwen Shapira (was: Neha Narkhede) Console consumer: expose configuration option to enable/disable writing the line separator -- Key: KAFKA-1519 URL: https://issues.apache.org/jira/browse/KAFKA-1519 Project: Kafka Issue Type: Improvement Components: consumer Affects Versions: 0.8.1.1 Reporter: Michael Noll Assignee: Gwen Shapira Priority: Minor Attachments: KAFKA-1519.patch The current console consumer includes a {{DefaultMessageFormatter}}, which exposes a few user-configurable options which can be set on the command line via --property, e.g. --property line.separator=XYZ. Unfortunately, the current implementation does not allow the user to completely disable writing any such line separator. However, this functionality would be helpful to enable users to capture data as is from a Kafka topic to snapshot file. Capturing data as is -- without an artificial line separator -- is particularly nice for data in a binary format (including Avro). *No workaround* A potential workaround would be to pass an empty string as the property value of line.separator, but this doesn't work in the current implementation. The following variants throw an Invalid parser arguments exception: {code} --property line.separator= # nothing --property line.separator= # double quotes --property line.separator='' # single quotes {code} Escape tricks via a backslash don't work either. If there actually is a workaround please let me know. *How to fix* We can introduce a print.line option to enable/disable writing line.separator similar to how the code already uses print.key to enable/disable writing key.separator. This change is trivial. To preserve backwards compatibility, the print.line option would be set to true by default (unlike the print.key option, which defaults to false). *Alternatives* Apart from modifying the built-in {{DefaultMessageFormatter}}, users could of course implement their own custom {{MessageFormatter}}. But given that it's a) a trivial change to the {{DefaultMessageFormatter}} and b) a nice user feature I'd say changing the built-in {{DefaultMessageFormatter}} would be the better approach. This way, Kafka would support writing data as-is to a file out of the box. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053146#comment-14053146 ] Evgeny Vereshchagin commented on KAFKA-1528: Updated reviewboard against branch trunk Normalize all the line endings -- Key: KAFKA-1528 URL: https://issues.apache.org/jira/browse/KAFKA-1528 Project: Kafka Issue Type: Improvement Reporter: Evgeny Vereshchagin Priority: Trivial Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch Hi! I add .gitattributes file and remove all '\r' from some .bat files See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and https://help.github.com/articles/dealing-with-line-endings for explanation. Maybe, add https://help.github.com/articles/dealing-with-line-endings to contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgeny Vereshchagin updated KAFKA-1528: --- Attachment: KAFKA-1528_2014-07-06_16:20:28.patch Normalize all the line endings -- Key: KAFKA-1528 URL: https://issues.apache.org/jira/browse/KAFKA-1528 Project: Kafka Issue Type: Improvement Reporter: Evgeny Vereshchagin Priority: Trivial Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch Hi! I add .gitattributes file and remove all '\r' from some .bat files See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and https://help.github.com/articles/dealing-with-line-endings for explanation. Maybe, add https://help.github.com/articles/dealing-with-line-endings to contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23291: Patch for KAFKA-1528
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23291/ --- (Updated July 6, 2014, 4:23 p.m.) Review request for kafka. Bugs: KAFKA-1528 https://issues.apache.org/jira/browse/KAFKA-1528 Repository: kafka Description (updated) --- See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ See https://help.github.com/articles/dealing-with-line-endings Normalize line endings. Contents the same. Add text attribute to *.md files Diffs (updated) - .gitattributes PRE-CREATION bin/windows/kafka-run-class.bat f4d2904a3320ae756ed4171f6e99566f1f0cf963 bin/windows/kafka-server-stop.bat b9496ce0fe7438b9df1bbdb10bd86ddefab27ff4 bin/windows/zookeeper-server-stop.bat 29bdee81ac3ac4a96b081ac369d8f6d85c237747 Diff: https://reviews.apache.org/r/23291/diff/ Testing --- Thanks, Evgeny Vereshchagin
[jira] [Updated] (KAFKA-1528) Normalize all the line endings
[ https://issues.apache.org/jira/browse/KAFKA-1528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Evgeny Vereshchagin updated KAFKA-1528: --- Attachment: KAFKA-1528_2014-07-06_16:23:21.patch Normalize all the line endings -- Key: KAFKA-1528 URL: https://issues.apache.org/jira/browse/KAFKA-1528 Project: Kafka Issue Type: Improvement Reporter: Evgeny Vereshchagin Priority: Trivial Attachments: KAFKA-1528.patch, KAFKA-1528_2014-07-06_16:20:28.patch, KAFKA-1528_2014-07-06_16:23:21.patch Hi! I add .gitattributes file and remove all '\r' from some .bat files See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ and https://help.github.com/articles/dealing-with-line-endings for explanation. Maybe, add https://help.github.com/articles/dealing-with-line-endings to contributing/commiting guide? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053151#comment-14053151 ] Jun Rao commented on KAFKA-1414: Thanks for the patch. The compilation fails though since ParArray is not supported in scala 2.8. Is it only supported in scala 2.11? Speedup broker startup after hard reset --- Key: KAFKA-1414 URL: https://issues.apache.org/jira/browse/KAFKA-1414 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: parallel-dir-loading-0.8.patch, parallel-dir-loading-trunk.patch After hard reset due to power failure broker takes way too much time recovering unflushed segments in a single thread. This could be easiliy improved launching multiple threads (one per data dirrectory, assuming that typically each data directory is on a dedicated drive). Localy we trie this simple patch to LogManager.loadLogs and it seems to work, however I'm too new to scala, so do not take it literally: {code} /** * Recover and load all logs in the given data directories */ private def loadLogs(dirs: Seq[File]) { val threads : Array[Thread] = new Array[Thread](dirs.size) var i: Int = 0 val me = this for(dir - dirs) { val thread = new Thread( new Runnable { def run() { val recoveryPoints = me.recoveryPointCheckpoints(dir).read /* load the logs */ val subDirs = dir.listFiles() if(subDirs != null) { val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) if(cleanShutDownFile.exists()) info(Found clean shutdown file. Skipping recovery for all logs in data directory '%s'.format(dir.getAbsolutePath)) for(dir - subDirs) { if(dir.isDirectory) { info(Loading log ' + dir.getName + ') val topicPartition = Log.parseTopicPartitionName(dir.getName) val config = topicConfigs.getOrElse(topicPartition.topic, defaultConfig) val log = new Log(dir, config, recoveryPoints.getOrElse(topicPartition, 0L), scheduler, time) val previous = addLogWithLock(topicPartition, log) if(previous != null) throw new IllegalArgumentException(Duplicate log directories found: %s, %s!.format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath)) } } cleanShutDownFile.delete() } } }) thread.start() threads(i) = thread i = i + 1 } for(thread - threads) { thread.join() } } def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { logCreationOrDeletionLock synchronized { this.logs.put(topicPartition, log) } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23291: Patch for KAFKA-1528
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23291/#review47357 --- .gitattributes https://reviews.apache.org/r/23291/#comment83093 Thanks for the patch. Since we explicitly specify the text types, do we still need this line? - Jun Rao On July 6, 2014, 4:23 p.m., Evgeny Vereshchagin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23291/ --- (Updated July 6, 2014, 4:23 p.m.) Review request for kafka. Bugs: KAFKA-1528 https://issues.apache.org/jira/browse/KAFKA-1528 Repository: kafka Description --- See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ See https://help.github.com/articles/dealing-with-line-endings Normalize line endings. Contents the same. Add text attribute to *.md files Diffs - .gitattributes PRE-CREATION bin/windows/kafka-run-class.bat f4d2904a3320ae756ed4171f6e99566f1f0cf963 bin/windows/kafka-server-stop.bat b9496ce0fe7438b9df1bbdb10bd86ddefab27ff4 bin/windows/zookeeper-server-stop.bat 29bdee81ac3ac4a96b081ac369d8f6d85c237747 Diff: https://reviews.apache.org/r/23291/diff/ Testing --- Thanks, Evgeny Vereshchagin
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053167#comment-14053167 ] Evgeny Vereshchagin commented on KAFKA-1414: Changelog: [2.9.0|http://www.scala-lang.org/download/changelog.html#2.9.0]. Speedup broker startup after hard reset --- Key: KAFKA-1414 URL: https://issues.apache.org/jira/browse/KAFKA-1414 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: parallel-dir-loading-0.8.patch, parallel-dir-loading-trunk.patch After hard reset due to power failure broker takes way too much time recovering unflushed segments in a single thread. This could be easiliy improved launching multiple threads (one per data dirrectory, assuming that typically each data directory is on a dedicated drive). Localy we trie this simple patch to LogManager.loadLogs and it seems to work, however I'm too new to scala, so do not take it literally: {code} /** * Recover and load all logs in the given data directories */ private def loadLogs(dirs: Seq[File]) { val threads : Array[Thread] = new Array[Thread](dirs.size) var i: Int = 0 val me = this for(dir - dirs) { val thread = new Thread( new Runnable { def run() { val recoveryPoints = me.recoveryPointCheckpoints(dir).read /* load the logs */ val subDirs = dir.listFiles() if(subDirs != null) { val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) if(cleanShutDownFile.exists()) info(Found clean shutdown file. Skipping recovery for all logs in data directory '%s'.format(dir.getAbsolutePath)) for(dir - subDirs) { if(dir.isDirectory) { info(Loading log ' + dir.getName + ') val topicPartition = Log.parseTopicPartitionName(dir.getName) val config = topicConfigs.getOrElse(topicPartition.topic, defaultConfig) val log = new Log(dir, config, recoveryPoints.getOrElse(topicPartition, 0L), scheduler, time) val previous = addLogWithLock(topicPartition, log) if(previous != null) throw new IllegalArgumentException(Duplicate log directories found: %s, %s!.format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath)) } } cleanShutDownFile.delete() } } }) thread.start() threads(i) = thread i = i + 1 } for(thread - threads) { thread.join() } } def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { logCreationOrDeletionLock synchronized { this.logs.put(topicPartition, log) } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053171#comment-14053171 ] Jun Rao commented on KAFKA-1414: scala 2.8 is really old and we could drop the 2.8 support. However, I am not sure about using ParArray. It seems that the application doesn't have control on the degree of parallelism. For this particular task, the degree of parallelism should probably be based on the underlying I/O system, not just # of CPU cores, which seems to be what the scala library is based on. Speedup broker startup after hard reset --- Key: KAFKA-1414 URL: https://issues.apache.org/jira/browse/KAFKA-1414 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: parallel-dir-loading-0.8.patch, parallel-dir-loading-trunk.patch After hard reset due to power failure broker takes way too much time recovering unflushed segments in a single thread. This could be easiliy improved launching multiple threads (one per data dirrectory, assuming that typically each data directory is on a dedicated drive). Localy we trie this simple patch to LogManager.loadLogs and it seems to work, however I'm too new to scala, so do not take it literally: {code} /** * Recover and load all logs in the given data directories */ private def loadLogs(dirs: Seq[File]) { val threads : Array[Thread] = new Array[Thread](dirs.size) var i: Int = 0 val me = this for(dir - dirs) { val thread = new Thread( new Runnable { def run() { val recoveryPoints = me.recoveryPointCheckpoints(dir).read /* load the logs */ val subDirs = dir.listFiles() if(subDirs != null) { val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) if(cleanShutDownFile.exists()) info(Found clean shutdown file. Skipping recovery for all logs in data directory '%s'.format(dir.getAbsolutePath)) for(dir - subDirs) { if(dir.isDirectory) { info(Loading log ' + dir.getName + ') val topicPartition = Log.parseTopicPartitionName(dir.getName) val config = topicConfigs.getOrElse(topicPartition.topic, defaultConfig) val log = new Log(dir, config, recoveryPoints.getOrElse(topicPartition, 0L), scheduler, time) val previous = addLogWithLock(topicPartition, log) if(previous != null) throw new IllegalArgumentException(Duplicate log directories found: %s, %s!.format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath)) } } cleanShutDownFile.delete() } } }) thread.start() threads(i) = thread i = i + 1 } for(thread - threads) { thread.join() } } def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { logCreationOrDeletionLock synchronized { this.logs.put(topicPartition, log) } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Created] (KAFKA-1529) transient unit test failure in testAutoCreateAfterDeleteTopic
Jun Rao created KAFKA-1529: -- Summary: transient unit test failure in testAutoCreateAfterDeleteTopic Key: KAFKA-1529 URL: https://issues.apache.org/jira/browse/KAFKA-1529 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2 Reporter: Jun Rao Assignee: Jun Rao Saw the following transient failure. kafka.admin.DeleteTopicTest testAutoCreateAfterDeleteTopic FAILED org.scalatest.junit.JUnitTestFailedError: Topic should have been auto created at org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101) at org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149) at org.scalatest.Assertions$class.fail(Assertions.scala:711) at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149) at kafka.admin.DeleteTopicTest.testAutoCreateAfterDeleteTopic(DeleteTopicTest.scala:222) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1529) transient unit test failure in testAutoCreateAfterDeleteTopic
[ https://issues.apache.org/jira/browse/KAFKA-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053174#comment-14053174 ] Jun Rao commented on KAFKA-1529: The issue is probably due to not enough retries in the producer. However, testAutoCreateAfterDeleteTopic seems redundant since it should be covered by testRecreateTopicAfterDeletion already. transient unit test failure in testAutoCreateAfterDeleteTopic - Key: KAFKA-1529 URL: https://issues.apache.org/jira/browse/KAFKA-1529 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2 Reporter: Jun Rao Assignee: Jun Rao Saw the following transient failure. kafka.admin.DeleteTopicTest testAutoCreateAfterDeleteTopic FAILED org.scalatest.junit.JUnitTestFailedError: Topic should have been auto created at org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101) at org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149) at org.scalatest.Assertions$class.fail(Assertions.scala:711) at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149) at kafka.admin.DeleteTopicTest.testAutoCreateAfterDeleteTopic(DeleteTopicTest.scala:222) -- This message was sent by Atlassian JIRA (v6.2#6252)
Review Request 23294: Patch for KAFKA-1529
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23294/ --- Review request for kafka. Bugs: KAFKA-1529 https://issues.apache.org/jira/browse/KAFKA-1529 Repository: kafka Description --- remove testAutoCreateAfterDeleteTopic since it's redundant Diffs - core/src/test/scala/unit/kafka/admin/DeleteTopicTest.scala 5d3c57a5ccf3eee52888cc5a34c2a3acc3dd3679 Diff: https://reviews.apache.org/r/23294/diff/ Testing --- Thanks, Jun Rao
[jira] [Commented] (KAFKA-1529) transient unit test failure in testAutoCreateAfterDeleteTopic
[ https://issues.apache.org/jira/browse/KAFKA-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053175#comment-14053175 ] Jun Rao commented on KAFKA-1529: Created reviewboard https://reviews.apache.org/r/23294/ against branch origin/trunk transient unit test failure in testAutoCreateAfterDeleteTopic - Key: KAFKA-1529 URL: https://issues.apache.org/jira/browse/KAFKA-1529 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2 Reporter: Jun Rao Assignee: Jun Rao Attachments: KAFKA-1529.patch Saw the following transient failure. kafka.admin.DeleteTopicTest testAutoCreateAfterDeleteTopic FAILED org.scalatest.junit.JUnitTestFailedError: Topic should have been auto created at org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101) at org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149) at org.scalatest.Assertions$class.fail(Assertions.scala:711) at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149) at kafka.admin.DeleteTopicTest.testAutoCreateAfterDeleteTopic(DeleteTopicTest.scala:222) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Updated] (KAFKA-1529) transient unit test failure in testAutoCreateAfterDeleteTopic
[ https://issues.apache.org/jira/browse/KAFKA-1529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jun Rao updated KAFKA-1529: --- Attachment: KAFKA-1529.patch transient unit test failure in testAutoCreateAfterDeleteTopic - Key: KAFKA-1529 URL: https://issues.apache.org/jira/browse/KAFKA-1529 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.8.2 Reporter: Jun Rao Assignee: Jun Rao Attachments: KAFKA-1529.patch Saw the following transient failure. kafka.admin.DeleteTopicTest testAutoCreateAfterDeleteTopic FAILED org.scalatest.junit.JUnitTestFailedError: Topic should have been auto created at org.scalatest.junit.AssertionsForJUnit$class.newAssertionFailedException(AssertionsForJUnit.scala:101) at org.scalatest.junit.JUnit3Suite.newAssertionFailedException(JUnit3Suite.scala:149) at org.scalatest.Assertions$class.fail(Assertions.scala:711) at org.scalatest.junit.JUnit3Suite.fail(JUnit3Suite.scala:149) at kafka.admin.DeleteTopicTest.testAutoCreateAfterDeleteTopic(DeleteTopicTest.scala:222) -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053203#comment-14053203 ] Evgeny Vereshchagin commented on KAFKA-1414: {quote} scala 2.8 is really old and we could drop the 2.8 support. {quote} +1 for this {quote} It seems that the application doesn't have control on the degree of parallelism {quote} Since [2.10.0|http://www.scala-lang.org/download/changelog.html#parallel_collections_are_now_configurable_with_custom_thread_pools] Parallel Collections [configurable|http://docs.scala-lang.org/overviews/parallel-collections/configuration.html] with custom thread pools. Speedup broker startup after hard reset --- Key: KAFKA-1414 URL: https://issues.apache.org/jira/browse/KAFKA-1414 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: parallel-dir-loading-0.8.patch, parallel-dir-loading-trunk.patch After hard reset due to power failure broker takes way too much time recovering unflushed segments in a single thread. This could be easiliy improved launching multiple threads (one per data dirrectory, assuming that typically each data directory is on a dedicated drive). Localy we trie this simple patch to LogManager.loadLogs and it seems to work, however I'm too new to scala, so do not take it literally: {code} /** * Recover and load all logs in the given data directories */ private def loadLogs(dirs: Seq[File]) { val threads : Array[Thread] = new Array[Thread](dirs.size) var i: Int = 0 val me = this for(dir - dirs) { val thread = new Thread( new Runnable { def run() { val recoveryPoints = me.recoveryPointCheckpoints(dir).read /* load the logs */ val subDirs = dir.listFiles() if(subDirs != null) { val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) if(cleanShutDownFile.exists()) info(Found clean shutdown file. Skipping recovery for all logs in data directory '%s'.format(dir.getAbsolutePath)) for(dir - subDirs) { if(dir.isDirectory) { info(Loading log ' + dir.getName + ') val topicPartition = Log.parseTopicPartitionName(dir.getName) val config = topicConfigs.getOrElse(topicPartition.topic, defaultConfig) val log = new Log(dir, config, recoveryPoints.getOrElse(topicPartition, 0L), scheduler, time) val previous = addLogWithLock(topicPartition, log) if(previous != null) throw new IllegalArgumentException(Duplicate log directories found: %s, %s!.format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath)) } } cleanShutDownFile.delete() } } }) thread.start() threads(i) = thread i = i + 1 } for(thread - threads) { thread.join() } } def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { logCreationOrDeletionLock synchronized { this.logs.put(topicPartition, log) } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23208: Patch for KAFKA-1512
--- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23208/#review47362 --- core/src/main/scala/kafka/network/SocketServer.scala https://reviews.apache.org/r/23208/#comment83102 We probably should get remoteSocketAddress? core/src/main/scala/kafka/network/SocketServer.scala https://reviews.apache.org/r/23208/#comment83097 Would it be useful to add some metrics such as avg/max connections per host? core/src/main/scala/kafka/network/SocketServer.scala https://reviews.apache.org/r/23208/#comment83096 An InetAddress has a host and a port. It seems that we need to key on the hostName of an InetAddress. core/src/main/scala/kafka/server/KafkaConfig.scala https://reviews.apache.org/r/23208/#comment83100 This may not work well with ipv6 since the ip will contain :. This is also an issue for broker.list. So, do we want to address the issue now or fix it later together with broker.list? core/src/test/scala/unit/kafka/network/SocketServerTest.scala https://reviews.apache.org/r/23208/#comment83098 Should we just fail the test if we reach here? core/src/test/scala/unit/kafka/network/SocketServerTest.scala https://reviews.apache.org/r/23208/#comment83099 Do we really need to print the stack trace if this is expected? - Jun Rao On July 3, 2014, 10:18 p.m., Jay Kreps wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23208/ --- (Updated July 3, 2014, 10:18 p.m.) Review request for kafka. Bugs: KAFKA-1512 https://issues.apache.org/jira/browse/KAFKA-1512 Repository: kafka Description --- KAFKA-1512 Add per-ip connection limits. Diffs - core/src/main/scala/kafka/network/SocketServer.scala 4976d9c3a66bc965f5870a0736e21c7b32650bab core/src/main/scala/kafka/server/KafkaConfig.scala ef75b67b67676ae5b8931902cbc8c0c2cc72c0d3 core/src/main/scala/kafka/server/KafkaServer.scala c22e51e0412843ec993721ad3230824c0aadd2ba core/src/test/scala/unit/kafka/network/SocketServerTest.scala 1c492de8fde6582ca2342842a551739575d1f46c Diff: https://reviews.apache.org/r/23208/diff/ Testing --- Thanks, Jay Kreps
[jira] [Commented] (KAFKA-1414) Speedup broker startup after hard reset
[ https://issues.apache.org/jira/browse/KAFKA-1414?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053207#comment-14053207 ] Jun Rao commented on KAFKA-1414: Hmm, I am not sure if you should drop the support of scala 2.9 yet. If not, we probably should just manage the threadpool in Kafka itself, instead of using scala parallel collections. Speedup broker startup after hard reset --- Key: KAFKA-1414 URL: https://issues.apache.org/jira/browse/KAFKA-1414 Project: Kafka Issue Type: Improvement Components: log Affects Versions: 0.8.2, 0.9.0, 0.8.1.1 Reporter: Dmitry Bugaychenko Assignee: Jay Kreps Attachments: parallel-dir-loading-0.8.patch, parallel-dir-loading-trunk.patch After hard reset due to power failure broker takes way too much time recovering unflushed segments in a single thread. This could be easiliy improved launching multiple threads (one per data dirrectory, assuming that typically each data directory is on a dedicated drive). Localy we trie this simple patch to LogManager.loadLogs and it seems to work, however I'm too new to scala, so do not take it literally: {code} /** * Recover and load all logs in the given data directories */ private def loadLogs(dirs: Seq[File]) { val threads : Array[Thread] = new Array[Thread](dirs.size) var i: Int = 0 val me = this for(dir - dirs) { val thread = new Thread( new Runnable { def run() { val recoveryPoints = me.recoveryPointCheckpoints(dir).read /* load the logs */ val subDirs = dir.listFiles() if(subDirs != null) { val cleanShutDownFile = new File(dir, Log.CleanShutdownFile) if(cleanShutDownFile.exists()) info(Found clean shutdown file. Skipping recovery for all logs in data directory '%s'.format(dir.getAbsolutePath)) for(dir - subDirs) { if(dir.isDirectory) { info(Loading log ' + dir.getName + ') val topicPartition = Log.parseTopicPartitionName(dir.getName) val config = topicConfigs.getOrElse(topicPartition.topic, defaultConfig) val log = new Log(dir, config, recoveryPoints.getOrElse(topicPartition, 0L), scheduler, time) val previous = addLogWithLock(topicPartition, log) if(previous != null) throw new IllegalArgumentException(Duplicate log directories found: %s, %s!.format(log.dir.getAbsolutePath, previous.dir.getAbsolutePath)) } } cleanShutDownFile.delete() } } }) thread.start() threads(i) = thread i = i + 1 } for(thread - threads) { thread.join() } } def addLogWithLock(topicPartition: TopicAndPartition, log: Log): Log = { logCreationOrDeletionLock synchronized { this.logs.put(topicPartition, log) } } {code} -- This message was sent by Atlassian JIRA (v6.2#6252)
Re: Review Request 23291: Patch for KAFKA-1528
On July 6, 2014, 5:01 p.m., Jun Rao wrote: .gitattributes, lines 1-2 https://reviews.apache.org/r/23291/diff/2/?file=624384#file624384line1 Thanks for the patch. Since we explicitly specify the text types, do we still need this line? * text=auto set the default behavior, in case people don't have core.autocrlf set. But it means that you really trust Git to do binary detection properly Yes, you're right. I will drop this line and add other text files names (for example, HEADER, LICENCE, NOTICE etc) if you approve. - Evgeny --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23291/#review47357 --- On July 6, 2014, 4:23 p.m., Evgeny Vereshchagin wrote: --- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/23291/ --- (Updated July 6, 2014, 4:23 p.m.) Review request for kafka. Bugs: KAFKA-1528 https://issues.apache.org/jira/browse/KAFKA-1528 Repository: kafka Description --- See http://adaptivepatchwork.com/2012/03/01/mind-the-end-of-your-line/ See https://help.github.com/articles/dealing-with-line-endings Normalize line endings. Contents the same. Add text attribute to *.md files Diffs - .gitattributes PRE-CREATION bin/windows/kafka-run-class.bat f4d2904a3320ae756ed4171f6e99566f1f0cf963 bin/windows/kafka-server-stop.bat b9496ce0fe7438b9df1bbdb10bd86ddefab27ff4 bin/windows/zookeeper-server-stop.bat 29bdee81ac3ac4a96b081ac369d8f6d85c237747 Diff: https://reviews.apache.org/r/23291/diff/ Testing --- Thanks, Evgeny Vereshchagin
[jira] [Updated] (KAFKA-1517) Messages is a required argument to Producer Performance Test
[ https://issues.apache.org/jira/browse/KAFKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Compton updated KAFKA-1517: -- Description: When running the producer performance test without providing a messages argument, you get an error: {noformat} $bin/kafka-producer-perf-test.sh --topics mirrormirror --broker-list kafka-dc21:9092 Missing required argument [messages] Option Description -- --- .. --messages Long: countThe number of messages to send or consume (default: 9223372036854775807) {noformat} However the [shell command documentation|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/perf/src/main/scala/kafka/perf/PerfConfig.scala#L25] doesn't say that this is required and implies that [2^63-1|http://en.wikipedia.org/wiki/9223372036854775807] (Long.MaxValue) messages will be sent. It should probably look like the [ConsoleProducer|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/core/src/main/scala/kafka/producer/ConsoleProducer.scala#L32] and prefix the documentation with REQUIRED. Or should we make this a non-required argument and set the default value to something sane like 100,000 messages. Which option is preferable for this? was: When running the producer performance test without providing a messages argument, you get an error: {noformat} $bin/kafka-producer-perf-test.sh --topics mirrormirror --broker-list kafka-dc21:9092 Missing required argument [messages] Option Description -- --- .. --messages Long: countThe number of messages to send or consume (default: 9223372036854775807) {noformat} However the [shell command documentation|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/perf/src/main/scala/kafka/perf/PerfConfig.scala#L25] doesn't say that this is required and implies that [2^63-1|http://en.wikipedia.org/wiki/9223372036854775807] (Long.MaxValue) messages will be sent. It should probably look like the [ConsoleProducer|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/core/src/main/scala/kafka/producer/ConsoleProducer.scala#L32] and prefix the documentation with REQUIRED or we should make this a non-required argument and set the default value to something sane like 100,000 messages. Which option is preferable for this? Messages is a required argument to Producer Performance Test Key: KAFKA-1517 URL: https://issues.apache.org/jira/browse/KAFKA-1517 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Daniel Compton Priority: Trivial When running the producer performance test without providing a messages argument, you get an error: {noformat} $bin/kafka-producer-perf-test.sh --topics mirrormirror --broker-list kafka-dc21:9092 Missing required argument [messages] Option Description -- --- .. --messages Long: countThe number of messages to send or consume (default: 9223372036854775807) {noformat} However the [shell command documentation|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/perf/src/main/scala/kafka/perf/PerfConfig.scala#L25] doesn't say that this is required and implies that [2^63-1|http://en.wikipedia.org/wiki/9223372036854775807] (Long.MaxValue) messages will be sent. It should probably look like the [ConsoleProducer|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/core/src/main/scala/kafka/producer/ConsoleProducer.scala#L32] and prefix the documentation with REQUIRED. Or should we make this a non-required argument and set the default value to something sane like 100,000 messages. Which option is preferable for this? -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (KAFKA-1517) Messages is a required argument to Producer Performance Test
[ https://issues.apache.org/jira/browse/KAFKA-1517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14053248#comment-14053248 ] Daniel Compton commented on KAFKA-1517: --- Great, I'll prepare a patch, documenting it as mandatory and removing the default value. Messages is a required argument to Producer Performance Test Key: KAFKA-1517 URL: https://issues.apache.org/jira/browse/KAFKA-1517 Project: Kafka Issue Type: Bug Affects Versions: 0.8.1.1 Reporter: Daniel Compton Priority: Trivial When running the producer performance test without providing a messages argument, you get an error: {noformat} $bin/kafka-producer-perf-test.sh --topics mirrormirror --broker-list kafka-dc21:9092 Missing required argument [messages] Option Description -- --- .. --messages Long: countThe number of messages to send or consume (default: 9223372036854775807) {noformat} However the [shell command documentation|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/perf/src/main/scala/kafka/perf/PerfConfig.scala#L25] doesn't say that this is required and implies that [2^63-1|http://en.wikipedia.org/wiki/9223372036854775807] (Long.MaxValue) messages will be sent. It should probably look like the [ConsoleProducer|https://github.com/apache/kafka/blob/c66e408b244de52f1c5c5bbd7627aa1f028f9a87/core/src/main/scala/kafka/producer/ConsoleProducer.scala#L32] and prefix the documentation with REQUIRED. Or should we make this a non-required argument and set the default value to something sane like 100,000 messages. Which option is preferable for this? -- This message was sent by Atlassian JIRA (v6.2#6252)