[ https://issues.apache.org/jira/browse/KAFKA-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jay Kreps updated KAFKA-545: ---------------------------- Attachment: KAFKA-545-draft.patch Attaching wip that has this command. Usage is: jkreps-mn:kafka-git jkreps$ ./bin/kafka-run-class.sh kafka.perf.LogPerformance --help Option Description ------ ----------- --batch-size <Integer: size> Number of messages to write in a single batch. (default: 200) --compression-codec <Integer: If set, messages are sent compressed compression codec > (default: 0) --date-format <date format> The date format to use for formatting the time field. See java.text. SimpleDateFormat for options. (default: yyyy-MM-dd HH:mm:ss:SSS) --dir <path> The log directory. (default: /var/folders/wV/wVHRnnYrEX0ZFMG7ypsUXE+++TM/- Tmp-/kafka-8193339) --flush-interval <Integer: The number of messages in a partition num_messages> between flushes. (default: 2147483647) --flush-time <Integer: ms> The time between flushes. (default: 2147483647) --help Print usage. --hide-header If set, skips printing the header for the stats --index-interval <Integer: bytes> The number of bytes in between index entries. (default: 4096) --message-size <Integer: size> The size of each message. (default: 100) --messages <Long: count> The number of messages to send or consume (default: 9223372036854775807) --partitions <Integer: num_partitions> The number of partitions. (default: 1) --reader-batch-size <Integer: The number of messages to write at num_messages> once. (default: 200) --readers <Integer: num_threads> The number of reader threads. (default: 1) --reporting-interval <Integer: size> Interval at which to print progress info. (default: 5000) --show-detailed-stats If set, stats are reported for each reporting interval as configured by reporting-interval --topic <topic> REQUIRED: The topic to consume from. --writer-batch-size <Integer: The number of messages to write at num_messages> once. (default: 200) --writers <Integer: num_threads> The number of writer threads. (default: 1) > Add a Performance Suite for the Log subsystem > --------------------------------------------- > > Key: KAFKA-545 > URL: https://issues.apache.org/jira/browse/KAFKA-545 > Project: Kafka > Issue Type: New Feature > Affects Versions: 0.8 > Reporter: Jay Kreps > Priority: Blocker > Labels: features > Attachments: KAFKA-545-draft.patch > > > We have had several performance concerns or potential improvements for the > logging subsystem. To conduct these in a data-driven way, it would be good to > have a single-machine performance test that isolated the performance of the > log. > The performance optimizations we would like to evaluate include > - Special casing appends in a follower which already have the correct offset > to avoid decompression and recompression > - Memory mapping either all or some of the segment files to improve the > performance of small appends and lookups > - Supporting multiple data directories and avoiding RAID > Having a standalone tool is nice to isolate the component and makes profiling > more intelligible. > This test would drive load against Log/LogManager controlled by a set of > command line options. These command line program could then be scripted up > into a suite of tests that covered variations in message size, message set > size, compression, number of partitions, etc. > Here is a proposed usage for the tool: > ./bin/kafka-log-perf-test.sh > Option Description > ------ ----------- > --partitions The number of partitions to write to > --dir The directory in which to write the log > --message-size The size of the messages > --set-size The number of messages per write > --compression Compression alg > --messages The number of messages to write > --readers The number of reader threads reading the data > The tool would capture latency and throughput for the append() and read() > operations. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira