[ 
https://issues.apache.org/jira/browse/KAFKA-545?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jay Kreps updated KAFKA-545:
----------------------------

    Attachment: KAFKA-545-draft.patch

Attaching wip that has this command. Usage is:

jkreps-mn:kafka-git jkreps$ ./bin/kafka-run-class.sh kafka.perf.LogPerformance 
--help
Option                                  Description                            
------                                  -----------                            
--batch-size <Integer: size>            Number of messages to write in a       
                                          single batch. (default: 200)         
--compression-codec <Integer:           If set, messages are sent compressed   
  compression codec >                     (default: 0)                         
--date-format <date format>             The date format to use for formatting  
                                          the time field. See java.text.       
                                          SimpleDateFormat for options.        
                                          (default: yyyy-MM-dd HH:mm:ss:SSS)   
--dir <path>                            The log directory. (default:           
                                          
/var/folders/wV/wVHRnnYrEX0ZFMG7ypsUXE+++TM/-
                                          Tmp-/kafka-8193339)                  
--flush-interval <Integer:              The number of messages in a partition  
  num_messages>                           between flushes. (default:           
                                          2147483647)                          
--flush-time <Integer: ms>              The time between flushes. (default:    
                                          2147483647)                          
--help                                  Print usage.                           
--hide-header                           If set, skips printing the header for  
                                          the stats                            
--index-interval <Integer: bytes>       The number of bytes in between index   
                                          entries. (default: 4096)             
--message-size <Integer: size>          The size of each message. (default:    
                                          100)                                 
--messages <Long: count>                The number of messages to send or      
                                          consume (default:                    
                                          9223372036854775807)                 
--partitions <Integer: num_partitions>  The number of partitions. (default: 1) 
--reader-batch-size <Integer:           The number of messages to write at     
  num_messages>                           once. (default: 200)                 
--readers <Integer: num_threads>        The number of reader threads.          
                                          (default: 1)                         
--reporting-interval <Integer: size>    Interval at which to print progress    
                                          info. (default: 5000)                
--show-detailed-stats                   If set, stats are reported for each    
                                          reporting interval as configured by  
                                          reporting-interval                   
--topic <topic>                         REQUIRED: The topic to consume from.   
--writer-batch-size <Integer:           The number of messages to write at     
  num_messages>                           once. (default: 200)                 
--writers <Integer: num_threads>        The number of writer threads.          
                                          (default: 1)     
                
> Add a Performance Suite for the Log subsystem
> ---------------------------------------------
>
>                 Key: KAFKA-545
>                 URL: https://issues.apache.org/jira/browse/KAFKA-545
>             Project: Kafka
>          Issue Type: New Feature
>    Affects Versions: 0.8
>            Reporter: Jay Kreps
>            Priority: Blocker
>              Labels: features
>         Attachments: KAFKA-545-draft.patch
>
>
> We have had several performance concerns or potential improvements for the 
> logging subsystem. To conduct these in a data-driven way, it would be good to 
> have a single-machine performance test that isolated the performance of the 
> log.
> The performance optimizations we would like to evaluate include
> - Special casing appends in a follower which already have the correct offset 
> to avoid decompression and recompression
> - Memory mapping either all or some of the segment files to improve the 
> performance of small appends and lookups
> - Supporting multiple data directories and avoiding RAID
> Having a standalone tool is nice to isolate the component and makes profiling 
> more intelligible.
> This test would drive load against Log/LogManager controlled by a set of 
> command line options. These command line program could then be scripted up 
> into a suite of tests that covered variations in message size, message set 
> size, compression, number of partitions, etc.
> Here is a proposed usage for the tool:
> ./bin/kafka-log-perf-test.sh
> Option                   Description                            
> ------                       -----------                            
> --partitions             The number of partitions to write to
> --dir                       The directory in which to write the log
> --message-size      The size of the messages
> --set-size               The number of messages per write
> --compression        Compression alg
> --messages            The number of messages to write
> --readers                The number of reader threads reading the data
> The tool would capture latency and throughput for the append() and read() 
> operations.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to