[ https://issues.apache.org/jira/browse/KAFKA-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jay Kreps resolved KAFKA-414. ----------------------------- Resolution: Won't Fix > Evaluate mmap-based writes for Log implementation > ------------------------------------------------- > > Key: KAFKA-414 > URL: https://issues.apache.org/jira/browse/KAFKA-414 > Project: Kafka > Issue Type: New Feature > Reporter: Jay Kreps > Priority: Minor > Attachments: TestLinearWritePerformance.java, > linear_write_performance.txt > > > Working on another project I noticed that small write performance for > FileChannel is really very bad. This likely effects Kafka in the case where > messages are produced one at a time or in small batches. I wrote a quick > program to evaluate the following options: > raf = RandomAccessFile > mmap = MappedByteBuffer > channel = FileChannel > For both of the later two I tried both direct-allocated and non-direct > allocated buffers (direct allocation is supposed to be faster). > Here are the results I saw: > [jkreps@jkreps-ld valencia]$ java -XX:+UseConcMarkSweepGC -cp > target/test-classes -server -Xmx1G -Xms1G valencia.TestLinearWritePerformance > $((256*1024)) $((1*1024*1024*1024)) 2 > file_length size (bytes) raf (mb/sec) > channel_direct (mb/sec) mmap_direct (mb/sec) channel_heap (mb/sec) > mmap_heap (mb/sec) > 1000000 1 > 0.60 0.52 28.66 > 0.55 50.40 > 2000000 2 > 1.18 1.16 67.84 > 1.13 74.17 > 4000000 4 > 2.33 2.26 121.52 > 2.23 122.14 > 8000000 8 > 4.72 4.51 228.39 > 4.41 175.20 > 16000000 16 > 9.25 8.96 393.24 > 8.88 314.11 > 32000000 32 > 18.43 17.93 601.83 > 17.28 482.25 > 64000000 64 > 36.25 35.21 799.98 > 34.39 680.39 > 128000000 128 > 69.80 67.52 963.30 > 66.21 870.82 > 256000000 256 > 134.24 129.25 1064.13 > 129.01 1014.00 > 512000000 512 > 247.38 238.24 1124.71 > 235.57 1091.81 > 1024000000 1024 > 420.42 411.43 1170.94 > 406.57 1138.80 > 1073741824 2048 > 671.93 658.96 1133.63 > 650.39 1151.81 > 1073741824 4096 > 1007.84 989.88 1165.60 > 976.10 1158.49 > 1073741824 8192 > 1137.12 1145.01 1189.38 > 1128.30 1174.66 > 1073741824 16384 > 1172.63 1228.33 1192.19 > 1206.58 1156.37 > 1073741824 32768 > 1221.13 1295.37 1170.96 > 1262.28 1156.65 > 1073741824 65536 > 1255.23 1306.33 1160.22 > 1268.24 1142.52 > 1073741824 131072 > 1240.65 1292.06 1101.90 > 1269.00 1119.14 > The size column gives the size of the write, and the length column gives the > total length of the file written. > Now over a period of time the 1GB/sec performance is unsustainable because > the disk on my machine would not be able to keep up. Nonetheless it is worth > noting that even up to 256 byte writes that is not the bottleneck, the > bottleneck is the write overhead. > This would indicate that a better strategy for the log would be to > pre-allocate the segment and mmap it. Then use the memory map for writes and > continue to use the filechannel for reads. -- This message was sent by Atlassian JIRA (v6.4.14#64029)