[ https://issues.apache.org/jira/browse/HADOOP-5318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712491#action_12712491 ]
Ben Maurer commented on HADOOP-5318: ------------------------------------ When I tested stuff out, I got a boost from doing buffering that can't be replicated with just the grouping of atomic increments -- however, if we're just going to go with the simple version, this patch is as good as it gets. A similar patch is needed for input streams. > Poor IO Performance due to AtomicLong operations > ------------------------------------------------ > > Key: HADOOP-5318 > URL: https://issues.apache.org/jira/browse/HADOOP-5318 > Project: Hadoop Core > Issue Type: Bug > Affects Versions: 0.19.0 > Environment: 2x quad core xeon linux 64 bit > Reporter: Ben Maurer > Attachments: buf.patch, buffer-output.patch, hadoop-5318.txt, > hadoop-5318.txt, TestWriteConcurrency.java > > > The AtomicLong operations in counting file system statistics can cause high > levels of contention with multiple threads. This test demonstrates having > multiple threads writing to different sequence files: > {code:java} > import java.io.IOException; > import org.apache.hadoop.conf.Configuration; > import org.apache.hadoop.fs.FileSystem; > import org.apache.hadoop.fs.Path; > import org.apache.hadoop.io.ByteWritable; > import org.apache.hadoop.io.SequenceFile; > import org.apache.hadoop.io.SequenceFile.Writer; > import org.apache.hadoop.io.SequenceFile.CompressionType; > public class Test { > public static void main(String[] args) throws IOException { > final Configuration c = new Configuration(); > final FileSystem fs = FileSystem.get(c); > > final int NUM = 1000*1000; > for (int i = 0; i < Integer.valueOf(args[0]); i ++) { > final int ii = i; > new Thread(new Runnable() { > @Override > public void run() { > > try { > Writer f = > SequenceFile.createWriter(fs, c, new Path("/test/" + ii ), > ByteWritable.class, ByteWritable.class, CompressionType.NONE); > ByteWritable v = new > ByteWritable(); > > long time = > System.currentTimeMillis(); > for (int i = 0; i < NUM; i ++) > f.append(v, v); > f.close(); > long end = > System.currentTimeMillis(); > > System.out.printf("%d opartions > in %d msec. %f/second\n", NUM, end - time, (float)(1000 * NUM)/(end - time)); > > } catch (Exception e) { > // TODO Auto-generated catch > block > e.printStackTrace(); > } > > } > }).start(); > } > } > } > {code} > The results of this benchmark are > {code} > ==== 1 threads ==== > 1000000 opartions in 1431 msec. 698812.000000/second > ==== 2 threads ==== > 1000000 opartions in 3001 msec. 333222.250000/second > 1000000 opartions in 2985 msec. 335008.375000/second > ==== 3 threads ==== > 1000000 opartions in 4923 msec. 203128.171875/second > 1000000 opartions in 4924 msec. 203086.921875/second > 1000000 opartions in 4981 msec. 200762.906250/second > ==== 4 threads ==== > 1000000 opartions in 6716 msec. 148898.156250/second > 1000000 opartions in 7048 msec. 141884.218750/second > 1000000 opartions in 7342 msec. 136202.671875/second > 1000000 opartions in 7344 msec. 136165.578125/second > ==== 5 threads ==== > 1000000 opartions in 10366 msec. 96469.226563/second > 1000000 opartions in 11085 msec. 90212.000000/second > 1000000 opartions in 11121 msec. 89919.968750/second > 1000000 opartions in 11464 msec. 87229.585938/second > 1000000 opartions in 11538 msec. 86670.132813/second > ==== 6 threads ==== > 1000000 opartions in 16513 msec. 60558.347656/second > 1000000 opartions in 17704 msec. 56484.410156/second > 1000000 opartions in 18219 msec. 54887.753906/second > 1000000 opartions in 18550 msec. 53908.355469/second > 1000000 opartions in 18605 msec. 53748.992188/second > 1000000 opartions in 18663 msec. 53581.953125/second > ==== 7 threads ==== > 1000000 opartions in 22207 msec. 45030.847656/second > 1000000 opartions in 23275 msec. 42964.554688/second > 1000000 opartions in 23484 msec. 42582.183594/second > 1000000 opartions in 24378 msec. 41020.593750/second > 1000000 opartions in 24425 msec. 40941.656250/second > 1000000 opartions in 24533 msec. 40761.421875/second > 1000000 opartions in 24645 msec. 40576.183594/second > ==== 8 threads ==== > 1000000 opartions in 26375 msec. 37914.691406/second > 1000000 opartions in 26420 msec. 37850.113281/second > 1000000 opartions in 26532 msec. 37690.335938/second > 1000000 opartions in 26670 msec. 37495.312500/second > 1000000 opartions in 29772 msec. 33588.605469/second > 1000000 opartions in 29859 msec. 33490.738281/second > 1000000 opartions in 30098 msec. 33224.800781/second > 1000000 opartions in 30082 msec. 33242.468750/second > {code} > However, if I comment out the file system statistics increments, the > benchmark improves to: > {code} > ==== 1 threads ==== > 1000000 opartions in 1194 msec. 837520.937500/second > ==== 2 threads ==== > 1000000 opartions in 1433 msec. 697836.687500/second > 1000000 opartions in 1433 msec. 697836.687500/second > ==== 3 threads ==== > 1000000 opartions in 1643 msec. 608642.750000/second > 1000000 opartions in 1643 msec. 608642.750000/second > 1000000 opartions in 1639 msec. 610128.125000/second > ==== 4 threads ==== > 1000000 opartions in 1886 msec. 530222.687500/second > 1000000 opartions in 1886 msec. 530222.687500/second > 1000000 opartions in 1886 msec. 530222.687500/second > 1000000 opartions in 1899 msec. 526592.937500/second > ==== 5 threads ==== > 1000000 opartions in 2065 msec. 484261.500000/second > 1000000 opartions in 2066 msec. 484027.093750/second > 1000000 opartions in 2067 msec. 483792.937500/second > 1000000 opartions in 2066 msec. 484027.093750/second > 1000000 opartions in 2066 msec. 484027.093750/second > ==== 6 threads ==== > 1000000 opartions in 2151 msec. 464900.031250/second > 1000000 opartions in 2111 msec. 473709.156250/second > 1000000 opartions in 2153 msec. 464468.187500/second > 1000000 opartions in 2114 msec. 473036.906250/second > 1000000 opartions in 2113 msec. 473260.781250/second > 1000000 opartions in 2112 msec. 473484.843750/second > ==== 7 threads ==== > 1000000 opartions in 2368 msec. 422297.312500/second > 1000000 opartions in 2334 msec. 428449.000000/second > 1000000 opartions in 2332 msec. 428816.468750/second > 1000000 opartions in 2330 msec. 429184.562500/second > 1000000 opartions in 2332 msec. 428816.468750/second > 1000000 opartions in 2375 msec. 421052.625000/second > 1000000 opartions in 2394 msec. 417710.937500/second > ==== 8 threads ==== > 1000000 opartions in 2517 msec. 397298.375000/second > 1000000 opartions in 2538 msec. 394011.031250/second > 1000000 opartions in 2538 msec. 394011.031250/second > 1000000 opartions in 2538 msec. 394011.031250/second > 1000000 opartions in 2539 msec. 393855.843750/second > 1000000 opartions in 2614 msec. 382555.468750/second > 1000000 opartions in 2666 msec. 375093.781250/second > 1000000 opartions in 2701 msec. 370233.250000/second > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.