Dear Wiki user, You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.
The following page has been changed by stack: http://wiki.apache.org/hadoop/Hbase/NewFileFormat/Performance The comment on the change is: First commit New page: Numbers comparing MapFile and RFile (TFile+mods has dropped from the running for the moment anyways). The code used running tests is available over in [http://github.com/ryanobjc/hbase-rfile/tree/rfile github]. I did following on local filesystem and on 4node hdfs: {{{$ ./bin/hadoop org.apache.hadoop.hbase.MapFilePerformanceEvaluation ; ./bin/hadoop org.apache.hadoop.hbase.RFilePerformanceEvaluation}}} == Local Filesystem == Macosx, 10 byte cells and keys. === MapFile === {{{2009-02-06 10:40:53,553 INFO [main] hbase.MapFilePerformanceEvaluation(86): Running SequentialWriteBenchmark for 100000 rows. 2009-02-06 10:40:53,621 WARN [main] util.NativeCodeLoader(52): Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 2009-02-06 10:40:56,379 INFO [main] hbase.MapFilePerformanceEvaluation(89): Running SequentialWriteBenchmark for 100000 rows took 2713ms. 2009-02-06 10:40:56,380 INFO [main] hbase.MapFilePerformanceEvaluation(86): Running UniformRandomSmallScan for 100000 rows. 2009-02-06 10:41:00,367 INFO [main] hbase.MapFilePerformanceEvaluation(89): Running UniformRandomSmallScan for 100000 rows took 3969ms. 2009-02-06 10:41:00,367 INFO [main] hbase.MapFilePerformanceEvaluation(86): Running UniformRandomReadBenchmark for 100000 rows. 2009-02-06 10:41:07,791 INFO [main] hbase.MapFilePerformanceEvaluation(89): Running UniformRandomReadBenchmark for 100000 rows took 7418ms. 2009-02-06 10:41:07,796 INFO [main] hbase.MapFilePerformanceEvaluation(86): Running GaussianRandomReadBenchmark for 100000 rows. 2009-02-06 10:41:14,303 INFO [main] hbase.MapFilePerformanceEvaluation(89): Running GaussianRandomReadBenchmark for 100000 rows took 6483ms. 2009-02-06 10:41:14,303 INFO [main] hbase.MapFilePerformanceEvaluation(86): Running SequentialReadBenchmark for 100000 rows. 2009-02-06 10:41:15,158 INFO [main] hbase.MapFilePerformanceEvaluation(89): Running SequentialReadBenchmark for 100000 rows took 852ms.}}} === rfile 8k buffer === {{{2009-02-06 11:19:03,630 INFO [main] hbase.RFilePerformanceEvaluation(86): Running SequentialWriteBenchmark for 100000 rows. 2009-02-06 11:19:04,512 INFO [main] hbase.RFilePerformanceEvaluation(89): Running SequentialWriteBenchmark for 100000 rows took 835ms. 2009-02-06 11:19:04,516 INFO [main] hbase.RFilePerformanceEvaluation(86): Running UniformRandomSmallScan for 100000 rows. 2009-02-06 11:19:07,075 INFO [main] hbase.RFilePerformanceEvaluation(89): Running UniformRandomSmallScan for 100000 rows took 2424ms. 2009-02-06 11:19:07,078 INFO [main] hbase.RFilePerformanceEvaluation(86): Running UniformRandomReadBenchmark for 100000 rows. 2009-02-06 11:19:13,801 INFO [main] hbase.RFilePerformanceEvaluation(89): Running UniformRandomReadBenchmark for 100000 rows took 6715ms. 2009-02-06 11:19:13,806 INFO [main] hbase.RFilePerformanceEvaluation(86): Running GaussianRandomReadBenchmark for 100000 rows. 2009-02-06 11:19:19,646 INFO [main] hbase.RFilePerformanceEvaluation(89): Running GaussianRandomReadBenchmark for 100000 rows took 5835ms. 2009-02-06 11:19:19,647 INFO [main] hbase.RFilePerformanceEvaluation(86): Running SequentialReadBenchmark for 100000 rows. 2009-02-06 11:19:19,740 INFO [main] hbase.RFilePerformanceEvaluation(89): Running SequentialReadBenchmark for 100000 rows took 89ms.}}} == HDFS == 4 node hdfs cluster, ten byte keys and cells === MapFile === $ ./bin/hadoop org.apache.hadoop.hbase.MapFilePerformanceEvaluation {{{09/02/06 20:00:01 INFO hbase.MapFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 20:00:01 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 09/02/06 20:00:01 INFO compress.CodecPool: Got brand-new compressor 09/02/06 20:00:01 INFO compress.CodecPool: Got brand-new compressor 09/02/06 20:00:04 INFO hbase.MapFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 2754ms. 09/02/06 20:00:04 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 20:00:26 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 22265ms. 09/02/06 20:00:26 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 20:02:31 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 124587ms. 09/02/06 20:02:31 INFO hbase.MapFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows. 09/02/06 20:04:36 INFO hbase.MapFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 125150ms. 09/02/06 20:04:36 INFO hbase.MapFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 20:04:37 INFO hbase.MapFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 960ms.}}} === rfile 8k buffer using seek+read === ==== First Run ==== $ ./bin/hadoop org.apache.hadoop.hbase.RFilePerformanceEvaluation {{{09/02/06 20:05:23 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 20:05:24 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 578ms. 09/02/06 20:05:24 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 20:05:41 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 17492ms. 09/02/06 20:05:41 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 20:07:41 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 119389ms. 09/02/06 20:07:41 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows. 09/02/06 20:09:36 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 115376ms. 09/02/06 20:09:36 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 20:09:36 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 102ms.}}} ==== Second Run ==== {{{$ ./bin/hadoop org.apache.hadoop.hbase.RFilePerformanceEvaluation 09/02/06 20:35:14 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 20:35:15 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 674ms. 09/02/06 20:35:15 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 20:35:29 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 13320ms. 09/02/06 20:35:29 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 20:37:26 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 117903ms. 09/02/06 20:37:26 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 row 09/02/06 20:39:26 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 119625ms. 09/02/06 20:39:26 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 20:39:26 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 112ms.}}} === rfile 8k using pread === {{{$ ./bin/hadoop org.apache.hadoop.hbase.RFilePerformanceEvaluation 09/02/06 20:44:27 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 20:44:28 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 568ms. 09/02/06 20:44:28 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 20:44:42 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 14239ms. 09/02/06 20:44:42 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 20:46:20 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 97716ms. 09/02/06 20:46:20 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows. 09/02/06 20:47:54 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 93736ms. 09/02/06 20:47:54 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 20:47:54 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 389ms.}}} == HDFS using 1k keys == === rfile w/ 8k buffer and pread === ==== First Run ==== {{{$ ./bin/hadoop org.apache.hadoop.hbase.RFilePerformanceEvaluation 09/02/06 20:52:36 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 20:52:39 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 2949ms. 09/02/06 20:52:39 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 20:53:24 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 44921ms. 09/02/06 20:53:24 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 20:55:07 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 102617ms. 09/02/06 20:55:07 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows. 09/02/06 21:01:45 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 398033ms. 09/02/06 21:01:45 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 21:01:56 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 10784ms.}}} ==== Second Run ==== {{{09/02/06 22:10:51 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 22:10:54 INFO hbase.RFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 3151ms. 09/02/06 22:10:54 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 22:11:37 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 42660ms. 09/02/06 22:11:37 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 22:13:18 INFO hbase.RFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 100919ms. 09/02/06 22:13:18 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows. 09/02/06 22:19:49 INFO hbase.RFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 390413ms. 09/02/06 22:19:49 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 22:19:59 INFO hbase.RFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 10883ms.}}} === MapFile === ==== First Run ==== {{{$ ./bin/hadoop org.apache.hadoop.hbase.MapFilePerformanceEvaluation 09/02/06 21:03:19 INFO hbase.MapFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 21:03:19 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 09/02/06 21:03:19 INFO compress.CodecPool: Got brand-new compressor 09/02/06 21:03:19 INFO compress.CodecPool: Got brand-new compressor 09/02/06 21:03:34 INFO hbase.MapFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 14293ms. 09/02/06 21:03:34 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 21:04:03 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 29751ms. 09/02/06 21:04:03 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 21:07:50 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 226938ms. 09/02/06 21:07:50 INFO hbase.MapFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows. 09/02/06 21:11:41 INFO hbase.MapFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 230951ms. 09/02/06 21:11:41 INFO hbase.MapFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 21:11:44 INFO hbase.MapFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 2560ms.}}} ==== Second Run ==== {{{$ ./bin/hadoop org.apache.hadoop.hbase.MapFilePerformanceEvaluation ; ./bin/hadoop org.apache.hadoop.hbase.RFilePerformanceEvaluation 09/02/06 22:02:09 INFO hbase.MapFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows. 09/02/06 22:02:09 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 09/02/06 22:02:09 INFO compress.CodecPool: Got brand-new compressor 09/02/06 22:02:09 INFO compress.CodecPool: Got brand-new compressor 09/02/06 22:02:23 INFO hbase.MapFilePerformanceEvaluation: Running SequentialWriteBenchmark for 100000 rows took 14016ms. 09/02/06 22:02:23 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows. 09/02/06 22:02:56 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomSmallScan for 100000 rows took 32547ms. 09/02/06 22:02:56 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows. 09/02/06 22:06:50 INFO hbase.MapFilePerformanceEvaluation: Running UniformRandomReadBenchmark for 100000 rows took 234207ms. 09/02/06 22:06:50 INFO hbase.MapFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows. 09/02/06 22:10:48 INFO hbase.MapFilePerformanceEvaluation: Running GaussianRandomReadBenchmark for 100000 rows took 237558ms. 09/02/06 22:10:48 INFO hbase.MapFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows. 09/02/06 22:10:50 INFO hbase.MapFilePerformanceEvaluation: Running SequentialReadBenchmark for 100000 rows took 2625ms.}}}
