[Hadoop Wiki] Update of "Hbase/PerformanceEvaluation" by stack

Apache Wiki Sat, 17 Jan 2009 14:53:42 -0800

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change 
notification.


The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/PerformanceEvaluation

------------------------------------------------------------------------------
  
  Start cluster fresh for each test then wait for all regions to be deployed 
before starting up tests (means no content in memcache which means that for 
such as random read we are always going to the filesystem, never getting values 
from memcache).
  
- ||<rowbgcolor="#ececec">Experiment 
Run||0.2.0java6||mapfile0.17.1||0.19.0RC1!Java6||0.19.0RC1!Java6!Zlib||mapfile0.19.0||!BigTable||
+ ||<rowbgcolor="#ececec">Experiment 
Run||0.2.0java6||mapfile0.17.1||0.19.0RC1!Java6||0.19.0RC1!Java6!Zlib||0.19.0RC1!Java6,8Clients||mapfile0.19.0||!BigTable||
- ||random reads ||428||568||540||80||768||1212||
+ ||random reads ||428||568||540||80||768||768||1212||
- ||random reads (mem)||-||-||-||-||-||10811||
+ ||random reads (mem)||-||-||-||-||-||-||10811||
- ||random writes||2167||2218||9986||-||-||8850||
+ ||random writes||2167||2218||9986||-||-||-||8850||
- ||sequential reads||427||582||464||-||-||4425||
+ ||sequential reads||427||582||464||-||-||-||4425||
- ||sequential writes||2076||5684||9892||7182||7519||8547||
+ ||sequential writes||2076||5684||9892||7182||14027||7519||8547||
- ||scans||3737||55692||20971||20560||55555||15385||
+ ||scans||3737||55692||20971||20560||14742||55555||15385||
  
  Some improvement writing and scanning (faster than BigTable paper seemingly). 
 Random Reads still lag.  Sequential Reads lag badly.  A bit of fetch-ahead as 
we did scanning should help here.
  
@@ -197, +197 @@

  
  Of note, the mapfile numbers are less than those of hbase when writing 
because the mapfile tests write one file whereas hbase after first split is 
writing to multiple files concurrently.  On the other hand, hbase random read 
is very like mapfile random read, at least in single client case; we're 
effectively asking the filesystem for a random value from the midst of a file 
in both cases.  The mapfile numbers are useful as guage of how much hdfs has 
come on since the last time we ran PE.
  
- Block compression (zlib -- hbase bug won't let you specify lzo) is a little 
slower writing, way slower random-reading but about same scanning.
+ Block compression (native zlib -- hbase bug won't let you specify anything 
but the DefaultCodec, e.g. lzo) is a little slower writing, way slower 
random-reading but about same scanning.
  
- Will post a new state, 8 concurrent clients, in a while so we can start 
tracking how we are doing when contending clients.
+ The 8 concurrent clients write a single regionserver instance.  Our cluster 
is four computers.  Load was put up by running a MR job as follows: {{{$ 
./bin/hadoop org.apache.hadoop.hbase.PerformanceEvaluation randomRead  8}}} MR 
job ran two mappers per computer so 8 clients running concurrently.  Timings 
were those reported at head of the MR job page in the UI.

[Hadoop Wiki] Update of "Hbase/PerformanceEvaluation" by stack

Reply via email to