This is a fantastic test and should be made more public. Great work Dave
-----Original Message----- From: M.Deniz OKTAR [mailto:[email protected]] Sent: Friday, May 13, 2011 2:20 PM To: [email protected]; [email protected] Subject: Re: VMWare and Hadoop/Hbase I'v tried this in our internal tests with Xen, We tried to see if the performance degrades consistently with the amount of resources we take away from the standalone machine to the visualized one. The cluster I tested is below, took one disk (25% of 4) ,1 core with medium priority (12,5% of 8) and ~1100 mb of memory (10%) to the second virtual machine and the results of the virtual machines with the remaining resources were awful . Writing performance degraded around 25%, which is cool. But the read was around 3x slower which pointed out that Hypervisor is not so good at handling huge random disk accesses. Tests: Yahoo benchmark 100M Cluster: 5x cluster (1 namenode 4 data nodes) xeon 5620 12gb ram 4x sata 7200 rpm drives Results: === TEST 1: 100M inserts === ---STANDALONE CLUSTER--- YCSB Client 0.1 Command line: -db com.yahoo.ycsb.db.HBaseClient -load -p columnfamily=values -P myworkloads/4 -threads 100 -s [OVERALL], RunTime(ms), 5374309.0 [OVERALL], Throughput(ops/sec), 18607.043249653118 [INSERT], Operations, 100000000 [INSERT], AverageLatency(ms), 4.88454509 [INSERT], MinLatency(ms), 0 [INSERT], MaxLatency(ms), 92837 ---VIRTUAL CLUSTER--- Command line: -db com.yahoo.ycsb.db.HBaseClient -load -p columnfamily=values -P myworkloads/4 -threads 100 -s [OVERALL], RunTime(ms), 6912776.0 [OVERALL], Throughput(ops/sec), 14465.968519737946 [INSERT], Operations, 100000000 [INSERT], AverageLatency(ms), 6.36836144 [INSERT], MinLatency(ms), 0 [INSERT], MaxLatency(ms), 104990 === TEST2: 100M read/update/write (transaction) === ---STANDALONE CLUSTER--- YCSB Client 0.1 Command line: -db com.yahoo.ycsb.db.HBaseClient -p columnfamily=values -P myworkloads/4 -threads 100 -s [OVERALL], RunTime(ms), 9012970.0 [OVERALL], Throughput(ops/sec), 3328.53654233843 [UPDATE], Operations, 4500829 [UPDATE], AverageLatency(ms), 0.09758402285445637 [UPDATE], MinLatency(ms), 0 [UPDATE], MaxLatency(ms), 5477 ---VIRTUAL CLUSTER--- YCSB Client 0.1 Command line: -db com.yahoo.ycsb.db.HBaseClient -p columnfamily=values -P myworkloads/4 -threads 100 -s [OVERALL], RunTime(ms), 2.0272831E7 [OVERALL], Throughput(ops/sec), 1479.813056203152 [UPDATE], Operations, 4502501 [UPDATE], AverageLatency(ms), 2.803586717693122 [UPDATE], MinLatency(ms), 0 [UPDATE], MaxLatency(ms), 560605 -- M.Deniz OKTAR iletken Recommendation Technologies http://www.iletken.com.tr Tel: +90(212)328-0290 GSM: +90(533)477-6358 On Mon, May 9, 2011 at 10:41 PM, Andrew Purtell <[email protected]> wrote: > It is not advisable to do this. > > Hadoop/HBase is very I/O intensive. They should have dedicated hardware. > Why add the overhead of Hypervisor mediation on the I/O path then? > > --- On Mon, 5/9/11, Vishal Kapoor <[email protected]> wrote: > > > From: Vishal Kapoor <[email protected]> > > Subject: VMWare and Hadoop/Hbase > > To: [email protected] > > Date: Monday, May 9, 2011, 12:24 PM > > We were wondering if its advisable to > > provision hbase/hadoop nodes as VMWare > > instances? > > any suggestions? > > > > thanks, > > Vishal > > >
