Hi, Could anybody suggest what may be the issue. I ran YCSB on both the development and production servers.
The loading of data performs better on the production cluster but the 50% read-50% write workloada performs better on the development.The average latency for read shoots up to 30-40 ms on production, for development it is between 10-20 ms.This was while running with 10 threads maintaining 1000 tps using this command - [*java -cp build/ycsb.jar:db/hbase/conf:db/hbase/lib/* com.yahoo.ycsb.Client -t -db com.yahoo.ycsb.db.HBaseClient -P workloads/workloada -p columnfamily=data -p operationcount=1000000 -s -threads 10 -target 1000*] The clusters seems to perform similiarly using YCSB when the tps and operationcount is lowered to 500 and 100000 respectively. We ran our Map-Reduces on the two clusters (assuming that we will not reach 1000 tps or that much of operationcount from the map-reduce), but strangely the development cluster performed better. Any suggestions will be really helpful? Thanks Himanish On Mon, May 16, 2011 at 4:43 PM, Himanish Kushary <[email protected]>wrote: > *PRODUCTION SERVER CPU INFO* > processor : 0 > vendor_id : AuthenticAMD > cpu family : 16 > model : 9 > model name : AMD Opteron(tm) Processor 6174 > stepping : 1 > cpu MHz : 2200.022 > cache size : 512 KB > physical id : 1 > siblings : 12 > core id : 0 > cpu cores : 12 > apicid : 16 > fpu : yes > fpu_exception : yes > cpuid level : 5 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat > pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp > lm 3dnowext 3dnow constant_tsc nonstop_tsc pni cx16 popcnt lahf_lm > cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse > 3dnowprefetch osvw > bogomips : 4400.03 > TLB size : 1024 4K pages > clflush size : 64 > cache_alignment : 64 > address sizes : 48 bits physical, 48 bits virtual > power management: ts ttp tm stc 100mhzsteps hwpstate [8] > > > *DEVELOPMENT SERVER CPU INFO* > > processor : 0 > vendor_id : GenuineIntel > cpu family : 6 > model : 30 > model name : Intel(R) Core(TM) i7 CPU Q 740 @ 1.73GHz > stepping : 5 > cpu MHz : 933.000 > cache size : 6144 KB > physical id : 0 > siblings : 8 > core id : 0 > cpu cores : 4 > apicid : 0 > initial apicid : 0 > fpu : yes > fpu_exception : yes > cpuid level : 11 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat > pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm > constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf > pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2 > popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid > bogomips : 3457.61 > clflush size : 64 > cache_alignment : 64 > address sizes : 36 bits physical, 48 bits virtual > power management: > > > > On Mon, May 16, 2011 at 4:26 PM, Jack Levin <[email protected]> wrote: > >> What is the clock rate of your CPUs (desktop vs blade)? >> >> -Jack >> >> On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary <[email protected]> >> wrote: >> > Yes, it is only the HW that was changed . All the configurations are >> kept at >> > default from the cloudera installer. >> > >> > The regionserver logs semms ok. >> > >> > On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans < >> [email protected]>wrote: >> > >> >> Ok I see... so the only thing that changed is the HW right? No >> >> upgrades to a new version? Also could it be possible that you changed >> >> some configs (or missed them)? BTW counting has a parameter for >> >> scanner caching, like you would write: count "myTable", CACHE = 1000 >> >> >> >> and it should stream through your data. >> >> >> >> Anything weird in the region server logs? >> >> >> >> J-D >> >> >> >> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <[email protected] >> > >> >> wrote: >> >> > Thanks for the reply. We ran the TestDFSIO benchmark on both the >> >> development >> >> > and production and found the production to be better.The statistics >> are >> >> > shown below. >> >> > >> >> > But once we bring HBase into the picture things gets reversed :-( >> >> > >> >> > The count operation,map-reduces etc becomes less performing on the >> >> > production box.We are using Pseudo Distribution mode in both the >> >> development >> >> > and production servers for both hadoop and hbase. >> >> > >> >> > *DEVELOPMENT SERVER* >> >> > >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Date & time: Sun May >> 15 >> >> > 21:26:26 EDT 2011 >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Number of files: 10 >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000 >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Throughput mb/sec: >> >> > 58.09495038691237 >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec: >> >> > 59.699485778808594 >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: IO rate std deviation: >> >> > 10.54547265175703 >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Test exec time sec: 163.354 >> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: >> >> > >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Date & time: Sun May >> 15 >> >> > 21:28:44 EDT 2011 >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Number of files: 10 >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000 >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Throughput mb/sec: >> >> > 682.4075337791729 >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec: >> >> > 755.5845947265625 >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: IO rate std deviation: >> >> > 229.60029445080488 >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Test exec time sec: 63.896 >> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: >> >> > >> >> > >> >> > >> >> > >> >> > *PRODUCTION SERVER* >> >> > >> >> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE >> >> PERFORMANCE* >> >> > >> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43 >> >> > GMT+00:00 2011 >> >> > >> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10 >> >> > >> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000 >> >> > >> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec: >> 69.25447557048375 >> >> > >> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec: >> >> > 70.06581115722656 >> >> > >> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation: >> >> > 7.243961483443693 >> >> > >> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896 >> >> > >> >> > >> >> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ >> >> PERFORMANCE* >> >> > >> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01 >> >> > GMT+00:00 2011 >> >> > >> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10 >> >> > >> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000 >> >> > >> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec: >> 1487.20999405116 >> >> > >> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec: >> >> > 1525.230712890625 >> >> > >> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation: >> >> > 239.54492784268226 >> >> > >> >> >> > >> > >> > >> > -- >> > Thanks & Regards >> > Himanish >> > >> > > > > -- > Thanks & Regards > Himanish > -- Thanks & Regards Himanish
