Himanish, it hard to say without trend graphs.  Setup ganglia and get
fsreadlatancy, as well as thread count graphs to see what the issue
might be.

-Jack

On Thu, May 19, 2011 at 11:46 AM, Himanish Kushary <[email protected]> wrote:
> Hi,
>
> Could anybody suggest what may be the issue. I ran YCSB on both the
> development and production servers.
>
> The loading of data performs better on the production cluster but the 50%
> read-50% write workloada performs better on the development.The average
> latency for read shoots up to 30-40 ms on production, for development it is
> between 10-20 ms.This was while running with 10 threads maintaining 1000 tps
> using this command - [*java -cp build/ycsb.jar:db/hbase/conf:db/hbase/lib/*
> com.yahoo.ycsb.Client -t -db com.yahoo.ycsb.db.HBaseClient -P
> workloads/workloada -p columnfamily=data -p operationcount=1000000 -s
> -threads 10 -target 1000*]
>
> The clusters seems to perform similiarly using YCSB when the tps and
> operationcount is lowered to 500 and 100000 respectively.
>
> We ran our Map-Reduces on the two clusters (assuming that we will not reach
> 1000 tps or that much of operationcount from the map-reduce), but strangely
> the development cluster performed better.
>
> Any suggestions will be really helpful?
>
> Thanks
> Himanish
>
>
>
> On Mon, May 16, 2011 at 4:43 PM, Himanish Kushary <[email protected]>wrote:
>
>> *PRODUCTION SERVER CPU INFO*
>> processor : 0
>> vendor_id : AuthenticAMD
>> cpu family : 16
>> model : 9
>> model name : AMD Opteron(tm) Processor 6174
>> stepping : 1
>> cpu MHz : 2200.022
>> cache size : 512 KB
>> physical id : 1
>> siblings : 12
>> core id : 0
>> cpu cores : 12
>> apicid : 16
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 5
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
>> pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp
>> lm 3dnowext 3dnow constant_tsc nonstop_tsc pni cx16 popcnt lahf_lm
>> cmp_legacy svm extapic cr8_legacy altmovcr8 abm sse4a misalignsse
>> 3dnowprefetch osvw
>> bogomips : 4400.03
>> TLB size : 1024 4K pages
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 48 bits physical, 48 bits virtual
>> power management: ts ttp tm stc 100mhzsteps hwpstate [8]
>>
>>
>> *DEVELOPMENT SERVER CPU INFO*
>>
>> processor : 0
>> vendor_id : GenuineIntel
>> cpu family : 6
>> model : 30
>> model name : Intel(R) Core(TM) i7 CPU       Q 740  @ 1.73GHz
>> stepping : 5
>> cpu MHz : 933.000
>> cache size : 6144 KB
>> physical id : 0
>> siblings : 8
>> core id : 0
>> cpu cores : 4
>> apicid : 0
>> initial apicid : 0
>> fpu : yes
>> fpu_exception : yes
>> cpuid level : 11
>> wp : yes
>> flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat
>> pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm
>> constant_tsc arch_perfmon pebs bts rep_good xtopology nonstop_tsc aperfmperf
>> pni dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm sse4_1 sse4_2
>> popcnt lahf_lm ida tpr_shadow vnmi flexpriority ept vpid
>> bogomips : 3457.61
>> clflush size : 64
>> cache_alignment : 64
>> address sizes : 36 bits physical, 48 bits virtual
>> power management:
>>
>>
>>
>> On Mon, May 16, 2011 at 4:26 PM, Jack Levin <[email protected]> wrote:
>>
>>> What is the clock rate of your CPUs (desktop vs blade)?
>>>
>>> -Jack
>>>
>>> On Mon, May 16, 2011 at 1:24 PM, Himanish Kushary <[email protected]>
>>> wrote:
>>> > Yes, it is only the HW that was changed . All the configurations are
>>> kept at
>>> > default from the cloudera installer.
>>> >
>>> > The regionserver logs semms ok.
>>> >
>>> > On Mon, May 16, 2011 at 3:20 PM, Jean-Daniel Cryans <
>>> [email protected]>wrote:
>>> >
>>> >> Ok I see... so the only thing that changed is the HW right? No
>>> >> upgrades to a new version? Also could it be possible that you changed
>>> >> some configs (or missed them)? BTW counting has a parameter for
>>> >> scanner caching, like you would write: count "myTable", CACHE = 1000
>>> >>
>>> >> and it should stream through your data.
>>> >>
>>> >> Anything weird in the region server logs?
>>> >>
>>> >> J-D
>>> >>
>>> >> On Mon, May 16, 2011 at 12:13 PM, Himanish Kushary <[email protected]
>>> >
>>> >> wrote:
>>> >> > Thanks for the reply. We ran the TestDFSIO benchmark on both the
>>> >> development
>>> >> > and production and found the production to be better.The statistics
>>> are
>>> >> > shown below.
>>> >> >
>>> >> > But once we bring HBase into the picture things gets reversed :-(
>>> >> >
>>> >> > The count operation,map-reduces etc becomes less performing on the
>>> >> > production box.We are using Pseudo Distribution mode in both the
>>> >> development
>>> >> > and production servers for both hadoop and hbase.
>>> >> >
>>> >> > *DEVELOPMENT SERVER*
>>> >> >
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: ----- TestDFSIO ----- : write
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:            Date & time: Sun May
>>> 15
>>> >> > 21:26:26 EDT 2011
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:        Number of files: 10
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:      Throughput mb/sec:
>>> >> > 58.09495038691237
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 59.699485778808594
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:  IO rate std deviation:
>>> >> > 10.54547265175703
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:     Test exec time sec: 163.354
>>> >> > 11/05/15 21:26:26 INFO fs.TestDFSIO:
>>> >> >
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: ----- TestDFSIO ----- : read
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:            Date & time: Sun May
>>> 15
>>> >> > 21:28:44 EDT 2011
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:        Number of files: 10
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:      Throughput mb/sec:
>>> >> > 682.4075337791729
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 755.5845947265625
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:  IO rate std deviation:
>>> >> > 229.60029445080488
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:     Test exec time sec: 63.896
>>> >> > 11/05/15 21:28:44 INFO fs.TestDFSIO:
>>> >> >
>>> >> >
>>> >> >
>>> >> >
>>> >> > *PRODUCTION SERVER*
>>> >> >
>>> >> > 5/16 01:00:43 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *WRITE
>>> >> PERFORMANCE*
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Date & time: Mon May 16 01:00:43
>>> >> > GMT+00:00 2011
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Number of files: 10
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Throughput mb/sec:
>>> 69.25447557048375
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 70.06581115722656
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: IO rate std deviation:
>>> >> > 7.243961483443693
>>> >> >
>>> >> > 11/05/16 01:00:43 INFO fs.TestDFSIO: Test exec time sec: 126.896
>>> >> >
>>> >> >
>>> >> > 5/16 01:25:01 INFO fs.TestDFSIO: ----- TestDFSIO ----- : *READ
>>> >> PERFORMANCE*
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Date & time: Mon May 16 01:25:01
>>> >> > GMT+00:00 2011
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Number of files: 10
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Total MBytes processed: 10000
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Throughput mb/sec:
>>> 1487.20999405116
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: Average IO rate mb/sec:
>>> >> > 1525.230712890625
>>> >> >
>>> >> > 11/05/16 01:25:01 INFO fs.TestDFSIO: IO rate std deviation:
>>> >> > 239.54492784268226
>>> >> >
>>> >>
>>> >
>>> >
>>> >
>>> > --
>>> > Thanks & Regards
>>> > Himanish
>>> >
>>>
>>
>>
>>
>> --
>> Thanks & Regards
>> Himanish
>>
>
>
>
> --
> Thanks & Regards
> Himanish
>

Reply via email to