It does sound familiar to me.

J-D

On Sat, Mar 19, 2011 at 11:10 AM, Andrew Purtell <[email protected]> wrote:
> I have a mapreduce task put together for experimentation which does a lot of 
> Increments over three tables and Puts to another. I set writeToWAL to false. 
> My HBase includes the patch that fixes serialization of writeToWAL for 
> Increments. MemstoreLAB is enabled but is probably not a factor, but still 
> need to test to exclude it.
>
> After starting a job up on a test cluster on EC2 with 20 mappers over 10 
> slaves I see initially 10-15K/ops/sec/server. This performance drops over a 
> short time to stabilize around 1K/ops/sec/server. So I flush the tables with 
> the shell. Immediately after flushing the tables, performance is back up to 
> 10-15K/ops/sec/server. If I don't flush, performance remains low 
> indefinitely. If I flush only the table receiving the Gets, performance 
> remains low.
>
> If I set the shell to flush in a loop every 60 seconds, performance 
> repeatedly drops during that interval, then recovers after flushing.
>
> When Gary and I went to NCHC in Taiwan, we saw a guy from PhiCloud present 
> something similar to this regarding 0.89DR. He measured the performance of 
> the memstore for a get-and-put use case over time and graphed it, looked like 
> time increased on a staircase with a trend to O(n). This was a surprising 
> result. ConcurrentSkipListMap#put is supposed to run in O(log n). His 
> workaround was to flush after some fixed number of gets+puts, 1000 I think. 
> At the time we weren't sure what was going on given the language barrier.
>
> Sound familiar?
>
> I don't claim to really understand what is going on, but need to get to the 
> bottom of this. Going to look at it in depth starting Monday.
>
>   - Andy
>
>
>
>
>

Reply via email to