Thanks for the reply.Here are the info :
What is your concurrency like (How many concurrent mappers?). - We have 8
concurrent mappers running
Where is the time being spent? In the server, in the mapper? - The most
time is spent in calling Htable.batch(...) inside the mapper
Why are you having scanner timeouts if you are doing big batch Gets? - We
are getting scanner timeout from the original Scan which serves the input
records to the mapper.The scanner caching is set to 100 .
I think because the mapper is taking too
long(because of the batch Gets inside it) to process initial 100 records ,
the next batch of scanned records throws the exception
Also, could it be happening due to concurrency ? I am currently on a single
region-server. When i run the test case the batch Gets happen sequentially
whereas from the map-reduce the batch Gets happen concurrently on the same
region server. Could this be the reason that during map-reduce the
performance degrades due to thrashing on the same region server ? Thoughts ?
- Thanks
Himanish
On Mon, Feb 20, 2012 at 3:39 PM, Stack <[email protected]> wrote:
> On Mon, Feb 20, 2012 at 12:04 PM, Himanish Kushary <[email protected]>
> wrote:
> > Also to add , from the map-reduce we have started seeing
> >
> > org.apache.hadoop.hbase.client.ScannerTimeoutException: 360388ms
> > passed since the last invocation, timeout is currently set to 300000
> >
> > due to the extremely high time spent on firing the batch Gets
> >
>
> What is your concurrency like (How many concurrent mappers?). Where
> is the time being spent? In the server, in the mapper? Why are you
> having scanner timeouts if you are doing big batch Gets?
>
> St.Ack
>
--
Thanks & Regards
Himanish