On Wed, Oct 19, 2011 at 12:51 PM, Eran Kutner <[email protected]> wrote:

> Hi J-D,
> Thanks for the detailed explanation.
> So if I understand correctly the lease we're talking about is a scanner
> lease and the timeout is between two scanner calls, correct? I think that
> make sense because I now realize that jobs that fail (some jobs continued
> to
> fail even after reducing the number of map tasks as Stack suggested) use
> filters to fetch relatively few rows out of a very large table, so they
> could be spending a lot of time on the region server scanning rows until it
> reached my setCaching value which was 1000. Setting the caching value to 1
> seem to allow these job to complete.
> I think it has to be the above, since my rows are small, with just a few
> columns and processing them is very quick.
>

Excellent!


>
> However, there are still a couple ofw thing I don't understand:
> 1. What is the difference between setCaching and setBatch?
>

* Set the maximum number of values to return for each call to next()

VS

* Set the number of rows for caching that will be passed to scanners.

The former is useful if you have rows with millions of columns and you could
setBatch to get only 1000 of them at a time. You could call that intra-row
scanning.


> 2. Examining the region server logs more closely than I did yesterday I see
> a log of ClosedChannelExceptions in addition to the expired leases (but no
> UnknownScannerException), is that expected? You can see an excerpt of the
> log from one of the region servers here: http://pastebin.com/NLcZTzsY


It means that when the server got to process that client request and started
reading from the socket, the client was already gone. Killing a client does
that (or killing a MR that scans), so does SocketTimeoutException. This
should probably go in the book. We should also print something nicer :)

J-D

Reply via email to