Hi Jan I am facing the same issue as you did. If I change from hbase-0.90.3 to using hbase-0.90.5, I get a lot of Scanner timeout exceptions. Did you file the JIRA?
Also, in my case, the setup method in my mapper takes a long time (like 10 mins) and by the time the map function is invoked, it throws a scanner timeout exception. My map is not executed even once. thanks Vrushali ________________________________ From: Jan Lukavský <[email protected]> To: "[email protected]" <[email protected]> Cc: Jonathan Hsieh <[email protected]>; "[email protected]" <[email protected]> Sent: Monday, March 12, 2012 3:03 AM Subject: Re: Why ScannerTimeoutException is not handled in TableInputFormat? Hi Jon, sorry I forgot to include the details of our environment. We are using cdh3u3, which includes the patch (I think cdh3u2 did not). I went a bit through the patches for the two issues (HBASE-4196) and (HBASE-4269) and I think the problem is that the patch to HBASE-4196 did not change the sematics for *mapreduce* package. The change was related to *mapred *(the older API). We are using the new one and hence for us the semantics was in fact changed by HBASE-4269. IMO, the correct way of restoring the sematics would be to catch DNRIOEx only in the mapred implementation of the record reader. Shall I file another JIRA? Thanks, Jan On 7.3.2012 00:25, Jonathan Hsieh wrote: > Hi Jan, > > What version were you on before and what version are you on currently? > > The HBASE-4269 tried to restore the semantics changed in HBASE-4196 from > 0.90.3 to 0.90.5. If we caught the exception we may skip records if the > timeout exceptions are skipped which is undesirable. > > It might make sense to change Scanner Timeout to be derived from a > different exception but this will take a bit of digging and testing to > verify (DNRIOEx is used in about 40 places) > > Jon. > > On Fri, Feb 24, 2012 at 5:08 AM, Jan Lukavský > <[email protected]>wrote: > >> Hi all, >> >> patch to HBASE-4269 removed handling of ScannerTimeoutException from >> TableRecordReader. Is there a reason for this? We are now seeing a lot of >> ScannerTimeoutExceptions in our jobs, which were previously handled. The >> problem is caused by ScannerTimeoutException being derived from >> DoNotRetryIOException (why's that?). Would it be safe to just derive this >> exception from IOException? And if not, should the TableRecordReader >> explictly handle this exception? The same problem may arise with >> LeaseException, which is also derived from DoNotRetryIOException. Both of >> these exception may have the same cause and in my understanding can be >> handled in the RecordReader without any harm. Or am I wrong? >> >> Thanks for answer, >> Jan >> >> >
