[jira] [Commented] (HBASE-5757) TableInputFormat should handle as much errors as possible

2012-04-11 Thread Jan Lukavsky (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13251460#comment-13251460
 ] 

Jan Lukavsky commented on HBASE-5757:
-

The problem with multiple fetching of rows doesn't exist. I thought (don't know 
why) that ScannerTimeoutException can be thrown while processing rows cached in 
the scanner on client side. This is not the case. Adding counter for the number 
of retries in the input format might be interesting nevertheless.

> TableInputFormat should handle as much errors as possible
> -
>
> Key: HBASE-5757
> URL: https://issues.apache.org/jira/browse/HBASE-5757
> Project: HBase
>  Issue Type: Bug
>  Components: mapred, mapreduce
>Affects Versions: 0.90.6
>Reporter: Jan Lukavsky
>
> Prior to HBASE-4196 there was different handling of IOExceptions thrown from 
> scanner in mapred and mapreduce API. The patch to HBASE-4196 unified this 
> handling so that if exception is caught a reconnect is attempted (without 
> bothering the mapred client). After that, HBASE-4269 changed this behavior 
> back, but in both mapred and mapreduce APIs. The question is, is there any 
> reason not to handle all errors that the input format can handle? In other 
> words, why not try to reissue the request after *any* IOException? I see the 
> following disadvantages of current approach
>  * the client may see exceptions like LeaseException and 
> ScannerTimeoutException if he fails to process all fetched data in timeout
>  * to avoid ScannerTimeoutException the client must raise 
> hbase.regionserver.lease.period
>  * timeouts for tasks is aready configured in mapred.task.timeout, so this 
> seems to me a bit redundant, because typically one needs to update both these 
> parameters
>  * I don't see any possibility to get rid of LeaseException (this is 
> configured on server side)
> I think all of these issues would be gone, if the DoNotRetryIOException would 
> not be rethrown. On the other hand, handling errors in InputFormat has 
> disadvantage, that it may hide from the user some inefficiency. Eg. if I have 
> very big scanner.caching, and I manage to process only a few rows in timeout, 
> I will end up with single row being fetched many times (and will not be 
> explicitly notified about this). Could we solve this problem by adding some 
> counter to the InputFormat?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4269) Add tests and restore semantics to TableInputFormat/TableRecordReader

2012-04-10 Thread Jan Lukavsky (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13250541#comment-13250541
 ] 

Jan Lukavsky commented on HBASE-4269:
-

Hi Jonathan,

I will file a new issue with description of issues it causes us.

Thanks.

> Add tests and restore semantics to TableInputFormat/TableRecordReader
> -
>
> Key: HBASE-4269
> URL: https://issues.apache.org/jira/browse/HBASE-4269
> Project: HBase
>  Issue Type: Improvement
>  Components: mapred, mapreduce, test
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4269-Add-tests-and-restore-semantics-to-TableI.patch, 
> 0001-HBASE-4269-Add-tests-and-restore-semantics-to-TableI.patch
>
>
> HBASE-4196 Modified the semantics of failures in 
> TableImportFormat/TableRecordReader, and had no tests cases.  This patch 
> restores semantics to rethrow when a DoNotRetryIOException is triggered and 
> adds test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4269) Add tests and restore semantics to TableInputFormat/TableRecordReader

2012-03-16 Thread Jan Lukavsky (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230972#comment-13230972
 ] 

Jan Lukavsky commented on HBASE-4269:
-

Hi,

I think patch to this issue changed semantics for mapreduce API. In HBASE-4196 
there was no change in semantics in 
org.apache.hadoop.hbase.mapreduce.TableRecordReaderImpl, the only change was in 
org.apache.hadoop.hbase.mapred.TableRecordReaderImpl (where the catch of 
UnknownScannerException was changed to IOException). Now the semantics of 
mapreduce API is different of the one before HBASE-4196, and I think this 
should be reverted. Is there any reason why to have different semantics for the 
two APIs? Wouldn't it be better to accept the change of semantics in 
HBASE-4196? Are there any negative side-effects of this change? I don't see any 
discussion of the type "do we need to change the semantics back"?

Thanks for reply :)

 Jan

> Add tests and restore semantics to TableInputFormat/TableRecordReader
> -
>
> Key: HBASE-4269
> URL: https://issues.apache.org/jira/browse/HBASE-4269
> Project: HBase
>  Issue Type: Improvement
>  Components: mapred, mapreduce, test
>Affects Versions: 0.90.5, 0.92.0
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-4269-Add-tests-and-restore-semantics-to-TableI.patch, 
> 0001-HBASE-4269-Add-tests-and-restore-semantics-to-TableI.patch
>
>
> HBASE-4196 Modified the semantics of failures in 
> TableImportFormat/TableRecordReader, and had no tests cases.  This patch 
> restores semantics to rethrow when a DoNotRetryIOException is triggered and 
> adds test cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira