[
https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080304#comment-14080304
]
Ishan Chhabra commented on HBASE-11558:
---------------------------------------
Updated release notes. It makes sense to remove the second method. Do you
propose to delete the method or mark it as deprecated for now? Which branches
should get this patch? I can open a separate JIRA and put in the patch there
once the answers are clear.
> Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
> ---------------------------------------------------------------------------
>
> Key: HBASE-11558
> URL: https://issues.apache.org/jira/browse/HBASE-11558
> Project: HBase
> Issue Type: Bug
> Components: mapreduce, Scanners
> Reporter: Ishan Chhabra
> Assignee: Ishan Chhabra
> Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0
>
> Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch,
> HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch,
> HBASE_11558_v2.patch, HBASE_11558_v2.patch
>
>
> 0.94 and before, if one sets caching on the Scan object in the Job by calling
> scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly
> read and used by the mappers during a mapreduce job. This is because
> Scan.write respects and serializes caching, which is used internally by
> TableMapReduceUtil to serialize and transfer the scan object to the mappers.
> 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect
> caching anymore as ClientProtos.Scan does not have the field caching. Caching
> is passed via the ScanRequest object to the server and so is not needed in
> the Scan object. However, this breaks application code that relies on the
> earlier behavior. This will lead to sudden degradation in Scan performance
> 0.96+ for users relying on the old behavior.
> There are 2 options here:
> 1. Add caching to Scan object, adding an extra int to the payload for the
> Scan object which is really not needed in the general case.
> 2. Document and preach that TableMapReduceUtil.setScannerCaching must be
> called by the client.
--
This message was sent by Atlassian JIRA
(v6.2#6252)