[
https://issues.apache.org/jira/browse/HBASE-11558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14080158#comment-14080158
]
Hudson commented on HBASE-11558:
--------------------------------
SUCCESS: Integrated in HBase-0.98 #425 (See
[https://builds.apache.org/job/HBase-0.98/425/])
HBASE-11558 Caching set on Scan object gets lost when using TableMapReduceUtil
in 0.95+ (Ishan Chhabra) (ndimiduk: rev
61de4e47835f98dd7d2cec92bf33641c9de072a8)
* hbase-client/src/main/java/org/apache/hadoop/hbase/protobuf/ProtobufUtil.java
* hbase-protocol/src/main/protobuf/Client.proto
*
hbase-server/src/test/java/org/apache/hadoop/hbase/protobuf/TestProtobufUtil.java
*
hbase-protocol/src/main/java/org/apache/hadoop/hbase/protobuf/generated/ClientProtos.java
> Caching set on Scan object gets lost when using TableMapReduceUtil in 0.95+
> ---------------------------------------------------------------------------
>
> Key: HBASE-11558
> URL: https://issues.apache.org/jira/browse/HBASE-11558
> Project: HBase
> Issue Type: Bug
> Components: mapreduce, Scanners
> Reporter: Ishan Chhabra
> Assignee: Ishan Chhabra
> Fix For: 0.99.0, 0.96.3, 0.98.5, 2.0.0
>
> Attachments: HBASE_11558-0.96.patch, HBASE_11558-0.96_v2.patch,
> HBASE_11558-0.98.patch, HBASE_11558-0.98_v2.patch, HBASE_11558.patch,
> HBASE_11558_v2.patch, HBASE_11558_v2.patch
>
>
> 0.94 and before, if one sets caching on the Scan object in the Job by calling
> scan.setCaching(int) and passes it to TableMapReduceUtil, it is correctly
> read and used by the mappers during a mapreduce job. This is because
> Scan.write respects and serializes caching, which is used internally by
> TableMapReduceUtil to serialize and transfer the scan object to the mappers.
> 0.95+, after the move to protobuf, ProtobufUtil.toScan does not respect
> caching anymore as ClientProtos.Scan does not have the field caching. Caching
> is passed via the ScanRequest object to the server and so is not needed in
> the Scan object. However, this breaks application code that relies on the
> earlier behavior. This will lead to sudden degradation in Scan performance
> 0.96+ for users relying on the old behavior.
> There are 2 options here:
> 1. Add caching to Scan object, adding an extra int to the payload for the
> Scan object which is really not needed in the general case.
> 2. Document and preach that TableMapReduceUtil.setScannerCaching must be
> called by the client.
--
This message was sent by Atlassian JIRA
(v6.2#6252)