[
https://issues.apache.org/jira/browse/HBASE-8571?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663034#comment-13663034
]
Doug Meil commented on HBASE-8571:
----------------------------------
I realized that it does actually handle setCaching, but my comment was more
about the implementation. The Scan is explicitly set in the job with every
option *except* setCaching (e.g., startRow/stopRow, columns, cacheBlocks,
etc.). But the caching is magic because it gets picked up way down the line
via the Config instance in ClientScanner. It's just a little confusing, that's
all.
> CopyTable and RowCounter don't seem to use setCaching setting
> -------------------------------------------------------------
>
> Key: HBASE-8571
> URL: https://issues.apache.org/jira/browse/HBASE-8571
> Project: HBase
> Issue Type: Bug
> Reporter: Doug Meil
>
> Maybe it's just me, but I've been looking on trunk and I don't see where
> either RowCounter or CopyTable MapReduce can adjust the setCaching setting on
> the Scan instance.
> Example from RowCounter...
> {code}
> Job job = new Job(conf, NAME + "_" + tableName);
> job.setJarByClass(RowCounter.class);
> Scan scan = new Scan();
> scan.setCacheBlocks(false);
> Set<byte []> qualifiers = new TreeSet<byte[]>(Bytes.BYTES_COMPARATOR);
> if (startKey != null && !startKey.equals("")) {
> scan.setStartRow(Bytes.toBytes(startKey));
> }
> if (endKey != null && !endKey.equals("")) {
> scan.setStopRow(Bytes.toBytes(endKey));
> }
> scan.setFilter(new FirstKeyOnlyFilter());
> if (sb.length() > 0) {
> for (String columnName : sb.toString().trim().split(" ")) {
> String [] fields = columnName.split(":");
> if(fields.length == 1) {
> scan.addFamily(Bytes.toBytes(fields[0]));
> } else {
> byte[] qualifier = Bytes.toBytes(fields[1]);
> qualifiers.add(qualifier);
> scan.addColumn(Bytes.toBytes(fields[0]), qualifier);
> }
> }
> }
> // specified column may or may not be part of first key value for the row.
> // Hence do not use FirstKeyOnlyFilter if scan has columns, instead use
> // FirstKeyValueMatchingQualifiersFilter.
> if (qualifiers.size() == 0) {
> scan.setFilter(new FirstKeyOnlyFilter());
> } else {
> scan.setFilter(new FirstKeyValueMatchingQualifiersFilter(qualifiers));
> }
> job.setOutputFormatClass(NullOutputFormat.class);
> TableMapReduceUtil.initTableMapperJob(tableName, scan,
> RowCounterMapper.class, ImmutableBytesWritable.class, Result.class,
> job);
> job.setNumReduceTasks(0);
> return job;
> {code}
> TableMapReduceUtil only serializes the Scan into the job, it doesn't adjust
> any of the settings.
> Maybe I'm missing something, but this seems like a problem.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira