sandeepvinayak commented on a change in pull request #2591:
URL: https://github.com/apache/hbase/pull/2591#discussion_r513599178
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormatBase.java
##########
@@ -323,7 +323,7 @@ public boolean nextKeyValue() throws IOException,
InterruptedException {
}
List<InputSplit> splits = new ArrayList<>(1);
long regionSize =
sizeCalculator.getRegionSize(regLoc.getRegionInfo().getRegionName());
- TableSplit split = new TableSplit(tableName, scan,
+ TableSplit split = new TableSplit(tableName,
Review comment:
@saintstack that is correct! If you see the jira for description, there
is a heap dump screenshots which shows the scan object may occupy much memory
in case of tables with large number of regions. This patch just fix the
TableInputFormat for single table where we don’t use the scan object from
TableSplit since we use it from MR Job conf directly. There should be another
patch to fix the similar fix with more code changes for MultiTableInputFormat.
I will try to fix that in a separate patch.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]