Sandeep Pal created HBASE-25226:
-----------------------------------

             Summary: Optimize in-memory representation for HBase map reduce 
table splits for MultiTableInputFormat
                 Key: HBASE-25226
                 URL: https://issues.apache.org/jira/browse/HBASE-25226
             Project: HBase
          Issue Type: Improvement
            Reporter: Sandeep Pal
            Assignee: Sandeep Pal


It has been observed that when the table has too many regions, MR jobs consume 
a lot of memory in the client. This is because we keep the region level 
information in memory and the memory heavy object is TableSplit because of the 
Scan object as a part of it.

There is a jira 
[HBASE-24859|https://issues.apache.org/jira/projects/HBASE/issues/HBASE-24859] 
which fix this single table TableInputFormat because we do not use the scan 
object from TableSplit in this case. 
However, it looks like we can do some optimization in case of 
MultiTableInputFormat as well since each split is not required to have memory 
heavy scan object.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to