[
https://issues.apache.org/jira/browse/HBASE-24859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183523#comment-17183523
]
Bharath Vissapragada commented on HBASE-24859:
----------------------------------------------
Can you add some data that gives some insight into memory usage? Like Xmx
limits on the client JVM, no. of regions, key lengths, top 5-10 contributors
(by %, based on a heap dump analysis etc)? I'm wondering if we can do some
simple optimizations like dedup with interning, avoid unnecessary copies etc
and get a reasonable improvement in the memory usage.
> Remove the empty regions from the hbase mapreduce splits
> --------------------------------------------------------
>
> Key: HBASE-24859
> URL: https://issues.apache.org/jira/browse/HBASE-24859
> Project: HBase
> Issue Type: Improvement
> Components: mapreduce
> Reporter: Sandeep Pal
> Assignee: Sandeep Pal
> Priority: Major
>
> It has been observed that when the table has too many regions, MR jobs
> consume more memory in the client. This is because we keep the region level
> information in memory and the memory heavy object is TableSplit because of
> Scan object as a part of it.
> We can optimize the memory consumption by not loading the region level
> information if the region is empty.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)