stack created HBASE-21081:
-----------------------------
Summary: Trim Master memory usage, part 2
Key: HBASE-21081
URL: https://issues.apache.org/jira/browse/HBASE-21081
Project: HBase
Issue Type: Bug
Affects Versions: 2.0.1
Reporter: stack
Assignee: stack
Good one found by a jxray spelunking [[email protected]] on a 700 node
cluster with 500k+ regions. For some reason, there are >1M instances of each
column family when there should be only 500k (By rights there should be only
the number of column families in the table rather than repeating these bytes
per region -- TODO).
The below seemed suspicious added by HBASE-19496. It is making hashmaps with
byte []s for keys. Byte []'s don't do hashCode/Equals. Usually when we have
byte []'s for keys, we do ConcurrentMap and pass a Comparator in constructor
that knows how to do byte []s.
{code}
.setStoreSequenceIds(regionLoadPB.getStoreCompleteSequenceIdList().stream()
.collect(Collectors.toMap(
(ClusterStatusProtos.StoreSequenceId s) -> s.getFamilyName().toByteArray(),
ClusterStatusProtos.StoreSequenceId::getSequenceId)))
{code}
But looking back through code, even if a hashmap, the hashmap should only have
one item in the Map. Where's the other coming from.
Here's how to get a TreeMap w/ Comparator into the mix... but need to check if
this fixes the issue (I don't think so).
{code}
@@ -66,12 +70,13 @@ public final class RegionMetricsBuilder {
.setStoreCount(regionLoadPB.getStores())
.setStoreFileCount(regionLoadPB.getStorefiles())
.setStoreFileSize(new Size(regionLoadPB.getStorefileSizeMB(),
Size.Unit.MEGABYTE))
-
.setStoreSequenceIds(regionLoadPB.getStoreCompleteSequenceIdList().stream()
- .collect(Collectors.toMap(
- (ClusterStatusProtos.StoreSequenceId s) ->
s.getFamilyName().toByteArray(),
- ClusterStatusProtos.StoreSequenceId::getSequenceId)))
+
.setStoreSequenceIds(regionLoadPB.getStoreCompleteSequenceIdList().stream().collect(
+ Collectors.toMap(s -> s.getFamilyName().toByteArray(),
+ ClusterStatusProtos.StoreSequenceId::getSequenceId,
+ (k1, k2) -> k1, // Should never happen; only one completed
sequenceid per Store
+ () -> new TreeMap<byte [], Long>(Bytes.BYTES_COMPARATOR))))
.setUncompressedStoreFileSize(
- new
Size(regionLoadPB.getStoreUncompressedSizeMB(),Size.Unit.MEGABYTE))
+ new Size(regionLoadPB.getStoreUncompressedSizeMB(),
Size.Unit.MEGABYTE))
.build();
}
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)