[ https://issues.apache.org/jira/browse/HBASE-15205?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136872#comment-15136872 ]
ramkrishna.s.vasudevan commented on HBASE-15205: ------------------------------------------------ Let me tell with the code here. Since replication is enabled by default. For every append we try to scope the WALEdits with the replication scope. So even if there is one CF and there is no replication at all enabled even then we try to iterate and find the scope associated with that CF for every append. {code} } else { family = CellUtil.cloneFamily(cell); // Unexpected, has a tendency to happen in unit tests assert htd.getFamily(family) != null; if (!scopes.containsKey(family)) { int scope = htd.getFamily(family).getScope(); if (scope != REPLICATION_SCOPE_LOCAL) { scopes.put(family, scope); } } {code} This code ' int scope = htd.getFamily(family).getScope();' generates lot of garbage as we do some new String() operation. In case of Multi Cf case in this same piece of code we define a local map {code} NavigableMap<byte[], Integer> scopes = new TreeMap<byte[], Integer>(Bytes.BYTES_COMPARATOR); {code} to which we copy all the CFs and their scopes which has NON default scope associated. So for every append we iterate thro all the cells, find the scope of each CF in the cell (if it is not already added to the 'scopes') map. This map is then serialized in the 'pb'. The above logic for multiCF makes sense because if among all the cF if only one is with GLOBAL scope then only that information is added to that WALKey. So first thing that we can avoid is reduce the garbage created by doing this new String() by actually getting the scope once in the HRegion and use that in the append(). This avoids all the garbage created by new String() and the UTF8 encoder for every append. But the other thing is that if we can just add all the non default CF and serialize it for every WAL key we can even avoid the local map getting created and the check that we perform on these maps etc. But at the cost of serializing more information per WAL. > Do not find the replication scope for every WAL#append() > -------------------------------------------------------- > > Key: HBASE-15205 > URL: https://issues.apache.org/jira/browse/HBASE-15205 > Project: HBase > Issue Type: Sub-task > Components: regionserver > Reporter: ramkrishna.s.vasudevan > Assignee: ramkrishna.s.vasudevan > Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-15205.patch, ScopeWALEdits.jpg, > ScopeWALEdits_afterpatch.jpg > > > After the byte[] and char[] the other top contributor for lot of GC (though > it is only 2.86%) is the UTF_8.newDecoder. > This happens because for every WAL append we try to calculate the replication > scope associate with the families associated with the TableDescriptor. I > think per WAL append doing this is very costly and creates lot of garbage. -- This message was sent by Atlassian JIRA (v6.3.4#6332)