[
https://issues.apache.org/jira/browse/HBASE-18116?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16025196#comment-16025196
]
Andrew Purtell edited comment on HBASE-18116 at 5/25/17 7:00 PM:
-----------------------------------------------------------------
In addition, when estimating the size of a replication queue entry we only
consider the WALEdit objects, not the also associated WALKey objects.
was (Author: apurtell):
Also, when calculating the heap size of a replication queue entry we only track
the WALEdit objects, not the associated WALKey objects.
> Replication buffer quota accounting should not include bulk transfer hfiles
> ---------------------------------------------------------------------------
>
> Key: HBASE-18116
> URL: https://issues.apache.org/jira/browse/HBASE-18116
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: Andrew Purtell
>
> In ReplicationSourceWALReaderThread we maintain a global quota on enqueued
> replication work for preventing OOM by queuing up too many edits into queues
> on heap. When calculating the size of a given replication queue entry, if it
> has associated hfiles (is a bulk load to be replicated as a batch of hfiles),
> we get the file sizes and include the sum. We then apply that result to the
> quota. This isn't quite right. Those hfiles will be pulled by the sink as a
> file copy, not pushed by the source. The cells in those files are not queued
> in memory at the source and therefore shouldn't be counted against the quota.
> Related, the sum of the hfile sizes are also included when checking if queued
> work exceeds the configured replication queue capacity, which is by default
> 64 MB. HFiles are commonly much larger than this.
> So what happens is when we encounter a bulk load replication entry typically
> both the quota and capacity limits are exceeded, we break out of loops, and
> send right away. What is transferred on the wire via HBase RPC though has
> only a partial relationship to the calculation.
> Depending how you look at it, it makes sense to factor hfile file sizes
> against replication queue capacity limits. The sink will be occupied
> transferring those files at the HDFS level. Anyway, this is how we have been
> doing it and it is too late to change now. I do not however think it is
> correct to apply hfile file sizes against a quota for in memory state on the
> source. The source doesn't queue or even transfer those bytes.
> Something I noticed while working on HBASE-18027.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)