[ https://issues.apache.org/jira/browse/HBASE-3473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14311870#comment-14311870 ]
Lars Hofhansl commented on HBASE-3473: -------------------------------------- I'm interested in this as we've seen fairly low replication speeds (compared to the speed with which we can load data into the primary cluster). Turns out the TCP receive buffer can only be set to anything bigger than 64K before the negotiation phase and only on the ServerSocket. Since we're listening on one port only we have a single ServerSocket only (unless I am missing something), so I am not sure how we can set this only for the replication RPCs. [~ghelmling], [~stack]. Part of the slower replication speed - presumably - is that we have only a single thread writing to the remote cluster, while we have potentially many threads writing into the primary cluster. If we want to preserve ordering I also do not see an easy way out of this. > Add a configuration property for socket receive buffer size for replication > endpoints > ------------------------------------------------------------------------------------- > > Key: HBASE-3473 > URL: https://issues.apache.org/jira/browse/HBASE-3473 > Project: HBase > Issue Type: Improvement > Components: Replication > Reporter: Gary Helmling > Priority: Minor > > Looking at this blog post about optimizing replication throughput for > LinkedIn's Kafka: > http://sna-projects.com/blog/2011/01/optimizing-tcp-socket-across-data-centers/ > It seems worth testing out if HBase replication connections can also benefit > from increasing the socket receive buffer size on (expected to be) > high-latency connections. > To this end, we would add a new configuration property for receive buffer > size for replication connection and do some benchmarking to evaluate > throughput with different values, verifying that making this configurable > would have significant impact. For the moment, it seems best to scope the > configuration setting to replication connections only, in order to avoid also > impacting (negatively) intra-cluster communications. -- This message was sent by Atlassian JIRA (v6.3.4#6332)