[jira] [Commented] (HBASE-12728) buffered writes substantially less useful after removal of HTablePool

Ted Yu (JIRA) Fri, 23 Jan 2015 19:01:33 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-12728?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14290400#comment-14290400
 ]


Ted Yu commented on HBASE-12728:
--------------------------------

After doing a clean build, the test doesn't pass with addendum 2.
Though there're a lot of exceptions:
{code}
2015-01-23 18:56:53,209 WARN  [PriorityRpcServer.handler=2,queue=0,port=52379] 
hdfs.DFSInputStream(1078): Connection failure: Failed to connect to 
/127.0.0.1:52369 for file /    
user/tyu/test-data/58b7b2ea-9919-4535-baf5-c3ed27fce466/data/hbase/meta/1588230740/info/517adf8439d34234bf13b9c98d8ebdfd
 for block BP-1225607801-192.168.0.19-1422068063192:      
blk_1073741838_1014:java.net.ConnectException: Connection refused
java.net.ConnectException: Connection refused
  at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
  at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:739)
...
  at org.apache.hadoop.hbase.regionserver.HRegion.getScanner(HRegion.java:2118)
  at 
org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:2033)
{code}
the exceptions are not bubbled up to caller.
Debugging.

> buffered writes substantially less useful after removal of HTablePool
> ---------------------------------------------------------------------
>
>                 Key: HBASE-12728
>                 URL: https://issues.apache.org/jira/browse/HBASE-12728
>             Project: HBase
>          Issue Type: Bug
>          Components: hbase
>    Affects Versions: 0.98.0
>            Reporter: Aaron Beppu
>            Assignee: Nick Dimiduk
>            Priority: Blocker
>             Fix For: 1.0.0, 2.0.0, 1.1.0
>
>         Attachments: 12728-1.0-addendum-2.txt, 
> 12728.connection-owns-buffers.example.branch-1.0.patch, HBASE-12728-2.patch, 
> HBASE-12728-3.patch, HBASE-12728-4.patch, HBASE-12728-5.patch, 
> HBASE-12728-6.patch, HBASE-12728-6.patch, HBASE-12728.05-branch-1.0.patch, 
> HBASE-12728.05-branch-1.patch, HBASE-12728.06-branch-1.0.patch, 
> HBASE-12728.06-branch-1.patch, HBASE-12728.addendum.patch, HBASE-12728.patch, 
> bulk-mutator.patch
>
>
> In previous versions of HBase, when use of HTablePool was encouraged, HTable 
> instances were long-lived in that pool, and for that reason, if autoFlush was 
> set to false, the table instance could accumulate a full buffer of writes 
> before a flush was triggered. Writes from the client to the cluster could 
> then be substantially larger and less frequent than without buffering.
> However, when HTablePool was deprecated, the primary justification seems to 
> have been that creating HTable instances is cheap, so long as the connection 
> and executor service being passed to it are pre-provided. A use pattern was 
> encouraged where users should create a new HTable instance for every 
> operation, using an existing connection and executor service, and then close 
> the table. In this pattern, buffered writes are substantially less useful; 
> writes are as small and as frequent as they would have been with 
> autoflush=true, except the synchronous write is moved from the operation 
> itself to the table close call which immediately follows.
> More concretely :
> ```
> // Given these two helpers ...
> private HTableInterface getAutoFlushTable(String tableName) throws 
> IOException {
>   // (autoflush is true by default)
>   return storedConnection.getTable(tableName, executorService);
> }
> private HTableInterface getBufferedTable(String tableName) throws IOException 
> {
>   HTableInterface table = getAutoFlushTable(tableName);
>   table.setAutoFlush(false);
>   return table;
> }
> // it's my contention that these two methods would behave almost identically,
> // except the first will hit a synchronous flush during the put call,
> and the second will
> // flush during the (hidden) close call on table.
> private void writeAutoFlushed(Put somePut) throws IOException {
>   try (HTableInterface table = getAutoFlushTable(tableName)) {
>     table.put(somePut); // will do synchronous flush
>   }
> }
> private void writeBuffered(Put somePut) throws IOException {
>   try (HTableInterface table = getBufferedTable(tableName)) {
>     table.put(somePut);
>   } // auto-close will trigger synchronous flush
> }
> ```
> For buffered writes to actually provide a performance benefit to users, one 
> of two things must happen:
> - The writeBuffer itself shouldn't live, flush and die with the lifecycle of 
> it's HTableInstance. If the writeBuffer were managed elsewhere and had a long 
> lifespan, this could cease to be an issue. However, if the same writeBuffer 
> is appended to by multiple tables, then some additional concurrency control 
> will be needed around it.
> - Alternatively, there should be some pattern for having long-lived HTable 
> instances. However, since HTable is not thread-safe, we'd need multiple 
> instances, and a mechanism for leasing them out safely -- which sure sounds a 
> lot like the old HTablePool to me.
> See discussion on mailing list here : 
> http://mail-archives.apache.org/mod_mbox/hbase-user/201412.mbox/%3CCAPdJLkEzmUQZ_kvD%3D8mrxi4V%3DhCmUp3g9MUZsddD%2Bmon%2BAvNtg%40mail.gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12728) buffered writes substantially less useful after removal of HTablePool

Reply via email to