[ 
https://issues.apache.org/jira/browse/HBASE-17361?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15825762#comment-15825762
 ] 

Yu Li commented on HBASE-17361:
-------------------------------

Thanks for the reminder [~Apache9]

For HTable, I think we still have some interface compatible issue, such as:
1. The {{setWriteBufferSize}} interface which exists from the very beginning
2. The {{setOperationTimeout}} interface has been existing for more than 6 
years (since HBASE-2937)
3. The {{setRpcTimeout}} interface  was introduced by HBASE-15645 which went 
into all 1.0+ branches (1.0.4, 1.1.5, 1.2.2, 1.3.0, 1.4.0)
4. The {{setRead/WriteRpcTimeout}} interfaces were introduced by HBASE-15866 
which also went into branch-1 (1.4.0)

And to make HTable thread safe, we need to move all the above methods into a 
similar table builder like {{AsyncTableBuilder}}, right? I guess the change 
should be only for 2.0 and need to add incompatible flag in release note?

Please let me know your thoughts, and I'll prepare the patch as soon as we get 
a consensus. Thanks.

> Make HTable thread safe
> -----------------------
>
>                 Key: HBASE-17361
>                 URL: https://issues.apache.org/jira/browse/HBASE-17361
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Yu Li
>            Assignee: Yu Li
>            Priority: Critical
>         Attachments: HBASE-17361.patch, HBASE-17361.patch
>
>
> Currently HTable is marked as NOT thread safe, and this JIRA target at 
> improving this to take better usage of the thread-safe BufferedMutator.
> Some findings/work done:
> If we try to do put to the same HTable instance in parallel, there'll be 
> problem, since now we have {{HTable#getBufferedMutator}} like
> {code}
>    BufferedMutator getBufferedMutator() throws IOException {
>      if (mutator == null) {
>       this.mutator = (BufferedMutatorImpl) connection.getBufferedMutator(
>           new BufferedMutatorParams(tableName)
>               .pool(pool)
>               .writeBufferSize(connConfiguration.getWriteBufferSize())
>               .maxKeyValueSize(connConfiguration.getMaxKeyValueSize())
>       );
>     }
>     mutator.setRpcTimeout(writeRpcTimeout);
>     mutator.setOperationTimeout(operationTimeout);
>     return mutator;
>   }
> {code}
> And {{HTable#flushCommits}}:
> {code}
>   void flushCommits() throws IOException {
>     if (mutator == null) {
>       // nothing to flush if there's no mutator; don't bother creating one.
>       return;
>     }
>     getBufferedMutator().flush();
>   }
> {code}
> For {{HTable#put}}
> {code}
>   public void put(final Put put) throws IOException {
>     getBufferedMutator().mutate(put);
>     flushCommits();
>   }
> {code}
> If we launch multiple threads to put in parallel, below sequence might happen 
> because {{HTable#getBufferedMutator}} is not thread safe:
> {noformat}
> 1. ThreadA runs to getBufferedMutator and finds mutator==null
> 2. ThreadB runs to getBufferedMutator and finds mutator==null
> 3. ThreadA initialize mutator to instanceA, then calls mutator#mutate,
> adding one put (putA) into {{writeAsyncBuffer}}
> 4. ThreadB initialize mutator to instanceB
> 5. ThreadA runs to flushCommits, now mutator is instanceB, it calls
> instanceB's flush method, putA is lost
> {noformat}
> After fixing this, we will find quite some contention on 
> {{BufferedMutatorImpl#flush}}, so more efforts required to make HTable thread 
> safe but with good performance meanwhile.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to