[ 
https://issues.apache.org/jira/browse/HBASE-6339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Harsh J updated HBASE-6339:
---------------------------

    Description: 
I noticed that right now, under a bulkLoadHFiles call to an RS, we grab the 
HRegion write lock as soon as we determine that it is a multi-family bulk load 
we'll be attempting. The file copy from the caller's source FS is done after 
holding the lock.

This doesn't seem right. For instance, we had a recent use-case where the bulk 
load running cluster is a separate HDFS instance/cluster than the one that runs 
HBase and the transfers between these FSes can get slower than an intra-cluster 
transfer. Hence I think we should begin to hold the write lock only after we've 
got a successful destinationFS copy of the requested file, and thereby allow 
more write throughput to pass.

Does this sound reasonable to do?

  was:
I noticed that right now, under a bulkLoadHFiles call to an RS, we grab the 
write lock as soon as we determine that it is a multi-family bulk load we'll be 
attempting. The file copy from the caller's source FS is done after holding the 
lock.

This doesn't seem right. For instance, we had a recent use-case where the bulk 
load running cluster is a separate HDFS instance/cluster than the one that runs 
HBase and the transfers between these FSes can get slower than an intra-cluster 
transfer. Hence I think we should begin to hold the write lock only after we've 
got a successful destinationFS copy of the requested file, and thereby allow 
more write throughput to pass.

Does this sound reasonable to do?

    
> Bulkload call to RS should begin holding write lock only after the file has 
> been transferred
> --------------------------------------------------------------------------------------------
>
>                 Key: HBASE-6339
>                 URL: https://issues.apache.org/jira/browse/HBASE-6339
>             Project: HBase
>          Issue Type: Improvement
>          Components: client, regionserver
>    Affects Versions: 0.90.0
>            Reporter: Harsh J
>            Assignee: Harsh J
>
> I noticed that right now, under a bulkLoadHFiles call to an RS, we grab the 
> HRegion write lock as soon as we determine that it is a multi-family bulk 
> load we'll be attempting. The file copy from the caller's source FS is done 
> after holding the lock.
> This doesn't seem right. For instance, we had a recent use-case where the 
> bulk load running cluster is a separate HDFS instance/cluster than the one 
> that runs HBase and the transfers between these FSes can get slower than an 
> intra-cluster transfer. Hence I think we should begin to hold the write lock 
> only after we've got a successful destinationFS copy of the requested file, 
> and thereby allow more write throughput to pass.
> Does this sound reasonable to do?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to