[ 
https://issues.apache.org/jira/browse/HBASE-21342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mazhenlin updated HBASE-21342:
------------------------------
    Description: 
As mentioned in 
[[HBASE-15291|https://issues.apache.org/jira/browse/HBASE-15291]|#HBASE-15291], 
there is a race condition.   If Two secure bulkload calls  from the same UGI 
into two different regions and one region finishes earlier, it will close the 
bulk load fs, and the other region will fail.

 

Another case would be more serious. The FileSystem.close() function need two 
synchronized variables : CACHE and deleteOnExit. If one region calls 
FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
another region is trying to close srcFS ( in  
SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.

 

I have wrote a UT for this and fixed it using reference counter.

 

  was:
As mentioned in [#HBASE-15291], there is a race condition.   If Two secure 
bulkload calls  from the same UGI into two different regions and one region 
finishes earlier, it will close the bulk load fs, and the other region will 
fail.

 

Another case would be more serious. The FileSystem.close() function need two 
synchronized variables : CACHE and deleteOnExit. If one region calls 
FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
another region is trying to close srcFS ( in  
SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.

 

I have wrote a UT for this and fixed it using reference counter.

 


> FileSystem in use was closed by others  in secure bulkLoad
> ----------------------------------------------------------
>
>                 Key: HBASE-21342
>                 URL: https://issues.apache.org/jira/browse/HBASE-21342
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 3.0.0, 2.1.0, 1.5.0, 1.3.3, 1.4.4, 2.0.1, 1.2.7
>            Reporter: mazhenlin
>            Priority: Major
>         Attachments: race.patch
>
>
> As mentioned in 
> [[HBASE-15291|https://issues.apache.org/jira/browse/HBASE-15291]|#HBASE-15291],
>  there is a race condition.   If Two secure bulkload calls  from the same UGI 
> into two different regions and one region finishes earlier, it will close the 
> bulk load fs, and the other region will fail.
>  
> Another case would be more serious. The FileSystem.close() function need two 
> synchronized variables : CACHE and deleteOnExit. If one region calls 
> FileSystem.closeAllForUGI ( in SecureBulkLoadManager.cleanupBulkLoad) while 
> another region is trying to close srcFS ( in  
> SecureBulkLoadListener.closeSrcFs)   , can cause deadlock here.
>  
> I have wrote a UT for this and fixed it using reference counter.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to