[ 
https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francis Liu updated HBASE-5498:
-------------------------------

    Attachment: HBASE-5498_94.patch

Here's close to the final patch. It's just missing unit tests for secure bulk 
load. Putting it up for initial comments.

After thinking more about dealing with failure scenarios. I changed the design 
a bit. The secure bulk load RPC now mimics the existing bulkLoadHFiles() api. 
Failure became easier to deal with if all the necessary checks are done prior 
to staging an HFile for the actual bulk load.

Given the similarity between the secure and non-secure apis. We should probably 
consider integrating the secure bulkload RPC into the non-security classes (ie 
HRegionServer, HRegion, etc.) in 0.96. Which will streamline the implementation.

Usage of secure mode is now done under the covers. LoadIncrementalHFiles will 
automatically switch to using the secure mode if hbase security is enabled. 





                
> Secure Bulk Load
> ----------------
>
>                 Key: HBASE-5498
>                 URL: https://issues.apache.org/jira/browse/HBASE-5498
>             Project: HBase
>          Issue Type: Improvement
>          Components: mapred, security
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>             Fix For: 0.96.0
>
>         Attachments: HBASE-5498_94.patch, HBASE-5498_draft_94.patch, 
> HBASE-5498_draft.patch
>
>
> Design doc: 
> https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load
> Short summary:
> Security as it stands does not cover the bulkLoadHFiles() feature. Users 
> calling this method will bypass ACLs. Also loading is made more cumbersome in 
> a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data 
> from user's directory to the hbase directory, which would require certain 
> write access privileges set.
> Our solution is to create a coprocessor which makes use of AuthManager to 
> verify if a user has write access to the table. If so, launches a MR job as 
> the hbase user to do the importing (ie rewrite from text to hfiles). One 
> tricky part this job will have to do is impersonate the calling user when 
> reading the input files. We can do this by expecting the user to pass an hdfs 
> delegation token as part of the secureBulkLoad() coprocessor call and extend 
> an inputformat to make use of that token. The output is written to a 
> temporary directory accessible only by hbase and then bulkloadHFiles() is 
> called.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to