[ 
https://issues.apache.org/jira/browse/HBASE-5498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13467033#comment-13467033
 ] 

Francis Liu commented on HBASE-5498:
------------------------------------

[[email protected]]
{quote}
"InternalBulkLoadListener" isn't necessary because there is no 
"BulkLoadListener" – just call it BulkLoadListener?
{quote}
I added internal to disambiguate as a listener to only the actual moving of the 
file and not a listener to the entire bulkload process which is what the 
coprocessor hook does. I'm fine either way was worried it'll be misunderstood.

{quote}
The new '// TODO deal with viewFS' in HStore gives me concern. I think this 
should be implemented, but don't have a strong opinion. There are other places 
where this is going to be an issue I suspect.
{quote}
My assumption was that HBase wasn't federation compatible yet. If that is true 
I think it's safe to push this to that future effort.

{quote}
In BaseRegionObserver we have "//TODO this should end up as a coprocessor hook" 
– Those proposed hooks should be added as part of this change IMO. I don't like 
the idea of BaseRegionObserver exporting something not part of the 
RegionObserver interface. It is supposed to be a default implementation of that 
interface not a superset.
{quote}
I didn't add this as a coprocessor hook as these methods are security only 
methods which we don't want to bleed into the core code in 0.94. I added it as 
a TODO so we can address this in 0.96 as part of streamlining things since we 
don't need have the artificial security separation in that codebase?

{quote}
In SecureBulkLoadEndpoint we have "//TODO make this configurable" – This should 
either be done or not?
{quote}
It is already configurable, I seem to have forgotten to remove the todo.

{quote}
In SecureTestUtil, should we be loading the SecureBulkLoad support 
unconditionally? How about just for the relevant tests?
{quote}
Not sure what the downside would be? Since it is expected to always be enabled 
in a secure deployment should it be always be available in the tests?

{quote}
And maybe SecureBulkLoadProxy could be moved out of LoadIncrementalHFiles to a 
util class? Perhaps others will want to programatically import HFiles securely. 
{quote}
I added the proxy to prevent the security code bleeding into the core code. I 
extract this as a helper class if you think it's useful? It seemed to me that 
LoadIncrementalHFiles was the entry point for users that wanted to do bulk load 
as it does a lot of things that I believe users wouldn't want to re-roll again.

                
> Secure Bulk Load
> ----------------
>
>                 Key: HBASE-5498
>                 URL: https://issues.apache.org/jira/browse/HBASE-5498
>             Project: HBase
>          Issue Type: Improvement
>          Components: security
>            Reporter: Francis Liu
>            Assignee: Francis Liu
>             Fix For: 0.94.3, 0.96.0
>
>         Attachments: HBASE-5498_94.patch, HBASE-5498_94.patch, 
> HBASE-5498_draft_94.patch, HBASE-5498_draft.patch, HBASE-5498_trunk.patch
>
>
> Design doc: 
> https://cwiki.apache.org/confluence/display/HCATALOG/HBase+Secure+Bulk+Load
> Short summary:
> Security as it stands does not cover the bulkLoadHFiles() feature. Users 
> calling this method will bypass ACLs. Also loading is made more cumbersome in 
> a secure setting because of hdfs privileges. bulkLoadHFiles() moves the data 
> from user's directory to the hbase directory, which would require certain 
> write access privileges set.
> Our solution is to create a coprocessor which makes use of AuthManager to 
> verify if a user has write access to the table. If so, launches a MR job as 
> the hbase user to do the importing (ie rewrite from text to hfiles). One 
> tricky part this job will have to do is impersonate the calling user when 
> reading the input files. We can do this by expecting the user to pass an hdfs 
> delegation token as part of the secureBulkLoad() coprocessor call and extend 
> an inputformat to make use of that token. The output is written to a 
> temporary directory accessible only by hbase and then bulkloadHFiles() is 
> called.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to