[ 
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151448#comment-16151448
 ] 

Yongjun Zhang commented on HDFS-12357:
--------------------------------------

HI [~manojg],

{quote}
Having UserFilterINodeAttributeProvider seems like a cleaner approach. Is it 
possible to examine the bypassUser config and skip the wrapper 
UserFilterINodeAttributeProvider if the user list is empty. Most of the times, 
the bypass user list is going to empty and we can totally skip the wrapper if 
so.
{quote}
Thanks for the good point here, sorry too many updates today I missed the above 
one again.

If we move the code of loading conf and checking isBypassUse to {{FSDirectory}} 
class  (like done in v001), we could  skip the wrapper when the bypassUser is 
empty. However, even when bypassUser is not empty, it's only one of two users, 
the wrapper is still created when many other users are not in the list. Any 
further thought?

Hi [~chris.douglas],

Looking at the change I did in rev5 again, it saved the extra cost of 
{{components = Arrays.copyOfRange(components, 1, components.length);}}, but it 
introduced another extra cost: {{isBypassUser()}} is called twice. One at
{code}
    if (attributeProvider != null &&
        !attributeProvider.isBypassUser()) {
{code}
The other at the trapper implementation
{code}
nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
{code} 

after the first one is checked and found to be a non bypassUser, the second one 
checks again. And this extra call happens to most users unfortunately.  Seems 
not easy to avoid both extra costs with the wrapper approach.

v001 implementation does't have either of these extra costs. But certainly the 
wrapper class is a better abstraction.  I can go with either approach if 
agreed, and we can certainly keep improving the solution.

Thanks a lot.






> Let NameNode to bypass external attribute provider for special user
> -------------------------------------------------------------------
>
>                 Key: HDFS-12357
>                 URL: https://issues.apache.org/jira/browse/HDFS-12357
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Yongjun Zhang
>            Assignee: Yongjun Zhang
>         Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch, 
> HDFS-12357.003.patch, HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the 
> same cluster), in addition to copying file data, we copy the metadata from 
> source to target. If external attribute provider is enabled, the metadata may 
> be read from the provider, thus provider data read from source may be saved 
> to target HDFS. 
> We want to avoid saving metadata from external provider to HDFS, so we want 
> to bypass external provider when doing the distcp (or hadoop fs -cp) 
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the 
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a 
> list of users), and let NN bypass external provider when the current user is 
> a special user.
> If we run applications as the special user that need data from external 
> attribute provider, then it won't work. So the constraint on this approach 
> is, the special users here should not run applications that need data from 
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn], 
> [~manojg] for the discussions in the other jiras. 
> I'm creating this one to discuss further.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to