[
https://issues.apache.org/jira/browse/HDFS-12357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16151362#comment-16151362
]
Yongjun Zhang commented on HDFS-12357:
--------------------------------------
HI [~chris.douglas],
I uploaded rev005 to avoid the {{components = Arrays.copyOfRange(components, 1,
components.length);}} overhead.
Basically I added a new API (package scope) {{boolean isBypassUser() {}} to
{{INodeAttributeProvider}} class, and have a default implementation of
returning false. Then let {{UserFilterINodeAttributeProvider}} version to
override it. Then do the following
{code}
if (attributeProvider != null &&
!attributeProvider.isBypassUser()) {
// permission checking sends the full components array including the
// first empty component for the root. however file status
// related calls are expected to strip out the root component according
// to TestINodeAttributeProvider.
byte[][] components = iip.getPathComponents();
components = Arrays.copyOfRange(components, 1, components.length);
nodeAttrs = attributeProvider.getAttributes(components, nodeAttrs);
}
return nodeAttrs;
......
{code}
similar to the logic as in v001.
So here is a trade-off between not exposing the isBypassUser API and suffer the
cost overhead, vs exposing it and save the cost.
Wonder what you think?
Thanks.
> Let NameNode to bypass external attribute provider for special user
> -------------------------------------------------------------------
>
> Key: HDFS-12357
> URL: https://issues.apache.org/jira/browse/HDFS-12357
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
> Attachments: HDFS-12357.001.patch, HDFS-12357.002.patch,
> HDFS-12357.003.patch, HDFS-12357.004.patch, HDFS-12357.005.patch
>
>
> This is a third proposal to solve the problem described in HDFS-12202.
> The problem is, when we do distcp from one cluster to another (or within the
> same cluster), in addition to copying file data, we copy the metadata from
> source to target. If external attribute provider is enabled, the metadata may
> be read from the provider, thus provider data read from source may be saved
> to target HDFS.
> We want to avoid saving metadata from external provider to HDFS, so we want
> to bypass external provider when doing the distcp (or hadoop fs -cp)
> operation.
> Two alternative approaches were proposed earlier, one in HDFS-12202, the
> other in HDFS-12294. The proposal here is the third one.
> The idea is, we introduce a new config, that specifies a special user (or a
> list of users), and let NN bypass external provider when the current user is
> a special user.
> If we run applications as the special user that need data from external
> attribute provider, then it won't work. So the constraint on this approach
> is, the special users here should not run applications that need data from
> external provider.
> Thanks [~asuresh] for proposing this idea and [~chris.douglas], [~daryn],
> [~manojg] for the discussions in the other jiras.
> I'm creating this one to discuss further.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]