[
https://issues.apache.org/jira/browse/HDFS-12202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yongjun Zhang reassigned HDFS-12202:
------------------------------------
Assignee: Yongjun Zhang
> Provide new set of FileSystem API to bypass external attribute provider
> -----------------------------------------------------------------------
>
> Key: HDFS-12202
> URL: https://issues.apache.org/jira/browse/HDFS-12202
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: hdfs, hdfs-client
> Reporter: Yongjun Zhang
> Assignee: Yongjun Zhang
>
> HDFS client uses
> {code}
> /**
> * Return a file status object that represents the path.
> * @param f The path we want information from
> * @return a FileStatus object
> * @throws FileNotFoundException when the path does not exist
> * @throws IOException see specific implementation
> */
> public abstract FileStatus getFileStatus(Path f) throws IOException;
> /**
> * List the statuses of the files/directories in the given path if the path
> is
> * a directory.
> * <p>
> * Does not guarantee to return the List of files/directories status in a
> * sorted order.
> * <p>
> * Will not return null. Expect IOException upon access error.
> * @param f given path
> * @return the statuses of the files/directories in the given patch
> * @throws FileNotFoundException when the path does not exist
> * @throws IOException see specific implementation
> */
> public abstract FileStatus[] listStatus(Path f) throws
> FileNotFoundException,
> IOException;
> {code}
> to get FileStatus of files.
> When external attribute provider (INodeAttributeProvider) is enabled for a
> cluster, the external attribute provider is consulted to get back some
> relevant info (including ACL, group etc) and returned back in FileStatus,
> There is a problem here, when we use distcp to copy files from srcCluster to
> tgtCluster, if srcCluster has external attribute provider enabled, the data
> we copied would contain data from attribute provider, which we may not want.
> Create this jira to add a new set of interface for distcp to use, so that
> distcp can copy HDFS data only and bypass external attribute provider data.
> The new set API would look like
> {code}
> /**
> * Return a file status object that represents the path.
> * @param f The path we want information from
> * @param bypassExtAttrProvider if true, bypass external attr provider
> * when it's in use.
> * @return a FileStatus object
> * @throws FileNotFoundException when the path does not exist
> * @throws IOException see specific implementation
> */
> public FileStatus getFileStatus(Path f,
> final boolean bypassExtAttrProvider) throws IOException;
> /**
> * List the statuses of the files/directories in the given path if the path
> is
> * a directory.
> * <p>
> * Does not guarantee to return the List of files/directories status in a
> * sorted order.
> * <p>
> * Will not return null. Expect IOException upon access error.
> * @param f
> * @param bypassExtAttrProvider if true, bypass external attr provider
> * when it's in use.
> * @return
> * @throws FileNotFoundException
> * @throws IOException
> */
> public FileStatus[] listStatus(Path f,
> final boolean bypassExtAttrProvider) throws FileNotFoundException,
> IOException;
> {code}
> So when bypassExtAttrProvider is true, external attribute provider will be
> bypassed.
> Thanks.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]