[
https://issues.apache.org/jira/browse/ATLAS-2444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Madhan Neethiraj updated ATLAS-2444:
Description:
With Hadoop 3, HDFS supports multiple namespaces; each namespace is serviced by
a cluster of name-nodes identified by a nameServiceId. HDFS path entity in
Atlas should be updated to capture the the nameServiceId of the namenode
cluster for the path. Also, references to individual namenode in the path
should be replaced with corresponding nameServiceID.
Changes:
# HDFS model to include a new attribute called nameServiceId, default will be
empty
# hdfs path's qualified name will include the cluster info as well
# All Hooks dealing with HDFS path will resolve the host:port to the
respective nameServiceId
was:
With Hadoop 3, HDFS is introducing namespace/nameServiceId for data
segregation.
A cluster with hadoop 3 deployment can have multiple name nodes under a single
name space and multiple namespace/nameserviceIds as well.
This metadata can be captured for other interested application/parties, eg.
Ranger can use this to enforce policies per Namespace.
Changes:
# HDFS model to include a new attribute called nameServiceId, default will be
empty
# hdfs path's qualified name will include the cluster info as well
# All Hooks dealing with HDFS path will resolve the host:port to the
respective nameServiceId
> HDFS NameNode Federation support
>
>
> Key: ATLAS-2444
> URL: https://issues.apache.org/jira/browse/ATLAS-2444
> Project: Atlas
> Issue Type: Bug
>Reporter: Apoorv Naik
>Assignee: Apoorv Naik
>Priority: Major
> Fix For: 1.0.0
>
>
> With Hadoop 3, HDFS supports multiple namespaces; each namespace is serviced
> by a cluster of name-nodes identified by a nameServiceId. HDFS path entity in
> Atlas should be updated to capture the the nameServiceId of the namenode
> cluster for the path. Also, references to individual namenode in the path
> should be replaced with corresponding nameServiceID.
> Changes:
> # HDFS model to include a new attribute called nameServiceId, default will
> be empty
> # hdfs path's qualified name will include the cluster info as well
> # All Hooks dealing with HDFS path will resolve the host:port to the
> respective nameServiceId
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)