[
https://issues.apache.org/jira/browse/HDFS-11400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15861287#comment-15861287
]
Hari Sekhon commented on HDFS-11400:
------------------------------------
[~aw] Good question. Where are such fake users coming from? Given NN resolves
users from OS / Kerberos, this would mean the OS / Kerberos systems have
already been compromised to have had fake users added?
Putting a configurable user/group filter to only automatically create home
directories for a whitelisted regex of users/groups could form a layer of
protection. For example in a cluster integrated with Active Directory which
might have 20,000 users you may only want 100 of those users actually using the
Hadoop cluster. Although in practice this filtering is usually already done at
the OS level via SSSD etc.
Another layer of protection could be a setting on max enumerated users for
which home directories were going to be automatically created or max number of
home directories already in existence - if the enumerated users or the number
of existing home directories is too high, eg. 1000 then log it and disable
auto-creation until resolved to prevent said memory explosion. Really the
second idea on number of home directories in existence before disabling auto
home directory creation would be better as it shouldn't really be enumerating
users but rather creating the home directory on the fly each time a single new
user is first used on the cluster and no home directory exists for the user.
How about these ideas?
This would stop various jobs from breaking where they try to put staging files
etc in home directories that don't exist because they haven't been manually
created yet or scripted (it seems silly in retrospect for admins to keep
writing scripts to do this for every client when this could be solved once and
for all via NN logic).
> Automatic HDFS Home Directory Creation
> --------------------------------------
>
> Key: HDFS-11400
> URL: https://issues.apache.org/jira/browse/HDFS-11400
> Project: Hadoop HDFS
> Issue Type: New Feature
> Components: hdfs, namenode
> Affects Versions: 2.7.1
> Environment: HDP 2.4.2
> Reporter: Hari Sekhon
>
> Feature Request to add automatic home directory creation for HDFS users when
> they are first resolved by the NameNode if their home directory does not
> already exist, using configurable umask defaulting to 027.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]