[
https://issues.apache.org/jira/browse/HDFS-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Manoj Govindassamy updated HDFS-10480:
--------------------------------------
Attachment: HDFS-10480.02.patch
Thanks for the review [~andrew.wang], [~kihwal]. Attached v02 patch to address
the following. Can you please take a look.
bq. Is there a reason for dumping the info to a file on the NN? This makes it
more difficult for admins to get the information, and is more complicated than
just printing it out on the command line. Allowing a user-specified name that
isn't validated is also a possible security issue. This also means normal users
can't use this, since they won't have access to the NN's log directory.
The design is changed now. Client now gets a RemoteIterator for the open files,
and the list is retrieved in batches from NameNode. The fetching batch size is
configurable. This light weight model helps NameNode to serve any humongous
list with ease.
bq. Let's not change the import to a wildcard, it makes backports harder.
Done.
bq. Shouldn't this only go to the active NN, since it has up-to-date info about
writers? This is in reference to the Operation.UNCHECKED and the HA logic in
DFSAdmin.
Done.
bq. Nit: "getUnderconstructionFiles" -> "getUnderConstructionFiles"
Done.
bq. Could you also add a Java API to HdfsAdmin?
Done.
bq. One more thing that would be nice here is to filter the output on a passed
path or DN. Usecases: An admin might already know a stale file by path (perhaps
from fsck's -openforwrite), and wants to figure out who the lease holder is. A
DN might be blocked from decommissioning by an open-for-write file, and the
admin wants to figure out what files those are.
bq. With thousand+ node clusters, where you might be adding and removing
machines regularly for maintenance, a huge use case on top of the directory
filter would be a "which open files are blocking server decommissioning" filter
(identify files with blocks on hosts that are currently in decommisioning
state).
With the attached patch, the infrastructure is now available to get the above
enhancements. In the interest of patch size and easy backports, can take up
above enhancements in a new jira, if you are ok.
> Add an admin command to list currently open files
> -------------------------------------------------
>
> Key: HDFS-10480
> URL: https://issues.apache.org/jira/browse/HDFS-10480
> Project: Hadoop HDFS
> Issue Type: Improvement
> Reporter: Kihwal Lee
> Assignee: Manoj Govindassamy
> Attachments: HDFS-10480.02.patch, HDFS-10480-trunk-1.patch,
> HDFS-10480-trunk.patch
>
>
> Currently there is no easy way to obtain the list of active leases or files
> being written. It will be nice if we have an admin command to list open files
> and their lease holders.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]