[ 
https://issues.apache.org/jira/browse/HDFS-10480?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-10480:
--------------------------------------
    Attachment: HDFS-10480.02.patch

Thanks for the review [~andrew.wang], [~kihwal]. Attached v02 patch to address 
the following. Can you please take a look.

bq. Is there a reason for dumping the info to a file on the NN? This makes it 
more difficult for admins to get the information, and is more complicated than 
just printing it out on the command line. Allowing a user-specified name that 
isn't validated is also a possible security issue. This also means normal users 
can't use this, since they won't have access to the NN's log directory.
The design is changed now. Client now gets a RemoteIterator for the open files, 
and the list is retrieved in batches from NameNode. The fetching batch size is 
configurable. This light weight model helps NameNode to serve any humongous 
list with ease.

bq. Let's not change the import to a wildcard, it makes backports harder.
Done.

bq. Shouldn't this only go to the active NN, since it has up-to-date info about 
writers? This is in reference to the Operation.UNCHECKED and the HA logic in 
DFSAdmin.
Done.

bq. Nit: "getUnderconstructionFiles" -> "getUnderConstructionFiles"
Done.

bq. Could you also add a Java API to HdfsAdmin?
Done.

bq. One more thing that would be nice here is to filter the output on a passed 
path or DN. Usecases: An admin might already know a stale file by path (perhaps 
from fsck's -openforwrite), and wants to figure out who the lease holder is. A 
DN might be blocked from decommissioning by an open-for-write file, and the 
admin wants to figure out what files those are.
bq. With thousand+ node clusters, where you might be adding and removing 
machines regularly for maintenance, a huge use case on top of the directory 
filter would be a "which open files are blocking server decommissioning" filter 
(identify files with blocks on hosts that are currently in decommisioning 
state).
With the attached patch, the infrastructure is now available to get the above 
enhancements. In the interest of patch size and easy backports, can take up  
above enhancements in a new jira, if you are ok.


> Add an admin command to list currently open files
> -------------------------------------------------
>
>                 Key: HDFS-10480
>                 URL: https://issues.apache.org/jira/browse/HDFS-10480
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Kihwal Lee
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-10480.02.patch, HDFS-10480-trunk-1.patch, 
> HDFS-10480-trunk.patch
>
>
> Currently there is no easy way to obtain the list of active leases or files 
> being written. It will be nice if we have an admin command to list open files 
> and their lease holders.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to