[ 
https://issues.apache.org/jira/browse/HDFS-11847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Manoj Govindassamy updated HDFS-11847:
--------------------------------------
    Attachment: HDFS-11847.03.patch

Thanks for the detailed review [~xiaochen]. Attached v03 patch to address the 
following. Please take a look.
1. Deprecated the old API and added a new one which accepts the additional type 
argument to filter the result by.
2. Updated {{FSN#listOpenFiles}} to check for {{ALL_OPEN_FILES}} type first and 
then for combination filtering option later. But, the result set and the 
reporting doesn't differentiate the entries by type. For this, we need to add 
the type to the {{OpenFileEntry}}. Will do this.
3. About printing DataNodes details in the results, planning to take this 
enhancement along with the pending item in (2) in a separate jira if you are 
ok. I need to change the proto, the handling of the 
4. Yes, better to return as much result as possible. Made 
{{DatanodeAdminManager#processBlocksInternal}} to log the warning message on 
unexpected open files and continue to the next one.
5. In {{DatanodeAdminManager#processBlocksInternal}}, the computation is at the 
DataNode level. There can be multiple blocks across DNs for the same file and 
the full count need to be tracked for JMX reporting purposes. So, retaining the 
existing lowRedundancyBlocksInOpenFiles field. When I removed this field and 
piggy backed on the {{lowRedundancyOpenFiles.size()}}, the actual count was 
lesser than the expected for few tests.
6. In {{LeavingServiceStatus}} both members are needed due to (5)

7. Updated the comment for the class {{LeavingServiceStatus}}
8. {{FSN#getFilesBlockingDecom}} added hasReadLock()
9. {{TestDecommission#verifyOpenFilesBlockingDecommission}} PrintStream is now 
copied before the exchange and restored to the old one. Good find.
10. {{DFS_NAMENODE_REDUNDANCY_INTERVAL_SECONDS_KEY}} value is actually in 
seconds. So, it is 1000 seconds and not 1 sec. Anyways, updated this to Max 
value.


> Enhance dfsadmin listOpenFiles command to list files blocking datanode 
> decommissioning
> --------------------------------------------------------------------------------------
>
>                 Key: HDFS-11847
>                 URL: https://issues.apache.org/jira/browse/HDFS-11847
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs
>    Affects Versions: 3.0.0-alpha1
>            Reporter: Manoj Govindassamy
>            Assignee: Manoj Govindassamy
>         Attachments: HDFS-11847.01.patch, HDFS-11847.02.patch, 
> HDFS-11847.03.patch
>
>
> HDFS-10480 adds {{listOpenFiles}} option is to {{dfsadmin}} command to list 
> all the open files in the system.
> Additionally, it would be very useful to only list open files that are 
> blocking the DataNode decommissioning. With thousand+ node clusters, where 
> there might be machines added and removed regularly for maintenance, any 
> option to monitor and debug decommissioning status is very helpful. Proposal 
> here is to add suboptions to {{listOpenFiles}} for the above case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to