[ 
https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dhruba borthakur updated HDFS-729:
----------------------------------

    Attachment: corruptFiles.txt

This introduces a new method to ClientProtocol to retrive the list of corrupted 
files from the namenode. The server restricts that only 100 files can be 
retrieved by one invocation of this call. The application using this API has to 
iteratively call this method multiple times if it wants to retrieve all 
corrupted files. Only a superuser can invoke this call. Here is the javadoc:

{code}

  /**
   * Returns a list of files that are corrupted.
   * <p>
   * Returns a list of files that have at least one block that has no valid 
replicas.
   * The returned list has numExpectedFiles files in it. If the number of files
   * returned is zero, then it implies that no more 
   * corrupted files are available in the system. The startingNumber is the 
   * startingNumber-th corrupted file in the system. 
   * An application will typicaly invoke this method as
   *   int startingNumber = 0;
   *   LocatedBlocks[] l = getCorruptFiles(500, startingNumber);
   *   while (l.size() > 0) {
   *     while (LocatedBlocks onefile: l) {
   *       processOneCorruptedFile(onefile);
   *     }
   *     startingNumber += l.size();
   *     l = getCorruptFiles(100, startingNumber);
   *   }

   * 
   * @param numExpectedFiles the maximum number of files to be returned 
   * @param startingNumber list files starting from startingNumberth to 
   *                       (startingNumber + numExpectedFiles)th in the 
   *                       list of corrupted files
   * @throws AccessControlException if the superuser privilege is violated.
   * @throws IOException if unable to retrieve information of a corrupt file

{code}

I am in the process of writing a unit test for this one.



> fsck option to list only corrupted files
> ----------------------------------------
>
>                 Key: HDFS-729
>                 URL: https://issues.apache.org/jira/browse/HDFS-729
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: dhruba borthakur
>            Assignee: dhruba borthakur
>         Attachments: corruptFiles.txt
>
>
> An option to fsck to list only corrupted files will be very helpful for 
> frequent monitoring.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to