[ 
https://issues.apache.org/jira/browse/HDFS-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334224#comment-16334224
 ] 

Rakesh R commented on HDFS-13025:
---------------------------------

Thanks [~umamaheswararao] for the quick turnaround. 
{{TestExternalStoragePolicySatisfier}} unit testing idea is awesome. Added few 
comments, please take care.

# Please add few details about the reason to ignore like, {{@Ignore("reason why 
this test is skipping")}}
# Add javadoc for {{boolean scanCompleted}} BlockStorageMovementNeeded#addAll() 
and BlockStorageMovementNeeded#add() methods. Aslo, please remove unwanted 
{{@param startId}} from BlockStorageMovementNeeded#add()
# Can we avoid duplicate codes. How about create a method and reuse like below, 
{code}
  // add javadoc
  public synchronized void addAll(long startId,
      List<ItemInfo> itemInfoList, boolean scanCompleted) {
    storageMovementNeeded.addAll(itemInfoList);
    add(startId, scanCompleted);
  }

  // add javadoc
  public synchronized void add(ItemInfo itemInfo, boolean scanCompleted) {
    storageMovementNeeded.add(itemInfo);
    add(itemInfo.getStartId(), scanCompleted);
  }

  private void add(long startId, boolean scanCompleted) {
    DirPendingWorkInfo pendingWork =
        pendingWorkForDirectory.get(startId);
    if (pendingWork == null) {
      pendingWork = new DirPendingWorkInfo();
      pendingWorkForDirectory.put(startId, pendingWork);
    }
    pendingWork.addPendingWorkCount(1);
    if (scanCompleted) {
      pendingWork.markScanCompleted();
    }
  }
{code}
# Below {{service.addAllFileIdsToProcess}} call is going to 
{{BlockStorageMovementNeeded#addAll}} then add an entry to 
{{pendingWorkForDirectory.put(startId, pendingWork);}}. If the InodeId is a 
file, should we need to add into the {{pendingWorkForDirectory}} and mark as 
scanned?
{code}
IntraSPSNameNodeFileIdCollector#scanAndCollectFileIds()
         ...// other logic
                 ...// other logic
         service.addAllFileIdsToProcess(startInode.getId(), currentBatch, true);
{code}
# Please add annotation {{@InterfaceAudience.Private}} to 
IntraSPSNameNodeFileIdCollector, ExternalSPSFileIDCollector & other missing 
classes, if any.
# If {{dfs}} is not created then skip #processPath and #scanAndCollectFileIds 
to avoid NPE. I remember we discussed about retries, probably we could add null 
check {{dfs != null}} before using dfs. // TODO about retry mechanism now as a 
reminder and later we could implement this post branch merge.
{code}
ExternalSPSFileIDCollector.java

LOG.error("Unable to get the filesystem. Make sure Namenode running and "
          + "configured namenode address is correct.", e);
{code}
# Please add debug log message here saying, no children for this directory.
{code}
ExternalSPSFileIDCollector .java
85            if (children == null) {
86              return;
87            }
{code}


> [SPS]: Implement a mechanism to scan the files for external SPS
> ---------------------------------------------------------------
>
>                 Key: HDFS-13025
>                 URL: https://issues.apache.org/jira/browse/HDFS-13025
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: namenode
>            Reporter: Uma Maheswara Rao G
>            Assignee: Uma Maheswara Rao G
>            Priority: Major
>         Attachments: HDFS-13025-HDFS-10285-0.patch, 
> HDFS-13025-HDFS-10285-1.patch
>
>
> HDFS-12911 modularization is introducing FileIDCollector interface for 
> scanning the files. That will help us to plugin different ways of scanning 
> mechanisms if needed.
> For Internal SPS, we have INode based scanning. For external SPS, we should 
> go via client API scanning as we can not access NN internal structures. 
> This is the task to implement the scanning plugin for external SPS.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to