[
https://issues.apache.org/jira/browse/HDFS-13025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16334224#comment-16334224
]
Rakesh R commented on HDFS-13025:
---------------------------------
Thanks [~umamaheswararao] for the quick turnaround.
{{TestExternalStoragePolicySatisfier}} unit testing idea is awesome. Added few
comments, please take care.
# Please add few details about the reason to ignore like, {{@Ignore("reason why
this test is skipping")}}
# Add javadoc for {{boolean scanCompleted}} BlockStorageMovementNeeded#addAll()
and BlockStorageMovementNeeded#add() methods. Aslo, please remove unwanted
{{@param startId}} from BlockStorageMovementNeeded#add()
# Can we avoid duplicate codes. How about create a method and reuse like below,
{code}
// add javadoc
public synchronized void addAll(long startId,
List<ItemInfo> itemInfoList, boolean scanCompleted) {
storageMovementNeeded.addAll(itemInfoList);
add(startId, scanCompleted);
}
// add javadoc
public synchronized void add(ItemInfo itemInfo, boolean scanCompleted) {
storageMovementNeeded.add(itemInfo);
add(itemInfo.getStartId(), scanCompleted);
}
private void add(long startId, boolean scanCompleted) {
DirPendingWorkInfo pendingWork =
pendingWorkForDirectory.get(startId);
if (pendingWork == null) {
pendingWork = new DirPendingWorkInfo();
pendingWorkForDirectory.put(startId, pendingWork);
}
pendingWork.addPendingWorkCount(1);
if (scanCompleted) {
pendingWork.markScanCompleted();
}
}
{code}
# Below {{service.addAllFileIdsToProcess}} call is going to
{{BlockStorageMovementNeeded#addAll}} then add an entry to
{{pendingWorkForDirectory.put(startId, pendingWork);}}. If the InodeId is a
file, should we need to add into the {{pendingWorkForDirectory}} and mark as
scanned?
{code}
IntraSPSNameNodeFileIdCollector#scanAndCollectFileIds()
...// other logic
...// other logic
service.addAllFileIdsToProcess(startInode.getId(), currentBatch, true);
{code}
# Please add annotation {{@InterfaceAudience.Private}} to
IntraSPSNameNodeFileIdCollector, ExternalSPSFileIDCollector & other missing
classes, if any.
# If {{dfs}} is not created then skip #processPath and #scanAndCollectFileIds
to avoid NPE. I remember we discussed about retries, probably we could add null
check {{dfs != null}} before using dfs. // TODO about retry mechanism now as a
reminder and later we could implement this post branch merge.
{code}
ExternalSPSFileIDCollector.java
LOG.error("Unable to get the filesystem. Make sure Namenode running and "
+ "configured namenode address is correct.", e);
{code}
# Please add debug log message here saying, no children for this directory.
{code}
ExternalSPSFileIDCollector .java
85 if (children == null) {
86 return;
87 }
{code}
> [SPS]: Implement a mechanism to scan the files for external SPS
> ---------------------------------------------------------------
>
> Key: HDFS-13025
> URL: https://issues.apache.org/jira/browse/HDFS-13025
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: namenode
> Reporter: Uma Maheswara Rao G
> Assignee: Uma Maheswara Rao G
> Priority: Major
> Attachments: HDFS-13025-HDFS-10285-0.patch,
> HDFS-13025-HDFS-10285-1.patch
>
>
> HDFS-12911 modularization is introducing FileIDCollector interface for
> scanning the files. That will help us to plugin different ways of scanning
> mechanisms if needed.
> For Internal SPS, we have INode based scanning. For external SPS, we should
> go via client API scanning as we can not access NN internal structures.
> This is the task to implement the scanning plugin for external SPS.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]