[
https://issues.apache.org/jira/browse/HDFS-11221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15811552#comment-15811552
]
Ewan Higgs commented on HDFS-11221:
-----------------------------------
Hi [~jiajia],
This looks like a good first cut to change all the places where the interfaces
are used. It looks like good work with the changes to tests as well. I suspect
Jenkins won't pick up the patch unless the name is of the form
{{HDFS-11221-0001.patch}}.
One thing that concerns me is that we're just putting {{.get()}} everywhere.
This already helps: it means exceptions are thrown as soon as it's known we
have an empty value instead of only when the value is used. This is good
because it means we can see when the error happens and not some time later as
with null. But we're still just fundamentally changing the code from raising
{{NullPointerException}} to {{NoSuchElementException}}. An incremental
improvement we can take is to identify some places where we have alternate
behaviour using {{orElse}} (or {{orElseThrow}} (where an {{IOException}} or
{{RuntimeException}} could be thrown with a useful message)). Personally, I
would accept this and then update the places where the code is accessed on a
case by case basis.
The thing that should be decided on now, however, is the signature of the
{{get...}} functions:
{code}
@@ -635,15 +672,15 @@ public StorageState analyzeStorage(StartupOption
startOpt, Storage storage,
}
// check whether current directory is valid
- File versionFile = getVersionFile();
+ File versionFile = getVersionFile().get();
boolean hasCurrent = versionFile.exists();
// check which directories exist
- boolean hasPrevious = getPreviousDir().exists();
- boolean hasPreviousTmp = getPreviousTmp().exists();
- boolean hasRemovedTmp = getRemovedTmp().exists();
- boolean hasFinalizedTmp = getFinalizedTmp().exists();
- boolean hasCheckpointTmp = getLastCheckpointTmp().exists();
+ boolean hasPrevious = getPreviousDir().get().exists();
+ boolean hasPreviousTmp = getPreviousTmp().get().exists();
+ boolean hasRemovedTmp = getRemovedTmp().get().exists();
+ boolean hasFinalizedTmp = getFinalizedTmp().get().exists();
+ boolean hasCheckpointTmp = getLastCheckpointTmp().get().exists();
{code}
As a rule of thumb, we should limit the surface of Optional values so we can
get back to our actual types and work with those as soon as possible. All of
these are relative to {{root}}. So it makes sense that these functions which
aren't seen outside HDFS are changed to accept {{File root}} as an argument and
return {{File}} without the Optional wrapper.
What do you think?
In StorageInfo.java:
{code}
@@ -187,7 +187,7 @@ protected void setClusterId(Properties props, int
layoutVersion,
Feature.FEDERATION, layoutVersion)) {
String cid = getProperty(props, sd, "clusterID");
if (!(clusterID.equals("") || cid.equals("") || clusterID.equals(cid))) {
- throw new InconsistentFSStateException(sd.getRoot(),
+ throw new InconsistentFSStateException(sd.getRoot().get(),
"cluster Id is incompatible with others.");
}
{code}
There are a few other exceptions built in the same directory using just
{{sd.root}}. Could you make them all consistent using {{sd.getRoot()}}. Then
add a constructor to {{InconsistentFSStateException}} which also receives an
Optional<File> and handles it when the argument is empty by printing that the
property file is missing. This would look something like the following:
{code}
static private String msg(Optional<File> dir, String descr) {
if (!dir.isPresent()) {
return "Directory not found therefore it is in an inconsistent state: " +
descr;
} else {
return InconsistentFSStateException.msg(dir.get(), descr);
}
}
static private String msg(File dir, String descr) {
return "Directory " + getFilePath(dir)
+ " is in an inconsistent state: " + descr;
}
public InconsistentFSStateException(Optional<File> dir, String descr) {
super(InconsistentFSStateException.msg(dir, descr));
}
public InconsistentFSStateException(File dir, String descr) {
super(InconsistentFSStateException.msg(dir, descr));
}
{code}
Places like the following should probably use {code}sd.getRoot().map((File f)
-> f.getCanonicalPath().toString()).orElse("<null path>"){code}
{code}
if (!blockpoolID.equals(nsInfo.getBlockPoolID())) {
throw new IOException("Incompatible blockpoolIDs in "
- + sd.getRoot().getCanonicalPath() + ": namenode blockpoolID = "
+ + sd.getRoot().get().getCanonicalPath() + ": namenode blockpoolID = "
+ nsInfo.getBlockPoolID() + "; datanode blockpoolID = "
+ blockpoolID);
}
{code}
> Have StorageDirectory return Optional<File> instead of File/null
> ----------------------------------------------------------------
>
> Key: HDFS-11221
> URL: https://issues.apache.org/jira/browse/HDFS-11221
> Project: Hadoop HDFS
> Issue Type: Task
> Reporter: Ewan Higgs
> Assignee: Jiajia Li
> Priority: Minor
> Attachments: HDFS-11221-v1.patch
>
>
> In HDFS-10675, {{StorageDirectory.root}} can be {{null}} because {{PROVIDED}}
> storage locations will not have any directories associated with them. Hence,
> we need to add checks to StorageDirectory to make sure we handle this. This
> would also lead to changes in code that call {{StorageDirectory.getRoot}},
> {{StorageDirectory.getCurrentDir}}, {{StorageDirectory.getVersionFile}} etc.
> as the return value can be {{nul}}l (if {{StorageDirectory.root}} is null).
> The proposal to handle this is to change the return type of the above
> functions to {{Optional<File>}}. According to my preliminary check, this will
> result in changes in ~70 places, which is why it's not appropriate to put it
> in the patch for HDFS-10675. But it is certainly a valuable fix.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]