[
https://issues.apache.org/jira/browse/HDFS-5431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841628#comment-13841628
]
Colin Patrick McCabe commented on HDFS-5431:
--------------------------------------------
{code}
if (in.readBoolean()) {
info.setOwnerName(Text.readString(in));
}
if (in.readBoolean()) {
info.setGroupName(Text.readString(in));
}
if (in.readBoolean()) {
info.setMode(FsPermission.read(in));
}
if (in.readBoolean()) {
info.setReservation(in.readLong());
}
if (in.readBoolean()) {
info.setQuota(in.readLong());
}
if (in.readBoolean()) {
info.setWeight(in.readInt());
}
{code}
I don't think the backwards-compatibility stuff here is really going to work.
The problem is, if we add more booleans, the old code won't know they're there,
and will ignore them. Then we will interpret those bytes as something else,
which could cause some really bad results.
I think the best way to do this is to start with a 32-bit word, which we can
treat as a bitfield. We can then load or not load field N according to whether
bit N is set. If there are bits set that we don't know how to interpret, we
can bail out with a nice error message rather than trying to loading garbage
and possibly corrupting the fsimage. We probably should use this approach for
cache directives as well.
{code}
int mode = Integer.parseInt(modeString, 8);
info.setMode(new FsPermission((short)mode));
{code}
hey, there's a {{Short.parseShort}} too :)
About terminology: isn't "maximum" a better name for what we're implementing
here than "quota"? If we implement something more sophisticated later, it
could get confusing if we just use the term "quota" here. I also think we
should rip out weight completely if we're not going to support it any more. I
see a few places where "weight" is lingering now. The feature flag stuff
should allow us to add it forwards-compatibly (although not
backwards-compatibly) in the future, if we want to. I feel the same way about
"reservation."
I'm not sure that we want a cache directive addition to fail when the maximum
has been exceeded. The problem is, there isn't any good way to implement this
kind of simple check for more sophisticated quota methods like fair share or
minimum share, etc. Also, this is dependent on things like what we think the
sizes are of files and directories in the cluster, which may change. The
result is very inconsistent behavior from the user's point of view. For
example, maybe he can add cache directives if a datanode has not come up, but
can't add them once it comes up and we determine the full size of a certain
file. Or maybe he could add them by manually editing the edit log, but not
from the command-line. It just feels inconsistent. I would rather we teach
people to rely on looking at {{bytesNeeded}} versus {{bytesCached}} to
determine if they had enough space.
I wonder if we should add another metric that somehow allows users to
disambiguate between bytes not cached because of maximums / quotas / other
"executive decision" and bytes not cached because the DN had an issue. Right
now all the user can do is subtract bytesNeeded from bytesCached and see that
there is some gap, but he would have to check the logs to know why.
> support cachepool-based quota management in path-based caching
> --------------------------------------------------------------
>
> Key: HDFS-5431
> URL: https://issues.apache.org/jira/browse/HDFS-5431
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: datanode, namenode
> Affects Versions: 3.0.0
> Reporter: Colin Patrick McCabe
> Assignee: Andrew Wang
> Attachments: hdfs-5431-1.patch
>
>
> We should support cachepool-based quota management in path-based caching.
--
This message was sent by Atlassian JIRA
(v6.1#6144)