[
https://issues.apache.org/jira/browse/HDFS-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798421#comment-16798421
]
Daryn Sharp commented on HDFS-14355:
------------------------------------
I quickly skimmed the patch. At a high level, a plugin design should not leak
details. This patch breaks the abstraction and litters the code with
references to "pmem" and explicit conditionals. The details of pmem should be
pushed down and hidden in the pmem specific classes.
Adding another reference (cacheFilePath) to {{ReplicaInfo}} is less than
desirable from a memory perspective. For those not using the feature, it's not
nearly as bad as adding one to inodes but may be an issue for very dense DNs.
More importantly, those that DO use the feature will incur a substantial memory
hit to store the paths. Why does a SCM block need to track a completely
different path? Isn't the SCM just another storage dir?
{{getCacheLoaderClaZZ}} should be {{getCacheLoaderClaSS}} and the returned
{{Class}} must be instantiated. It's wrong to compare the class simple name
against {{MemoryMappableBlockLoader}}. Consider if a user configures with
{{my.custom.MemoryMappableBlockLoader}} and it instantiates
{{org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader}}
anyway because of the matching simple name? Catching
{{ReflectiveOperationException}} and just logging it masks a serious
misconfiguration or error condition.
There's quite a few other exceptions being swallowed or just logged when
dubious "bad things" happen. Or discarding of exceptions and rethrowing
generic "something went wrong" exceptions w/o the original exception as a
cause. That complicates debugging.
New methods in {{FsDatasetUtil}} are not ok. Invalid arguments are not
"ignorable". That's how bugs creep in which are insanely hard to debug. Don't
check if offset is valid, just seek, let it throw if invalid. Don't ignore
null filenames, just let it throw. Catching {{Throwable}} is a bad practice;
let alone catching, discarding, and throwing a bland exception.
Why throw a {{RuntimeException}} in "getOneLocation"? Is the intent to take
down the DN?
Why do a costly compute of the checksum for a mmap file? Isn't the client is
going to compute the checksum anyway?
> Implement HDFS cache on SCM by using pure java mapped byte buffer
> -----------------------------------------------------------------
>
> Key: HDFS-14355
> URL: https://issues.apache.org/jira/browse/HDFS-14355
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: caching, datanode
> Reporter: Feilong He
> Assignee: Feilong He
> Priority: Major
> Attachments: HDFS-14355.000.patch, HDFS-14355.001.patch,
> HDFS-14355.002.patch, HDFS-14355.003.patch
>
>
> This task is to implement the caching to persistent memory using pure
> {{java.nio.MappedByteBuffer}}, which could be useful in case native support
> isn't available or convenient in some environments or platforms.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]