[ 
https://issues.apache.org/jira/browse/HDFS-14355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16798421#comment-16798421
 ] 

Daryn Sharp commented on HDFS-14355:
------------------------------------

I quickly skimmed the patch.  At a high level, a plugin design should not leak 
details.  This patch breaks the abstraction and litters the code with 
references to "pmem" and explicit conditionals.  The details of pmem should be 
pushed down and hidden in the pmem specific classes. 

Adding another reference (cacheFilePath) to {{ReplicaInfo}} is less than 
desirable from a memory perspective.  For those not using the feature, it's not 
nearly as bad as adding one to inodes but may be an issue for very dense DNs.  
More importantly, those that DO use the feature will incur a substantial memory 
hit to store the paths.  Why does a SCM block need to track a completely 
different path?  Isn't the SCM just another storage dir?

{{getCacheLoaderClaZZ}} should be {{getCacheLoaderClaSS}} and the returned 
{{Class}} must be instantiated.  It's wrong to compare the class simple name 
against {{MemoryMappableBlockLoader}}.  Consider if a user configures with 
{{my.custom.MemoryMappableBlockLoader}} and it instantiates 
{{org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.MemoryMappableBlockLoader}}
 anyway because of the matching simple name?  Catching 
{{ReflectiveOperationException}} and just logging it masks a serious 
misconfiguration or error condition.

There's quite a few other exceptions being swallowed or just logged when 
dubious "bad things" happen.  Or discarding of exceptions and rethrowing 
generic "something went wrong" exceptions w/o the original exception as a 
cause.  That complicates debugging.

New methods in {{FsDatasetUtil}} are not ok.  Invalid arguments are not 
"ignorable".  That's how bugs creep in which are insanely hard to debug.  Don't 
check if offset is valid, just seek, let it throw if invalid.  Don't ignore 
null filenames, just let it throw.  Catching {{Throwable}} is a bad practice; 
let alone catching, discarding, and throwing a bland exception.

Why throw a {{RuntimeException}} in "getOneLocation"?  Is the intent to take 
down the DN?

Why do a costly compute of the checksum for a mmap file?  Isn't the client is 
going to compute the checksum anyway?

> Implement HDFS cache on SCM by using pure java mapped byte buffer
> -----------------------------------------------------------------
>
>                 Key: HDFS-14355
>                 URL: https://issues.apache.org/jira/browse/HDFS-14355
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: caching, datanode
>            Reporter: Feilong He
>            Assignee: Feilong He
>            Priority: Major
>         Attachments: HDFS-14355.000.patch, HDFS-14355.001.patch, 
> HDFS-14355.002.patch, HDFS-14355.003.patch
>
>
> This task is to implement the caching to persistent memory using pure 
> {{java.nio.MappedByteBuffer}}, which could be useful in case native support 
> isn't available or convenient in some environments or platforms.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to