[
https://issues.apache.org/jira/browse/HADOOP-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris Douglas updated HADOOP-5958:
----------------------------------
Fix Version/s: (was: 0.21.0)
0.22.0
Status: Open (was: Patch Available)
bq. what happens when you run that same benchmark against an NFS drive?
Every class using DF assumes- reasonably- that the resource behaves like a
local drive. Configuring e.g. LocalDirAllocator or FSDataset to use a remote FS
and then worrying that DF might take milliseconds instead of microseconds is
focusing on the noise.
I don't understand the virtue of the current approach, as DF seems like a fixed
set of functionality not meriting an abstract class, factory, etc. Is PosixDF
needed for anything but getMount and getFilesystem? The latter has no callers
and the former has one that creates an instance and discards all but the mount.
This suggests two approaches:
# If nearly all uses of DF involve delegating to a java.io.File, is there any
reason not to simply replace uses of DF with File? Everywhere DF is used, a
local FS is assumed. As Steve points out, if this were otherwise, other designs
would be preferred.
# If DF retained Shell as a subtype and used the java.io.File methods where
appropriate, is the cost really prohibitive? Few of these are created and other
than a faster implementation of most calls, nothing else changes. The costs
incurred for keeping everything intact appear trivial.
Neither of these is an incompatible change, assuming java.io.File is correctly
implemented. They're not even mutually exclusive.
A few nits:
* Incompatibly moving {{DF::main}} seems unnecessary
* The comment on {{DF_INTERVAL_DEFAULT}} should be javadoc
* While the original didn't have it either, DF methods should have javadoc
* In {{getPercentUsed}}, using {{cap}} in the calculation of {{used}} avoids
the second call to {{getCapacity}}
* If the current design is retained (because some architecture has a faulty
java.io.File impl?), it should be possible to use PosixDF exclusively using the
config passed to {{DF::getDF}} (could be named {{DF::get}}).
* The current patch also requires changes to HDFS that must be committed with
these. If retained, please open an issue and link
* {{getFilesystem}} and {{getMounts}} should probably be deprecated, even
removed from DF since one needs to explicitly instantiate PosixDF to make the
call. The only reason {{getMounts}} is there is because that's the command it's
scraped from, anyway.
> Use JDK 1.6 File APIs in DF.java wherever possible
> --------------------------------------------------
>
> Key: HADOOP-5958
> URL: https://issues.apache.org/jira/browse/HADOOP-5958
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs
> Reporter: Devaraj Das
> Assignee: Aaron Kimball
> Fix For: 0.22.0
>
> Attachments: HADOOP-5958-hdfs.patch, HADOOP-5958-mapred.patch,
> HADOOP-5958.2.patch, HADOOP-5958.3.patch, HADOOP-5958.4.patch,
> HADOOP-5958.patch
>
>
> JDK 1.6 has File APIs like File.getFreeSpace() which should be used instead
> of spawning a command process for getting the various disk/partition related
> attributes. This would avoid spikes in memory consumption by tasks when
> things like LocalDirAllocator is used for creating paths on the filesystem.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.