[
https://issues.apache.org/jira/browse/HDFS-12093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103078#comment-16103078
]
Ewan Higgs commented on HDFS-12093:
-----------------------------------
Tested this on a simple 1 NN 1 DN shared machine and it was able to start the
DN much faster. So that aspect is fixed.
One issue I did run into, however, is an exception in the FsVolumeSpi. I'm not
sure if it's related:
{code}
2017-07-27 13:18:45,599 INFO impl.FsVolumeImpl: Adding ScanInfo for blkid
1073741825
2017-07-27 13:18:45,600 ERROR datanode.DirectoryScanner: Error compiling report
for the volume, StorageId: DS-e89a096e-ba2c-4e85-bf2b-5321e8f93852
java.util.concurrent.ExecutionException: java.lang.IllegalArgumentException:
URI scheme is not "file"
at java.util.concurrent.FutureTask.report(FutureTask.java:122)
at java.util.concurrent.FutureTask.get(FutureTask.java:192)
at
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.getDiskReport(DirectoryScanner.java:544)
at
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.scan(DirectoryScanner.java:393)
at
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.reconcile(DirectoryScanner.java:375)
at
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner.run(DirectoryScanner.java:320)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.IllegalArgumentException: URI scheme is not "file"
at java.io.File.<init>(File.java:421)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.FsVolumeSpi$ScanInfo.<init>(FsVolumeSpi.java:319)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ProvidedVolumeImpl$ProvidedBlockPoolSlice.compileReport(ProvidedVolumeImpl.java:151)
at
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.ProvidedVolumeImpl.compileReport(ProvidedVolumeImpl.java:482)
at
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner$ReportCompiler.call(DirectoryScanner.java:618)
at
org.apache.hadoop.hdfs.server.datanode.DirectoryScanner$ReportCompiler.call(DirectoryScanner.java:581)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
{code}
This is coming from the following:
{code}
public ScanInfo(long blockId, File blockFile, File metaFile,
FsVolumeSpi vol, FileRegion fileRegion, long length) {
this.blockId = blockId;
String condensedVolPath =
(vol == null || vol.getBaseURI() == null) ? null :
getCondensedPath(new File(vol.getBaseURI()).getAbsolutePath()); //
<-- vol.getBaseURI will return my volume's scheme (s3a).
this.blockSuffix = blockFile == null ? null :
getSuffix(blockFile, condensedVolPath);
this.blockLength = length;
if (metaFile == null) {
this.metaSuffix = null;
} else if (blockFile == null) {
this.metaSuffix = getSuffix(metaFile, condensedVolPath);
} else {
this.metaSuffix = getSuffix(metaFile,
condensedVolPath + blockSuffix);
}
this.volume = vol;
this.fileRegion = fileRegion;
}
{code}
Not sure if this is related or needs to be fixed under this ticket.
> [READ] Share remoteFS between ProvidedReplica instances.
> --------------------------------------------------------
>
> Key: HDFS-12093
> URL: https://issues.apache.org/jira/browse/HDFS-12093
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Ewan Higgs
> Attachments: HDFS-12093-HDFS-9806.001.patch
>
>
> When a Datanode comes online using Provided storage, it fills the
> {{ReplicaMap}} with the known replicas. With Provided Storage, this includes
> {{ProvidedReplica}} instances. Each of these objects, in their constructor,
> will construct an FileSystem using the Service Provider. This can result in
> contacting the remote file system and checking that the credentials are
> correct and that the data is there. For large systems this is a prohibitively
> expensive operation to perform per replica.
> Instead, the {{ProvidedVolumeImpl}} should own the reference to the
> {{remoteFS}} and should share it with the {{ProvidedReplica}} objects on
> their creation.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]