[
https://issues.apache.org/jira/browse/HADOOP-10560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13991314#comment-13991314
]
Andrei Savu commented on HADOOP-10560:
--------------------------------------
I had quick look over the patch. Some improvements ideas:
* that property should be named fs.s3n.copyThreads to be consistent with other
s3n related configs
* you need to shutdown() s3CopyExecutor in a finally {} block to avoid leaking
threads
* set recognizable thread names to make it easy to debug when stuck for
whatever reason (see
http://stackoverflow.com/questions/6113746/naming-threads-and-thread-pools-of-executorservice)
* fix typo in config description - directy - you should probably also add a
note on why this is necessary as an workaround
* nice to have: a better test that actually tests concurrency and not only
correctness
[[email protected]] just curious - is there a better way to get a handle on a
Hadoop managed ExecutorService?
> Update NativeS3FileSystem to issue copy commands for files with in a
> directory with a configurable number of threads
> --------------------------------------------------------------------------------------------------------------------
>
> Key: HADOOP-10560
> URL: https://issues.apache.org/jira/browse/HADOOP-10560
> Project: Hadoop Common
> Issue Type: Improvement
> Components: fs/s3
> Reporter: Ted Malaska
> Assignee: Ted Malaska
> Priority: Minor
> Labels: performance
> Attachments: HADOOP-10560-1.patch, HADOOP-10560.patch
>
>
> In NativeS3FileSystem if you do a copy of a directory it will copy all the
> files to the new location, but it will do this with one thread. Code is
> below. This jira will allow a configurable number of threads to be used to
> issue the copy commands to S3.
> do {
> PartialListing listing = store.list(srcKey, S3_MAX_LISTING_LENGTH,
> priorLastKey, true);
> for (FileMetadata file : listing.getFiles())
> { keysToDelete.add(file.getKey()); store.copy(file.getKey(), dstKey +
> file.getKey().substring(srcKey.length())); }
> priorLastKey = listing.getPriorLastKey();
> } while (priorLastKey != null);
--
This message was sent by Atlassian JIRA
(v6.2#6252)