[
https://issues.apache.org/jira/browse/HADOOP-13403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15412269#comment-15412269
]
Hadoop QA commented on HADOOP-13403:
------------------------------------
| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
12s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m
0s{color} | {color:green} The patch appears to include 2 new or modified test
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 9m
18s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
19s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
15s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m
21s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 1m
6s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m
14s{color} | {color:green} hadoop-tools/hadoop-azure: The patch generated 0 new
+ 43 unchanged - 1 fixed = 43 total (was 44) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m
20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m
39s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m
29s{color} | {color:green} hadoop-azure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
18s{color} | {color:green} The patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 17m 42s{color} |
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Image:yetus/hadoop:9560f25 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12822628/HADOOP-13403-006.patch
|
| JIRA Issue | HADOOP-13403 |
| Optional Tests | asflicense compile javac javadoc mvninstall mvnsite
unit findbugs checkstyle |
| uname | Linux 8be9946d46ca 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh
|
| git revision | trunk / 6255859 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| Test Results |
https://builds.apache.org/job/PreCommit-HADOOP-Build/10201/testReport/ |
| modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure |
| Console output |
https://builds.apache.org/job/PreCommit-HADOOP-Build/10201/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org |
This message was automatically generated.
> AzureNativeFileSystem rename/delete performance improvements
> ------------------------------------------------------------
>
> Key: HADOOP-13403
> URL: https://issues.apache.org/jira/browse/HADOOP-13403
> Project: Hadoop Common
> Issue Type: Bug
> Components: azure
> Affects Versions: 2.7.2
> Reporter: Subramanyam Pattipaka
> Assignee: Subramanyam Pattipaka
> Fix For: 2.9.0
>
> Attachments: HADOOP-13403-001.patch, HADOOP-13403-002.patch,
> HADOOP-13403-003.patch, HADOOP-13403-004.patch, HADOOP-13403-005.patch,
> HADOOP-13403-006.patch
>
>
> WASB Performance Improvements
> Problem
> -----------
> Azure Native File system operations like rename/delete which has large number
> of directories and/or files in the source directory are experiencing
> performance issues. Here are possible reasons
> a) We first list all files under source directory hierarchically. This is
> a serial operation.
> b) After collecting the entire list of files under a folder, we delete or
> rename files one by one serially.
> c) There is no logging information available for these costly operations
> even in DEBUG mode leading to difficulty in understanding wasb performance
> issues.
> Proposal
> -------------
> Step 1: Rename and delete operations will generate a list all files under the
> source folder. We need to use azure flat listing option to get list with
> single request to azure store. We have introduced config
> fs.azure.flatlist.enable to enable this option. The default value is 'false'
> which means flat listing is disabled.
> Step 2: Create thread pool and threads dynamically based on user
> configuration. These thread pools will be deleted after operation is over.
> We are introducing introducing two new configs
> a) fs.azure.rename.threads : Config to set number of rename
> threads. Default value is 0 which means no threading.
> b) fs.azure.delete.threads: Config to set number of delete
> threads. Default value is 0 which means no threading.
> We have provided debug log information on number of threads not used
> for the operation which can be useful .
> Failure Scenarios:
> If we fail to create thread pool due to ANY reason (for example trying
> create with thread count with large value such as 1000000), we fall back to
> serialization operation.
> Step 3: Bob operations can be done in parallel using multiple threads
> executing following snippet
> while ((currentIndex = fileIndex.getAndIncrement()) < files.length) {
> FileMetadata file = files[currentIndex];
> Rename/delete(file);
> }
> The above strategy depends on the fact that all files are stored in a
> final array and each thread has to determine synchronized next index to do
> the job. The advantage of this strategy is that even if user configures large
> number of unusable threads, we always ensure that work doesn’t get serialized
> due to lagging threads.
> We are logging following information which can be useful for tuning
> number of threads
> a) Number of unusable threads
> b) Time taken by each thread
> c) Number of files processed by each thread
> d) Total time taken for the operation
> Failure Scenarios:
> Failure to queue a thread execute request shouldn’t be an issue if we
> can ensure at least one thread has completed execution successfully. If we
> couldn't schedule one thread then we should take serialization path.
> Exceptions raised while executing threads are still considered regular
> exceptions and returned to client as operation failed. Exceptions raised
> while stopping threads and deleting thread pool shouldn't can be ignored if
> operation all files are done with out any issue.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]