[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=684640=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-684640
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 22/Nov/21 11:37
Start Date: 22/Nov/21 11:37
Worklog Time Spent: 10m 
  Work Description: sodonnel merged pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 684640)
Time Spent: 3h 50m  (was: 3h 40m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 3h 50m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=684524=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-684524
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 22/Nov/21 05:47
Start Date: 22/Nov/21 05:47
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-975148937


   :confetti_ball: **+1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 41s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m  8s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 50s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  19m  5s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 41s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 27s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  22m  2s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m  9s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  21m  9s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m  8s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m  8s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  
hadoop-common-project/hadoop-common: The patch generated 0 new + 51 unchanged - 
8 fixed = 51 total (was 59)  |
   | +1 :green_heart: |  mvnsite  |   1m 38s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  2s |  |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 44s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 37s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  22m  1s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 23s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 56s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 194m 12s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/4/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint xml |
   | uname | Linux b83ce577d780 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 320d5572c890768fbc8a9cf63b18a0f8c9488998 |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 
/usr/lib/jvm/java-11-openjdk-amd64:Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04 
/usr/lib/jvm/java-8-openjdk-amd64:Private 
Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   |  Test Results | 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/4/testReport/ |
   | Max. process+thread count | 3152 (vs. ulimit of 5500) |
   | modules | C: 

[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-21 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=684494=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-684494
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 22/Nov/21 02:35
Start Date: 22/Nov/21 02:35
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-975017718


   @sodonnel thanks for review, fixed the java docs errors.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 684494)
Time Spent: 3.5h  (was: 3h 20m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 3.5h
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683938=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683938
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 13:41
Start Date: 19/Nov/21 13:41
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-974082796


   There are 3 errors coming up in the Java Docs build job related to the 
changes here:
   
   ```
   [ERROR] 
/home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-3645/ubuntu-focal/src/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java:42:
 error: malformed HTML
   [ERROR]* set thread count by option value, if the value <= 1, use 1 
instead.
   [ERROR] ^
   [ERROR] 
/home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-3645/ubuntu-focal/src/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java:53:
 error: malformed HTML
   [ERROR]* set thread pool queue size by option value, if the value < 1,
   [ERROR]   ^
   [ERROR] 
/home/jenkins/jenkins-agent/workspace/hadoop-multibranch_PR-3645/ubuntu-focal/src/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CommandWithDestination.java:400:
 error: unknown tag: target
   [ERROR]* file "._COPYING_". If the copy is
   ```
   
   I guess it does not like the `<` symbol or the `` as it thinks it 
should be valid HTML.
   
   Could you try to fix those by changing the Java Doc comments and then I 
think this change is good to commit.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683938)
Time Spent: 3h 20m  (was: 3h 10m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 3h 20m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683932
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 12:55
Start Date: 19/Nov/21 12:55
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-974047752


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 53s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  0s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 4 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  34m 57s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 37s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  20m 13s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  0s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 36s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m  7s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  spotbugs  |   2m 27s | 
[/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html)
 |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  24m 20s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   1m  1s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  22m 56s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  22m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m  7s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  20m  7s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   0m 57s |  |  
hadoop-common-project/hadoop-common: The patch generated 0 new + 33 unchanged - 
8 fixed = 33 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  the patch passed  |
   | +1 :green_heart: |  xml  |   0m  1s |  |  The patch has no ill-formed XML 
file.  |
   | -1 :x: |  javadoc  |   1m  4s | 
[/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-common in the patch failed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | +1 :green_heart: |  javadoc  |   1m 37s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  25m  8s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | +1 :green_heart: |  unit  |  17m 20s |  |  hadoop-common in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   0m 49s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 206m 34s |  |  |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint xml |
   | uname | Linux 7ad858b6dcca 4.15.0-147-generic #151-Ubuntu SMP Fri Jun 18 
19:21:19 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | 

[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683869=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683869
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 09:38
Start Date: 19/Nov/21 09:38
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-973909341


   > There is also a failing test in `hadoop.cli.TestCLI`. I have not looked 
into it, but it may be related to these changes as we are changing the CLI 
here. Can you check it too please?
   
   The failure was caused by the modification of command's usage and 
description,  fixed now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683869)
Time Spent: 3h  (was: 2h 50m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 3h
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683867=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683867
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 09:33
Start Date: 19/Nov/21 09:33
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on a change in pull request 
#3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r753016551



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+  Runtime.getRuntime().availableProcessors() * 2;
+
+  public static final Logger LOG =
+  LoggerFactory.getLogger(CopyCommandWithMultiThread.class);
+
+  protected void setThreadCount(String optValue) {
+if (optValue != null) {
+  int count = Integer.parseInt(optValue);
+  threadCount = count < 1 ? 1 : Math.min(count, MAX_THREAD_COUNT);

Review comment:
   Fixed By add docs to these methods.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683867)
Time Spent: 2h 50m  (was: 2h 40m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 2h 50m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683866=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683866
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 09:32
Start Date: 19/Nov/21 09:32
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on a change in pull request 
#3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r753015750



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+fs.mkdirs(dir);
+Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+fs.mkdirs(fromDirPath);
+Path toDirPath = new Path(dir, TO_DIR_NAME);
+fs.mkdirs(toDirPath);
+
+int numTotalFiles = 0;
+int numDirs = RandomUtils.nextInt(0, 5);
+for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+  Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+  fs.mkdirs(subDirPath);
+  int numFiles = RandomUtils.nextInt(0, 10);
+  for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+numTotalFiles++;
+Path subFile = new Path(subDirPath, "file" + fileCount);
+fs.createNewFile(subFile);
+FSDataOutputStream output = fs.create(subFile, true);
+for (int i = 0; i < 100; ++i) {
+  output.writeInt(i);
+  output.writeChar('\n');
+}
+output.close();
+  }
+}
+
+return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new Configuration(false);
+conf.set("fs.file.impl", LocalFileSystem.class.getName());
+fs = FileSystem.getLocal(conf);
+testDir = new FileSystemTestHelper().getTestRootPath(fs);
+// don't want scheme on the path, just an absolute path
+testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+FileSystem.setDefaultUri(conf, fs.getUri());
+fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+fs.delete(testDir, true);
+fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+cmd.setConf(conf);
+assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 1)
+  public void testCopy() throws Exception {
+Path dir = new Path("dir" + RandomStringUtils.randomNumeric(4));
+initialize(dir);
+MultiThreadedCopy copy = new MultiThreadedCopy(1, DEFAULT_QUEUE_SIZE, 0);

Review comment:
   Fixed by created  `@Before `  method




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org


[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683865=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683865
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 09:31
Start Date: 19/Nov/21 09:31
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on a change in pull request 
#3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r753014895



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+fs.mkdirs(dir);
+Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+fs.mkdirs(fromDirPath);
+Path toDirPath = new Path(dir, TO_DIR_NAME);
+fs.mkdirs(toDirPath);
+
+int numTotalFiles = 0;
+int numDirs = RandomUtils.nextInt(0, 5);
+for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+  Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+  fs.mkdirs(subDirPath);
+  int numFiles = RandomUtils.nextInt(0, 10);
+  for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+numTotalFiles++;
+Path subFile = new Path(subDirPath, "file" + fileCount);
+fs.createNewFile(subFile);
+FSDataOutputStream output = fs.create(subFile, true);
+for (int i = 0; i < 100; ++i) {
+  output.writeInt(i);
+  output.writeChar('\n');
+}
+output.close();
+  }
+}
+
+return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new Configuration(false);
+conf.set("fs.file.impl", LocalFileSystem.class.getName());
+fs = FileSystem.getLocal(conf);
+testDir = new FileSystemTestHelper().getTestRootPath(fs);
+// don't want scheme on the path, just an absolute path
+testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+FileSystem.setDefaultUri(conf, fs.getUri());
+fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+fs.delete(testDir, true);
+fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+cmd.setConf(conf);
+assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 1)

Review comment:
   Fixed 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683865)
Time Spent: 2.5h  (was: 2h 20m)

> Enable get command run with multi-thread
> 

[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-19 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683864=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683864
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 09:30
Start Date: 19/Nov/21 09:30
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-973903891


   @sodonnel  thanks for your detailed review, I have removed the limit of 
thread count, and fixed the problems in unit tests. Please help review again.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683864)
Time Spent: 2h 20m  (was: 2h 10m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683754
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 03:04
Start Date: 19/Nov/21 03:04
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on a change in pull request 
#3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752825112



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+  Runtime.getRuntime().availableProcessors() * 2;

Review comment:
   
   >  I guess it is best to avoid having no limit
   
   I agree with you, no limit and decided by users is the best way. Set the 
limit just for keep same  with the original. I will cancel it.
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683754)
Time Spent: 2h 10m  (was: 2h)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 2h 10m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683753
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 19/Nov/21 02:48
Start Date: 19/Nov/21 02:48
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on a change in pull request 
#3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752825112



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+  Runtime.getRuntime().availableProcessors() * 2;

Review comment:
   I agree with you, no limit is the best way. I set the limit just for 
keep same  with the original limit. I will cancel it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683753)
Time Spent: 2h  (was: 1h 50m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683288=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683288
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 18/Nov/21 13:09
Start Date: 18/Nov/21 13:09
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-972850753


   Thanks for working on this @smarthanwang. The change looks mostly good to 
me. I have just a few minor comments.
   
   There is also a failing test in `hadoop.cli.TestCLI`. I have not looked into 
it, but it may be related to these changes as we are changing the CLI here. Can 
you check it too please?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683288)
Time Spent: 1h 50m  (was: 1h 40m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683284=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683284
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 18/Nov/21 13:05
Start Date: 18/Nov/21 13:05
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752221314



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+  Runtime.getRuntime().availableProcessors() * 2;
+
+  public static final Logger LOG =
+  LoggerFactory.getLogger(CopyCommandWithMultiThread.class);
+
+  protected void setThreadCount(String optValue) {
+if (optValue != null) {
+  int count = Integer.parseInt(optValue);
+  threadCount = count < 1 ? 1 : Math.min(count, MAX_THREAD_COUNT);

Review comment:
   Should we warn here if the thread count is being reduced due to the 
MAX_THREAD_COUNT in a similar way to `setThreadPoolQueueSize`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683284)
Time Spent: 1h 40m  (was: 1.5h)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 1h 40m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683281=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683281
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 18/Nov/21 13:04
Start Date: 18/Nov/21 13:04
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752220174



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/CopyCommandWithMultiThread.java
##
@@ -0,0 +1,157 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ArrayBlockingQueue;
+import java.util.concurrent.ThreadPoolExecutor;
+import java.util.concurrent.TimeUnit;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import org.apache.hadoop.classification.VisibleForTesting;
+
+/**
+ * Abstract command to enable sub copy commands run with multi-thread.
+ */
+public abstract class CopyCommandWithMultiThread
+extends CommandWithDestination {
+
+  private int threadCount = 1;
+  private ThreadPoolExecutor executor = null;
+  private int threadPoolQueueSize = DEFAULT_QUEUE_SIZE;
+
+  public static final int DEFAULT_QUEUE_SIZE = 1024;
+  public static final int MAX_THREAD_COUNT =
+  Runtime.getRuntime().availableProcessors() * 2;

Review comment:
   I wonder if we should limit the number of threads like this. Its hard to 
say if the copy will be CPU bound, or disk / IO bound overall. I guess it is 
best to avoid having no limit, but I wonder if having 2 * cores would be enough 
for a small VM trying to put a large dir into the cluster. What do you think? 
Maybe setting the limit to 4 or 8 * cores would be more flexible for users and 
they can experiment with their own hardware to find the best setting?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683281)
Time Spent: 1.5h  (was: 1h 20m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 1.5h
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683274=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683274
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 18/Nov/21 12:58
Start Date: 18/Nov/21 12:58
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752215572



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+fs.mkdirs(dir);
+Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+fs.mkdirs(fromDirPath);
+Path toDirPath = new Path(dir, TO_DIR_NAME);
+fs.mkdirs(toDirPath);
+
+int numTotalFiles = 0;
+int numDirs = RandomUtils.nextInt(0, 5);
+for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+  Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+  fs.mkdirs(subDirPath);
+  int numFiles = RandomUtils.nextInt(0, 10);
+  for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+numTotalFiles++;
+Path subFile = new Path(subDirPath, "file" + fileCount);
+fs.createNewFile(subFile);
+FSDataOutputStream output = fs.create(subFile, true);
+for (int i = 0; i < 100; ++i) {
+  output.writeInt(i);
+  output.writeChar('\n');
+}
+output.close();
+  }
+}
+
+return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new Configuration(false);
+conf.set("fs.file.impl", LocalFileSystem.class.getName());
+fs = FileSystem.getLocal(conf);
+testDir = new FileSystemTestHelper().getTestRootPath(fs);
+// don't want scheme on the path, just an absolute path
+testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+FileSystem.setDefaultUri(conf, fs.getUri());
+fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+fs.delete(testDir, true);
+fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+cmd.setConf(conf);
+assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 1)
+  public void testCopy() throws Exception {
+Path dir = new Path("dir" + RandomStringUtils.randomNumeric(4));
+initialize(dir);
+MultiThreadedCopy copy = new MultiThreadedCopy(1, DEFAULT_QUEUE_SIZE, 0);

Review comment:
   Every test starts with these two lines:
   
   ```
   Path dir = new Path("dir" + RandomStringUtils.randomNumeric(4));
   initialize(dir);
   ```
   
   Do you think it would be better to create a `@Before` method to run before 
each test?




-- 
This is an automated message 

[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-18 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683272=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683272
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 18/Nov/21 12:56
Start Date: 18/Nov/21 12:56
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on a change in pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#discussion_r752213829



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/shell/TestCopyToLocal.java
##
@@ -0,0 +1,230 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.IOException;
+import java.util.LinkedList;
+import java.util.concurrent.ThreadPoolExecutor;
+
+import org.junit.AfterClass;
+import org.junit.Assert;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import org.apache.commons.lang3.RandomStringUtils;
+import org.apache.commons.lang3.RandomUtils;
+import org.apache.hadoop.conf.Configuration;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.FileSystemTestHelper;
+import org.apache.hadoop.fs.LocalFileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.shell.CopyCommands.CopyToLocal;
+
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.DEFAULT_QUEUE_SIZE;
+import static 
org.apache.hadoop.fs.shell.CopyCommandWithMultiThread.MAX_THREAD_COUNT;
+import static org.junit.Assert.assertEquals;
+
+public class TestCopyToLocal {
+
+  private static final String FROM_DIR_NAME = "fromDir";
+  private static final String TO_DIR_NAME = "toDir";
+
+  private static FileSystem fs;
+  private static Path testDir;
+  private static Configuration conf;
+
+  private static int initialize(Path dir) throws Exception {
+fs.mkdirs(dir);
+Path fromDirPath = new Path(dir, FROM_DIR_NAME);
+fs.mkdirs(fromDirPath);
+Path toDirPath = new Path(dir, TO_DIR_NAME);
+fs.mkdirs(toDirPath);
+
+int numTotalFiles = 0;
+int numDirs = RandomUtils.nextInt(0, 5);
+for (int dirCount = 0; dirCount < numDirs; ++dirCount) {
+  Path subDirPath = new Path(fromDirPath, "subdir" + dirCount);
+  fs.mkdirs(subDirPath);
+  int numFiles = RandomUtils.nextInt(0, 10);
+  for (int fileCount = 0; fileCount < numFiles; ++fileCount) {
+numTotalFiles++;
+Path subFile = new Path(subDirPath, "file" + fileCount);
+fs.createNewFile(subFile);
+FSDataOutputStream output = fs.create(subFile, true);
+for (int i = 0; i < 100; ++i) {
+  output.writeInt(i);
+  output.writeChar('\n');
+}
+output.close();
+  }
+}
+
+return numTotalFiles;
+  }
+
+  @BeforeClass
+  public static void init() throws Exception {
+conf = new Configuration(false);
+conf.set("fs.file.impl", LocalFileSystem.class.getName());
+fs = FileSystem.getLocal(conf);
+testDir = new FileSystemTestHelper().getTestRootPath(fs);
+// don't want scheme on the path, just an absolute path
+testDir = new Path(fs.makeQualified(testDir).toUri().getPath());
+
+FileSystem.setDefaultUri(conf, fs.getUri());
+fs.setWorkingDirectory(testDir);
+  }
+
+  @AfterClass
+  public static void cleanup() throws Exception {
+fs.delete(testDir, true);
+fs.close();
+  }
+
+  private void run(CopyCommandWithMultiThread cmd, String... args) {
+cmd.setConf(conf);
+assertEquals(0, cmd.run(args));
+  }
+
+  @org.junit.Test(timeout = 1)

Review comment:
   Nit: There is a mixture of `@org.junit.Test` and `@Test` annotations in 
this class - can you change them all to just `@Test`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

  

[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683123=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683123
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 18/Nov/21 07:26
Start Date: 18/Nov/21 07:26
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-972603877


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 34s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +0 :ok: |  markdownlint  |   0m  1s |  |  markdownlint was not available.  
|
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 26s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 54s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  19m  9s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  8s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 39s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 45s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | -1 :x: |  spotbugs  |   2m 28s | 
[/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/branch-spotbugs-hadoop-common-project_hadoop-common-warnings.html)
 |  hadoop-common-project/hadoop-common in trunk has 1 extant spotbugs 
warnings.  |
   | +1 :green_heart: |  shadedclient  |  22m  8s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 57s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 18s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  21m 18s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m  9s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m  9s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  7s |  |  
hadoop-common-project/hadoop-common: The patch generated 0 new + 33 unchanged - 
8 fixed = 33 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 37s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m 10s | 
[/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-common in the patch failed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | +1 :green_heart: |  javadoc  |   1m 42s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 36s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 53s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  17m 18s | 
[/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 58s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 194m 25s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.cli.TestCLI |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/2/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell markdownlint |
   | uname | 

[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=683078=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-683078
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 18/Nov/21 04:14
Start Date: 18/Nov/21 04:14
Worklog Time Spent: 10m 
  Work Description: smarthanwang commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-972522580


   @sodonnel thanks for your comment,  I have added -t and -q parameters to the 
docs, and fixed some mistakes in docs meanwhile.  


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 683078)
Time Spent: 50m  (was: 40m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=682480=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-682480
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 17/Nov/21 09:48
Start Date: 17/Nov/21 09:48
Worklog Time Spent: 10m 
  Work Description: sodonnel edited a comment on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-971410375


   This seems like a good change. I will try to review in more detail in the 
next few days.
   
   Could you add the new -t parameter to the docs in 
`./hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md` as 
part of this PR please? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 682480)
Time Spent: 40m  (was: 0.5h)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-17 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=682477=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-682477
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 17/Nov/21 09:48
Start Date: 17/Nov/21 09:48
Worklog Time Spent: 10m 
  Work Description: sodonnel commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-971410375


   This seems like a good change. I will to review in more detail in the next 
few days.
   
   Could you add the new -t parameter to the docs in 
`./hadoop-common-project/hadoop-common/src/site/markdown/FileSystemShell.md` as 
part of this PR please? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 682477)
Time Spent: 0.5h  (was: 20m)

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
>  Labels: pull-request-available
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org



[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=680178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-680178
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 11/Nov/21 07:29
Start Date: 11/Nov/21 07:29
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus commented on pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645#issuecomment-966055357


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime |  Logfile | Comment |
   |::|--:|:|::|:---:|
   | +0 :ok: |  reexec  |   0m 47s |  |  Docker mode activated.  |
    _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  |  No case conflicting files 
found.  |
   | +0 :ok: |  codespell  |   0m  1s |  |  codespell was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  |  The patch does not contain 
any @author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  |  The patch appears to 
include 3 new or modified test files.  |
    _ trunk Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |  32m 23s |  |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 57s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  compile  |  18m 53s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  checkstyle  |   1m  9s |  |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   1m 38s |  |  trunk passed  |
   | +1 :green_heart: |  javadoc  |   1m 10s |  |  trunk passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javadoc  |   1m 40s |  |  trunk passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 26s |  |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m 33s |  |  branch has no errors 
when building and testing our client artifacts.  |
    _ Patch Compile Tests _ |
   | +1 :green_heart: |  mvninstall  |   0m 56s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  20m 59s |  |  the patch passed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04  |
   | +1 :green_heart: |  javac  |  20m 59s |  |  the patch passed  |
   | +1 :green_heart: |  compile  |  19m 16s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  javac  |  19m 16s |  |  the patch passed  |
   | +1 :green_heart: |  blanks  |   0m  0s |  |  The patch has no blanks 
issues.  |
   | +1 :green_heart: |  checkstyle  |   1m  6s |  |  
hadoop-common-project/hadoop-common: The patch generated 0 new + 33 unchanged - 
8 fixed = 33 total (was 41)  |
   | +1 :green_heart: |  mvnsite  |   1m 34s |  |  the patch passed  |
   | -1 :x: |  javadoc  |   1m  6s | 
[/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/artifact/out/patch-javadoc-hadoop-common-project_hadoop-common-jdkUbuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.txt)
 |  hadoop-common in the patch failed with JDK 
Ubuntu-11.0.11+9-Ubuntu-0ubuntu2.20.04.  |
   | +1 :green_heart: |  javadoc  |   1m 41s |  |  the patch passed with JDK 
Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10  |
   | +1 :green_heart: |  spotbugs  |   2m 33s |  |  the patch passed  |
   | +1 :green_heart: |  shadedclient  |  21m 49s |  |  patch has no errors 
when building and testing our client artifacts.  |
    _ Other Tests _ |
   | -1 :x: |  unit  |  17m 24s | 
[/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt)
 |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   1m  0s |  |  The patch does not 
generate ASF License warnings.  |
   |  |   | 193m 40s |  |  |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | hadoop.cli.TestCLI |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.41 ServerAPI=1.41 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3645/1/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/3645 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient spotbugs checkstyle codespell |
   | uname | Linux eb09f5500377 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / ee47b1a001a3f68d7d9bb0f6ac1878315a14cbce |
   | Default Java | Private Build-1.8.0_292-8u292-b10-0ubuntu1~20.04-b10 |
   | Multi-JDK versions | 

[jira] [Work logged] (HADOOP-17998) Enable get command run with multi-thread

2021-11-10 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HADOOP-17998?focusedWorklogId=680144=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-680144
 ]

ASF GitHub Bot logged work on HADOOP-17998:
---

Author: ASF GitHub Bot
Created on: 11/Nov/21 04:14
Start Date: 11/Nov/21 04:14
Worklog Time Spent: 10m 
  Work Description: smarthanwang opened a new pull request #3645:
URL: https://github.com/apache/hadoop/pull/3645


   ### Description
   CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
HADOOP-14698, and make put dirs or multiple files faster.So, It's necessary to 
enable get and copyToLocal command run with multi-thread.
   
   ### Tests
   Test case: 1 dir 240 files  13G
   
   **1. Get with single thread.**
   
   ```
   time hadoop fs -get /tmp/data/20211101 .
   real6m28.785s
   user0m18.844s
   sys0m44.953s
   ```
   **2. Get with 10 threads.**
   ```
   time hadoop fs -get -t 10 /tmp/data/20211101 .
   real0m58.721s
   user0m21.386s
   sys0m54.066s
   ```
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 680144)
Remaining Estimate: 0h
Time Spent: 10m

> Enable get command run with multi-thread
> 
>
> Key: HADOOP-17998
> URL: https://issues.apache.org/jira/browse/HADOOP-17998
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: fs
>Affects Versions: 3.3.1
>Reporter: Chengwei Wang
>Priority: Major
> Attachments: HADOOP-17998.001.patch, HADOOP-17998.002.patch
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> CopyFromLocal/Put is enabled to run with multi-thread with HDFS-11786 and 
> HADOOP-14698, and make put dirs or multiple files faster.
> So, It's necessary to enable get and copyToLocal command run with 
> multi-thread.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org