date:20201001

[GitHub] [hadoop] bshashikant commented on pull request #2355: HDFS-15611. Add list Snapshot command in WebHDFS.

2020-10-01 Thread GitBox



bshashikant commented on pull request #2355:
URL: https://github.com/apache/hadoop/pull/2355#issuecomment-702536750


   > Thanks @bshashikant . Should we add `SnapshotStatus[] 
getSnapshotListing(Path snapshotRoot)`
   > to FileSystem? If yes, we should move SnapshotStatus to 
org.apache.hadoop.fs and declare dirStatus using FileStatus.
   
   Thanks @szetszwo . I checked the code and saw that 
"getSnapshottableDirectoryList()" is also not moved to Filesystem class yet. I 
would prefer no to move any of these to Filesystem class here.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493820
 ]

ASF GitHub Bot logged work on HADOOP-17292:
---

Author: ASF GitHub Bot
Created on: 02/Oct/20 05:19
Start Date: 02/Oct/20 05:19
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2350:
URL: https://github.com/apache/hadoop/pull/2350#discussion_r498622029



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java
##
@@ -236,7 +237,7 @@ public synchronized int compress(byte[] b, int off, int len)
 }
 
 // Compress data
-n = useLz4HC ? compressBytesDirectHC() : compressBytesDirect();
+n = compressBytesDirect();

Review comment:
   fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493820)
Time Spent: 1h 10m  (was: 1h)

> Using lz4-java in Lz4Codec
> --
>
> Key: HADOOP-17292
> URL: https://issues.apache.org/jira/browse/HADOOP-17292
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for lz4 codec which has several disadvantages:
> It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and 
> they have to be installed separately on each node of the clusters, container 
> images, or local test environments which adds huge complexities from 
> deployment point of view. In some environments, it requires compiling the 
> natives from sources which is non-trivial. Also, this approach is platform 
> dependent; the binary may not work in different platform, so it requires 
> recompilation.
> It requires extra configuration of java.library.path to load the natives, and 
> it results higher application deployment and maintenance cost for users.
> Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which 
> is JNI-based implementation. It contains native binaries in jar file, and it 
> can automatically load the native binaries into JVM from jar without any 
> setup. If a native implementation can not be found for a platform, it can 
> fallback to pure-java implementation of lz4.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] viirya commented on a change in pull request #2350: HADOOP-17292. Using lz4-java in Lz4Codec

2020-10-01 Thread GitBox



viirya commented on a change in pull request #2350:
URL: https://github.com/apache/hadoop/pull/2350#discussion_r498622029



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java
##
@@ -236,7 +237,7 @@ public synchronized int compress(byte[] b, int off, int len)
 }
 
 // Compress data
-n = useLz4HC ? compressBytesDirectHC() : compressBytesDirect();
+n = compressBytesDirect();

Review comment:
   fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493819=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493819
 ]

ASF GitHub Bot logged work on HADOOP-17292:
---

Author: ASF GitHub Bot
Created on: 02/Oct/20 05:15
Start Date: 02/Oct/20 05:15
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2350:
URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621489



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java
##
@@ -143,22 +143,16 @@ public void testSnappyCodec() throws IOException {
   
   @Test
   public void testLz4Codec() throws IOException {
-if (NativeCodeLoader.isNativeCodeLoaded()) {
-  if (Lz4Codec.isNativeCodeLoaded()) {
-conf.setBoolean(
+conf.setBoolean(
 CommonConfigurationKeys.IO_COMPRESSION_CODEC_LZ4_USELZ4HC_KEY,

Review comment:
   fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493819)
Time Spent: 1h  (was: 50m)

> Using lz4-java in Lz4Codec
> --
>
> Key: HADOOP-17292
> URL: https://issues.apache.org/jira/browse/HADOOP-17292
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for lz4 codec which has several disadvantages:
> It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and 
> they have to be installed separately on each node of the clusters, container 
> images, or local test environments which adds huge complexities from 
> deployment point of view. In some environments, it requires compiling the 
> natives from sources which is non-trivial. Also, this approach is platform 
> dependent; the binary may not work in different platform, so it requires 
> recompilation.
> It requires extra configuration of java.library.path to load the natives, and 
> it results higher application deployment and maintenance cost for users.
> Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which 
> is JNI-based implementation. It contains native binaries in jar file, and it 
> can automatically load the native binaries into JVM from jar without any 
> setup. If a native implementation can not be found for a platform, it can 
> fallback to pure-java implementation of lz4.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] viirya commented on a change in pull request #2350: HADOOP-17292. Using lz4-java in Lz4Codec

2020-10-01 Thread GitBox



viirya commented on a change in pull request #2350:
URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621489



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java
##
@@ -143,22 +143,16 @@ public void testSnappyCodec() throws IOException {
   
   @Test
   public void testLz4Codec() throws IOException {
-if (NativeCodeLoader.isNativeCodeLoaded()) {
-  if (Lz4Codec.isNativeCodeLoaded()) {
-conf.setBoolean(
+conf.setBoolean(
 CommonConfigurationKeys.IO_COMPRESSION_CODEC_LZ4_USELZ4HC_KEY,

Review comment:
   fixed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493818
 ]

ASF GitHub Bot logged work on HADOOP-17292:
---

Author: ASF GitHub Bot
Created on: 02/Oct/20 05:13
Start Date: 02/Oct/20 05:13
Worklog Time Spent: 10m 
  Work Description: viirya commented on a change in pull request #2350:
URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621151



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -494,8 +494,7 @@ public String getName() {
   private static  boolean 
isAvailable(TesterPair pair) {
 Compressor compressor = pair.compressor;
 
-if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)
-&& (NativeCodeLoader.isNativeCodeLoaded()))
+if (compressor.getClass().isAssignableFrom(Lz4Compressor.class))

Review comment:
   Added compatibility test.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493818)
Time Spent: 50m  (was: 40m)

> Using lz4-java in Lz4Codec
> --
>
> Key: HADOOP-17292
> URL: https://issues.apache.org/jira/browse/HADOOP-17292
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: L. C. Hsieh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for lz4 codec which has several disadvantages:
> It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and 
> they have to be installed separately on each node of the clusters, container 
> images, or local test environments which adds huge complexities from 
> deployment point of view. In some environments, it requires compiling the 
> natives from sources which is non-trivial. Also, this approach is platform 
> dependent; the binary may not work in different platform, so it requires 
> recompilation.
> It requires extra configuration of java.library.path to load the natives, and 
> it results higher application deployment and maintenance cost for users.
> Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which 
> is JNI-based implementation. It contains native binaries in jar file, and it 
> can automatically load the native binaries into JVM from jar without any 
> setup. If a native implementation can not be found for a platform, it can 
> fallback to pure-java implementation of lz4.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] viirya commented on a change in pull request #2350: HADOOP-17292. Using lz4-java in Lz4Codec

2020-10-01 Thread GitBox



viirya commented on a change in pull request #2350:
URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621151



##
File path: 
hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java
##
@@ -494,8 +494,7 @@ public String getName() {
   private static  boolean 
isAvailable(TesterPair pair) {
 Compressor compressor = pair.compressor;
 
-if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)
-&& (NativeCodeLoader.isNativeCodeLoaded()))
+if (compressor.getClass().isAssignableFrom(Lz4Compressor.class))

Review comment:
   Added compatibility test.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205970#comment-17205970
 ] 

Attila Doroszlai commented on HADOOP-16990:
---

Sure, I ran these tests before uploading the patch:

{code}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running 
org.apache.hadoop.hdfs.web.oauth2.TestRefreshTokenTimeBasedTokenRefresher
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.006 s 
- in org.apache.hadoop.hdfs.web.oauth2.TestRefreshTokenTimeBasedTokenRefresher
[INFO] Running 
org.apache.hadoop.hdfs.web.oauth2.TestClientCredentialTimeBasedTokenRefresher
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.908 s 
- in 
org.apache.hadoop.hdfs.web.oauth2.TestClientCredentialTimeBasedTokenRefresher
[INFO] Running org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.2 s - 
in org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2
[INFO]
[INFO] Results:
[INFO]
[INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0
{code}

But it looks like Yetus also runs tests with the patch: 
https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/85/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: HADOOP-16990.001.patch
>
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] szetszwo commented on a change in pull request #2355: HDFS-15611. Add list Snapshot command in WebHDFS.

2020-10-01 Thread GitBox



szetszwo commented on a change in pull request #2355:
URL: https://github.com/apache/hadoop/pull/2355#discussion_r498605120



##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java
##
@@ -1459,6 +1460,19 @@ SnapshotDiffReport decodeResponse(Map json) {
 }.run();
   }
 
+  public SnapshotStatus[] getSnapshotList(final Path snapshotDir)

Review comment:
   In DistributedFileSystem, it is called
   `  public SnapshotStatus[] getSnapshotListing(Path snapshotRoot)`
   Let's use the same name?
   

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotStatus.java
##
@@ -25,6 +25,7 @@
 import org.apache.hadoop.fs.Path;
 import org.apache.hadoop.fs.permission.FsPermission;
 import org.apache.hadoop.hdfs.DFSUtilClient;
+import org.apache.hadoop.util.StringUtils;

Review comment:
   Unused import.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java
##
@@ -1582,6 +1584,18 @@ public SnapshotDiffReport getSnapshotDiffReport(Path 
path,
 return JsonUtilClient.toSnapshottableDirectoryList(json);
   }
 
+  public SnapshotStatus[] getSnapshotList(Path snapshotRoot)

Review comment:
   It should call getSnapshotListing.

##
File path: 
hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/JsonUtilClient.java
##
@@ -872,4 +873,39 @@ private static SnapshottableDirectoryStatus 
toSnapshottableDirectoryStatus(
 snapshotQuota, parentFullPath);
 return snapshottableDirectoryStatus;
   }
+
+  public static SnapshotStatus[] toSnapshotList(final Map json) {
+if (json == null) {
+  return null;
+}
+List list = (List) json.get("SnapshotList");
+if (list == null) {
+  return null;
+}
+SnapshotStatus[] statuses =
+new SnapshotStatus[list.size()];
+for (int i = 0; i < list.size(); i++) {
+  statuses[i] = toSnapshotStatus((Map) list.get(i));
+}
+return statuses;
+  }
+
+  private static SnapshotStatus toSnapshotStatus(
+  Map json) {
+if (json == null) {
+  return null;
+}
+int snapshotID = getInt(json, "snapshotID", 0);
+boolean isDeleted = ((String)json.get("deletionStatus")).
+contentEquals("DELETED");

Review comment:
   Use "DELETED".equal(..) in order to avoid NPE?
   ` final boolean isDeleted = "DELETED".equals(json.get("deletionStatus"));`
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Wei-Chiu Chuang (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205919#comment-17205919
 ] 

Wei-Chiu Chuang commented on HADOOP-16990:
--

Looks good... Attila.

Can you also manually verify the HDFS tests that use mockserver also pass? The 
Hadoop precommit bypass HDFS tests so you'll need additional manual check.

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: HADOOP-16990.001.patch
>
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Hadoop QA (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205903#comment-17205903
 ] 

Hadoop QA commented on HADOOP-16990:


| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime ||  Logfile || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 40m 
49s{color} |  | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} || ||
| {color:green}+1{color} | {color:green} dupname {color} | {color:green}  0m  
0s{color} |  | {color:green} No case conflicting files found. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} |  | {color:green} The patch does not contain any @author tags. 
{color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch appears to include 3 new or modified 
test files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
28s{color} |  | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 
30s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 
27s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 
46s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
28s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
49s{color} |  | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
23m  9s{color} |  | {color:green} branch has no errors when building and 
testing our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} |  | {color:green} trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
23s{color} |  | {color:green} trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue}  3m 
21s{color} |  | {color:blue} Used deprecated FindBugs config; considering 
switching to SpotBugs. {color} |
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m 
32s{color} |  | {color:blue} branch/hadoop-project no findbugs output file 
(findbugsXml.xml) {color} |
|| || || || {color:brown} Patch Compile Tests {color} || ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
15s{color} |  | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
18s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 
16s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 
16s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m  
4s{color} |  | {color:green} the patch passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 24m  
4s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  4m 
 5s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
55s{color} |  | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} |  | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} xml {color} | {color:green}  0m  
2s{color} |  | {color:green} The patch has no ill-formed XML file. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
17m 32s{color} |  | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
25s{color} |  | {color:green} the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
17s{color}

[jira] [Work logged] (HADOOP-17265) ABFS: Support for Client Correlation ID

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17265?focusedWorklogId=493703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493703
 ]

ASF GitHub Bot logged work on HADOOP-17265:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 21:42
Start Date: 01/Oct/20 21:42
Worklog Time Spent: 10m 
  Work Description: sumangala-patki commented on pull request #2344:
URL: https://github.com/apache/hadoop/pull/2344#issuecomment-702413487


   Test Results
   
   HNS-Enabled Account (Location: East US 2)
   
   ```
   Authentication type: SharedKey
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 42
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24
   
   Authentication type: OAuth
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 75
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 140
   ```
   
   HNS-Disabled Account (Location: East US 2, Central US)
   
   ```
   Authentication type: SharedKey
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [ERROR] Failures:
   [ERROR] 
ITestAzureBlobFileSystemRandomRead.testSkipBounds:196->Assert.assertTrue:41->Assert.fail:88
 There should not be any network I/O (elapsedTimeMs=53).
   [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24
   
   Authentication type: OAuth
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [ERROR] Failures:
   [ERROR]   
ITestAzureBlobFileSystemRandomRead.testValidateSeekBounds:245->Assert.assertTrue:41->Assert.fail:88
 There should not be any network I/O (elapsedTimeMs=22).
   [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493703)
Time Spent: 50m  (was: 40m)

> ABFS: Support for Client Correlation ID
> ---
>
> Key: HADOOP-17265
> URL: https://issues.apache.org/jira/browse/HADOOP-17265
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Sumangala Patki
>Priority: Major
>  Labels: abfsactive, pull-request-available
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Introducing a client correlation ID that appears in the Azure diagnostic logs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] sumangala-patki commented on pull request #2344: HADOOP-17265. ABFS: Support for Client Correlation ID

2020-10-01 Thread GitBox



sumangala-patki commented on pull request #2344:
URL: https://github.com/apache/hadoop/pull/2344#issuecomment-702413487


   Test Results
   
   HNS-Enabled Account (Location: East US 2)
   
   ```
   Authentication type: SharedKey
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 42
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24
   
   Authentication type: OAuth
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 75
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 140
   ```
   
   HNS-Disabled Account (Location: East US 2, Central US)
   
   ```
   Authentication type: SharedKey
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [ERROR] Failures:
   [ERROR] 
ITestAzureBlobFileSystemRandomRead.testSkipBounds:196->Assert.assertTrue:41->Assert.fail:88
 There should not be any network I/O (elapsedTimeMs=53).
   [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24
   
   Authentication type: OAuth
   
   [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0
   [ERROR] Failures:
   [ERROR]   
ITestAzureBlobFileSystemRandomRead.testValidateSeekBounds:245->Assert.assertTrue:41->Assert.fail:88
 There should not be any network I/O (elapsedTimeMs=22).
   [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246
   [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17265) ABFS: Support for Client Correlation ID

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17265?focusedWorklogId=493698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493698
 ]

ASF GitHub Bot logged work on HADOOP-17265:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 21:33
Start Date: 01/Oct/20 21:33
Worklog Time Spent: 10m 
  Work Description: sumangala-patki commented on a change in pull request 
#2344:
URL: https://github.com/apache/hadoop/pull/2344#discussion_r498524223



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java
##
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.utils;
+
+import java.util.UUID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH;
+
+public class TrackingContext {
+  private String clientCorrelationID;
+  private String clientRequestID;
+  private static final Logger LOG = LoggerFactory.getLogger(
+  org.apache.hadoop.fs.azurebfs.services.AbfsClient.class);
+
+  public TrackingContext(String clientCorrelationID) {
+//validation
+if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) ||
+(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) {
+  this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+  LOG.debug("Invalid config provided; correlation id not included in 
header.");
+}
+else if (clientCorrelationID.length() > 0) {
+  this.clientCorrelationID = clientCorrelationID + ":";
+  LOG.debug("Client correlation id has been validated and set 
successfully.");

Review comment:
   Success log omitted





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493698)
Time Spent: 40m  (was: 0.5h)

> ABFS: Support for Client Correlation ID
> ---
>
> Key: HADOOP-17265
> URL: https://issues.apache.org/jira/browse/HADOOP-17265
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Sumangala Patki
>Priority: Major
>  Labels: abfsactive, pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Introducing a client correlation ID that appears in the Azure diagnostic logs



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2344: HADOOP-17265. ABFS: Support for Client Correlation ID

2020-10-01 Thread GitBox



sumangala-patki commented on a change in pull request #2344:
URL: https://github.com/apache/hadoop/pull/2344#discussion_r498524223



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java
##
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.utils;
+
+import java.util.UUID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH;
+
+public class TrackingContext {
+  private String clientCorrelationID;
+  private String clientRequestID;
+  private static final Logger LOG = LoggerFactory.getLogger(
+  org.apache.hadoop.fs.azurebfs.services.AbfsClient.class);
+
+  public TrackingContext(String clientCorrelationID) {
+//validation
+if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) ||
+(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) {
+  this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+  LOG.debug("Invalid config provided; correlation id not included in 
header.");
+}
+else if (clientCorrelationID.length() > 0) {
+  this.clientCorrelationID = clientCorrelationID + ":";
+  LOG.debug("Client correlation id has been validated and set 
successfully.");

Review comment:
   Success log omitted





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17183) ABFS: Enable checkaccess API

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17183?focusedWorklogId=493676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493676
 ]

ASF GitHub Bot logged work on HADOOP-17183:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 20:31
Start Date: 01/Oct/20 20:31
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2331:
URL: https://github.com/apache/hadoop/pull/2331#issuecomment-702380473


   +1, merged to 3.3 and trunk



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493676)
Time Spent: 40m  (was: 0.5h)

> ABFS: Enable checkaccess API
> 
>
> Key: HADOOP-17183
> URL: https://issues.apache.org/jira/browse/HADOOP-17183
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Bilahari T H
>Assignee: Bilahari T H
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Enable check access on ABFS. Currently by default the same if disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2331: HADOOP-17183. ABFS: Enabling checkaccess on ABFS

2020-10-01 Thread GitBox



steveloughran commented on pull request #2331:
URL: https://github.com/apache/hadoop/pull/2331#issuecomment-702380473


   +1, merged to 3.3 and trunk



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-17183) ABFS: Enable checkaccess API

2020-10-01 Thread Steve Loughran (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Loughran updated HADOOP-17183:

Fix Version/s: 3.3.1
   Resolution: Fixed
   Status: Resolved  (was: Patch Available)

> ABFS: Enable checkaccess API
> 
>
> Key: HADOOP-17183
> URL: https://issues.apache.org/jira/browse/HADOOP-17183
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Bilahari T H
>Assignee: Bilahari T H
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.3.1
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Enable check access on ABFS. Currently by default the same if disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17183) ABFS: Enable checkaccess API

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17183?focusedWorklogId=493675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493675
 ]

ASF GitHub Bot logged work on HADOOP-17183:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 20:29
Start Date: 01/Oct/20 20:29
Worklog Time Spent: 10m 
  Work Description: steveloughran merged pull request #2331:
URL: https://github.com/apache/hadoop/pull/2331


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493675)
Time Spent: 0.5h  (was: 20m)

> ABFS: Enable checkaccess API
> 
>
> Key: HADOOP-17183
> URL: https://issues.apache.org/jira/browse/HADOOP-17183
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.0
>Reporter: Bilahari T H
>Assignee: Bilahari T H
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Enable check access on ABFS. Currently by default the same if disabled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran merged pull request #2331: HADOOP-17183. ABFS: Enabling checkaccess on ABFS

2020-10-01 Thread GitBox



steveloughran merged pull request #2331:
URL: https://github.com/apache/hadoop/pull/2331


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205803#comment-17205803
 ] 

Attila Doroszlai edited comment on HADOOP-16990 at 10/1/20, 8:25 PM:
-

MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as 
Hadoop trunk).  But also on Guava 28.2-android, which is ahead of Hadoop 
(27.0-jre).  There is no previous release of MockServer with Guava 27.0 
dependency.  They upgraded from 20 to 28.1.

With [^HADOOP-16990.001.patch]:

{code}
[INFO] +- org.mock-server:mockserver-netty:jar:5.11.1:test
[INFO] |  +- org.mock-server:mockserver-client-java:jar:5.11.1:test
[INFO] |  \- org.mock-server:mockserver-core:jar:5.11.1:test
[INFO] | +- com.lmax:disruptor:jar:3.4.2:test
[INFO] | +- io.netty:netty-codec-socks:jar:4.1.50.Final:test
{code}

BTW, MockServer is only used in HDFS Client:

{code}
hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSOAuth2.java
hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/web/oauth2/TestClientCredentialTimeBasedTokenRefresher.java
hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/web/oauth2/TestRefreshTokenTimeBasedTokenRefresher.java
{code}


was (Author: adoroszlai):
MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as 
Hadoop trunk).  But also on Guava 28.2-android, which is ahead of Hadoop 
(27.0-jre).  There is no previous release of MockServer with Guava 27.0 
dependency.  They upgraded from 20 to 28.1.

With [^HADOOP-16990.001.patch]:

{code}
[INFO] +- org.mock-server:mockserver-netty:jar:5.11.1:test
[INFO] |  +- org.mock-server:mockserver-client-java:jar:5.11.1:test
[INFO] |  \- org.mock-server:mockserver-core:jar:5.11.1:test
[INFO] | +- com.lmax:disruptor:jar:3.4.2:test
[INFO] | +- io.netty:netty-codec-socks:jar:4.1.50.Final:test
{code}



> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: HADOOP-16990.001.patch
>
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HADOOP-16990:
--
Status: Patch Available  (was: In Progress)

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: HADOOP-16990.001.patch
>
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205803#comment-17205803
 ] 

Attila Doroszlai edited comment on HADOOP-16990 at 10/1/20, 8:22 PM:
-

MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as 
Hadoop trunk).  But also on Guava 28.2-android, which is ahead of Hadoop 
(27.0-jre).  There is no previous release of MockServer with Guava 27.0 
dependency.  They upgraded from 20 to 28.1.

With [^HADOOP-16990.001.patch]:

{code}
[INFO] +- org.mock-server:mockserver-netty:jar:5.11.1:test
[INFO] |  +- org.mock-server:mockserver-client-java:jar:5.11.1:test
[INFO] |  \- org.mock-server:mockserver-core:jar:5.11.1:test
[INFO] | +- com.lmax:disruptor:jar:3.4.2:test
[INFO] | +- io.netty:netty-codec-socks:jar:4.1.50.Final:test
{code}




was (Author: adoroszlai):
MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as 
Hadoop trunk).  But also on Guava 28.2-android, which is ahead of Hadoop 
(27.0-jre).  There is no previous release of MockServer with Guava 27.0 
dependency.  They upgraded from 20 to 28.1.

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: HADOOP-16990.001.patch
>
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm

2020-10-01 Thread GitBox



steveloughran commented on a change in pull request #2349:
URL: https://github.com/apache/hadoop/pull/2349#discussion_r498491655



##
File path: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
##
@@ -348,6 +348,16 @@ public Path getWorkPath() throws IOException {
* @param context the job's context
*/
   public void setupJob(JobContext context) throws IOException {
+// Downgrade v2 to v1 with a warning.
+if (algorithmVersion == 2) {
+  Logger log = LoggerFactory.getLogger(
+  "org.apache.hadoop.mapreduce.lib.output."
+  + "FileOutputCommitter.Algorithm");
+
+  log.warn("The v2 commit algorithm is deprecated;"
+  + " please switch to the v1 algorithm");

Review comment:
   switching to your text
   





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai updated HADOOP-16990:
--
Attachment: HADOOP-16990.001.patch

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: HADOOP-16990.001.patch
>
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205803#comment-17205803
 ] 

Attila Doroszlai commented on HADOOP-16990:
---

MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as 
Hadoop trunk).  But also on Guava 28.2-android, which is ahead of Hadoop 
(27.0-jre).  There is no previous release of MockServer with Guava 27.0 
dependency.  They upgraded from 20 to 28.1.

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
> Attachments: HADOOP-16990.001.patch
>
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493672
 ]

ASF GitHub Bot logged work on HADOOP-17281:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 20:13
Start Date: 01/Oct/20 20:13
Worklog Time Spent: 10m 
  Work Description: steveloughran removed a comment on pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522


   Looks good. Annoying about the return types which force you to do that 
wrapping/casting. Can't you just forcibly cast the return type of the inner 
iterator? after all, type erasure means all type info will be lost in the 
actual compiled binary. I'd prefer that as it will give you automatic 
passthrough of the IOStatistics stuff.
   
   Add text to filesystem.md, something which: 
   
   * specifies the result is exactly the same a listStatus, provided no other 
caller updates the directory during the list
   * declares that it's not atomic and performance implementations will page
   * and that if a path isn't there, that fact may not surface until 
next/hasNext...that is, we do lazy eval for all file IO
   
   
   We need to similar new contract tests in AbstractContractGetFileStatusTest 
for all to use
   
   * that in a dir with files and subdirectories, you get both returned in the 
listing
   * that you can iterate through with next() to failure as well as 
hasNext/next, and get the same results
   * listStatusIterator(file) returns the file
   * listStatusIterator("/") gives you a listing of root (put that in 
AbstractContractRootDirectoryTest)
   
   And two for changes partway through the iteration
   
   * change the directory during a list to add/delete files
   * deletes the actual path.
   
   These tests can't assert on what will happen, and with paged IO aren't 
likely to pick up on changes...there just to show it can be done and pick up on 
any major issues with implementations.
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493672)
Time Spent: 0.5h  (was: 20m)

> Implement FileSystem.listStatusIterator() in S3AFileSystem
> --
>
> Key: HADOOP-17281
> URL: https://issues.apache.org/jira/browse/HADOOP-17281
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an 
> array. Once we implement the listStatusIterator(), clients can benefit from 
> the async listing done recently 
> https://issues.apache.org/jira/browse/HADOOP-17074  by performing some tasks 
> on files while iterating them.
>  
> CC [~stevel]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran removed a comment on pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-01 Thread GitBox



steveloughran removed a comment on pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522


   Looks good. Annoying about the return types which force you to do that 
wrapping/casting. Can't you just forcibly cast the return type of the inner 
iterator? after all, type erasure means all type info will be lost in the 
actual compiled binary. I'd prefer that as it will give you automatic 
passthrough of the IOStatistics stuff.
   
   Add text to filesystem.md, something which: 
   
   * specifies the result is exactly the same a listStatus, provided no other 
caller updates the directory during the list
   * declares that it's not atomic and performance implementations will page
   * and that if a path isn't there, that fact may not surface until 
next/hasNext...that is, we do lazy eval for all file IO
   
   
   We need to similar new contract tests in AbstractContractGetFileStatusTest 
for all to use
   
   * that in a dir with files and subdirectories, you get both returned in the 
listing
   * that you can iterate through with next() to failure as well as 
hasNext/next, and get the same results
   * listStatusIterator(file) returns the file
   * listStatusIterator("/") gives you a listing of root (put that in 
AbstractContractRootDirectoryTest)
   
   And two for changes partway through the iteration
   
   * change the directory during a list to add/delete files
   * deletes the actual path.
   
   These tests can't assert on what will happen, and with paged IO aren't 
likely to pick up on changes...there just to show it can be done and pick up on 
any major issues with implementations.
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493671=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493671
 ]

ASF GitHub Bot logged work on HADOOP-17281:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 20:12
Start Date: 01/Oct/20 20:12
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522


   Looks good. Annoying about the return types which force you to do that 
wrapping/casting. Can't you just forcibly cast the return type of the inner 
iterator? after all, type erasure means all type info will be lost in the 
actual compiled binary. I'd prefer that as it will give you automatic 
passthrough of the IOStatistics stuff.
   
   Add text to filesystem.md, something which: 
   
   * specifies the result is exactly the same a listStatus, provided no other 
caller updates the directory during the list
   * declares that it's not atomic and performance implementations will page
   * and that if a path isn't there, that fact may not surface until 
next/hasNext...that is, we do lazy eval for all file IO
   
   
   We need to similar new contract tests in AbstractContractGetFileStatusTest 
for all to use
   
   * that in a dir with files and subdirectories, you get both returned in the 
listing
   * that you can iterate through with next() to failure as well as 
hasNext/next, and get the same results
   * listStatusIterator(file) returns the file
   * listStatusIterator("/") gives you a listing of root (put that in 
AbstractContractRootDirectoryTest)
   
   And two for changes partway through the iteration
   
   * change the directory during a list to add/delete files
   * deletes the actual path.
   
   These tests can't assert on what will happen, and with paged IO aren't 
likely to pick up on changes...there just to show it can be done and pick up on 
any major issues with implementations.
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493671)
Time Spent: 20m  (was: 10m)

> Implement FileSystem.listStatusIterator() in S3AFileSystem
> --
>
> Key: HADOOP-17281
> URL: https://issues.apache.org/jira/browse/HADOOP-17281
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an 
> array. Once we implement the listStatusIterator(), clients can benefit from 
> the async listing done recently 
> https://issues.apache.org/jira/browse/HADOOP-17074  by performing some tasks 
> on files while iterating them.
>  
> CC [~stevel]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-01 Thread GitBox



steveloughran commented on pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522


   Looks good. Annoying about the return types which force you to do that 
wrapping/casting. Can't you just forcibly cast the return type of the inner 
iterator? after all, type erasure means all type info will be lost in the 
actual compiled binary. I'd prefer that as it will give you automatic 
passthrough of the IOStatistics stuff.
   
   Add text to filesystem.md, something which: 
   
   * specifies the result is exactly the same a listStatus, provided no other 
caller updates the directory during the list
   * declares that it's not atomic and performance implementations will page
   * and that if a path isn't there, that fact may not surface until 
next/hasNext...that is, we do lazy eval for all file IO
   
   
   We need to similar new contract tests in AbstractContractGetFileStatusTest 
for all to use
   
   * that in a dir with files and subdirectories, you get both returned in the 
listing
   * that you can iterate through with next() to failure as well as 
hasNext/next, and get the same results
   * listStatusIterator(file) returns the file
   * listStatusIterator("/") gives you a listing of root (put that in 
AbstractContractRootDirectoryTest)
   
   And two for changes partway through the iteration
   
   * change the directory during a list to add/delete files
   * deletes the actual path.
   
   These tests can't assert on what will happen, and with paged IO aren't 
likely to pick up on changes...there just to show it can be done and pick up on 
any major issues with implementations.
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=493666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493666
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 19:58
Start Date: 01/Oct/20 19:58
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2323:
URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702364693


   @mehakmeet 
   * duration tracking for classic function. Issue: are the names of the 
`trackDuration` calls correct now?
   * class DurationStatisticSummary to store and extract duration stats from a 
statistic.
   * which is used in the tests to verify that the new functions all work 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493666)
Time Spent: 6h 40m  (was: 6.5h)

> Add public IOStatistics API
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6h 40m
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2323: HADOOP-16830. Add public IOStatistics API.

2020-10-01 Thread GitBox



steveloughran commented on pull request #2323:
URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702364693


   @mehakmeet 
   * duration tracking for classic function. Issue: are the names of the 
`trackDuration` calls correct now?
   * class DurationStatisticSummary to store and extract duration stats from a 
statistic.
   * which is used in the tests to verify that the new functions all work 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Wei-Chiu Chuang (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205783#comment-17205783
 ] 

Wei-Chiu Chuang commented on HADOOP-16990:
--

{noformat}
[INFO] +- org.mock-server:mockserver-netty:jar:3.9.2:test
[INFO] |  +- org.mock-server:mockserver-client-java:jar:3.9.2:test
[INFO] |  +- org.mock-server:mockserver-core:jar:3.9.2:test
[INFO] |  |  +- io.netty:netty-codec-socks:jar:4.0.24.Final:test

{noformat}
i don't think we are really affected, especially since it's in test scope. But 
very annoying when users complain about a new CVE found in the classpath.

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Ayush Saxena (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205779#comment-17205779
 ] 

Ayush Saxena commented on HADOOP-16990:
---

What advantages do we get with upgrading MockServer? Is there a CVE or some 
serious performance improvements?
{{MockServer-3.9.2}} uses guava-18, What version do you intend to move to, if 
the above holds true 

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Wei-Chiu Chuang (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205778#comment-17205778
 ] 

Wei-Chiu Chuang commented on HADOOP-16990:
--

Thanks for picking this up! [~adoroszlai]

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work started] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HADOOP-16990 started by Attila Doroszlai.
-
> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Assigned] (HADOOP-16990) Update Mockserver

2020-10-01 Thread Attila Doroszlai (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Attila Doroszlai reassigned HADOOP-16990:
-

Assignee: Attila Doroszlai

> Update Mockserver
> -
>
> Key: HADOOP-16990
> URL: https://issues.apache.org/jira/browse/HADOOP-16990
> Project: Hadoop Common
>  Issue Type: Task
>Reporter: Wei-Chiu Chuang
>Assignee: Attila Doroszlai
>Priority: Major
>
> We are on Mockserver 3.9.2 which is more than 5 years old. Time to update.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493632
 ]

ASF GitHub Bot logged work on HADOOP-17276:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 18:26
Start Date: 01/Oct/20 18:26
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498437024



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");

Review comment:
   Ok. Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493632)
Time Spent: 8h 10m  (was: 8h)

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h 10m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] ferhui commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items

2020-10-01 Thread GitBox



ferhui commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498437024



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");

Review comment:
   Ok. Done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493630
 ]

ASF GitHub Bot logged work on HADOOP-17276:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 18:19
Start Date: 01/Oct/20 18:19
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498433031



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -109,11 +114,53 @@ public String toString() {
 
   /** The caller context builder. */
   public static final class Builder {
-private final String context;
+private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '='.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");
+private final String fieldSeparator;
+private final StringBuilder sb = new StringBuilder();
 private byte[] signature;
 
 public Builder(String context) {
-  this.context = context;
+  this(context, new Configuration());
+}
+
+public Builder(String context, Configuration conf) {
+  if (isValid(context)) {
+sb.append(context);
+  }
+  fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY,
+  HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT);
+  checkFieldSeparator(fieldSeparator);
+}
+
+/**
+ * Check whether the separator is legal.
+ * The illegal separators include '\t', '\n', '='.
+ * Throw IllegalArgumentException if the separator is Illegal.
+ * @param separator the separator of fields.
+ */
+private void checkFieldSeparator(String separator) {
+  if (ILLEGAL_SEPARATORS.stream()

Review comment:
   Ok. It's done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493630)
Time Spent: 8h  (was: 7h 50m)

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 8h
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] ferhui commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items

2020-10-01 Thread GitBox



ferhui commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498433031



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -109,11 +114,53 @@ public String toString() {
 
   /** The caller context builder. */
   public static final class Builder {
-private final String context;
+private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '='.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");
+private final String fieldSeparator;
+private final StringBuilder sb = new StringBuilder();
 private byte[] signature;
 
 public Builder(String context) {
-  this.context = context;
+  this(context, new Configuration());
+}
+
+public Builder(String context, Configuration conf) {
+  if (isValid(context)) {
+sb.append(context);
+  }
+  fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY,
+  HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT);
+  checkFieldSeparator(fieldSeparator);
+}
+
+/**
+ * Check whether the separator is legal.
+ * The illegal separators include '\t', '\n', '='.
+ * Throw IllegalArgumentException if the separator is Illegal.
+ * @param separator the separator of fields.
+ */
+private void checkFieldSeparator(String separator) {
+  if (ILLEGAL_SEPARATORS.stream()

Review comment:
   Ok. It's done





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] smengcl edited a comment on pull request #2352: HDFS-15607. Create trash dir when allowing snapshottable dir

2020-10-01 Thread GitBox



smengcl edited a comment on pull request #2352:
URL: https://github.com/apache/hadoop/pull/2352#issuecomment-701586546


   > Do we need to add provisionTrash command for WebHdfs as well?
   
   Good point. I think so, yes.
   
   Update: As I was attempting to add `PROVISIONSNAPSHOTTRASH` to WebHDFS, I 
realized `NamenodeWebHdfsMethods` is already server code. But the whole 
provision trash logic are on the **client** side (same as provision EZ trash). 
This would also imply that WebHDFS `ALLOWSNAPSHOT` won't trigger provision 
snapshot trash at the moment.
   
   Note WebHDFS doesn't support encryption zone commands (create, list, etc.).
   
   I have opened another jira 
[HDFS-15612](https://issues.apache.org/jira/browse/HDFS-15612) for discussion 
on WebHDFS support for provision snapshot trash. Let's rule WebHDFS out in this 
jira for now. @bshashikant 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493609
 ]

ASF GitHub Bot logged work on HADOOP-17276:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 17:43
Start Date: 01/Oct/20 17:43
Worklog Time Spent: 10m 
  Work Description: goiri commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498413903



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -109,11 +114,53 @@ public String toString() {
 
   /** The caller context builder. */
   public static final class Builder {
-private final String context;
+private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '='.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");
+private final String fieldSeparator;
+private final StringBuilder sb = new StringBuilder();
 private byte[] signature;
 
 public Builder(String context) {
-  this.context = context;
+  this(context, new Configuration());
+}
+
+public Builder(String context, Configuration conf) {
+  if (isValid(context)) {
+sb.append(context);
+  }
+  fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY,
+  HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT);
+  checkFieldSeparator(fieldSeparator);
+}
+
+/**
+ * Check whether the separator is legal.
+ * The illegal separators include '\t', '\n', '='.
+ * Throw IllegalArgumentException if the separator is Illegal.
+ * @param separator the separator of fields.
+ */
+private void checkFieldSeparator(String separator) {
+  if (ILLEGAL_SEPARATORS.stream()

Review comment:
   Not that is wrong, but we could just do: 
ILLEGAL_SEPARATORS.contains(separator), and if we make it a HashSet would be 
faster.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");

Review comment:
   I would do a set: Collections.unmodifiableSet()





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493609)
Time Spent: 7h 50m  (was: 7h 40m)

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 50m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] goiri commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items

2020-10-01 Thread GitBox



goiri commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498413903



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -109,11 +114,53 @@ public String toString() {
 
   /** The caller context builder. */
   public static final class Builder {
-private final String context;
+private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '='.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");
+private final String fieldSeparator;
+private final StringBuilder sb = new StringBuilder();
 private byte[] signature;
 
 public Builder(String context) {
-  this.context = context;
+  this(context, new Configuration());
+}
+
+public Builder(String context, Configuration conf) {
+  if (isValid(context)) {
+sb.append(context);
+  }
+  fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY,
+  HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT);
+  checkFieldSeparator(fieldSeparator);
+}
+
+/**
+ * Check whether the separator is legal.
+ * The illegal separators include '\t', '\n', '='.
+ * Throw IllegalArgumentException if the separator is Illegal.
+ * @param separator the separator of fields.
+ */
+private void checkFieldSeparator(String separator) {
+  if (ILLEGAL_SEPARATORS.stream()

Review comment:
   Not that is wrong, but we could just do: 
ILLEGAL_SEPARATORS.contains(separator), and if we make it a HashSet would be 
faster.

##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");

Review comment:
   I would do a set: Collections.unmodifiableSet()





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493595
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 17:07
Start Date: 01/Oct/20 17:07
Worklog Time Spent: 10m 
  Work Description: viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702273856


   Hmm, for CompressDecompressTester.java, it seems to me that it is from 
original code?
   
   ```java
   else if (compressor.getClass().isAssignableFrom(ZlibCompressor.class)) {
 return ZlibFactory.isNativeZlibLoaded(new Configuration());
   -}  
   -else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)
   -&& isNativeSnappyLoadable())
   +}
   +else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class))
   ```
   
   Anyway, I can fix it here if you think it is ok.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493595)
Time Spent: 22h 40m  (was: 22.5h)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 22h 40m
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] viirya commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-10-01 Thread GitBox



viirya commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702273856


   Hmm, for CompressDecompressTester.java, it seems to me that it is from 
original code?
   
   ```java
   else if (compressor.getClass().isAssignableFrom(ZlibCompressor.class)) {
 return ZlibFactory.isNativeZlibLoaded(new Configuration());
   -}  
   -else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)
   -&& isNativeSnappyLoadable())
   +}
   +else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class))
   ```
   
   Anyway, I can fix it here if you think it is ok.
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty

2020-10-01 Thread Michael Stack (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205644#comment-17205644
 ] 

Michael Stack commented on HADOOP-17288:


{quote}Ideally if the downstream has to upgrade guava, then this patch has no 
meaning.
{quote}
+1
{quote}Then we might need to shade them as well? May be {{curator}} can be one 
of those.
{quote}
Yes. Unfortunately the tangles start to compound fast when the dependencies' 
dependency is also an hadoop dependency (and versions don't align). One thing 
to consider removing problem dependencies (like curator) if not heavily used. 
Thanks [~ayushtkn]

> Use shaded guava from thirdparty
> 
>
> Key: HADOOP-17288
> URL: https://issues.apache.org/jira/browse/HADOOP-17288
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use the shaded version of guava in hadoop-thirdparty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493544=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493544
 ]

ASF GitHub Bot logged work on HADOOP-17276:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 15:34
Start Date: 01/Oct/20 15:34
Worklog Time Spent: 10m 
  Work Description: ferhui commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498337876



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.

Review comment:
   It's ok. Remove etc from here and other places.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493544)
Time Spent: 7h 40m  (was: 7.5h)

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 40m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] ferhui commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items

2020-10-01 Thread GitBox



ferhui commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498337876



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.

Review comment:
   It's ok. Remove etc from here and other places.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] jbrennan333 commented on pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm

2020-10-01 Thread GitBox



jbrennan333 commented on pull request #2349:
URL: https://github.com/apache/hadoop/pull/2349#issuecomment-702176681


   @steveloughran It's hard to think of a terse warning for this.  I think your 
comment above gets close.  Maybe something like "The v2 commit algorithm 
assumes that the content of generated output files is consistent across all 
task attempts - if this is not true for this job, the v1 commit algorithm is 
strongly recommended."
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm

2020-10-01 Thread GitBox



steveloughran commented on pull request #2349:
URL: https://github.com/apache/hadoop/pull/2349#issuecomment-702161929


   (Yetus failure is from no new tests)



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17265) ABFS: Support for Client Correlation ID

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17265?focusedWorklogId=493507=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493507
 ]

ASF GitHub Bot logged work on HADOOP-17265:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 14:08
Start Date: 01/Oct/20 14:08
Worklog Time Spent: 10m 
  Work Description: snvijaya commented on a change in pull request #2344:
URL: https://github.com/apache/hadoop/pull/2344#discussion_r498266654



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java
##
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.utils;
+
+import java.util.UUID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH;
+
+public class TrackingContext {
+  private String clientCorrelationID;
+  private String clientRequestID;
+  private static final Logger LOG = LoggerFactory.getLogger(
+  org.apache.hadoop.fs.azurebfs.services.AbfsClient.class);
+
+  public TrackingContext(String clientCorrelationID) {
+//validation
+if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) ||
+(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) {
+  this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+  LOG.debug("Invalid config provided; correlation id not included in 
header.");
+}
+else if (clientCorrelationID.length() > 0) {
+  this.clientCorrelationID = clientCorrelationID + ":";
+  LOG.debug("Client correlation id has been validated and set 
successfully.");

Review comment:
   Log usually incur a perf cost so log for failure cases. Success case log 
can be omitted. 

##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java
##
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.utils;
+
+import java.util.UUID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH;
+
+public class TrackingContext {
+  private String clientCorrelationID;
+  private String clientRequestID;
+  private static final Logger LOG = LoggerFactory.getLogger(
+  org.apache.hadoop.fs.azurebfs.services.AbfsClient.class);
+
+  public TrackingContext(String clientCorrelationID) {
+//validation
+if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) ||
+(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) {
+  this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+  LOG.debug("Invalid config provided;

[jira] [Work logged] (HADOOP-17021) Add concat fs command

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17021?focusedWorklogId=493506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493506
 ]

ASF GitHub Bot logged work on HADOOP-17021:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 14:08
Start Date: 01/Oct/20 14:08
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #1993:
URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702161263


   OK, yetus is happy, it's just changed where to report
   
   I'm going to merge: what do you want to have in your Contributed by: credits 
as your full name? I think we need to stay with ASCII to avoid breaking things.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493506)
Time Spent: 4h 20m  (was: 4h 10m)

> Add concat fs command
> -
>
> Key: HADOOP-17021
> URL: https://issues.apache.org/jira/browse/HADOOP-17021
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HADOOP-17021.001.patch
>
>  Time Spent: 4h 20m
>  Remaining Estimate: 0h
>
> We should add one concat fs command for ease of use. It concatenates existing 
> source files into the target file using FileSystem.concat().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] snvijaya commented on a change in pull request #2344: HADOOP-17265. ABFS: Support for Client Correlation ID

2020-10-01 Thread GitBox



snvijaya commented on a change in pull request #2344:
URL: https://github.com/apache/hadoop/pull/2344#discussion_r498266654



##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java
##
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.utils;
+
+import java.util.UUID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH;
+
+public class TrackingContext {
+  private String clientCorrelationID;
+  private String clientRequestID;
+  private static final Logger LOG = LoggerFactory.getLogger(
+  org.apache.hadoop.fs.azurebfs.services.AbfsClient.class);
+
+  public TrackingContext(String clientCorrelationID) {
+//validation
+if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) ||
+(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) {
+  this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+  LOG.debug("Invalid config provided; correlation id not included in 
header.");
+}
+else if (clientCorrelationID.length() > 0) {
+  this.clientCorrelationID = clientCorrelationID + ":";
+  LOG.debug("Client correlation id has been validated and set 
successfully.");

Review comment:
   Log usually incur a perf cost so log for failure cases. Success case log 
can be omitted. 

##
File path: 
hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java
##
@@ -0,0 +1,61 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.hadoop.fs.azurebfs.utils;
+
+import java.util.UUID;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+import static 
org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH;
+
+public class TrackingContext {
+  private String clientCorrelationID;
+  private String clientRequestID;
+  private static final Logger LOG = LoggerFactory.getLogger(
+  org.apache.hadoop.fs.azurebfs.services.AbfsClient.class);
+
+  public TrackingContext(String clientCorrelationID) {
+//validation
+if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) ||
+(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) {
+  this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+  LOG.debug("Invalid config provided; correlation id not included in 
header.");
+}
+else if (clientCorrelationID.length() > 0) {
+  this.clientCorrelationID = clientCorrelationID + ":";
+  LOG.debug("Client correlation id has been validated and set 
successfully.");
+}
+else {
+  this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID;
+  LOG.debug(

Review comment:
   This config will be not set for most cases. For

[GitHub] [hadoop] steveloughran commented on pull request #1993: HADOOP-17021. Add concat fs command

2020-10-01 Thread GitBox



steveloughran commented on pull request #1993:
URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702161263


   OK, yetus is happy, it's just changed where to report
   
   I'm going to merge: what do you want to have in your Contributed by: credits 
as your full name? I think we need to stay with ASCII to avoid breaking things.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493505
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 14:04
Start Date: 01/Oct/20 14:04
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702159088


   ok, yetus is running, it's just reporting isn't quite there...if you follow 
the link you see the results. Test failures in hdfs: unrelated. ASF licence 
warning: unrelated. 
   
   Checkstyles are, sadly, related.
   ```
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:491:
}:5: '}' at column 5 should be on the same line as the next part of a 
multi-block statement (one that directly contains multiple blocks: 
if/else-if/else, do/while or try/catch/finally). [RightCurly]
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:492:
else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)): 
'if' construct must use '{}'s. [NeedBraces]
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java:356:
int[] size = { 4 * 1024, 64 * 1024, 128 * 1024, 1024 * 1024 };:18: '{' is 
followed by whitespace. [NoWhitespaceAfter]
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493505)
Time Spent: 22.5h  (was: 22h 20m)

> Using snappy-java in SnappyCodec
> 
>
> Key: HADOOP-17125
> URL: https://issues.apache.org/jira/browse/HADOOP-17125
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 22.5h
>  Remaining Estimate: 0h
>
> In Hadoop, we use native libs for snappy codec which has several 
> disadvantages:
>  * It requires native *libhadoop* and *libsnappy* to be installed in system 
> *LD_LIBRARY_PATH*, and they have to be installed separately on each node of 
> the clusters, container images, or local test environments which adds huge 
> complexities from deployment point of view. In some environments, it requires 
> compiling the natives from sources which is non-trivial. Also, this approach 
> is platform dependent; the binary may not work in different platform, so it 
> requires recompilation.
>  * It requires extra configuration of *java.library.path* to load the 
> natives, and it results higher application deployment and maintenance cost 
> for users.
> Projects such as *Spark* and *Parquet* use 
> [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based 
> implementation. It contains native binaries for Linux, Mac, and IBM in jar 
> file, and it can automatically load the native binaries into JVM from jar 
> without any setup. If a native implementation can not be found for a 
> platform, it can fallback to pure-java implementation of snappy based on 
> [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]].



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-10-01 Thread GitBox



steveloughran commented on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702159088


   ok, yetus is running, it's just reporting isn't quite there...if you follow 
the link you see the results. Test failures in hdfs: unrelated. ASF licence 
warning: unrelated. 
   
   Checkstyles are, sadly, related.
   ```
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:491:
}:5: '}' at column 5 should be on the same line as the next part of a 
multi-block statement (one that directly contains multiple blocks: 
if/else-if/else, do/while or try/catch/finally). [RightCurly]
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:492:
else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)): 
'if' construct must use '{}'s. [NeedBraces]
   
./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java:356:
int[] size = { 4 * 1024, 64 * 1024, 128 * 1024, 1024 * 1024 };:18: '{' is 
followed by whitespace. [NoWhitespaceAfter]
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493504
 ]

ASF GitHub Bot logged work on HADOOP-17125:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 14:03
Start Date: 01/Oct/20 14:03
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690909475


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  1s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 35s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  4s |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 25s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  19m 19s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m  2s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  7s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  6s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 12s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 38s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  2s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 38s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  18m 38s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 
36 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  18m 38s |  the patch passed  |
   | +1 :green_heart: |  javac  |  18m 38s |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 51s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  16m 51s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 134 unchanged - 
29 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  16m 51s |  the patch passed  |
   | +1 :green_heart: |  javac  |  16m 51s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 41s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m  1s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 25s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 32s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 32s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   9m 32s |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 55s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 177m 51s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.io.compress.snappy.TestSnappyCompressorDecompressor |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile

[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec

2020-10-01 Thread GitBox



hadoop-yetus removed a comment on pull request #2297:
URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690909475


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 30s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  0s |  No case conflicting files 
found.  |
   | +1 :green_heart: |  @author  |   0m  1s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
5 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 35s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  27m  4s |  trunk passed  |
   | +1 :green_heart: |  compile  |  21m 25s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  19m 19s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   3m  2s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   2m  7s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  21m  6s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 13s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  7s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   2m 12s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +0 :ok: |  findbugs  |   0m 38s |  branch/hadoop-project no findbugs 
output file (findbugsXml.xml)  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   1m  2s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 38s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | -1 :x: |  cc  |  18m 38s |  
root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 
36 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  18m 38s |  the patch passed  |
   | +1 :green_heart: |  javac  |  18m 38s |  the patch passed  |
   | +1 :green_heart: |  compile  |  16m 51s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | -1 :x: |  cc  |  16m 51s |  
root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 134 unchanged - 
29 fixed = 163 total (was 163)  |
   | +1 :green_heart: |  golang  |  16m 51s |  the patch passed  |
   | +1 :green_heart: |  javac  |  16m 51s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 41s |  root: The patch generated 1 new 
+ 151 unchanged - 5 fixed = 152 total (was 156)  |
   | +1 :green_heart: |  mvnsite  |   2m  1s |  the patch passed  |
   | +1 :green_heart: |  whitespace  |   0m  0s |  The patch has no whitespace 
issues.  |
   | +1 :green_heart: |  xml  |   0m  2s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  14m 25s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   1m 12s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   2m  9s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  findbugs  |   0m 32s |  hadoop-project has no data from 
findbugs  |
   ||| _ Other Tests _ |
   | +1 :green_heart: |  unit  |   0m 32s |  hadoop-project in the patch 
passed.  |
   | -1 :x: |  unit  |   9m 32s |  hadoop-common in the patch passed.  |
   | +1 :green_heart: |  asflicense  |   0m 55s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 177m 51s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.io.compress.snappy.TestSnappyCompressorDecompressor |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2297 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient xml cc findbugs checkstyle golang |
   | uname | Linux c9432386914d 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 
11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 9960c01a25c |
   | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 |
   | Multi-JDK versions |

[jira] [Work logged] (HADOOP-17124) Support LZO using aircompressor

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17124?focusedWorklogId=493503=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493503
 ]

ASF GitHub Bot logged work on HADOOP-17124:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 14:00
Start Date: 01/Oct/20 14:00
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #2159:
URL: https://github.com/apache/hadoop/pull/2159#issuecomment-702155803


   @dbtsai yes, lets do snappy first



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493503)
Time Spent: 1h  (was: 50m)

> Support LZO using aircompressor
> ---
>
> Key: HADOOP-17124
> URL: https://issues.apache.org/jira/browse/HADOOP-17124
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: common
>Affects Versions: 3.3.0
>Reporter: DB Tsai
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> LZO codec was removed in HADOOP-4874 because the original LZO binding is GPL 
> which is problematic. However, many legacy data is still compressed by LZO 
> codec, and companies often use vendor's GPL LZO codec in the classpath which 
> might cause GPL contamination.
> Presro and ORC-77 use [aircompressor| 
> [https://github.com/airlift/aircompressor]] (Apache V2 licensed) to compress 
> and decompress LZO data. Hadoop can add back LZO support using aircompressor 
> without GPL violation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2159: HADOOP-17124. Support LZO Codec using aircompressor

2020-10-01 Thread GitBox



steveloughran commented on pull request #2159:
URL: https://github.com/apache/hadoop/pull/2159#issuecomment-702155803


   @dbtsai yes, lets do snappy first



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-13327) Add OutputStream + Syncable to the Filesystem Specification

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-13327?focusedWorklogId=493502=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493502
 ]

ASF GitHub Bot logged work on HADOOP-13327:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 13:58
Start Date: 01/Oct/20 13:58
Worklog Time Spent: 10m 
  Work Description: hadoop-yetus removed a comment on pull request #2102:
URL: https://github.com/apache/hadoop/pull/2102#issuecomment-696402075


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
8 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 32s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  29m  9s |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  19m 29s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 47s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 40s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 32s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   2m 29s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   4m  5s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 51s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   7m 43s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  1s |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 10s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |  21m 10s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 10s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 10s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 47s |  root: The patch generated 3 new 
+ 105 unchanged - 4 fixed = 108 total (was 109)  |
   | +1 :green_heart: |  mvnsite  |   4m 57s |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s |  The patch has 3 line(s) that end in 
whitespace. Use git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  6s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  15m 51s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   2m 44s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 21s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   8m 20s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |  10m  1s |  hadoop-common in the patch passed.  |
   | -1 :x: |  unit  | 101m  4s |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 58s |  hadoop-azure in the patch passed.  
|
   | +1 :green_heart: |  unit  |   1m 19s |  hadoop-azure-datalake in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  5s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 316m 36s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.fs.contract.rawlocal.TestRawlocalContractCreate |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
   |   | hadoop.hdfs.TestSnapshotCommands |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2102/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2102 |
   | Optional

[jira] [Work logged] (HADOOP-17021) Add concat fs command

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17021?focusedWorklogId=493501=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493501
 ]

ASF GitHub Bot logged work on HADOOP-17021:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 13:58
Start Date: 01/Oct/20 13:58
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on pull request #1993:
URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702154756


   there's been ongoing work with a yetus update this week and its been playing 
up. I've suggested some minor change, either do that or just a rebase and 
forced push to see if we can trigger it again. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493501)
Time Spent: 4h 10m  (was: 4h)

> Add concat fs command
> -
>
> Key: HADOOP-17021
> URL: https://issues.apache.org/jira/browse/HADOOP-17021
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HADOOP-17021.001.patch
>
>  Time Spent: 4h 10m
>  Remaining Estimate: 0h
>
> We should add one concat fs command for ease of use. It concatenates existing 
> source files into the target file using FileSystem.concat().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #1993: HADOOP-17021. Add concat fs command

2020-10-01 Thread GitBox



steveloughran commented on pull request #1993:
URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702154756


   there's been ongoing work with a yetus update this week and its been playing 
up. I've suggested some minor change, either do that or just a rebase and 
forced push to see if we can trigger it again. 



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2102: HADOOP-13327. Specify Output Stream and Syncable

2020-10-01 Thread GitBox



hadoop-yetus removed a comment on pull request #2102:
URL: https://github.com/apache/hadoop/pull/2102#issuecomment-696402075


   :broken_heart: **-1 overall**
   
   
   
   
   
   
   | Vote | Subsystem | Runtime | Comment |
   |::|--:|:|:|
   | +0 :ok: |  reexec  |   0m 31s |  Docker mode activated.  |
   ||| _ Prechecks _ |
   | +1 :green_heart: |  dupname  |   0m  1s |  No case conflicting files 
found.  |
   | +0 :ok: |  markdownlint  |   0m  0s |  markdownlint was not available.  |
   | +1 :green_heart: |  @author  |   0m  0s |  The patch does not contain any 
@author tags.  |
   | +1 :green_heart: |  test4tests  |   0m  0s |  The patch appears to include 
8 new or modified test files.  |
   ||| _ trunk Compile Tests _ |
   | +0 :ok: |  mvndep  |   3m 32s |  Maven dependency ordering for branch  |
   | +1 :green_heart: |  mvninstall  |  29m  9s |  trunk passed  |
   | +1 :green_heart: |  compile  |  23m 14s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  compile  |  19m 29s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  checkstyle  |   2m 47s |  trunk passed  |
   | +1 :green_heart: |  mvnsite  |   4m 40s |  trunk passed  |
   | +1 :green_heart: |  shadedclient  |  23m 32s |  branch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   2m 29s |  trunk passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   4m  5s |  trunk passed with JDK Private 
Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +0 :ok: |  spotbugs  |   0m 51s |  Used deprecated FindBugs config; 
considering switching to SpotBugs.  |
   | +1 :green_heart: |  findbugs  |   7m 43s |  trunk passed  |
   ||| _ Patch Compile Tests _ |
   | +0 :ok: |  mvndep  |   0m 28s |  Maven dependency ordering for patch  |
   | +1 :green_heart: |  mvninstall  |   3m  1s |  the patch passed  |
   | +1 :green_heart: |  compile  |  21m 10s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javac  |  21m 10s |  the patch passed  |
   | +1 :green_heart: |  compile  |  18m 10s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  javac  |  18m 10s |  the patch passed  |
   | -0 :warning: |  checkstyle  |   2m 47s |  root: The patch generated 3 new 
+ 105 unchanged - 4 fixed = 108 total (was 109)  |
   | +1 :green_heart: |  mvnsite  |   4m 57s |  the patch passed  |
   | -1 :x: |  whitespace  |   0m  0s |  The patch has 3 line(s) that end in 
whitespace. Use git apply --whitespace=fix <>. Refer 
https://git-scm.com/docs/git-apply  |
   | +1 :green_heart: |  xml  |   0m  6s |  The patch has no ill-formed XML 
file.  |
   | +1 :green_heart: |  shadedclient  |  15m 51s |  patch has no errors when 
building and testing our client artifacts.  |
   | +1 :green_heart: |  javadoc  |   2m 44s |  the patch passed with JDK 
Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1  |
   | +1 :green_heart: |  javadoc  |   4m 21s |  the patch passed with JDK 
Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01  |
   | +1 :green_heart: |  findbugs  |   8m 20s |  the patch passed  |
   ||| _ Other Tests _ |
   | -1 :x: |  unit  |  10m  1s |  hadoop-common in the patch passed.  |
   | -1 :x: |  unit  | 101m  4s |  hadoop-hdfs in the patch passed.  |
   | +1 :green_heart: |  unit  |   1m 58s |  hadoop-azure in the patch passed.  
|
   | +1 :green_heart: |  unit  |   1m 19s |  hadoop-azure-datalake in the patch 
passed.  |
   | +1 :green_heart: |  asflicense  |   1m  5s |  The patch does not generate 
ASF License warnings.  |
   |  |   | 316m 36s |   |
   
   
   | Reason | Tests |
   |---:|:--|
   | Failed junit tests | 
hadoop.fs.contract.rawlocal.TestRawlocalContractCreate |
   |   | hadoop.hdfs.TestFileChecksum |
   |   | hadoop.hdfs.TestFileChecksumCompositeCrc |
   |   | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks |
   |   | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped |
   |   | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics |
   |   | hadoop.hdfs.TestSnapshotCommands |
   
   
   | Subsystem | Report/Notes |
   |--:|:-|
   | Docker | ClientAPI=1.40 ServerAPI=1.40 base: 
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2102/3/artifact/out/Dockerfile
 |
   | GITHUB PR | https://github.com/apache/hadoop/pull/2102 |
   | Optional Tests | dupname asflicense compile javac javadoc mvninstall 
mvnsite unit shadedclient findbugs checkstyle markdownlint xml |
   | uname | Linux 0d2edf1efca2 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 
23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux |
   | Build tool | maven |
   | Personality | dev-support/bin/hadoop.sh |
   | git revision | trunk / 83c7c2b4c48 |
   | Default Java | Private

[jira] [Work logged] (HADOOP-17021) Add concat fs command

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17021?focusedWorklogId=493500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493500
 ]

ASF GitHub Bot logged work on HADOOP-17021:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 13:56
Start Date: 01/Oct/20 13:56
Worklog Time Spent: 10m 
  Work Description: steveloughran commented on a change in pull request 
#1993:
URL: https://github.com/apache/hadoop/pull/1993#discussion_r498263927



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Concat.java
##
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.util.LinkedList;
+
+import com.google.common.annotations.VisibleForTesting;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.PathIOException;
+
+/**
+ * Concat the given files.
+ */
+@InterfaceAudience.Private
+@InterfaceStability.Unstable
+public class Concat extends FsCommand {
+  public static void registerCommands(CommandFactory factory) {
+factory.addClass(Concat.class, "-concat");
+  }
+
+  public static final String NAME = "concat";
+  public static final String USAGE = "   ...";
+  public static final String DESCRIPTION = "Concatenate existing source files"
+  + " into the target file. Target file and source files should be in the"
+  + " same directory.";
+  private static FileSystem testFs; // test only.
+
+  @Override
+  protected void processArguments(LinkedList args)
+  throws IOException {
+if (args.size() < 1) {
+  throw new IOException("Target path not specified. " + USAGE);
+}
+if (args.size() < 3) {
+  throw new IOException(
+  "The number of source paths is less than 2. " + USAGE);
+}
+PathData target = args.removeFirst();
+LinkedList srcList = args;
+if (!target.exists || !target.stat.isFile()) {
+  throw new FileNotFoundException(String
+  .format("Target path %s does not exist or is" + " not file.",
+  target.path));
+}
+Path[] srcArray = new Path[srcList.size()];
+for (int i = 0; i < args.size(); i++) {
+  PathData src = srcList.get(i);
+  if (!src.exists || !src.stat.isFile()) {
+throw new FileNotFoundException(
+String.format("%s does not exist or is not file.", src.path));
+  }
+  srcArray[i] = src.path;
+}
+FileSystem fs = target.fs;
+if (testFs != null) {
+  fs = testFs;
+}
+try {
+  fs.concat(target.path, srcArray);
+} catch (UnsupportedOperationException exception) {
+  throw new PathIOException("Dest filesystem '" + fs.getUri().getScheme()

Review comment:
   change this to the PathIOE which takes the target.path.toString as the 
first param. 
   
   The command line tools aren't great for reporting failures -anything we can 
do to improve the reporting is worth trying





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493500)
Time Spent: 4h  (was: 3h 50m)

> Add concat fs command
> -
>
> Key: HADOOP-17021
> URL: https://issues.apache.org/jira/browse/HADOOP-17021
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Jinglun
>Assignee: Jinglun
>Priority: Minor
>  Labels: pull-request-available
> Attachments: HADOOP-17021.001.patch
>
>  Time Spent: 4h
>  Remaining Estimate: 0h
>
> We should add one concat fs command for ease of use. It

[GitHub] [hadoop] steveloughran commented on a change in pull request #1993: HADOOP-17021. Add concat fs command

2020-10-01 Thread GitBox



steveloughran commented on a change in pull request #1993:
URL: https://github.com/apache/hadoop/pull/1993#discussion_r498263927



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Concat.java
##
@@ -0,0 +1,91 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.hadoop.fs.shell;
+
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.util.LinkedList;
+
+import com.google.common.annotations.VisibleForTesting;
+
+import org.apache.hadoop.classification.InterfaceAudience;
+import org.apache.hadoop.classification.InterfaceStability;
+import org.apache.hadoop.fs.FileSystem;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.PathIOException;
+
+/**
+ * Concat the given files.
+ */
+@InterfaceAudience.Private
+@InterfaceStability.Unstable
+public class Concat extends FsCommand {
+  public static void registerCommands(CommandFactory factory) {
+factory.addClass(Concat.class, "-concat");
+  }
+
+  public static final String NAME = "concat";
+  public static final String USAGE = "   ...";
+  public static final String DESCRIPTION = "Concatenate existing source files"
+  + " into the target file. Target file and source files should be in the"
+  + " same directory.";
+  private static FileSystem testFs; // test only.
+
+  @Override
+  protected void processArguments(LinkedList args)
+  throws IOException {
+if (args.size() < 1) {
+  throw new IOException("Target path not specified. " + USAGE);
+}
+if (args.size() < 3) {
+  throw new IOException(
+  "The number of source paths is less than 2. " + USAGE);
+}
+PathData target = args.removeFirst();
+LinkedList srcList = args;
+if (!target.exists || !target.stat.isFile()) {
+  throw new FileNotFoundException(String
+  .format("Target path %s does not exist or is" + " not file.",
+  target.path));
+}
+Path[] srcArray = new Path[srcList.size()];
+for (int i = 0; i < args.size(); i++) {
+  PathData src = srcList.get(i);
+  if (!src.exists || !src.stat.isFile()) {
+throw new FileNotFoundException(
+String.format("%s does not exist or is not file.", src.path));
+  }
+  srcArray[i] = src.path;
+}
+FileSystem fs = target.fs;
+if (testFs != null) {
+  fs = testFs;
+}
+try {
+  fs.concat(target.path, srcArray);
+} catch (UnsupportedOperationException exception) {
+  throw new PathIOException("Dest filesystem '" + fs.getUri().getScheme()

Review comment:
   change this to the PathIOE which takes the target.path.toString as the 
first param. 
   
   The command line tools aren't great for reporting failures -anything we can 
do to improve the reporting is worth trying





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm

2020-10-01 Thread GitBox



steveloughran commented on pull request #2349:
URL: https://github.com/apache/hadoop/pull/2349#issuecomment-702137474


   @jbrennan333 what do you think we should say instead of deprecated? "not 
recommended". 
   
   I was thinking of adding a link to the JIRA and changing the issue text 
there to clarify
   * safe if names and content of generated output files is consistent across 
all task attempts
   * unsafe if different TAs generate bad files (biggest risk, as partial 
failure of 1st attempt may leave)
   * unsafe if different TAs generate different content in same files (only an 
issue on a network partition and TA #1 generates output as/after TA #2 does its 
work.
   
   cleanup of job will delete the whole job attempt dir so that's the maximum 
time that a partitioned TA may commit work. There's no risk of some VM pausing 
for 3 hours, restarting and an in progress TA completing its work and 
overwriting the final output. This is good.
   
   
   
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm

2020-10-01 Thread GitBox



steveloughran commented on a change in pull request #2349:
URL: https://github.com/apache/hadoop/pull/2349#discussion_r498242126



##
File path: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java
##
@@ -348,6 +348,16 @@ public Path getWorkPath() throws IOException {
* @param context the job's context
*/
   public void setupJob(JobContext context) throws IOException {
+// Downgrade v2 to v1 with a warning.
+if (algorithmVersion == 2) {
+  Logger log = LoggerFactory.getLogger(
+  "org.apache.hadoop.mapreduce.lib.output."
+  + "FileOutputCommitter.Algorithm");
+
+  log.warn("The v2 commit algorithm is deprecated;"
+  + " please switch to the v1 algorithm");

Review comment:
   what do you suggest?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] steveloughran commented on a change in pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm

2020-10-01 Thread GitBox



steveloughran commented on a change in pull request #2349:
URL: https://github.com/apache/hadoop/pull/2349#discussion_r498242247



##
File path: 
hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml
##
@@ -1562,10 +1562,35 @@
 
 
   mapreduce.fileoutputcommitter.algorithm.version
-  2
-  The file output committer algorithm version
-  valid algorithm version number: 1 or 2
-  default to 2, which is the original algorithm
+  1
+  The file output committer algorithm version.
+
+  There are two algorithm versions in Hadoop, "1" and "2".
+
+  The version 2 algorithm is deprecated and no longer the default
+  as task commits were not atomic.
+  If a first task attempt fails part-way
+  through its task commit, the output directory could end up
+  with data from that failed commit, alongside the data
+  from any subsequent attempts.
+
+  See https://issues.apache.org/jira/browse/MAPREDUCE-7282
+
+  Although no-longer the default, this algorithm is safe to use if
+  all task attempts for a single task meet the following requirements
+  -they generate exactly the same set of files
+  -the contents of each file are exactly the same in each task attempt
+
+  That is:
+  1. If a second attempt commits work, there will be no leftover files from
+  a first attempt which failed during its task commit.
+  2. If a network partition causes the first task attempt to overwrite
+  some/all of the output of a second attempt, the result will be
+  exactly the same as if it had not done so.
+
+  To avoid the warning message on job setup, set the log level of the log
+  org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.Algorithm
+  to ERROR.

Review comment:
   ok





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] bshashikant opened a new pull request #2355: HDFS-15611. Add list Snapshot command in WebHDFS.

2020-10-01 Thread GitBox



bshashikant opened a new pull request #2355:
URL: https://github.com/apache/hadoop/pull/2355


   please check https://issues.apache.org/jira/browse/HDFS-15611
   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493458
 ]

ASF GitHub Bot logged work on HADOOP-17276:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 12:04
Start Date: 01/Oct/20 12:04
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498191019



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.

Review comment:
   > , etc.
   
   The illegal separators are only `\t`, `\n`, and `=`.
   "etc" is not needed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493458)
Time Spent: 7.5h  (was: 7h 20m)

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7.5h
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] aajisaka commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items

2020-10-01 Thread GitBox



aajisaka commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498191019



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.

Review comment:
   > , etc.
   
   The illegal separators are only `\t`, `\n`, and `=`.
   "etc" is not needed.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493456=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493456
 ]

ASF GitHub Bot logged work on HADOOP-17276:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 12:02
Start Date: 01/Oct/20 12:02
Worklog Time Spent: 10m 
  Work Description: aajisaka commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498189901



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");

Review comment:
   We should use `Collections.unmodifiableList` to provide an unmodifiable 
view.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493456)
Time Spent: 7h 20m  (was: 7h 10m)

> Extend CallerContext to make it include many items
> --
>
> Key: HADOOP-17276
> URL: https://issues.apache.org/jira/browse/HADOOP-17276
> Project: Hadoop Common
>  Issue Type: Improvement
>Reporter: Hui Fei
>Assignee: Hui Fei
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 7h 20m
>  Remaining Estimate: 0h
>
> Now context is string. We need to extend the CallerContext because context 
> may contains many items.
> Items include 
> * router ip
> * MR or CLI
> * etc



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] aajisaka commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items

2020-10-01 Thread GitBox



aajisaka commented on a change in pull request #2327:
URL: https://github.com/apache/hadoop/pull/2327#discussion_r498189901



##
File path: 
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java
##
@@ -114,6 +115,12 @@ public String toString() {
   /** The caller context builder. */
   public static final class Builder {
 private static final String KEY_VALUE_SEPARATOR = ":";
+/**
+ * The illegal separators include '\t', '\n', '=', etc.
+ * User should not set illegal separator.
+ */
+private static final List ILLEGAL_SEPARATORS =
+Arrays.asList("\t","\n","=");

Review comment:
   We should use `Collections.unmodifiableList` to provide an unmodifiable 
view.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] aajisaka closed pull request #2346: [DO NOT MERGE] Avoid YETUS-994 for testing

2020-10-01 Thread GitBox



aajisaka closed pull request #2346:
URL: https://github.com/apache/hadoop/pull/2346


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] aajisaka commented on pull request #2346: [DO NOT MERGE] Avoid YETUS-994 for testing

2020-10-01 Thread GitBox



aajisaka commented on pull request #2346:
URL: https://github.com/apache/hadoop/pull/2346#issuecomment-702078340


   The new token worked as expected, so we don't need to revert YETUS-994 in 
our setting.
   Closing.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=493435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493435
 ]

ASF GitHub Bot logged work on HADOOP-16830:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 11:31
Start Date: 01/Oct/20 11:31
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on pull request #2323:
URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702072245


   In IOStatisticsBinding class we have methods for tracking duration but, I am 
not able to wrap it around a normal function.
   There are 3 methods for tracking durations which are for Callable, 
CallableRaisingIOE, and FunctionRaisingIOE. We should add 1 more for a 
normal function too.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493435)
Time Spent: 6.5h  (was: 6h 20m)

> Add public IOStatistics API
> ---
>
> Key: HADOOP-16830
> URL: https://issues.apache.org/jira/browse/HADOOP-16830
> Project: Hadoop Common
>  Issue Type: New Feature
>  Components: fs, fs/s3
>Affects Versions: 3.3.0
>Reporter: Steve Loughran
>Assignee: Steve Loughran
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 6.5h
>  Remaining Estimate: 0h
>
> Applications like to collect the statistics which specific operations take, 
> by collecting exactly those operations done during the execution of FS API 
> calls by their individual worker threads, and returning these to their job 
> driver
> * S3A has a statistics API for some streams, but it's a non-standard one; 
> Impala  can't use it
> * FileSystem storage statistics are public, but as they aren't cross-thread, 
> they don't aggregate properly
> Proposed
> # A new IOStatistics interface to serve up statistics
> # S3A to implement
> # other stores to follow
> # Pass-through from the usual wrapper classes (FS data input/output streams)
> It's hard to think about how best to offer an API for operation context 
> stats, and how to actually implement.
> ThreadLocal isn't enough because the helper threads need to update on the 
> thread local value of the instigator
> My Initial PoC doesn't address that issue, but it shows what I'm thinking of



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] mehakmeet commented on pull request #2323: HADOOP-16830. Add public IOStatistics API.

2020-10-01 Thread GitBox



mehakmeet commented on pull request #2323:
URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702072245


   In IOStatisticsBinding class we have methods for tracking duration but, I am 
not able to wrap it around a normal function.
   There are 3 methods for tracking durations which are for Callable, 
CallableRaisingIOE, and FunctionRaisingIOE. We should add 1 more for a 
normal function too.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17272) ABFS Streams to support IOStatistics API

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17272?focusedWorklogId=493363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493363
 ]

ASF GitHub Bot logged work on HADOOP-17272:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 09:29
Start Date: 01/Oct/20 09:29
Worklog Time Spent: 10m 
  Work Description: mehakmeet commented on pull request #2353:
URL: https://github.com/apache/hadoop/pull/2353#issuecomment-702011782


   Have to force push since am rebasing Steve's branch on my commits. Also, 
Don't know why Yetus isn't running.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493363)
Time Spent: 40m  (was: 0.5h)

> ABFS Streams to  support IOStatistics API
> -
>
> Key: HADOOP-17272
> URL: https://issues.apache.org/jira/browse/HADOOP-17272
> Project: Hadoop Common
>  Issue Type: Sub-task
>  Components: fs/azure
>Affects Versions: 3.3.1
>Reporter: Steve Loughran
>Assignee: Mehakmeet Singh
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> ABFS input/output streams to support IOStatistics API



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] mehakmeet commented on pull request #2353: HADOOP-17272. ABFS Streams to support IOStatistics API

2020-10-01 Thread GitBox



mehakmeet commented on pull request #2353:
URL: https://github.com/apache/hadoop/pull/2353#issuecomment-702011782


   Have to force push since am rebasing Steve's branch on my commits. Also, 
Don't know why Yetus isn't running.



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-17288) Use shaded guava from thirdparty

2020-10-01 Thread Ayush Saxena (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205340#comment-17205340
 ] 

Ayush Saxena edited comment on HADOOP-17288 at 10/1/20, 7:49 AM:
-

Transitive dependency as the dependencies which hadoop pulls up, on which 
hadoop depends. Like {{mockserer-core}} is one. If you see the 
{{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, 
requires guava.

Ideally if the downstream has to upgrade guava, then this patch has no meaning. 
The basic requirement itself is that the downstream should not need to upgrade 
guava. 
If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, 
which is compatible with the guava version of the downstream project they can 
exclude it, and I think things should work fine?

If there are dependencies with higher version of Guava, (need to analyse), 
which aren't compatible with guava version of downstream projects. Then we 
might need to shade them as well? May be {{curator}} can be one of those. 

Let me know, if you have any idea/suggestion or different approach which may 
make things better


was (Author: ayushtkn):
Transitive dependency as the dependencies which hadoop pulls up, on which 
hadoop depends. Like {{mockserer-core}} is one. If you see the 
{{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, 
requires guava.

Ideally if the downstream has to upgrade guava, then this patch has no meaning. 
The basic requirement itself is that the downstream should not need guava. 
If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, 
which is compatible with the guava version of the downstream project they can 
exclude it, and I think things should work fine?

If there are dependencies with higher version of Guava, (need to analyse), 
which aren't compatible with guava version of downstream project. Then we might 
need to shade them as well? May be {{curator}} can be one of those.

Let me know, if you have any idea/suggestion or different approach which may 
make things better

> Use shaded guava from thirdparty
> 
>
> Key: HADOOP-17288
> URL: https://issues.apache.org/jira/browse/HADOOP-17288
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use the shaded version of guava in hadoop-thirdparty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty

2020-10-01 Thread Ayush Saxena (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205340#comment-17205340
 ] 

Ayush Saxena commented on HADOOP-17288:
---

Transitive dependency as the dependencies which hadoop pulls up, on which 
hadoop depends. Like {{mockserer-core}} is one. If you see the 
{{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, brings 
up guava in hadoop.

Ideally if the downstream has to upgrade guava, then this patch has no meaning. 
The basic requirement itself is that the downstream should not need guava. 
If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, 
which is compatible with the guava version of the downstream project they can 
exclude it, and I think things should work fine?

If there are dependencies with higher version of Guava, (need to analyse), 
which aren't compatible with guava version of downstream project. Then we might 
need to shade them as well? May be {{curator}} can be one of those.

Let me know, if you have any idea/suggestion or different approach which may 
make things better

> Use shaded guava from thirdparty
> 
>
> Key: HADOOP-17288
> URL: https://issues.apache.org/jira/browse/HADOOP-17288
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use the shaded version of guava in hadoop-thirdparty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Comment Edited] (HADOOP-17288) Use shaded guava from thirdparty

2020-10-01 Thread Ayush Saxena (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205340#comment-17205340
 ] 

Ayush Saxena edited comment on HADOOP-17288 at 10/1/20, 7:47 AM:
-

Transitive dependency as the dependencies which hadoop pulls up, on which 
hadoop depends. Like {{mockserer-core}} is one. If you see the 
{{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, 
requires guava.

Ideally if the downstream has to upgrade guava, then this patch has no meaning. 
The basic requirement itself is that the downstream should not need guava. 
If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, 
which is compatible with the guava version of the downstream project they can 
exclude it, and I think things should work fine?

If there are dependencies with higher version of Guava, (need to analyse), 
which aren't compatible with guava version of downstream project. Then we might 
need to shade them as well? May be {{curator}} can be one of those.

Let me know, if you have any idea/suggestion or different approach which may 
make things better


was (Author: ayushtkn):
Transitive dependency as the dependencies which hadoop pulls up, on which 
hadoop depends. Like {{mockserer-core}} is one. If you see the 
{{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, brings 
up guava in hadoop.

Ideally if the downstream has to upgrade guava, then this patch has no meaning. 
The basic requirement itself is that the downstream should not need guava. 
If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, 
which is compatible with the guava version of the downstream project they can 
exclude it, and I think things should work fine?

If there are dependencies with higher version of Guava, (need to analyse), 
which aren't compatible with guava version of downstream project. Then we might 
need to shade them as well? May be {{curator}} can be one of those.

Let me know, if you have any idea/suggestion or different approach which may 
make things better

> Use shaded guava from thirdparty
> 
>
> Key: HADOOP-17288
> URL: https://issues.apache.org/jira/browse/HADOOP-17288
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use the shaded version of guava in hadoop-thirdparty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-17288) Use shaded guava from thirdparty

2020-10-01 Thread Michael Stack (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Stack updated HADOOP-17288:
---
Fix Version/s: 3.4.0

> Use shaded guava from thirdparty
> 
>
> Key: HADOOP-17288
> URL: https://issues.apache.org/jira/browse/HADOOP-17288
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
> Fix For: 3.4.0
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use the shaded version of guava in hadoop-thirdparty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty

2020-10-01 Thread Michael Stack (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205329#comment-17205329
 ] 

Michael Stack commented on HADOOP-17288:


{quote}but guava as of now would be still packaged as it is part of several 
transitive dependencies.
{quote}
Can you say more on the above? Transitively included by Hadoop because Hadoop 
dependencies pull it in or are you talking downstreamers that expect Hadoop to 
provide guava to them (transitively?)

I'm wondering about the downstreamers whose apps use guava 11 because thats 
what hadoop used until 3.3.0/3.2.1. They want to upgrade to 3.4. They'll have 
to do the work to upgrade to guava 27 because that is what 3.2.1/3.3.0 have 
even though you've done all this work here. Seems a shame?

(I set fix version as 3.4.0 – thanks).

> Use shaded guava from thirdparty
> 
>
> Key: HADOOP-17288
> URL: https://issues.apache.org/jira/browse/HADOOP-17288
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use the shaded version of guava in hadoop-thirdparty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty

2020-10-01 Thread Ayush Saxena (Jira)



[ 
https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205309#comment-17205309
 ] 

Ayush Saxena commented on HADOOP-17288:
---

[~stack] yes the intentions are like that only.
The hadoop jars would be using the shaded guava from the thirdparty. but guava 
as of now would be still packaged as it is part of several transitive 
dependencies. So need to see whether we can go ahead like this, or try shading 
the ones which bring in guava(they are many)

Yes, I am targeting this for 3.4.0 as of now

> Use shaded guava from thirdparty
> 
>
> Key: HADOOP-17288
> URL: https://issues.apache.org/jira/browse/HADOOP-17288
> Project: Hadoop Common
>  Issue Type: Sub-task
>Reporter: Ayush Saxena
>Assignee: Ayush Saxena
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Use the shaded version of guava in hadoop-thirdparty



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Updated] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated HADOOP-17281:

Labels: pull-request-available  (was: )

> Implement FileSystem.listStatusIterator() in S3AFileSystem
> --
>
> Key: HADOOP-17281
> URL: https://issues.apache.org/jira/browse/HADOOP-17281
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an 
> array. Once we implement the listStatusIterator(), clients can benefit from 
> the async listing done recently 
> https://issues.apache.org/jira/browse/HADOOP-17074  by performing some tasks 
> on files while iterating them.
>  
> CC [~stevel]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-01 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493316
 ]

ASF GitHub Bot logged work on HADOOP-17281:
---

Author: ASF GitHub Bot
Created on: 01/Oct/20 06:56
Start Date: 01/Oct/20 06:56
Worklog Time Spent: 10m 
  Work Description: mukund-thakur opened a new pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354


   Ran the new test using ap-south-1 bucket. 
   
   O/P- 
   `(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using 
listFiles() api with batch size of 10 including 10ms of processing time for 
each file: 12,223,848,028 nS
   2020-10-01 12:19:28,811 
[JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  
contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of 
listing 1000 files using listStatus() api with batch size of 10 including 10ms 
of processing time for each file: 15,988,037,357 nS
   2020-10-01 12:19:41,050 
[JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  
contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of 
listing 1000 files using listStatusIterator() api with batch size of 10 
including 10ms of processing time for each file: 12,214,813,052 nS`
   
   From the logs we can see that time taken using listStatusIterator() and 
listFiles() matches and is less than listStatus().



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 493316)
Remaining Estimate: 0h
Time Spent: 10m

> Implement FileSystem.listStatusIterator() in S3AFileSystem
> --
>
> Key: HADOOP-17281
> URL: https://issues.apache.org/jira/browse/HADOOP-17281
> Project: Hadoop Common
>  Issue Type: Task
>  Components: fs/s3
>Affects Versions: 3.3.0
>Reporter: Mukund Thakur
>Assignee: Mukund Thakur
>Priority: Major
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Currently S3AFileSystem only implements listStatus() api which returns an 
> array. Once we implement the listStatusIterator(), clients can benefit from 
> the async listing done recently 
> https://issues.apache.org/jira/browse/HADOOP-17074  by performing some tasks 
> on files while iterating them.
>  
> CC [~stevel]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

[GitHub] [hadoop] mukund-thakur opened a new pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem

2020-10-01 Thread GitBox



mukund-thakur opened a new pull request #2354:
URL: https://github.com/apache/hadoop/pull/2354


   Ran the new test using ap-south-1 bucket. 
   
   O/P- 
   `(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using 
listFiles() api with batch size of 10 including 10ms of processing time for 
each file: 12,223,848,028 nS
   2020-10-01 12:19:28,811 
[JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  
contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of 
listing 1000 files using listStatus() api with batch size of 10 including 10ms 
of processing time for each file: 15,988,037,357 nS
   2020-10-01 12:19:41,050 
[JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO  
contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of 
listing 1000 files using listStatusIterator() api with batch size of 10 
including 10ms of processing time for each file: 12,214,813,052 nS`
   
   From the logs we can see that time taken using listStatusIterator() and 
listFiles() matches and is less than listStatus().



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

90 matches

Mail list logo