[GitHub] [hadoop] bshashikant commented on pull request #2355: HDFS-15611. Add list Snapshot command in WebHDFS.
bshashikant commented on pull request #2355: URL: https://github.com/apache/hadoop/pull/2355#issuecomment-702536750 > Thanks @bshashikant . Should we add `SnapshotStatus[] getSnapshotListing(Path snapshotRoot)` > to FileSystem? If yes, we should move SnapshotStatus to org.apache.hadoop.fs and declare dirStatus using FileStatus. Thanks @szetszwo . I checked the code and saw that "getSnapshottableDirectoryList()" is also not moved to Filesystem class yet. I would prefer no to move any of these to Filesystem class here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493820 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 02/Oct/20 05:19 Start Date: 02/Oct/20 05:19 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498622029 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -236,7 +237,7 @@ public synchronized int compress(byte[] b, int off, int len) } // Compress data -n = useLz4HC ? compressBytesDirectHC() : compressBytesDirect(); +n = compressBytesDirect(); Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493820) Time Spent: 1h 10m (was: 1h) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1h 10m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] viirya commented on a change in pull request #2350: HADOOP-17292. Using lz4-java in Lz4Codec
viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498622029 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/io/compress/lz4/Lz4Compressor.java ## @@ -236,7 +237,7 @@ public synchronized int compress(byte[] b, int off, int len) } // Compress data -n = useLz4HC ? compressBytesDirectHC() : compressBytesDirect(); +n = compressBytesDirect(); Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493819=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493819 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 02/Oct/20 05:15 Start Date: 02/Oct/20 05:15 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621489 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java ## @@ -143,22 +143,16 @@ public void testSnappyCodec() throws IOException { @Test public void testLz4Codec() throws IOException { -if (NativeCodeLoader.isNativeCodeLoaded()) { - if (Lz4Codec.isNativeCodeLoaded()) { -conf.setBoolean( +conf.setBoolean( CommonConfigurationKeys.IO_COMPRESSION_CODEC_LZ4_USELZ4HC_KEY, Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493819) Time Spent: 1h (was: 50m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] viirya commented on a change in pull request #2350: HADOOP-17292. Using lz4-java in Lz4Codec
viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621489 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/TestCodec.java ## @@ -143,22 +143,16 @@ public void testSnappyCodec() throws IOException { @Test public void testLz4Codec() throws IOException { -if (NativeCodeLoader.isNativeCodeLoaded()) { - if (Lz4Codec.isNativeCodeLoaded()) { -conf.setBoolean( +conf.setBoolean( CommonConfigurationKeys.IO_COMPRESSION_CODEC_LZ4_USELZ4HC_KEY, Review comment: fixed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17292) Using lz4-java in Lz4Codec
[ https://issues.apache.org/jira/browse/HADOOP-17292?focusedWorklogId=493818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493818 ] ASF GitHub Bot logged work on HADOOP-17292: --- Author: ASF GitHub Bot Created on: 02/Oct/20 05:13 Start Date: 02/Oct/20 05:13 Worklog Time Spent: 10m Work Description: viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621151 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -494,8 +494,7 @@ public String getName() { private static boolean isAvailable(TesterPair pair) { Compressor compressor = pair.compressor; -if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) +if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)) Review comment: Added compatibility test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493818) Time Spent: 50m (was: 40m) > Using lz4-java in Lz4Codec > -- > > Key: HADOOP-17292 > URL: https://issues.apache.org/jira/browse/HADOOP-17292 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: L. C. Hsieh >Priority: Major > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > In Hadoop, we use native libs for lz4 codec which has several disadvantages: > It requires native libhadoop to be installed in system LD_LIBRARY_PATH, and > they have to be installed separately on each node of the clusters, container > images, or local test environments which adds huge complexities from > deployment point of view. In some environments, it requires compiling the > natives from sources which is non-trivial. Also, this approach is platform > dependent; the binary may not work in different platform, so it requires > recompilation. > It requires extra configuration of java.library.path to load the natives, and > it results higher application deployment and maintenance cost for users. > Projects such as Spark use [lz4-java|https://github.com/lz4/lz4-java] which > is JNI-based implementation. It contains native binaries in jar file, and it > can automatically load the native binaries into JVM from jar without any > setup. If a native implementation can not be found for a platform, it can > fallback to pure-java implementation of lz4. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] viirya commented on a change in pull request #2350: HADOOP-17292. Using lz4-java in Lz4Codec
viirya commented on a change in pull request #2350: URL: https://github.com/apache/hadoop/pull/2350#discussion_r498621151 ## File path: hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java ## @@ -494,8 +494,7 @@ public String getName() { private static boolean isAvailable(TesterPair pair) { Compressor compressor = pair.compressor; -if (compressor.getClass().isAssignableFrom(Lz4Compressor.class) -&& (NativeCodeLoader.isNativeCodeLoaded())) +if (compressor.getClass().isAssignableFrom(Lz4Compressor.class)) Review comment: Added compatibility test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205970#comment-17205970 ] Attila Doroszlai commented on HADOOP-16990: --- Sure, I ran these tests before uploading the patch: {code} [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.web.oauth2.TestRefreshTokenTimeBasedTokenRefresher [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.006 s - in org.apache.hadoop.hdfs.web.oauth2.TestRefreshTokenTimeBasedTokenRefresher [INFO] Running org.apache.hadoop.hdfs.web.oauth2.TestClientCredentialTimeBasedTokenRefresher [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 4.908 s - in org.apache.hadoop.hdfs.web.oauth2.TestClientCredentialTimeBasedTokenRefresher [INFO] Running org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2 [INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 6.2 s - in org.apache.hadoop.hdfs.web.TestWebHDFSOAuth2 [INFO] [INFO] Results: [INFO] [INFO] Tests run: 3, Failures: 0, Errors: 0, Skipped: 0 {code} But it looks like Yetus also runs tests with the patch: https://ci-hadoop.apache.org/job/PreCommit-HADOOP-Build/85/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs-client.txt > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > Attachments: HADOOP-16990.001.patch > > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] szetszwo commented on a change in pull request #2355: HDFS-15611. Add list Snapshot command in WebHDFS.
szetszwo commented on a change in pull request #2355: URL: https://github.com/apache/hadoop/pull/2355#discussion_r498605120 ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java ## @@ -1459,6 +1460,19 @@ SnapshotDiffReport decodeResponse(Map json) { }.run(); } + public SnapshotStatus[] getSnapshotList(final Path snapshotDir) Review comment: In DistributedFileSystem, it is called ` public SnapshotStatus[] getSnapshotListing(Path snapshotRoot)` Let's use the same name? ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/protocol/SnapshotStatus.java ## @@ -25,6 +25,7 @@ import org.apache.hadoop.fs.Path; import org.apache.hadoop.fs.permission.FsPermission; import org.apache.hadoop.hdfs.DFSUtilClient; +import org.apache.hadoop.util.StringUtils; Review comment: Unused import. ## File path: hadoop-hdfs-project/hadoop-hdfs-httpfs/src/main/java/org/apache/hadoop/fs/http/client/HttpFSFileSystem.java ## @@ -1582,6 +1584,18 @@ public SnapshotDiffReport getSnapshotDiffReport(Path path, return JsonUtilClient.toSnapshottableDirectoryList(json); } + public SnapshotStatus[] getSnapshotList(Path snapshotRoot) Review comment: It should call getSnapshotListing. ## File path: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/web/JsonUtilClient.java ## @@ -872,4 +873,39 @@ private static SnapshottableDirectoryStatus toSnapshottableDirectoryStatus( snapshotQuota, parentFullPath); return snapshottableDirectoryStatus; } + + public static SnapshotStatus[] toSnapshotList(final Map json) { +if (json == null) { + return null; +} +List list = (List) json.get("SnapshotList"); +if (list == null) { + return null; +} +SnapshotStatus[] statuses = +new SnapshotStatus[list.size()]; +for (int i = 0; i < list.size(); i++) { + statuses[i] = toSnapshotStatus((Map) list.get(i)); +} +return statuses; + } + + private static SnapshotStatus toSnapshotStatus( + Map json) { +if (json == null) { + return null; +} +int snapshotID = getInt(json, "snapshotID", 0); +boolean isDeleted = ((String)json.get("deletionStatus")). +contentEquals("DELETED"); Review comment: Use "DELETED".equal(..) in order to avoid NPE? ` final boolean isDeleted = "DELETED".equals(json.get("deletionStatus"));` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205919#comment-17205919 ] Wei-Chiu Chuang commented on HADOOP-16990: -- Looks good... Attila. Can you also manually verify the HDFS tests that use mockserver also pass? The Hadoop precommit bypass HDFS tests so you'll need additional manual check. > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > Attachments: HADOOP-16990.001.patch > > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205903#comment-17205903 ] Hadoop QA commented on HADOOP-16990: | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Logfile || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 40m 49s{color} | | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || || | {color:green}+1{color} | {color:green} dupname {color} | {color:green} 0m 0s{color} | | {color:green} No case conflicting files found. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} trunk Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 28s{color} | | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 27m 30s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 29m 27s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 23m 46s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 3m 28s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 49s{color} | | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 23m 9s{color} | | {color:green} branch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 29s{color} | | {color:green} trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 23s{color} | | {color:green} trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:blue}0{color} | {color:blue} spotbugs {color} | {color:blue} 3m 21s{color} | | {color:blue} Used deprecated FindBugs config; considering switching to SpotBugs. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 32s{color} | | {color:blue} branch/hadoop-project no findbugs output file (findbugsXml.xml) {color} | || || || || {color:brown} Patch Compile Tests {color} || || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 1m 15s{color} | | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 18s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 28m 16s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 28m 16s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 24m 4s{color} | | {color:green} the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 24m 4s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 4m 5s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 55s{color} | | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s{color} | | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 17m 32s{color} | | {color:green} patch has no errors when building and testing our client artifacts. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 25s{color} | | {color:green} the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 17s{color}
[jira] [Work logged] (HADOOP-17265) ABFS: Support for Client Correlation ID
[ https://issues.apache.org/jira/browse/HADOOP-17265?focusedWorklogId=493703=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493703 ] ASF GitHub Bot logged work on HADOOP-17265: --- Author: ASF GitHub Bot Created on: 01/Oct/20 21:42 Start Date: 01/Oct/20 21:42 Worklog Time Spent: 10m Work Description: sumangala-patki commented on pull request #2344: URL: https://github.com/apache/hadoop/pull/2344#issuecomment-702413487 Test Results HNS-Enabled Account (Location: East US 2) ``` Authentication type: SharedKey [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 42 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 Authentication type: OAuth [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 75 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 140 ``` HNS-Disabled Account (Location: East US 2, Central US) ``` Authentication type: SharedKey [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [ERROR] Failures: [ERROR] ITestAzureBlobFileSystemRandomRead.testSkipBounds:196->Assert.assertTrue:41->Assert.fail:88 There should not be any network I/O (elapsedTimeMs=53). [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 Authentication type: OAuth [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [ERROR] Failures: [ERROR] ITestAzureBlobFileSystemRandomRead.testValidateSeekBounds:245->Assert.assertTrue:41->Assert.fail:88 There should not be any network I/O (elapsedTimeMs=22). [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493703) Time Spent: 50m (was: 40m) > ABFS: Support for Client Correlation ID > --- > > Key: HADOOP-17265 > URL: https://issues.apache.org/jira/browse/HADOOP-17265 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.0 >Reporter: Sumangala Patki >Priority: Major > Labels: abfsactive, pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > Introducing a client correlation ID that appears in the Azure diagnostic logs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on pull request #2344: HADOOP-17265. ABFS: Support for Client Correlation ID
sumangala-patki commented on pull request #2344: URL: https://github.com/apache/hadoop/pull/2344#issuecomment-702413487 Test Results HNS-Enabled Account (Location: East US 2) ``` Authentication type: SharedKey [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 42 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 Authentication type: OAuth [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [WARNING] Tests run: 458, Failures: 0, Errors: 0, Skipped: 75 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 140 ``` HNS-Disabled Account (Location: East US 2, Central US) ``` Authentication type: SharedKey [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [ERROR] Failures: [ERROR] ITestAzureBlobFileSystemRandomRead.testSkipBounds:196->Assert.assertTrue:41->Assert.fail:88 There should not be any network I/O (elapsedTimeMs=53). [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 Authentication type: OAuth [INFO] Tests run: 87, Failures: 0, Errors: 0, Skipped: 0 [ERROR] Failures: [ERROR] ITestAzureBlobFileSystemRandomRead.testValidateSeekBounds:245->Assert.assertTrue:41->Assert.fail:88 There should not be any network I/O (elapsedTimeMs=22). [ERROR] Tests run: 458, Failures: 1, Errors: 0, Skipped: 246 [WARNING] Tests run: 207, Failures: 0, Errors: 0, Skipped: 24 ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17265) ABFS: Support for Client Correlation ID
[ https://issues.apache.org/jira/browse/HADOOP-17265?focusedWorklogId=493698=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493698 ] ASF GitHub Bot logged work on HADOOP-17265: --- Author: ASF GitHub Bot Created on: 01/Oct/20 21:33 Start Date: 01/Oct/20 21:33 Worklog Time Spent: 10m Work Description: sumangala-patki commented on a change in pull request #2344: URL: https://github.com/apache/hadoop/pull/2344#discussion_r498524223 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java ## @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.utils; + +import java.util.UUID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH; + +public class TrackingContext { + private String clientCorrelationID; + private String clientRequestID; + private static final Logger LOG = LoggerFactory.getLogger( + org.apache.hadoop.fs.azurebfs.services.AbfsClient.class); + + public TrackingContext(String clientCorrelationID) { +//validation +if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) || +(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) { + this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; + LOG.debug("Invalid config provided; correlation id not included in header."); +} +else if (clientCorrelationID.length() > 0) { + this.clientCorrelationID = clientCorrelationID + ":"; + LOG.debug("Client correlation id has been validated and set successfully."); Review comment: Success log omitted This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493698) Time Spent: 40m (was: 0.5h) > ABFS: Support for Client Correlation ID > --- > > Key: HADOOP-17265 > URL: https://issues.apache.org/jira/browse/HADOOP-17265 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.0 >Reporter: Sumangala Patki >Priority: Major > Labels: abfsactive, pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > Introducing a client correlation ID that appears in the Azure diagnostic logs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] sumangala-patki commented on a change in pull request #2344: HADOOP-17265. ABFS: Support for Client Correlation ID
sumangala-patki commented on a change in pull request #2344: URL: https://github.com/apache/hadoop/pull/2344#discussion_r498524223 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java ## @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.utils; + +import java.util.UUID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH; + +public class TrackingContext { + private String clientCorrelationID; + private String clientRequestID; + private static final Logger LOG = LoggerFactory.getLogger( + org.apache.hadoop.fs.azurebfs.services.AbfsClient.class); + + public TrackingContext(String clientCorrelationID) { +//validation +if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) || +(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) { + this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; + LOG.debug("Invalid config provided; correlation id not included in header."); +} +else if (clientCorrelationID.length() > 0) { + this.clientCorrelationID = clientCorrelationID + ":"; + LOG.debug("Client correlation id has been validated and set successfully."); Review comment: Success log omitted This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17183) ABFS: Enable checkaccess API
[ https://issues.apache.org/jira/browse/HADOOP-17183?focusedWorklogId=493676=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493676 ] ASF GitHub Bot logged work on HADOOP-17183: --- Author: ASF GitHub Bot Created on: 01/Oct/20 20:31 Start Date: 01/Oct/20 20:31 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2331: URL: https://github.com/apache/hadoop/pull/2331#issuecomment-702380473 +1, merged to 3.3 and trunk This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493676) Time Spent: 40m (was: 0.5h) > ABFS: Enable checkaccess API > > > Key: HADOOP-17183 > URL: https://issues.apache.org/jira/browse/HADOOP-17183 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.0 >Reporter: Bilahari T H >Assignee: Bilahari T H >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1 > > Time Spent: 40m > Remaining Estimate: 0h > > Enable check access on ABFS. Currently by default the same if disabled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2331: HADOOP-17183. ABFS: Enabling checkaccess on ABFS
steveloughran commented on pull request #2331: URL: https://github.com/apache/hadoop/pull/2331#issuecomment-702380473 +1, merged to 3.3 and trunk This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17183) ABFS: Enable checkaccess API
[ https://issues.apache.org/jira/browse/HADOOP-17183?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-17183: Fix Version/s: 3.3.1 Resolution: Fixed Status: Resolved (was: Patch Available) > ABFS: Enable checkaccess API > > > Key: HADOOP-17183 > URL: https://issues.apache.org/jira/browse/HADOOP-17183 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.0 >Reporter: Bilahari T H >Assignee: Bilahari T H >Priority: Major > Labels: pull-request-available > Fix For: 3.3.1 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Enable check access on ABFS. Currently by default the same if disabled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17183) ABFS: Enable checkaccess API
[ https://issues.apache.org/jira/browse/HADOOP-17183?focusedWorklogId=493675=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493675 ] ASF GitHub Bot logged work on HADOOP-17183: --- Author: ASF GitHub Bot Created on: 01/Oct/20 20:29 Start Date: 01/Oct/20 20:29 Worklog Time Spent: 10m Work Description: steveloughran merged pull request #2331: URL: https://github.com/apache/hadoop/pull/2331 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493675) Time Spent: 0.5h (was: 20m) > ABFS: Enable checkaccess API > > > Key: HADOOP-17183 > URL: https://issues.apache.org/jira/browse/HADOOP-17183 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.0 >Reporter: Bilahari T H >Assignee: Bilahari T H >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Enable check access on ABFS. Currently by default the same if disabled. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran merged pull request #2331: HADOOP-17183. ABFS: Enabling checkaccess on ABFS
steveloughran merged pull request #2331: URL: https://github.com/apache/hadoop/pull/2331 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205803#comment-17205803 ] Attila Doroszlai edited comment on HADOOP-16990 at 10/1/20, 8:25 PM: - MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as Hadoop trunk). But also on Guava 28.2-android, which is ahead of Hadoop (27.0-jre). There is no previous release of MockServer with Guava 27.0 dependency. They upgraded from 20 to 28.1. With [^HADOOP-16990.001.patch]: {code} [INFO] +- org.mock-server:mockserver-netty:jar:5.11.1:test [INFO] | +- org.mock-server:mockserver-client-java:jar:5.11.1:test [INFO] | \- org.mock-server:mockserver-core:jar:5.11.1:test [INFO] | +- com.lmax:disruptor:jar:3.4.2:test [INFO] | +- io.netty:netty-codec-socks:jar:4.1.50.Final:test {code} BTW, MockServer is only used in HDFS Client: {code} hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFSOAuth2.java hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/web/oauth2/TestClientCredentialTimeBasedTokenRefresher.java hadoop-hdfs-project/hadoop-hdfs-client/src/test/java/org/apache/hadoop/hdfs/web/oauth2/TestRefreshTokenTimeBasedTokenRefresher.java {code} was (Author: adoroszlai): MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as Hadoop trunk). But also on Guava 28.2-android, which is ahead of Hadoop (27.0-jre). There is no previous release of MockServer with Guava 27.0 dependency. They upgraded from 20 to 28.1. With [^HADOOP-16990.001.patch]: {code} [INFO] +- org.mock-server:mockserver-netty:jar:5.11.1:test [INFO] | +- org.mock-server:mockserver-client-java:jar:5.11.1:test [INFO] | \- org.mock-server:mockserver-core:jar:5.11.1:test [INFO] | +- com.lmax:disruptor:jar:3.4.2:test [INFO] | +- io.netty:netty-codec-socks:jar:4.1.50.Final:test {code} > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > Attachments: HADOOP-16990.001.patch > > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HADOOP-16990: -- Status: Patch Available (was: In Progress) > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > Attachments: HADOOP-16990.001.patch > > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205803#comment-17205803 ] Attila Doroszlai edited comment on HADOOP-16990 at 10/1/20, 8:22 PM: - MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as Hadoop trunk). But also on Guava 28.2-android, which is ahead of Hadoop (27.0-jre). There is no previous release of MockServer with Guava 27.0 dependency. They upgraded from 20 to 28.1. With [^HADOOP-16990.001.patch]: {code} [INFO] +- org.mock-server:mockserver-netty:jar:5.11.1:test [INFO] | +- org.mock-server:mockserver-client-java:jar:5.11.1:test [INFO] | \- org.mock-server:mockserver-core:jar:5.11.1:test [INFO] | +- com.lmax:disruptor:jar:3.4.2:test [INFO] | +- io.netty:netty-codec-socks:jar:4.1.50.Final:test {code} was (Author: adoroszlai): MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as Hadoop trunk). But also on Guava 28.2-android, which is ahead of Hadoop (27.0-jre). There is no previous release of MockServer with Guava 27.0 dependency. They upgraded from 20 to 28.1. > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > Attachments: HADOOP-16990.001.patch > > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm
steveloughran commented on a change in pull request #2349: URL: https://github.com/apache/hadoop/pull/2349#discussion_r498491655 ## File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java ## @@ -348,6 +348,16 @@ public Path getWorkPath() throws IOException { * @param context the job's context */ public void setupJob(JobContext context) throws IOException { +// Downgrade v2 to v1 with a warning. +if (algorithmVersion == 2) { + Logger log = LoggerFactory.getLogger( + "org.apache.hadoop.mapreduce.lib.output." + + "FileOutputCommitter.Algorithm"); + + log.warn("The v2 commit algorithm is deprecated;" + + " please switch to the v1 algorithm"); Review comment: switching to your text This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai updated HADOOP-16990: -- Attachment: HADOOP-16990.001.patch > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > Attachments: HADOOP-16990.001.patch > > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205803#comment-17205803 ] Attila Doroszlai commented on HADOOP-16990: --- MockServer 5.11.0 and 5.11.1 (latest release) depend on Netty 4.1.50 (same as Hadoop trunk). But also on Guava 28.2-android, which is ahead of Hadoop (27.0-jre). There is no previous release of MockServer with Guava 27.0 dependency. They upgraded from 20 to 28.1. > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > Attachments: HADOOP-16990.001.patch > > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem
[ https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493672=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493672 ] ASF GitHub Bot logged work on HADOOP-17281: --- Author: ASF GitHub Bot Created on: 01/Oct/20 20:13 Start Date: 01/Oct/20 20:13 Worklog Time Spent: 10m Work Description: steveloughran removed a comment on pull request #2354: URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522 Looks good. Annoying about the return types which force you to do that wrapping/casting. Can't you just forcibly cast the return type of the inner iterator? after all, type erasure means all type info will be lost in the actual compiled binary. I'd prefer that as it will give you automatic passthrough of the IOStatistics stuff. Add text to filesystem.md, something which: * specifies the result is exactly the same a listStatus, provided no other caller updates the directory during the list * declares that it's not atomic and performance implementations will page * and that if a path isn't there, that fact may not surface until next/hasNext...that is, we do lazy eval for all file IO We need to similar new contract tests in AbstractContractGetFileStatusTest for all to use * that in a dir with files and subdirectories, you get both returned in the listing * that you can iterate through with next() to failure as well as hasNext/next, and get the same results * listStatusIterator(file) returns the file * listStatusIterator("/") gives you a listing of root (put that in AbstractContractRootDirectoryTest) And two for changes partway through the iteration * change the directory during a list to add/delete files * deletes the actual path. These tests can't assert on what will happen, and with paged IO aren't likely to pick up on changes...there just to show it can be done and pick up on any major issues with implementations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493672) Time Spent: 0.5h (was: 20m) > Implement FileSystem.listStatusIterator() in S3AFileSystem > -- > > Key: HADOOP-17281 > URL: https://issues.apache.org/jira/browse/HADOOP-17281 > Project: Hadoop Common > Issue Type: Task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Mukund Thakur >Assignee: Mukund Thakur >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Currently S3AFileSystem only implements listStatus() api which returns an > array. Once we implement the listStatusIterator(), clients can benefit from > the async listing done recently > https://issues.apache.org/jira/browse/HADOOP-17074 by performing some tasks > on files while iterating them. > > CC [~stevel] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran removed a comment on pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem
steveloughran removed a comment on pull request #2354: URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522 Looks good. Annoying about the return types which force you to do that wrapping/casting. Can't you just forcibly cast the return type of the inner iterator? after all, type erasure means all type info will be lost in the actual compiled binary. I'd prefer that as it will give you automatic passthrough of the IOStatistics stuff. Add text to filesystem.md, something which: * specifies the result is exactly the same a listStatus, provided no other caller updates the directory during the list * declares that it's not atomic and performance implementations will page * and that if a path isn't there, that fact may not surface until next/hasNext...that is, we do lazy eval for all file IO We need to similar new contract tests in AbstractContractGetFileStatusTest for all to use * that in a dir with files and subdirectories, you get both returned in the listing * that you can iterate through with next() to failure as well as hasNext/next, and get the same results * listStatusIterator(file) returns the file * listStatusIterator("/") gives you a listing of root (put that in AbstractContractRootDirectoryTest) And two for changes partway through the iteration * change the directory during a list to add/delete files * deletes the actual path. These tests can't assert on what will happen, and with paged IO aren't likely to pick up on changes...there just to show it can be done and pick up on any major issues with implementations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem
[ https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493671=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493671 ] ASF GitHub Bot logged work on HADOOP-17281: --- Author: ASF GitHub Bot Created on: 01/Oct/20 20:12 Start Date: 01/Oct/20 20:12 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2354: URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522 Looks good. Annoying about the return types which force you to do that wrapping/casting. Can't you just forcibly cast the return type of the inner iterator? after all, type erasure means all type info will be lost in the actual compiled binary. I'd prefer that as it will give you automatic passthrough of the IOStatistics stuff. Add text to filesystem.md, something which: * specifies the result is exactly the same a listStatus, provided no other caller updates the directory during the list * declares that it's not atomic and performance implementations will page * and that if a path isn't there, that fact may not surface until next/hasNext...that is, we do lazy eval for all file IO We need to similar new contract tests in AbstractContractGetFileStatusTest for all to use * that in a dir with files and subdirectories, you get both returned in the listing * that you can iterate through with next() to failure as well as hasNext/next, and get the same results * listStatusIterator(file) returns the file * listStatusIterator("/") gives you a listing of root (put that in AbstractContractRootDirectoryTest) And two for changes partway through the iteration * change the directory during a list to add/delete files * deletes the actual path. These tests can't assert on what will happen, and with paged IO aren't likely to pick up on changes...there just to show it can be done and pick up on any major issues with implementations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493671) Time Spent: 20m (was: 10m) > Implement FileSystem.listStatusIterator() in S3AFileSystem > -- > > Key: HADOOP-17281 > URL: https://issues.apache.org/jira/browse/HADOOP-17281 > Project: Hadoop Common > Issue Type: Task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Mukund Thakur >Assignee: Mukund Thakur >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Currently S3AFileSystem only implements listStatus() api which returns an > array. Once we implement the listStatusIterator(), clients can benefit from > the async listing done recently > https://issues.apache.org/jira/browse/HADOOP-17074 by performing some tasks > on files while iterating them. > > CC [~stevel] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem
steveloughran commented on pull request #2354: URL: https://github.com/apache/hadoop/pull/2354#issuecomment-702371522 Looks good. Annoying about the return types which force you to do that wrapping/casting. Can't you just forcibly cast the return type of the inner iterator? after all, type erasure means all type info will be lost in the actual compiled binary. I'd prefer that as it will give you automatic passthrough of the IOStatistics stuff. Add text to filesystem.md, something which: * specifies the result is exactly the same a listStatus, provided no other caller updates the directory during the list * declares that it's not atomic and performance implementations will page * and that if a path isn't there, that fact may not surface until next/hasNext...that is, we do lazy eval for all file IO We need to similar new contract tests in AbstractContractGetFileStatusTest for all to use * that in a dir with files and subdirectories, you get both returned in the listing * that you can iterate through with next() to failure as well as hasNext/next, and get the same results * listStatusIterator(file) returns the file * listStatusIterator("/") gives you a listing of root (put that in AbstractContractRootDirectoryTest) And two for changes partway through the iteration * change the directory during a list to add/delete files * deletes the actual path. These tests can't assert on what will happen, and with paged IO aren't likely to pick up on changes...there just to show it can be done and pick up on any major issues with implementations. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API
[ https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=493666=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493666 ] ASF GitHub Bot logged work on HADOOP-16830: --- Author: ASF GitHub Bot Created on: 01/Oct/20 19:58 Start Date: 01/Oct/20 19:58 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2323: URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702364693 @mehakmeet * duration tracking for classic function. Issue: are the names of the `trackDuration` calls correct now? * class DurationStatisticSummary to store and extract duration stats from a statistic. * which is used in the tests to verify that the new functions all work This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493666) Time Spent: 6h 40m (was: 6.5h) > Add public IOStatistics API > --- > > Key: HADOOP-16830 > URL: https://issues.apache.org/jira/browse/HADOOP-16830 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/s3 >Affects Versions: 3.3.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Time Spent: 6h 40m > Remaining Estimate: 0h > > Applications like to collect the statistics which specific operations take, > by collecting exactly those operations done during the execution of FS API > calls by their individual worker threads, and returning these to their job > driver > * S3A has a statistics API for some streams, but it's a non-standard one; > Impala can't use it > * FileSystem storage statistics are public, but as they aren't cross-thread, > they don't aggregate properly > Proposed > # A new IOStatistics interface to serve up statistics > # S3A to implement > # other stores to follow > # Pass-through from the usual wrapper classes (FS data input/output streams) > It's hard to think about how best to offer an API for operation context > stats, and how to actually implement. > ThreadLocal isn't enough because the helper threads need to update on the > thread local value of the instigator > My Initial PoC doesn't address that issue, but it shows what I'm thinking of -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2323: HADOOP-16830. Add public IOStatistics API.
steveloughran commented on pull request #2323: URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702364693 @mehakmeet * duration tracking for classic function. Issue: are the names of the `trackDuration` calls correct now? * class DurationStatisticSummary to store and extract duration stats from a statistic. * which is used in the tests to verify that the new functions all work This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205783#comment-17205783 ] Wei-Chiu Chuang commented on HADOOP-16990: -- {noformat} [INFO] +- org.mock-server:mockserver-netty:jar:3.9.2:test [INFO] | +- org.mock-server:mockserver-client-java:jar:3.9.2:test [INFO] | +- org.mock-server:mockserver-core:jar:3.9.2:test [INFO] | | +- io.netty:netty-codec-socks:jar:4.0.24.Final:test {noformat} i don't think we are really affected, especially since it's in test scope. But very annoying when users complain about a new CVE found in the classpath. > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205779#comment-17205779 ] Ayush Saxena commented on HADOOP-16990: --- What advantages do we get with upgrading MockServer? Is there a CVE or some serious performance improvements? {{MockServer-3.9.2}} uses guava-18, What version do you intend to move to, if the above holds true > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205778#comment-17205778 ] Wei-Chiu Chuang commented on HADOOP-16990: -- Thanks for picking this up! [~adoroszlai] > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work started] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HADOOP-16990 started by Attila Doroszlai. - > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-16990) Update Mockserver
[ https://issues.apache.org/jira/browse/HADOOP-16990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Attila Doroszlai reassigned HADOOP-16990: - Assignee: Attila Doroszlai > Update Mockserver > - > > Key: HADOOP-16990 > URL: https://issues.apache.org/jira/browse/HADOOP-16990 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Attila Doroszlai >Priority: Major > > We are on Mockserver 3.9.2 which is more than 5 years old. Time to update. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items
[ https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493632=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493632 ] ASF GitHub Bot logged work on HADOOP-17276: --- Author: ASF GitHub Bot Created on: 01/Oct/20 18:26 Start Date: 01/Oct/20 18:26 Worklog Time Spent: 10m Work Description: ferhui commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498437024 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); Review comment: Ok. Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493632) Time Spent: 8h 10m (was: 8h) > Extend CallerContext to make it include many items > -- > > Key: HADOOP-17276 > URL: https://issues.apache.org/jira/browse/HADOOP-17276 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Time Spent: 8h 10m > Remaining Estimate: 0h > > Now context is string. We need to extend the CallerContext because context > may contains many items. > Items include > * router ip > * MR or CLI > * etc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] ferhui commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items
ferhui commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498437024 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); Review comment: Ok. Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items
[ https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493630=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493630 ] ASF GitHub Bot logged work on HADOOP-17276: --- Author: ASF GitHub Bot Created on: 01/Oct/20 18:19 Start Date: 01/Oct/20 18:19 Worklog Time Spent: 10m Work Description: ferhui commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498433031 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -109,11 +114,53 @@ public String toString() { /** The caller context builder. */ public static final class Builder { -private final String context; +private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '='. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); +private final String fieldSeparator; +private final StringBuilder sb = new StringBuilder(); private byte[] signature; public Builder(String context) { - this.context = context; + this(context, new Configuration()); +} + +public Builder(String context, Configuration conf) { + if (isValid(context)) { +sb.append(context); + } + fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY, + HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT); + checkFieldSeparator(fieldSeparator); +} + +/** + * Check whether the separator is legal. + * The illegal separators include '\t', '\n', '='. + * Throw IllegalArgumentException if the separator is Illegal. + * @param separator the separator of fields. + */ +private void checkFieldSeparator(String separator) { + if (ILLEGAL_SEPARATORS.stream() Review comment: Ok. It's done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493630) Time Spent: 8h (was: 7h 50m) > Extend CallerContext to make it include many items > -- > > Key: HADOOP-17276 > URL: https://issues.apache.org/jira/browse/HADOOP-17276 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Time Spent: 8h > Remaining Estimate: 0h > > Now context is string. We need to extend the CallerContext because context > may contains many items. > Items include > * router ip > * MR or CLI > * etc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] ferhui commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items
ferhui commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498433031 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -109,11 +114,53 @@ public String toString() { /** The caller context builder. */ public static final class Builder { -private final String context; +private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '='. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); +private final String fieldSeparator; +private final StringBuilder sb = new StringBuilder(); private byte[] signature; public Builder(String context) { - this.context = context; + this(context, new Configuration()); +} + +public Builder(String context, Configuration conf) { + if (isValid(context)) { +sb.append(context); + } + fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY, + HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT); + checkFieldSeparator(fieldSeparator); +} + +/** + * Check whether the separator is legal. + * The illegal separators include '\t', '\n', '='. + * Throw IllegalArgumentException if the separator is Illegal. + * @param separator the separator of fields. + */ +private void checkFieldSeparator(String separator) { + if (ILLEGAL_SEPARATORS.stream() Review comment: Ok. It's done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl edited a comment on pull request #2352: HDFS-15607. Create trash dir when allowing snapshottable dir
smengcl edited a comment on pull request #2352: URL: https://github.com/apache/hadoop/pull/2352#issuecomment-701586546 > Do we need to add provisionTrash command for WebHdfs as well? Good point. I think so, yes. Update: As I was attempting to add `PROVISIONSNAPSHOTTRASH` to WebHDFS, I realized `NamenodeWebHdfsMethods` is already server code. But the whole provision trash logic are on the **client** side (same as provision EZ trash). This would also imply that WebHDFS `ALLOWSNAPSHOT` won't trigger provision snapshot trash at the moment. Note WebHDFS doesn't support encryption zone commands (create, list, etc.). I have opened another jira [HDFS-15612](https://issues.apache.org/jira/browse/HDFS-15612) for discussion on WebHDFS support for provision snapshot trash. Let's rule WebHDFS out in this jira for now. @bshashikant This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items
[ https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493609=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493609 ] ASF GitHub Bot logged work on HADOOP-17276: --- Author: ASF GitHub Bot Created on: 01/Oct/20 17:43 Start Date: 01/Oct/20 17:43 Worklog Time Spent: 10m Work Description: goiri commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498413903 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -109,11 +114,53 @@ public String toString() { /** The caller context builder. */ public static final class Builder { -private final String context; +private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '='. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); +private final String fieldSeparator; +private final StringBuilder sb = new StringBuilder(); private byte[] signature; public Builder(String context) { - this.context = context; + this(context, new Configuration()); +} + +public Builder(String context, Configuration conf) { + if (isValid(context)) { +sb.append(context); + } + fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY, + HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT); + checkFieldSeparator(fieldSeparator); +} + +/** + * Check whether the separator is legal. + * The illegal separators include '\t', '\n', '='. + * Throw IllegalArgumentException if the separator is Illegal. + * @param separator the separator of fields. + */ +private void checkFieldSeparator(String separator) { + if (ILLEGAL_SEPARATORS.stream() Review comment: Not that is wrong, but we could just do: ILLEGAL_SEPARATORS.contains(separator), and if we make it a HashSet would be faster. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); Review comment: I would do a set: Collections.unmodifiableSet() This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493609) Time Spent: 7h 50m (was: 7h 40m) > Extend CallerContext to make it include many items > -- > > Key: HADOOP-17276 > URL: https://issues.apache.org/jira/browse/HADOOP-17276 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Time Spent: 7h 50m > Remaining Estimate: 0h > > Now context is string. We need to extend the CallerContext because context > may contains many items. > Items include > * router ip > * MR or CLI > * etc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] goiri commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items
goiri commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498413903 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -109,11 +114,53 @@ public String toString() { /** The caller context builder. */ public static final class Builder { -private final String context; +private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '='. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); +private final String fieldSeparator; +private final StringBuilder sb = new StringBuilder(); private byte[] signature; public Builder(String context) { - this.context = context; + this(context, new Configuration()); +} + +public Builder(String context, Configuration conf) { + if (isValid(context)) { +sb.append(context); + } + fieldSeparator = conf.get(HADOOP_CALLER_CONTEXT_SEPARATOR_KEY, + HADOOP_CALLER_CONTEXT_SEPARATOR_DEFAULT); + checkFieldSeparator(fieldSeparator); +} + +/** + * Check whether the separator is legal. + * The illegal separators include '\t', '\n', '='. + * Throw IllegalArgumentException if the separator is Illegal. + * @param separator the separator of fields. + */ +private void checkFieldSeparator(String separator) { + if (ILLEGAL_SEPARATORS.stream() Review comment: Not that is wrong, but we could just do: ILLEGAL_SEPARATORS.contains(separator), and if we make it a HashSet would be faster. ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); Review comment: I would do a set: Collections.unmodifiableSet() This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493595=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493595 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 01/Oct/20 17:07 Start Date: 01/Oct/20 17:07 Worklog Time Spent: 10m Work Description: viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702273856 Hmm, for CompressDecompressTester.java, it seems to me that it is from original code? ```java else if (compressor.getClass().isAssignableFrom(ZlibCompressor.class)) { return ZlibFactory.isNativeZlibLoaded(new Configuration()); -} -else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class) -&& isNativeSnappyLoadable()) +} +else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)) ``` Anyway, I can fix it here if you think it is ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493595) Time Spent: 22h 40m (was: 22.5h) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 22h 40m > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] viirya commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
viirya commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702273856 Hmm, for CompressDecompressTester.java, it seems to me that it is from original code? ```java else if (compressor.getClass().isAssignableFrom(ZlibCompressor.class)) { return ZlibFactory.isNativeZlibLoaded(new Configuration()); -} -else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class) -&& isNativeSnappyLoadable()) +} +else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)) ``` Anyway, I can fix it here if you think it is ok. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty
[ https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205644#comment-17205644 ] Michael Stack commented on HADOOP-17288: {quote}Ideally if the downstream has to upgrade guava, then this patch has no meaning. {quote} +1 {quote}Then we might need to shade them as well? May be {{curator}} can be one of those. {quote} Yes. Unfortunately the tangles start to compound fast when the dependencies' dependency is also an hadoop dependency (and versions don't align). One thing to consider removing problem dependencies (like curator) if not heavily used. Thanks [~ayushtkn] > Use shaded guava from thirdparty > > > Key: HADOOP-17288 > URL: https://issues.apache.org/jira/browse/HADOOP-17288 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Use the shaded version of guava in hadoop-thirdparty -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items
[ https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493544=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493544 ] ASF GitHub Bot logged work on HADOOP-17276: --- Author: ASF GitHub Bot Created on: 01/Oct/20 15:34 Start Date: 01/Oct/20 15:34 Worklog Time Spent: 10m Work Description: ferhui commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498337876 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. Review comment: It's ok. Remove etc from here and other places. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493544) Time Spent: 7h 40m (was: 7.5h) > Extend CallerContext to make it include many items > -- > > Key: HADOOP-17276 > URL: https://issues.apache.org/jira/browse/HADOOP-17276 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Time Spent: 7h 40m > Remaining Estimate: 0h > > Now context is string. We need to extend the CallerContext because context > may contains many items. > Items include > * router ip > * MR or CLI > * etc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] ferhui commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items
ferhui commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498337876 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. Review comment: It's ok. Remove etc from here and other places. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] jbrennan333 commented on pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm
jbrennan333 commented on pull request #2349: URL: https://github.com/apache/hadoop/pull/2349#issuecomment-702176681 @steveloughran It's hard to think of a terse warning for this. I think your comment above gets close. Maybe something like "The v2 commit algorithm assumes that the content of generated output files is consistent across all task attempts - if this is not true for this job, the v1 commit algorithm is strongly recommended." This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm
steveloughran commented on pull request #2349: URL: https://github.com/apache/hadoop/pull/2349#issuecomment-702161929 (Yetus failure is from no new tests) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17265) ABFS: Support for Client Correlation ID
[ https://issues.apache.org/jira/browse/HADOOP-17265?focusedWorklogId=493507=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493507 ] ASF GitHub Bot logged work on HADOOP-17265: --- Author: ASF GitHub Bot Created on: 01/Oct/20 14:08 Start Date: 01/Oct/20 14:08 Worklog Time Spent: 10m Work Description: snvijaya commented on a change in pull request #2344: URL: https://github.com/apache/hadoop/pull/2344#discussion_r498266654 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java ## @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.utils; + +import java.util.UUID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH; + +public class TrackingContext { + private String clientCorrelationID; + private String clientRequestID; + private static final Logger LOG = LoggerFactory.getLogger( + org.apache.hadoop.fs.azurebfs.services.AbfsClient.class); + + public TrackingContext(String clientCorrelationID) { +//validation +if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) || +(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) { + this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; + LOG.debug("Invalid config provided; correlation id not included in header."); +} +else if (clientCorrelationID.length() > 0) { + this.clientCorrelationID = clientCorrelationID + ":"; + LOG.debug("Client correlation id has been validated and set successfully."); Review comment: Log usually incur a perf cost so log for failure cases. Success case log can be omitted. ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java ## @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.utils; + +import java.util.UUID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH; + +public class TrackingContext { + private String clientCorrelationID; + private String clientRequestID; + private static final Logger LOG = LoggerFactory.getLogger( + org.apache.hadoop.fs.azurebfs.services.AbfsClient.class); + + public TrackingContext(String clientCorrelationID) { +//validation +if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) || +(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) { + this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; + LOG.debug("Invalid config provided;
[jira] [Work logged] (HADOOP-17021) Add concat fs command
[ https://issues.apache.org/jira/browse/HADOOP-17021?focusedWorklogId=493506=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493506 ] ASF GitHub Bot logged work on HADOOP-17021: --- Author: ASF GitHub Bot Created on: 01/Oct/20 14:08 Start Date: 01/Oct/20 14:08 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #1993: URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702161263 OK, yetus is happy, it's just changed where to report I'm going to merge: what do you want to have in your Contributed by: credits as your full name? I think we need to stay with ASCII to avoid breaking things. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493506) Time Spent: 4h 20m (was: 4h 10m) > Add concat fs command > - > > Key: HADOOP-17021 > URL: https://issues.apache.org/jira/browse/HADOOP-17021 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Labels: pull-request-available > Attachments: HADOOP-17021.001.patch > > Time Spent: 4h 20m > Remaining Estimate: 0h > > We should add one concat fs command for ease of use. It concatenates existing > source files into the target file using FileSystem.concat(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] snvijaya commented on a change in pull request #2344: HADOOP-17265. ABFS: Support for Client Correlation ID
snvijaya commented on a change in pull request #2344: URL: https://github.com/apache/hadoop/pull/2344#discussion_r498266654 ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java ## @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.utils; + +import java.util.UUID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH; + +public class TrackingContext { + private String clientCorrelationID; + private String clientRequestID; + private static final Logger LOG = LoggerFactory.getLogger( + org.apache.hadoop.fs.azurebfs.services.AbfsClient.class); + + public TrackingContext(String clientCorrelationID) { +//validation +if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) || +(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) { + this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; + LOG.debug("Invalid config provided; correlation id not included in header."); +} +else if (clientCorrelationID.length() > 0) { + this.clientCorrelationID = clientCorrelationID + ":"; + LOG.debug("Client correlation id has been validated and set successfully."); Review comment: Log usually incur a perf cost so log for failure cases. Success case log can be omitted. ## File path: hadoop-tools/hadoop-azure/src/main/java/org/apache/hadoop/fs/azurebfs/utils/TrackingContext.java ## @@ -0,0 +1,61 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.azurebfs.utils; + +import java.util.UUID; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.CLIENT_CORRELATION_ID_PATTERN; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; +import static org.apache.hadoop.fs.azurebfs.constants.FileSystemConfigurations.MAX_CLIENT_CORRELATION_ID_LENGTH; + +public class TrackingContext { + private String clientCorrelationID; + private String clientRequestID; + private static final Logger LOG = LoggerFactory.getLogger( + org.apache.hadoop.fs.azurebfs.services.AbfsClient.class); + + public TrackingContext(String clientCorrelationID) { +//validation +if ((clientCorrelationID.length() > MAX_CLIENT_CORRELATION_ID_LENGTH) || +(!clientCorrelationID.matches(CLIENT_CORRELATION_ID_PATTERN))) { + this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; + LOG.debug("Invalid config provided; correlation id not included in header."); +} +else if (clientCorrelationID.length() > 0) { + this.clientCorrelationID = clientCorrelationID + ":"; + LOG.debug("Client correlation id has been validated and set successfully."); +} +else { + this.clientCorrelationID = DEFAULT_FS_AZURE_CLIENT_CORRELATION_ID; + LOG.debug( Review comment: This config will be not set for most cases. For
[GitHub] [hadoop] steveloughran commented on pull request #1993: HADOOP-17021. Add concat fs command
steveloughran commented on pull request #1993: URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702161263 OK, yetus is happy, it's just changed where to report I'm going to merge: what do you want to have in your Contributed by: credits as your full name? I think we need to stay with ASCII to avoid breaking things. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493505=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493505 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 01/Oct/20 14:04 Start Date: 01/Oct/20 14:04 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702159088 ok, yetus is running, it's just reporting isn't quite there...if you follow the link you see the results. Test failures in hdfs: unrelated. ASF licence warning: unrelated. Checkstyles are, sadly, related. ``` ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:491: }:5: '}' at column 5 should be on the same line as the next part of a multi-block statement (one that directly contains multiple blocks: if/else-if/else, do/while or try/catch/finally). [RightCurly] ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:492: else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)): 'if' construct must use '{}'s. [NeedBraces] ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java:356: int[] size = { 4 * 1024, 64 * 1024, 128 * 1024, 1024 * 1024 };:18: '{' is followed by whitespace. [NoWhitespaceAfter] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493505) Time Spent: 22.5h (was: 22h 20m) > Using snappy-java in SnappyCodec > > > Key: HADOOP-17125 > URL: https://issues.apache.org/jira/browse/HADOOP-17125 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 22.5h > Remaining Estimate: 0h > > In Hadoop, we use native libs for snappy codec which has several > disadvantages: > * It requires native *libhadoop* and *libsnappy* to be installed in system > *LD_LIBRARY_PATH*, and they have to be installed separately on each node of > the clusters, container images, or local test environments which adds huge > complexities from deployment point of view. In some environments, it requires > compiling the natives from sources which is non-trivial. Also, this approach > is platform dependent; the binary may not work in different platform, so it > requires recompilation. > * It requires extra configuration of *java.library.path* to load the > natives, and it results higher application deployment and maintenance cost > for users. > Projects such as *Spark* and *Parquet* use > [snappy-java|[https://github.com/xerial/snappy-java]] which is JNI-based > implementation. It contains native binaries for Linux, Mac, and IBM in jar > file, and it can automatically load the native binaries into JVM from jar > without any setup. If a native implementation can not be found for a > platform, it can fallback to pure-java implementation of snappy based on > [aircompressor|[https://github.com/airlift/aircompressor/tree/master/src/main/java/io/airlift/compress/snappy]]. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
steveloughran commented on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-702159088 ok, yetus is running, it's just reporting isn't quite there...if you follow the link you see the results. Test failures in hdfs: unrelated. ASF licence warning: unrelated. Checkstyles are, sadly, related. ``` ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:491: }:5: '}' at column 5 should be on the same line as the next part of a multi-block statement (one that directly contains multiple blocks: if/else-if/else, do/while or try/catch/finally). [RightCurly] ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/CompressDecompressTester.java:492: else if (compressor.getClass().isAssignableFrom(SnappyCompressor.class)): 'if' construct must use '{}'s. [NeedBraces] ./hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/io/compress/snappy/TestSnappyCompressorDecompressor.java:356: int[] size = { 4 * 1024, 64 * 1024, 128 * 1024, 1024 * 1024 };:18: '{' is followed by whitespace. [NoWhitespaceAfter] ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17125) Using snappy-java in SnappyCodec
[ https://issues.apache.org/jira/browse/HADOOP-17125?focusedWorklogId=493504=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493504 ] ASF GitHub Bot logged work on HADOOP-17125: --- Author: ASF GitHub Bot Created on: 01/Oct/20 14:03 Start Date: 01/Oct/20 14:03 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690909475 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 30s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 1s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 35s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 4s | trunk passed | | +1 :green_heart: | compile | 21m 25s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 19m 19s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 2s | trunk passed | | +1 :green_heart: | mvnsite | 2m 7s | trunk passed | | +1 :green_heart: | shadedclient | 21m 6s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 13s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 7s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 12s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 38s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 2s | the patch passed | | +1 :green_heart: | compile | 18m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 18m 38s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 36 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 38s | the patch passed | | +1 :green_heart: | javac | 18m 38s | the patch passed | | +1 :green_heart: | compile | 16m 51s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 16m 51s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 134 unchanged - 29 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 16m 51s | the patch passed | | +1 :green_heart: | javac | 16m 51s | the patch passed | | -0 :warning: | checkstyle | 2m 41s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 2m 1s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 25s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 12s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 9s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 32s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 32s | hadoop-project in the patch passed. | | -1 :x: | unit | 9m 32s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 55s | The patch does not generate ASF License warnings. | | | | 177m 51s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.io.compress.snappy.TestSnappyCompressorDecompressor | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile
[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2297: HADOOP-17125. Using snappy-java in SnappyCodec
hadoop-yetus removed a comment on pull request #2297: URL: https://github.com/apache/hadoop/pull/2297#issuecomment-690909475 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 30s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | No case conflicting files found. | | +1 :green_heart: | @author | 0m 1s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 5 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 35s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 4s | trunk passed | | +1 :green_heart: | compile | 21m 25s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 19m 19s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 3m 2s | trunk passed | | +1 :green_heart: | mvnsite | 2m 7s | trunk passed | | +1 :green_heart: | shadedclient | 21m 6s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 13s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 7s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 2m 12s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +0 :ok: | findbugs | 0m 38s | branch/hadoop-project no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 2s | the patch passed | | +1 :green_heart: | compile | 18m 38s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | -1 :x: | cc | 18m 38s | root-jdkUbuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 generated 36 new + 127 unchanged - 36 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 18m 38s | the patch passed | | +1 :green_heart: | javac | 18m 38s | the patch passed | | +1 :green_heart: | compile | 16m 51s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | -1 :x: | cc | 16m 51s | root-jdkPrivateBuild-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 generated 29 new + 134 unchanged - 29 fixed = 163 total (was 163) | | +1 :green_heart: | golang | 16m 51s | the patch passed | | +1 :green_heart: | javac | 16m 51s | the patch passed | | -0 :warning: | checkstyle | 2m 41s | root: The patch generated 1 new + 151 unchanged - 5 fixed = 152 total (was 156) | | +1 :green_heart: | mvnsite | 2m 1s | the patch passed | | +1 :green_heart: | whitespace | 0m 0s | The patch has no whitespace issues. | | +1 :green_heart: | xml | 0m 2s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 14m 25s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 1m 12s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 2m 9s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | findbugs | 0m 32s | hadoop-project has no data from findbugs | ||| _ Other Tests _ | | +1 :green_heart: | unit | 0m 32s | hadoop-project in the patch passed. | | -1 :x: | unit | 9m 32s | hadoop-common in the patch passed. | | +1 :green_heart: | asflicense | 0m 55s | The patch does not generate ASF License warnings. | | | | 177m 51s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.io.compress.snappy.TestSnappyCompressorDecompressor | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2297/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2297 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient xml cc findbugs checkstyle golang | | uname | Linux c9432386914d 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 9960c01a25c | | Default Java | Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | Multi-JDK versions |
[jira] [Work logged] (HADOOP-17124) Support LZO using aircompressor
[ https://issues.apache.org/jira/browse/HADOOP-17124?focusedWorklogId=493503=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493503 ] ASF GitHub Bot logged work on HADOOP-17124: --- Author: ASF GitHub Bot Created on: 01/Oct/20 14:00 Start Date: 01/Oct/20 14:00 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #2159: URL: https://github.com/apache/hadoop/pull/2159#issuecomment-702155803 @dbtsai yes, lets do snappy first This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493503) Time Spent: 1h (was: 50m) > Support LZO using aircompressor > --- > > Key: HADOOP-17124 > URL: https://issues.apache.org/jira/browse/HADOOP-17124 > Project: Hadoop Common > Issue Type: New Feature > Components: common >Affects Versions: 3.3.0 >Reporter: DB Tsai >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > LZO codec was removed in HADOOP-4874 because the original LZO binding is GPL > which is problematic. However, many legacy data is still compressed by LZO > codec, and companies often use vendor's GPL LZO codec in the classpath which > might cause GPL contamination. > Presro and ORC-77 use [aircompressor| > [https://github.com/airlift/aircompressor]] (Apache V2 licensed) to compress > and decompress LZO data. Hadoop can add back LZO support using aircompressor > without GPL violation. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2159: HADOOP-17124. Support LZO Codec using aircompressor
steveloughran commented on pull request #2159: URL: https://github.com/apache/hadoop/pull/2159#issuecomment-702155803 @dbtsai yes, lets do snappy first This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-13327) Add OutputStream + Syncable to the Filesystem Specification
[ https://issues.apache.org/jira/browse/HADOOP-13327?focusedWorklogId=493502=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493502 ] ASF GitHub Bot logged work on HADOOP-13327: --- Author: ASF GitHub Bot Created on: 01/Oct/20 13:58 Start Date: 01/Oct/20 13:58 Worklog Time Spent: 10m Work Description: hadoop-yetus removed a comment on pull request #2102: URL: https://github.com/apache/hadoop/pull/2102#issuecomment-696402075 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 31s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 8 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 32s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 29m 9s | trunk passed | | +1 :green_heart: | compile | 23m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 19m 29s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 47s | trunk passed | | +1 :green_heart: | mvnsite | 4m 40s | trunk passed | | +1 :green_heart: | shadedclient | 23m 32s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 29s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 5s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 51s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 7m 43s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 1s | the patch passed | | +1 :green_heart: | compile | 21m 10s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 21m 10s | the patch passed | | +1 :green_heart: | compile | 18m 10s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 18m 10s | the patch passed | | -0 :warning: | checkstyle | 2m 47s | root: The patch generated 3 new + 105 unchanged - 4 fixed = 108 total (was 109) | | +1 :green_heart: | mvnsite | 4m 57s | the patch passed | | -1 :x: | whitespace | 0m 0s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | xml | 0m 6s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 15m 51s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 44s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 21s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 8m 20s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 10m 1s | hadoop-common in the patch passed. | | -1 :x: | unit | 101m 4s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 1m 58s | hadoop-azure in the patch passed. | | +1 :green_heart: | unit | 1m 19s | hadoop-azure-datalake in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | The patch does not generate ASF License warnings. | | | | 316m 36s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.contract.rawlocal.TestRawlocalContractCreate | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestSnapshotCommands | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2102/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2102 | | Optional
[jira] [Work logged] (HADOOP-17021) Add concat fs command
[ https://issues.apache.org/jira/browse/HADOOP-17021?focusedWorklogId=493501=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493501 ] ASF GitHub Bot logged work on HADOOP-17021: --- Author: ASF GitHub Bot Created on: 01/Oct/20 13:58 Start Date: 01/Oct/20 13:58 Worklog Time Spent: 10m Work Description: steveloughran commented on pull request #1993: URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702154756 there's been ongoing work with a yetus update this week and its been playing up. I've suggested some minor change, either do that or just a rebase and forced push to see if we can trigger it again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493501) Time Spent: 4h 10m (was: 4h) > Add concat fs command > - > > Key: HADOOP-17021 > URL: https://issues.apache.org/jira/browse/HADOOP-17021 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Labels: pull-request-available > Attachments: HADOOP-17021.001.patch > > Time Spent: 4h 10m > Remaining Estimate: 0h > > We should add one concat fs command for ease of use. It concatenates existing > source files into the target file using FileSystem.concat(). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #1993: HADOOP-17021. Add concat fs command
steveloughran commented on pull request #1993: URL: https://github.com/apache/hadoop/pull/1993#issuecomment-702154756 there's been ongoing work with a yetus update this week and its been playing up. I've suggested some minor change, either do that or just a rebase and forced push to see if we can trigger it again. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on pull request #2102: HADOOP-13327. Specify Output Stream and Syncable
hadoop-yetus removed a comment on pull request #2102: URL: https://github.com/apache/hadoop/pull/2102#issuecomment-696402075 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | +0 :ok: | reexec | 0m 31s | Docker mode activated. | ||| _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | No case conflicting files found. | | +0 :ok: | markdownlint | 0m 0s | markdownlint was not available. | | +1 :green_heart: | @author | 0m 0s | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | The patch appears to include 8 new or modified test files. | ||| _ trunk Compile Tests _ | | +0 :ok: | mvndep | 3m 32s | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 29m 9s | trunk passed | | +1 :green_heart: | compile | 23m 14s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | compile | 19m 29s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | checkstyle | 2m 47s | trunk passed | | +1 :green_heart: | mvnsite | 4m 40s | trunk passed | | +1 :green_heart: | shadedclient | 23m 32s | branch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 29s | trunk passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 5s | trunk passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +0 :ok: | spotbugs | 0m 51s | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 :green_heart: | findbugs | 7m 43s | trunk passed | ||| _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 28s | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 3m 1s | the patch passed | | +1 :green_heart: | compile | 21m 10s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javac | 21m 10s | the patch passed | | +1 :green_heart: | compile | 18m 10s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | javac | 18m 10s | the patch passed | | -0 :warning: | checkstyle | 2m 47s | root: The patch generated 3 new + 105 unchanged - 4 fixed = 108 total (was 109) | | +1 :green_heart: | mvnsite | 4m 57s | the patch passed | | -1 :x: | whitespace | 0m 0s | The patch has 3 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply | | +1 :green_heart: | xml | 0m 6s | The patch has no ill-formed XML file. | | +1 :green_heart: | shadedclient | 15m 51s | patch has no errors when building and testing our client artifacts. | | +1 :green_heart: | javadoc | 2m 44s | the patch passed with JDK Ubuntu-11.0.8+10-post-Ubuntu-0ubuntu118.04.1 | | +1 :green_heart: | javadoc | 4m 21s | the patch passed with JDK Private Build-1.8.0_265-8u265-b01-0ubuntu2~18.04-b01 | | +1 :green_heart: | findbugs | 8m 20s | the patch passed | ||| _ Other Tests _ | | -1 :x: | unit | 10m 1s | hadoop-common in the patch passed. | | -1 :x: | unit | 101m 4s | hadoop-hdfs in the patch passed. | | +1 :green_heart: | unit | 1m 58s | hadoop-azure in the patch passed. | | +1 :green_heart: | unit | 1m 19s | hadoop-azure-datalake in the patch passed. | | +1 :green_heart: | asflicense | 1m 5s | The patch does not generate ASF License warnings. | | | | 316m 36s | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.contract.rawlocal.TestRawlocalContractCreate | | | hadoop.hdfs.TestFileChecksum | | | hadoop.hdfs.TestFileChecksumCompositeCrc | | | hadoop.hdfs.server.namenode.TestAddOverReplicatedStripedBlocks | | | hadoop.hdfs.server.blockmanagement.TestBlockTokenWithDFSStriped | | | hadoop.hdfs.server.namenode.TestNameNodeRetryCacheMetrics | | | hadoop.hdfs.TestSnapshotCommands | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.40 ServerAPI=1.40 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2102/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/2102 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle markdownlint xml | | uname | Linux 0d2edf1efca2 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 83c7c2b4c48 | | Default Java | Private
[jira] [Work logged] (HADOOP-17021) Add concat fs command
[ https://issues.apache.org/jira/browse/HADOOP-17021?focusedWorklogId=493500=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493500 ] ASF GitHub Bot logged work on HADOOP-17021: --- Author: ASF GitHub Bot Created on: 01/Oct/20 13:56 Start Date: 01/Oct/20 13:56 Worklog Time Spent: 10m Work Description: steveloughran commented on a change in pull request #1993: URL: https://github.com/apache/hadoop/pull/1993#discussion_r498263927 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Concat.java ## @@ -0,0 +1,91 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.shell; + +import java.io.FileNotFoundException; +import java.io.IOException; +import java.util.LinkedList; + +import com.google.common.annotations.VisibleForTesting; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.PathIOException; + +/** + * Concat the given files. + */ +@InterfaceAudience.Private +@InterfaceStability.Unstable +public class Concat extends FsCommand { + public static void registerCommands(CommandFactory factory) { +factory.addClass(Concat.class, "-concat"); + } + + public static final String NAME = "concat"; + public static final String USAGE = " ..."; + public static final String DESCRIPTION = "Concatenate existing source files" + + " into the target file. Target file and source files should be in the" + + " same directory."; + private static FileSystem testFs; // test only. + + @Override + protected void processArguments(LinkedList args) + throws IOException { +if (args.size() < 1) { + throw new IOException("Target path not specified. " + USAGE); +} +if (args.size() < 3) { + throw new IOException( + "The number of source paths is less than 2. " + USAGE); +} +PathData target = args.removeFirst(); +LinkedList srcList = args; +if (!target.exists || !target.stat.isFile()) { + throw new FileNotFoundException(String + .format("Target path %s does not exist or is" + " not file.", + target.path)); +} +Path[] srcArray = new Path[srcList.size()]; +for (int i = 0; i < args.size(); i++) { + PathData src = srcList.get(i); + if (!src.exists || !src.stat.isFile()) { +throw new FileNotFoundException( +String.format("%s does not exist or is not file.", src.path)); + } + srcArray[i] = src.path; +} +FileSystem fs = target.fs; +if (testFs != null) { + fs = testFs; +} +try { + fs.concat(target.path, srcArray); +} catch (UnsupportedOperationException exception) { + throw new PathIOException("Dest filesystem '" + fs.getUri().getScheme() Review comment: change this to the PathIOE which takes the target.path.toString as the first param. The command line tools aren't great for reporting failures -anything we can do to improve the reporting is worth trying This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493500) Time Spent: 4h (was: 3h 50m) > Add concat fs command > - > > Key: HADOOP-17021 > URL: https://issues.apache.org/jira/browse/HADOOP-17021 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Jinglun >Assignee: Jinglun >Priority: Minor > Labels: pull-request-available > Attachments: HADOOP-17021.001.patch > > Time Spent: 4h > Remaining Estimate: 0h > > We should add one concat fs command for ease of use. It
[GitHub] [hadoop] steveloughran commented on a change in pull request #1993: HADOOP-17021. Add concat fs command
steveloughran commented on a change in pull request #1993: URL: https://github.com/apache/hadoop/pull/1993#discussion_r498263927 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/shell/Concat.java ## @@ -0,0 +1,91 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.hadoop.fs.shell; + +import java.io.FileNotFoundException; +import java.io.IOException; +import java.util.LinkedList; + +import com.google.common.annotations.VisibleForTesting; + +import org.apache.hadoop.classification.InterfaceAudience; +import org.apache.hadoop.classification.InterfaceStability; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.PathIOException; + +/** + * Concat the given files. + */ +@InterfaceAudience.Private +@InterfaceStability.Unstable +public class Concat extends FsCommand { + public static void registerCommands(CommandFactory factory) { +factory.addClass(Concat.class, "-concat"); + } + + public static final String NAME = "concat"; + public static final String USAGE = " ..."; + public static final String DESCRIPTION = "Concatenate existing source files" + + " into the target file. Target file and source files should be in the" + + " same directory."; + private static FileSystem testFs; // test only. + + @Override + protected void processArguments(LinkedList args) + throws IOException { +if (args.size() < 1) { + throw new IOException("Target path not specified. " + USAGE); +} +if (args.size() < 3) { + throw new IOException( + "The number of source paths is less than 2. " + USAGE); +} +PathData target = args.removeFirst(); +LinkedList srcList = args; +if (!target.exists || !target.stat.isFile()) { + throw new FileNotFoundException(String + .format("Target path %s does not exist or is" + " not file.", + target.path)); +} +Path[] srcArray = new Path[srcList.size()]; +for (int i = 0; i < args.size(); i++) { + PathData src = srcList.get(i); + if (!src.exists || !src.stat.isFile()) { +throw new FileNotFoundException( +String.format("%s does not exist or is not file.", src.path)); + } + srcArray[i] = src.path; +} +FileSystem fs = target.fs; +if (testFs != null) { + fs = testFs; +} +try { + fs.concat(target.path, srcArray); +} catch (UnsupportedOperationException exception) { + throw new PathIOException("Dest filesystem '" + fs.getUri().getScheme() Review comment: change this to the PathIOE which takes the target.path.toString as the first param. The command line tools aren't great for reporting failures -anything we can do to improve the reporting is worth trying This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm
steveloughran commented on pull request #2349: URL: https://github.com/apache/hadoop/pull/2349#issuecomment-702137474 @jbrennan333 what do you think we should say instead of deprecated? "not recommended". I was thinking of adding a link to the JIRA and changing the issue text there to clarify * safe if names and content of generated output files is consistent across all task attempts * unsafe if different TAs generate bad files (biggest risk, as partial failure of 1st attempt may leave) * unsafe if different TAs generate different content in same files (only an issue on a network partition and TA #1 generates output as/after TA #2 does its work. cleanup of job will delete the whole job attempt dir so that's the maximum time that a partitioned TA may commit work. There's no risk of some VM pausing for 3 hours, restarting and an in progress TA completing its work and overwriting the final output. This is good. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm
steveloughran commented on a change in pull request #2349: URL: https://github.com/apache/hadoop/pull/2349#discussion_r498242126 ## File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/java/org/apache/hadoop/mapreduce/lib/output/FileOutputCommitter.java ## @@ -348,6 +348,16 @@ public Path getWorkPath() throws IOException { * @param context the job's context */ public void setupJob(JobContext context) throws IOException { +// Downgrade v2 to v1 with a warning. +if (algorithmVersion == 2) { + Logger log = LoggerFactory.getLogger( + "org.apache.hadoop.mapreduce.lib.output." + + "FileOutputCommitter.Algorithm"); + + log.warn("The v2 commit algorithm is deprecated;" + + " please switch to the v1 algorithm"); Review comment: what do you suggest? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #2349: MAPREDUCE-7282. Move away from V2 commit algorithm
steveloughran commented on a change in pull request #2349: URL: https://github.com/apache/hadoop/pull/2349#discussion_r498242247 ## File path: hadoop-mapreduce-project/hadoop-mapreduce-client/hadoop-mapreduce-client-core/src/main/resources/mapred-default.xml ## @@ -1562,10 +1562,35 @@ mapreduce.fileoutputcommitter.algorithm.version - 2 - The file output committer algorithm version - valid algorithm version number: 1 or 2 - default to 2, which is the original algorithm + 1 + The file output committer algorithm version. + + There are two algorithm versions in Hadoop, "1" and "2". + + The version 2 algorithm is deprecated and no longer the default + as task commits were not atomic. + If a first task attempt fails part-way + through its task commit, the output directory could end up + with data from that failed commit, alongside the data + from any subsequent attempts. + + See https://issues.apache.org/jira/browse/MAPREDUCE-7282 + + Although no-longer the default, this algorithm is safe to use if + all task attempts for a single task meet the following requirements + -they generate exactly the same set of files + -the contents of each file are exactly the same in each task attempt + + That is: + 1. If a second attempt commits work, there will be no leftover files from + a first attempt which failed during its task commit. + 2. If a network partition causes the first task attempt to overwrite + some/all of the output of a second attempt, the result will be + exactly the same as if it had not done so. + + To avoid the warning message on job setup, set the log level of the log + org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.Algorithm + to ERROR. Review comment: ok This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] bshashikant opened a new pull request #2355: HDFS-15611. Add list Snapshot command in WebHDFS.
bshashikant opened a new pull request #2355: URL: https://github.com/apache/hadoop/pull/2355 please check https://issues.apache.org/jira/browse/HDFS-15611 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items
[ https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493458=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493458 ] ASF GitHub Bot logged work on HADOOP-17276: --- Author: ASF GitHub Bot Created on: 01/Oct/20 12:04 Start Date: 01/Oct/20 12:04 Worklog Time Spent: 10m Work Description: aajisaka commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498191019 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. Review comment: > , etc. The illegal separators are only `\t`, `\n`, and `=`. "etc" is not needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493458) Time Spent: 7.5h (was: 7h 20m) > Extend CallerContext to make it include many items > -- > > Key: HADOOP-17276 > URL: https://issues.apache.org/jira/browse/HADOOP-17276 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Time Spent: 7.5h > Remaining Estimate: 0h > > Now context is string. We need to extend the CallerContext because context > may contains many items. > Items include > * router ip > * MR or CLI > * etc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items
aajisaka commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498191019 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. Review comment: > , etc. The illegal separators are only `\t`, `\n`, and `=`. "etc" is not needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17276) Extend CallerContext to make it include many items
[ https://issues.apache.org/jira/browse/HADOOP-17276?focusedWorklogId=493456=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493456 ] ASF GitHub Bot logged work on HADOOP-17276: --- Author: ASF GitHub Bot Created on: 01/Oct/20 12:02 Start Date: 01/Oct/20 12:02 Worklog Time Spent: 10m Work Description: aajisaka commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498189901 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); Review comment: We should use `Collections.unmodifiableList` to provide an unmodifiable view. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493456) Time Spent: 7h 20m (was: 7h 10m) > Extend CallerContext to make it include many items > -- > > Key: HADOOP-17276 > URL: https://issues.apache.org/jira/browse/HADOOP-17276 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Hui Fei >Assignee: Hui Fei >Priority: Major > Labels: pull-request-available > Time Spent: 7h 20m > Remaining Estimate: 0h > > Now context is string. We need to extend the CallerContext because context > may contains many items. > Items include > * router ip > * MR or CLI > * etc -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka commented on a change in pull request #2327: HADOOP-17276. Extend CallerContext to make it include many items
aajisaka commented on a change in pull request #2327: URL: https://github.com/apache/hadoop/pull/2327#discussion_r498189901 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/CallerContext.java ## @@ -114,6 +115,12 @@ public String toString() { /** The caller context builder. */ public static final class Builder { private static final String KEY_VALUE_SEPARATOR = ":"; +/** + * The illegal separators include '\t', '\n', '=', etc. + * User should not set illegal separator. + */ +private static final List ILLEGAL_SEPARATORS = +Arrays.asList("\t","\n","="); Review comment: We should use `Collections.unmodifiableList` to provide an unmodifiable view. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka closed pull request #2346: [DO NOT MERGE] Avoid YETUS-994 for testing
aajisaka closed pull request #2346: URL: https://github.com/apache/hadoop/pull/2346 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] aajisaka commented on pull request #2346: [DO NOT MERGE] Avoid YETUS-994 for testing
aajisaka commented on pull request #2346: URL: https://github.com/apache/hadoop/pull/2346#issuecomment-702078340 The new token worked as expected, so we don't need to revert YETUS-994 in our setting. Closing. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-16830) Add public IOStatistics API
[ https://issues.apache.org/jira/browse/HADOOP-16830?focusedWorklogId=493435=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493435 ] ASF GitHub Bot logged work on HADOOP-16830: --- Author: ASF GitHub Bot Created on: 01/Oct/20 11:31 Start Date: 01/Oct/20 11:31 Worklog Time Spent: 10m Work Description: mehakmeet commented on pull request #2323: URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702072245 In IOStatisticsBinding class we have methods for tracking duration but, I am not able to wrap it around a normal function. There are 3 methods for tracking durations which are for Callable, CallableRaisingIOE, and FunctionRaisingIOE. We should add 1 more for a normal function too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493435) Time Spent: 6.5h (was: 6h 20m) > Add public IOStatistics API > --- > > Key: HADOOP-16830 > URL: https://issues.apache.org/jira/browse/HADOOP-16830 > Project: Hadoop Common > Issue Type: New Feature > Components: fs, fs/s3 >Affects Versions: 3.3.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Time Spent: 6.5h > Remaining Estimate: 0h > > Applications like to collect the statistics which specific operations take, > by collecting exactly those operations done during the execution of FS API > calls by their individual worker threads, and returning these to their job > driver > * S3A has a statistics API for some streams, but it's a non-standard one; > Impala can't use it > * FileSystem storage statistics are public, but as they aren't cross-thread, > they don't aggregate properly > Proposed > # A new IOStatistics interface to serve up statistics > # S3A to implement > # other stores to follow > # Pass-through from the usual wrapper classes (FS data input/output streams) > It's hard to think about how best to offer an API for operation context > stats, and how to actually implement. > ThreadLocal isn't enough because the helper threads need to update on the > thread local value of the instigator > My Initial PoC doesn't address that issue, but it shows what I'm thinking of -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mehakmeet commented on pull request #2323: HADOOP-16830. Add public IOStatistics API.
mehakmeet commented on pull request #2323: URL: https://github.com/apache/hadoop/pull/2323#issuecomment-702072245 In IOStatisticsBinding class we have methods for tracking duration but, I am not able to wrap it around a normal function. There are 3 methods for tracking durations which are for Callable, CallableRaisingIOE, and FunctionRaisingIOE. We should add 1 more for a normal function too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17272) ABFS Streams to support IOStatistics API
[ https://issues.apache.org/jira/browse/HADOOP-17272?focusedWorklogId=493363=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493363 ] ASF GitHub Bot logged work on HADOOP-17272: --- Author: ASF GitHub Bot Created on: 01/Oct/20 09:29 Start Date: 01/Oct/20 09:29 Worklog Time Spent: 10m Work Description: mehakmeet commented on pull request #2353: URL: https://github.com/apache/hadoop/pull/2353#issuecomment-702011782 Have to force push since am rebasing Steve's branch on my commits. Also, Don't know why Yetus isn't running. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493363) Time Spent: 40m (was: 0.5h) > ABFS Streams to support IOStatistics API > - > > Key: HADOOP-17272 > URL: https://issues.apache.org/jira/browse/HADOOP-17272 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.1 >Reporter: Steve Loughran >Assignee: Mehakmeet Singh >Priority: Major > Labels: pull-request-available > Time Spent: 40m > Remaining Estimate: 0h > > ABFS input/output streams to support IOStatistics API -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mehakmeet commented on pull request #2353: HADOOP-17272. ABFS Streams to support IOStatistics API
mehakmeet commented on pull request #2353: URL: https://github.com/apache/hadoop/pull/2353#issuecomment-702011782 Have to force push since am rebasing Steve's branch on my commits. Also, Don't know why Yetus isn't running. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-17288) Use shaded guava from thirdparty
[ https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205340#comment-17205340 ] Ayush Saxena edited comment on HADOOP-17288 at 10/1/20, 7:49 AM: - Transitive dependency as the dependencies which hadoop pulls up, on which hadoop depends. Like {{mockserer-core}} is one. If you see the {{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, requires guava. Ideally if the downstream has to upgrade guava, then this patch has no meaning. The basic requirement itself is that the downstream should not need to upgrade guava. If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, which is compatible with the guava version of the downstream project they can exclude it, and I think things should work fine? If there are dependencies with higher version of Guava, (need to analyse), which aren't compatible with guava version of downstream projects. Then we might need to shade them as well? May be {{curator}} can be one of those. Let me know, if you have any idea/suggestion or different approach which may make things better was (Author: ayushtkn): Transitive dependency as the dependencies which hadoop pulls up, on which hadoop depends. Like {{mockserer-core}} is one. If you see the {{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, requires guava. Ideally if the downstream has to upgrade guava, then this patch has no meaning. The basic requirement itself is that the downstream should not need guava. If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, which is compatible with the guava version of the downstream project they can exclude it, and I think things should work fine? If there are dependencies with higher version of Guava, (need to analyse), which aren't compatible with guava version of downstream project. Then we might need to shade them as well? May be {{curator}} can be one of those. Let me know, if you have any idea/suggestion or different approach which may make things better > Use shaded guava from thirdparty > > > Key: HADOOP-17288 > URL: https://issues.apache.org/jira/browse/HADOOP-17288 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Use the shaded version of guava in hadoop-thirdparty -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty
[ https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205340#comment-17205340 ] Ayush Saxena commented on HADOOP-17288: --- Transitive dependency as the dependencies which hadoop pulls up, on which hadoop depends. Like {{mockserer-core}} is one. If you see the {{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, brings up guava in hadoop. Ideally if the downstream has to upgrade guava, then this patch has no meaning. The basic requirement itself is that the downstream should not need guava. If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, which is compatible with the guava version of the downstream project they can exclude it, and I think things should work fine? If there are dependencies with higher version of Guava, (need to analyse), which aren't compatible with guava version of downstream project. Then we might need to shade them as well? May be {{curator}} can be one of those. Let me know, if you have any idea/suggestion or different approach which may make things better > Use shaded guava from thirdparty > > > Key: HADOOP-17288 > URL: https://issues.apache.org/jira/browse/HADOOP-17288 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Use the shaded version of guava in hadoop-thirdparty -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-17288) Use shaded guava from thirdparty
[ https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205340#comment-17205340 ] Ayush Saxena edited comment on HADOOP-17288 at 10/1/20, 7:47 AM: - Transitive dependency as the dependencies which hadoop pulls up, on which hadoop depends. Like {{mockserer-core}} is one. If you see the {{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, requires guava. Ideally if the downstream has to upgrade guava, then this patch has no meaning. The basic requirement itself is that the downstream should not need guava. If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, which is compatible with the guava version of the downstream project they can exclude it, and I think things should work fine? If there are dependencies with higher version of Guava, (need to analyse), which aren't compatible with guava version of downstream project. Then we might need to shade them as well? May be {{curator}} can be one of those. Let me know, if you have any idea/suggestion or different approach which may make things better was (Author: ayushtkn): Transitive dependency as the dependencies which hadoop pulls up, on which hadoop depends. Like {{mockserer-core}} is one. If you see the {{hadoop-project/pom.xml}}, whichever dependency has excluded {{Guava}}, brings up guava in hadoop. Ideally if the downstream has to upgrade guava, then this patch has no meaning. The basic requirement itself is that the downstream should not need guava. If the transitive dependency, say {{x}} is a dependency of {{hadoop-common}}, which is compatible with the guava version of the downstream project they can exclude it, and I think things should work fine? If there are dependencies with higher version of Guava, (need to analyse), which aren't compatible with guava version of downstream project. Then we might need to shade them as well? May be {{curator}} can be one of those. Let me know, if you have any idea/suggestion or different approach which may make things better > Use shaded guava from thirdparty > > > Key: HADOOP-17288 > URL: https://issues.apache.org/jira/browse/HADOOP-17288 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Use the shaded version of guava in hadoop-thirdparty -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17288) Use shaded guava from thirdparty
[ https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Stack updated HADOOP-17288: --- Fix Version/s: 3.4.0 > Use shaded guava from thirdparty > > > Key: HADOOP-17288 > URL: https://issues.apache.org/jira/browse/HADOOP-17288 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Use the shaded version of guava in hadoop-thirdparty -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty
[ https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205329#comment-17205329 ] Michael Stack commented on HADOOP-17288: {quote}but guava as of now would be still packaged as it is part of several transitive dependencies. {quote} Can you say more on the above? Transitively included by Hadoop because Hadoop dependencies pull it in or are you talking downstreamers that expect Hadoop to provide guava to them (transitively?) I'm wondering about the downstreamers whose apps use guava 11 because thats what hadoop used until 3.3.0/3.2.1. They want to upgrade to 3.4. They'll have to do the work to upgrade to guava 27 because that is what 3.2.1/3.3.0 have even though you've done all this work here. Seems a shame? (I set fix version as 3.4.0 – thanks). > Use shaded guava from thirdparty > > > Key: HADOOP-17288 > URL: https://issues.apache.org/jira/browse/HADOOP-17288 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Use the shaded version of guava in hadoop-thirdparty -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17288) Use shaded guava from thirdparty
[ https://issues.apache.org/jira/browse/HADOOP-17288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17205309#comment-17205309 ] Ayush Saxena commented on HADOOP-17288: --- [~stack] yes the intentions are like that only. The hadoop jars would be using the shaded guava from the thirdparty. but guava as of now would be still packaged as it is part of several transitive dependencies. So need to see whether we can go ahead like this, or try shading the ones which bring in guava(they are many) Yes, I am targeting this for 3.4.0 as of now > Use shaded guava from thirdparty > > > Key: HADOOP-17288 > URL: https://issues.apache.org/jira/browse/HADOOP-17288 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Use the shaded version of guava in hadoop-thirdparty -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem
[ https://issues.apache.org/jira/browse/HADOOP-17281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated HADOOP-17281: Labels: pull-request-available (was: ) > Implement FileSystem.listStatusIterator() in S3AFileSystem > -- > > Key: HADOOP-17281 > URL: https://issues.apache.org/jira/browse/HADOOP-17281 > Project: Hadoop Common > Issue Type: Task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Mukund Thakur >Assignee: Mukund Thakur >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Currently S3AFileSystem only implements listStatus() api which returns an > array. Once we implement the listStatusIterator(), clients can benefit from > the async listing done recently > https://issues.apache.org/jira/browse/HADOOP-17074 by performing some tasks > on files while iterating them. > > CC [~stevel] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-17281) Implement FileSystem.listStatusIterator() in S3AFileSystem
[ https://issues.apache.org/jira/browse/HADOOP-17281?focusedWorklogId=493316=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-493316 ] ASF GitHub Bot logged work on HADOOP-17281: --- Author: ASF GitHub Bot Created on: 01/Oct/20 06:56 Start Date: 01/Oct/20 06:56 Worklog Time Spent: 10m Work Description: mukund-thakur opened a new pull request #2354: URL: https://github.com/apache/hadoop/pull/2354 Ran the new test using ap-south-1 bucket. O/P- `(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listFiles() api with batch size of 10 including 10ms of processing time for each file: 12,223,848,028 nS 2020-10-01 12:19:28,811 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatus() api with batch size of 10 including 10ms of processing time for each file: 15,988,037,357 nS 2020-10-01 12:19:41,050 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatusIterator() api with batch size of 10 including 10ms of processing time for each file: 12,214,813,052 nS` From the logs we can see that time taken using listStatusIterator() and listFiles() matches and is less than listStatus(). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking --- Worklog Id: (was: 493316) Remaining Estimate: 0h Time Spent: 10m > Implement FileSystem.listStatusIterator() in S3AFileSystem > -- > > Key: HADOOP-17281 > URL: https://issues.apache.org/jira/browse/HADOOP-17281 > Project: Hadoop Common > Issue Type: Task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Mukund Thakur >Assignee: Mukund Thakur >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Currently S3AFileSystem only implements listStatus() api which returns an > array. Once we implement the listStatusIterator(), clients can benefit from > the async listing done recently > https://issues.apache.org/jira/browse/HADOOP-17074 by performing some tasks > on files while iterating them. > > CC [~stevel] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] mukund-thakur opened a new pull request #2354: HADOOP-17281 Implement FileSystem.listStatusIterator() in S3AFileSystem
mukund-thakur opened a new pull request #2354: URL: https://github.com/apache/hadoop/pull/2354 Ran the new test using ap-south-1 bucket. O/P- `(ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listFiles() api with batch size of 10 including 10ms of processing time for each file: 12,223,848,028 nS 2020-10-01 12:19:28,811 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatus() api with batch size of 10 including 10ms of processing time for each file: 15,988,037,357 nS 2020-10-01 12:19:41,050 [JUnit-testMultiPagesListingPerformanceAndCorrectness] INFO contract.ContractTestUtils (ContractTestUtils.java:end(1847)) - Duration of listing 1000 files using listStatusIterator() api with batch size of 10 including 10ms of processing time for each file: 12,214,813,052 nS` From the logs we can see that time taken using listStatusIterator() and listFiles() matches and is less than listStatus(). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org