[jira] [Commented] (HADOOP-16531) Log more detail for slow RPC
[ https://issues.apache.org/jira/browse/HADOOP-16531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924716#comment-16924716 ] Chen Zhang commented on HADOOP-16531: - Thanks [~xkrogen] for the review and the commit > Log more detail for slow RPC > > > Key: HADOOP-16531 > URL: https://issues.apache.org/jira/browse/HADOOP-16531 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Chen Zhang >Assignee: Chen Zhang >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16531.001.patch > > > Current implementation only log process time > {code:java} > if ((rpcMetrics.getProcessingSampleCount() > minSampleSize) && > (processingTime > threeSigma)) { > LOG.warn("Slow RPC : {} took {} {} to process from client {}", > methodName, processingTime, RpcMetrics.TIMEUNIT, call); > rpcMetrics.incrSlowRpc(); > } > {code} > We need to log more details to help us locate the problem (eg. how long it > take to request lock, holding lock, or do other things) -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] xiaoyuyao commented on issue #1361: HDDS-1553. Add metrics in rack aware container placement policy.
xiaoyuyao commented on issue #1361: HDDS-1553. Add metrics in rack aware container placement policy. URL: https://github.com/apache/hadoop/pull/1361#issuecomment-529052232 Thanks @ChenSammi for the contribution. +1 for the latest change, I merged the change to trunk. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] xiaoyuyao merged pull request #1361: HDDS-1553. Add metrics in rack aware container placement policy.
xiaoyuyao merged pull request #1361: HDDS-1553. Add metrics in rack aware container placement policy. URL: https://github.com/apache/hadoop/pull/1361 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file …
hadoop-yetus commented on issue #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file … URL: https://github.com/apache/hadoop/pull/1411#issuecomment-529039404 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 71 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | 0 | shelldocs | 0 | Shelldocs was not available. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 605 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 880 | branch has no errors when building and testing our client artifacts. | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 565 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | +1 | shellcheck | 32 | There were no new shellcheck issues. | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 768 | patch has no errors when building and testing our client artifacts. | ||| _ Other Tests _ | | +1 | unit | 108 | hadoop-hdds in the patch passed. | | +1 | unit | 290 | hadoop-ozone in the patch passed. | | +1 | asflicense | 44 | The patch does not generate ASF License warnings. | | | | 3560 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1411/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1411 | | Optional Tests | dupname asflicense mvnsite unit shellcheck shelldocs | | uname | Linux 075bdf979e4e 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / bb0b922 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1411/2/testReport/ | | Max. process+thread count | 310 (vs. ulimit of 5500) | | modules | C: hadoop-ozone/common U: hadoop-ozone/common | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1411/2/console | | versions | git=2.7.4 maven=3.3.9 shellcheck=0.4.6 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] vivekratnavel commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager
vivekratnavel commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager URL: https://github.com/apache/hadoop/pull/1409#issuecomment-529034433 The unit and integration test failures are not related to the patch. @bharatviswa504 @anuengineer Thanks for your reviews! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager
hadoop-yetus commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager URL: https://github.com/apache/hadoop/pull/1409#issuecomment-529031436 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 44 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 25 | Maven dependency ordering for branch | | +1 | mvninstall | 631 | trunk passed | | +1 | compile | 395 | trunk passed | | +1 | checkstyle | 82 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 874 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 182 | trunk passed | | 0 | spotbugs | 468 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 681 | trunk passed | ||| _ Patch Compile Tests _ | | 0 | mvndep | 36 | Maven dependency ordering for patch | | +1 | mvninstall | 564 | the patch passed | | +1 | compile | 390 | the patch passed | | +1 | javac | 390 | the patch passed | | +1 | checkstyle | 83 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | xml | 2 | The patch has no ill-formed XML file. | | +1 | shadedclient | 707 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 188 | the patch passed | | +1 | findbugs | 663 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 287 | hadoop-hdds in the patch passed. | | -1 | unit | 186 | hadoop-ozone in the patch failed. | | +1 | asflicense | 43 | The patch does not generate ASF License warnings. | | | | 6246 | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.ozone.om.ratis.TestOzoneManagerRatisServer | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1409 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 7260e9bf8f51 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b15c116 | | Default Java | 1.8.0_222 | | unit | https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/3/artifact/out/patch-unit-hadoop-ozone.txt | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/3/testReport/ | | Max. process+thread count | 1167 (vs. ulimit of 5500) | | modules | C: hadoop-hdds/common hadoop-hdds/container-service hadoop-ozone/integration-test U: . | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/3/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321927084 ## File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java ## @@ -137,12 +137,22 @@ private Text dtService; private final boolean topologyAwareReadEnabled; + /** + * Creates RpcClient instance with the given configuration. + * @param conf Configuration + * @throws IOException + */ + public RpcClient(Configuration conf) throws IOException { Review comment: Sure let's go with removing this old constructor then. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] bharatviswa504 commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
bharatviswa504 commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321926066 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java ## @@ -214,6 +216,11 @@ public OzoneManagerProtocolClientSideTranslatorPB(OzoneConfiguration conf, this.clientID = clientId; } + public OzoneManagerProtocolClientSideTranslatorPB(OzoneConfiguration conf, Review comment: Yes if possible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] bharatviswa504 commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
bharatviswa504 commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321925879 ## File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java ## @@ -137,12 +137,22 @@ private Text dtService; private final boolean topologyAwareReadEnabled; + /** + * Creates RpcClient instance with the given configuration. + * @param conf Configuration + * @throws IOException + */ + public RpcClient(Configuration conf) throws IOException { Review comment: My comment is for add to a notion of @VisibleForTesting, and I think this can also be removed, as it is used only for testing. And it can be completely removed, and use the new constructor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] HeartSaVioR commented on issue #1413: [BRANCH-2] HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is c
HeartSaVioR commented on issue #1413: [BRANCH-2] HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. URL: https://github.com/apache/hadoop/pull/1413#issuecomment-529028474 cc. @steveloughran HadoopTestBase doesn't seem to be available in branch-2 so I removed it. Except that it was clean cherry-pick. Let's see the build result. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] HeartSaVioR opened a new pull request #1413: [BRANCH-2] HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, option
HeartSaVioR opened a new pull request #1413: [BRANCH-2] HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. URL: https://github.com/apache/hadoop/pull/1413 Please refer https://issues.apache.org/jira/browse/HADOOP-16255 for more details. FYI, FileContext.rename(path, path, options) leaks crc file for source of rename when CheckFs or its descendant is used as underlying filesystem. https://issues.apache.org/jira/browse/SPARK-28025 took a workaround via removing crc file manually, and we hope to get rid of workaround eventually. This PR is ported version of #1388 for branch-2. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] HeartSaVioR commented on issue #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called.
HeartSaVioR commented on issue #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. URL: https://github.com/apache/hadoop/pull/1388#issuecomment-529027196 Thanks for reviewing and merging! I'll create separate PR for branch-2. Thanks for the guide. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13363) Upgrade protobuf from 2.5.0 to something newer
[ https://issues.apache.org/jira/browse/HADOOP-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924622#comment-16924622 ] Vinayakumar B commented on HADOOP-13363: Thanks [~stack] and @ [~anu] . I will take a look at above pom later tomorrow. Thanks. > Upgrade protobuf from 2.5.0 to something newer > -- > > Key: HADOOP-13363 > URL: https://issues.apache.org/jira/browse/HADOOP-13363 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Affects Versions: 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Allen Wittenauer >Assignee: Vinayakumar B >Priority: Major > Labels: security > Attachments: HADOOP-13363.001.patch, HADOOP-13363.002.patch, > HADOOP-13363.003.patch, HADOOP-13363.004.patch, HADOOP-13363.005.patch > > > Standard protobuf 2.5.0 does not work properly on many platforms. (See, for > example, https://gist.github.com/BennettSmith/7111094 ). In order for us to > avoid crazy work arounds in the build environment and the fact that 2.5.0 is > starting to slowly disappear as a standard install-able package for even > Linux/x86, we need to either upgrade or self bundle or something else. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321917093 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/protocolPB/OzoneManagerProtocolClientSideTranslatorPB.java ## @@ -214,6 +216,11 @@ public OzoneManagerProtocolClientSideTranslatorPB(OzoneConfiguration conf, this.clientID = clientId; } + public OzoneManagerProtocolClientSideTranslatorPB(OzoneConfiguration conf, Review comment: @bharatviswa504 It turns out there is one caller here: https://github.com/apache/hadoop/blob/d69a1a0aa49614c084fa4b9546ace65aebe4/hadoop-ozone/ozone-recon/src/main/java/org/apache/hadoop/ozone/recon/ReconControllerModule.java#L102 But we can easily change it to use the new constructor. Shall we do that? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321916578 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java ## @@ -70,26 +71,46 @@ private final UserGroupInformation ugi; private final Text delegationTokenService; + // TODO: Do we want this to be final? + private String omServiceId; + public OMFailoverProxyProvider(OzoneConfiguration configuration, - UserGroupInformation ugi) throws IOException { + UserGroupInformation ugi, String omServiceId) throws IOException { this.conf = configuration; this.omVersion = RPC.getProtocolVersion(OzoneManagerProtocolPB.class); this.ugi = ugi; -loadOMClientConfigs(conf); +this.omServiceId = omServiceId; +loadOMClientConfigs(conf, this.omServiceId); this.delegationTokenService = computeDelegationTokenService(); currentProxyIndex = 0; currentProxyOMNodeId = omNodeIDList.get(currentProxyIndex); } - private void loadOMClientConfigs(Configuration config) throws IOException { + public OMFailoverProxyProvider(OzoneConfiguration configuration, + UserGroupInformation ugi) throws IOException { +this(configuration, ugi, null); + } + + private void loadOMClientConfigs(Configuration config, String omSvcId) + throws IOException { this.omProxies = new HashMap<>(); this.omProxyInfos = new HashMap<>(); this.omNodeIDList = new ArrayList<>(); -Collection omServiceIds = config.getTrimmedStringCollection( -OZONE_OM_SERVICE_IDS_KEY); +Collection omServiceIds; +if (omSvcId == null) { + // When no OM service id is passed in + // Note: this branch will only be followed when omSvcId is null, + // meaning the host name/service id provided by user doesn't match any + // ozone.om.service.ids on the client side. Therefore, in this case + // just treat it as non-HA by assigning an empty list to omServiceIds + omServiceIds = new ArrayList<>(); +} else { + omServiceIds = Collections.singletonList(omSvcId); +} +// TODO: Remove this warning? Or change the message? Review comment: Done. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321916457 ## File path: hadoop-ozone/common/src/main/java/org/apache/hadoop/ozone/om/ha/OMFailoverProxyProvider.java ## @@ -70,26 +71,46 @@ private final UserGroupInformation ugi; private final Text delegationTokenService; + // TODO: Do we want this to be final? + private String omServiceId; + public OMFailoverProxyProvider(OzoneConfiguration configuration, - UserGroupInformation ugi) throws IOException { + UserGroupInformation ugi, String omServiceId) throws IOException { this.conf = configuration; this.omVersion = RPC.getProtocolVersion(OzoneManagerProtocolPB.class); this.ugi = ugi; -loadOMClientConfigs(conf); +this.omServiceId = omServiceId; +loadOMClientConfigs(conf, this.omServiceId); this.delegationTokenService = computeDelegationTokenService(); currentProxyIndex = 0; currentProxyOMNodeId = omNodeIDList.get(currentProxyIndex); } - private void loadOMClientConfigs(Configuration config) throws IOException { + public OMFailoverProxyProvider(OzoneConfiguration configuration, + UserGroupInformation ugi) throws IOException { +this(configuration, ugi, null); + } + + private void loadOMClientConfigs(Configuration config, String omSvcId) + throws IOException { this.omProxies = new HashMap<>(); this.omProxyInfos = new HashMap<>(); this.omNodeIDList = new ArrayList<>(); -Collection omServiceIds = config.getTrimmedStringCollection( -OZONE_OM_SERVICE_IDS_KEY); +Collection omServiceIds; +if (omSvcId == null) { Review comment: @bharatviswa504 You are right. We can remove the condition here completely as it makes no difference. Will do the refactoring in a new jira. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321915663 ## File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/rpc/RpcClient.java ## @@ -137,12 +137,22 @@ private Text dtService; private final boolean topologyAwareReadEnabled; + /** + * Creates RpcClient instance with the given configuration. + * @param conf Configuration + * @throws IOException + */ + public RpcClient(Configuration conf) throws IOException { Review comment: @bharatviswa504 I believe the `VisibleForTesting` annotation doesn't hide the constructor from other code. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] dineshchitlangia commented on issue #1386: HDDS-2015. Encrypt/decrypt key using symmetric key while writing/reading
dineshchitlangia commented on issue #1386: HDDS-2015. Encrypt/decrypt key using symmetric key while writing/reading URL: https://github.com/apache/hadoop/pull/1386#issuecomment-529017454 Thanks @bharatviswa504 , @ajayydv , @anuengineer for a great review! Thanks @anuengineer for commit. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16255) ChecksumFS.Make FileSystem.rename(path, path, options) doesn't rename checksum
[ https://issues.apache.org/jira/browse/HADOOP-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924588#comment-16924588 ] Hudson commented on HADOOP-16255: - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17247 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17247/]) HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) (stevel: rev bb0b922a71cba9ceaf00588e9f3e3b2a3c2e3eab) * (add) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestChecksumFs.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/ChecksumFs.java > ChecksumFS.Make FileSystem.rename(path, path, options) doesn't rename checksum > -- > > Key: HADOOP-16255 > URL: https://issues.apache.org/jira/browse/HADOOP-16255 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.8.5, 3.1.2 >Reporter: Steve Loughran >Assignee: Jungtaek Lim >Priority: Major > > ChecksumFS doesn't override FilterFS rename/3, so doesn't rename the checksum > with the file. > As a result, if a file is renamed over an existing file using rename(src, > dest, OVERWRITE) the renamed file will be considered to have an invalid > checksum -the old one is picked up instead. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-16255) ChecksumFS.Make FileSystem.rename(path, path, options) doesn't rename checksum
[ https://issues.apache.org/jira/browse/HADOOP-16255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-16255. - Fix Version/s: 3.2.1 Resolution: Fixed committed to branch-3.2; once you've tested a backport to branch-2 we can put it in there too. thanks! > ChecksumFS.Make FileSystem.rename(path, path, options) doesn't rename checksum > -- > > Key: HADOOP-16255 > URL: https://issues.apache.org/jira/browse/HADOOP-16255 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.8.5, 3.1.2 >Reporter: Steve Loughran >Assignee: Jungtaek Lim >Priority: Major > Fix For: 3.2.1 > > > ChecksumFS doesn't override FilterFS rename/3, so doesn't rename the checksum > with the file. > As a result, if a file is renamed over an existing file using rename(src, > dest, OVERWRITE) the renamed file will be considered to have an invalid > checksum -the old one is picked up instead. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on issue #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called.
steveloughran commented on issue #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. URL: https://github.com/apache/hadoop/pull/1388#issuecomment-529011324 Ok, committed to branch-3.2 and trunk. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran closed pull request #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called.
steveloughran closed pull request #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. URL: https://github.com/apache/hadoop/pull/1388 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321904661 ## File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneClientFactory.java ## @@ -136,6 +136,31 @@ public static OzoneClient getRpcClient(String omHost, Integer omRpcPort, return getRpcClient(config); } + /** + * Returns an OzoneClient which will use RPC protocol. + * + * @param omServiceId + *Service ID of OzoneManager HA cluster. + * + * @param config + *Configuration to be used for OzoneClient creation + * + * @return OzoneClient + * + * @throws IOException + */ + public static OzoneClient getRpcClient(String omServiceId, + Configuration config) + throws IOException { +Preconditions.checkNotNull(omServiceId); +Preconditions.checkNotNull(config); +// Override ozone.om.address just in case it is used later. +// Because if this is not overridden, the (incorrect) value from xml +// will be used? +config.set(OZONE_OM_ADDRESS_KEY, omServiceId); Review comment: As discussed, I will remove this one. Thanks for pointing out! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on issue #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called.
steveloughran commented on issue #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. URL: https://github.com/apache/hadoop/pull/1388#issuecomment-529007530 all is good, +1 for trunk and I'll actually pull into 3.2 as well. For branch-2 I'll give you the homework of the cherrypick and retest before it goes in. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids
smengcl commented on a change in pull request #1360: HDDS-2007. Make ozone fs shell command work with OM HA service ids URL: https://github.com/apache/hadoop/pull/1360#discussion_r321902382 ## File path: hadoop-ozone/client/src/main/java/org/apache/hadoop/ozone/client/OzoneClientFactory.java ## @@ -136,6 +136,31 @@ public static OzoneClient getRpcClient(String omHost, Integer omRpcPort, return getRpcClient(config); } + /** + * Returns an OzoneClient which will use RPC protocol. + * + * @param omServiceId + *Service ID of OzoneManager HA cluster. + * + * @param config + *Configuration to be used for OzoneClient creation + * + * @return OzoneClient + * + * @throws IOException + */ + public static OzoneClient getRpcClient(String omServiceId, Review comment: @bharatviswa504 Yeah I believe we discussed about that and we agreed to do this. But merging all those `getRpcClient()` calls will change something out of the scope of this jira. We should probably open another jira to merge those functions. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, opt
steveloughran commented on a change in pull request #1388: HADOOP-16255. Add ChecksumFs.rename(path, path, boolean) to rename crc file as well when FileContext.rename(path, path, options) is called. URL: https://github.com/apache/hadoop/pull/1388#discussion_r321902124 ## File path: hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/CheckedBiFunction.java ## @@ -0,0 +1,29 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.util; + +import java.io.IOException; + +/** + * Defines a functional interface having two inputs which throws IOException. + */ +@FunctionalInterface +public interface CheckedBiFunction { Review comment: yeah, they should. I'm doing in new code that I know isn't going to be backportable into jdk7, not now we have to worry about jdk11 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] anaeimian opened a new pull request #1412: Avoiding logging Sasl message
anaeimian opened a new pull request #1412: Avoiding logging Sasl message URL: https://github.com/apache/hadoop/pull/1412 Based on a previous issue (HADOOP-11962), Sasl message should not be logged. Also, all of the instances of logging Sasl message have been removed except this one. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on issue #1302: HADOOP-16138. hadoop fs mkdir / of nonexistent abfs container raises NPE
steveloughran commented on issue #1302: HADOOP-16138. hadoop fs mkdir / of nonexistent abfs container raises NPE URL: https://github.com/apache/hadoop/pull/1302#issuecomment-529004456 OK, so this is the test. What about the underlying NPE? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1302: HADOOP-16138. hadoop fs mkdir / of nonexistent abfs container raises NPE
hadoop-yetus removed a comment on issue #1302: HADOOP-16138. hadoop fs mkdir / of nonexistent abfs container raises NPE URL: https://github.com/apache/hadoop/pull/1302#issuecomment-525199814 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 52 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1247 | trunk passed | | +1 | compile | 32 | trunk passed | | +1 | checkstyle | 25 | trunk passed | | +1 | mvnsite | 35 | trunk passed | | +1 | shadedclient | 783 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 27 | trunk passed | | 0 | spotbugs | 57 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 55 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 30 | the patch passed | | +1 | compile | 26 | the patch passed | | +1 | javac | 26 | the patch passed | | +1 | checkstyle | 19 | the patch passed | | +1 | mvnsite | 28 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 775 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 23 | the patch passed | | +1 | findbugs | 55 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 74 | hadoop-azure in the patch passed. | | +1 | asflicense | 31 | The patch does not generate ASF License warnings. | | | | 3425 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1302/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1302 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 6816f89b6687 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 3329257 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1302/6/testReport/ | | Max. process+thread count | 446 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1302/6/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1302: HADOOP-16138. hadoop fs mkdir / of nonexistent abfs container raises NPE
hadoop-yetus removed a comment on issue #1302: HADOOP-16138. hadoop fs mkdir / of nonexistent abfs container raises NPE URL: https://github.com/apache/hadoop/pull/1302#issuecomment-527260918 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 72 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1180 | trunk passed | | +1 | compile | 27 | trunk passed | | +1 | checkstyle | 20 | trunk passed | | +1 | mvnsite | 29 | trunk passed | | +1 | shadedclient | 806 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 22 | trunk passed | | 0 | spotbugs | 50 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 47 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 27 | the patch passed | | +1 | compile | 23 | the patch passed | | +1 | javac | 23 | the patch passed | | +1 | checkstyle | 15 | the patch passed | | +1 | mvnsite | 25 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 837 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 30 | the patch passed | | +1 | findbugs | 53 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 81 | hadoop-azure in the patch passed. | | +1 | asflicense | 27 | The patch does not generate ASF License warnings. | | | | 3409 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=18.09.7 Server=18.09.7 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1302/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1302 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 4965a22516bf 4.15.0-52-generic #56-Ubuntu SMP Tue Jun 4 22:49:08 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 915cbc9 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1302/7/testReport/ | | Max. process+thread count | 305 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-azure U: hadoop-tools/hadoop-azure | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1302/7/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
steveloughran commented on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-529003436 @mackrorysd do you want to revisit this? It's to tell us developers working on the class that this is sometimes used directly by specific tools and they need to take care when changing things This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-515459809 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 72 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1215 | trunk passed | | +1 | compile | 34 | trunk passed | | +1 | checkstyle | 24 | trunk passed | | +1 | mvnsite | 40 | trunk passed | | +1 | shadedclient | 757 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 27 | trunk passed | | 0 | spotbugs | 69 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 66 | hadoop-tools/hadoop-aws in trunk has 1 extant findbugs warnings. | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 42 | the patch passed | | +1 | compile | 37 | the patch passed | | +1 | javac | 37 | the patch passed | | +1 | checkstyle | 20 | the patch passed | | +1 | mvnsite | 42 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 840 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 25 | the patch passed | | +1 | findbugs | 79 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 295 | hadoop-aws in the patch passed. | | +1 | asflicense | 28 | The patch does not generate ASF License warnings. | | | | 3723 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/3/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 047465ae3480 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / c0a0c35 | | Default Java | 1.8.0_212 | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/3/artifact/out/branch-findbugs-hadoop-tools_hadoop-aws-warnings.html | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/3/testReport/ | | Max. process+thread count | 412 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/3/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-513202452 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 92 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 1 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1365 | trunk passed | | +1 | compile | 44 | trunk passed | | +1 | checkstyle | 27 | trunk passed | | +1 | mvnsite | 48 | trunk passed | | +1 | shadedclient | 908 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 32 | trunk passed | | 0 | spotbugs | 73 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 71 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 40 | the patch passed | | +1 | compile | 44 | the patch passed | | +1 | javac | 44 | the patch passed | | +1 | checkstyle | 22 | the patch passed | | +1 | mvnsite | 42 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 919 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 25 | the patch passed | | +1 | findbugs | 69 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 299 | hadoop-aws in the patch passed. | | +1 | asflicense | 30 | The patch does not generate ASF License warnings. | | | | 4153 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=18.09.8 Server=18.09.8 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 738153e5a963 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 4e66cb9 | | Default Java | 1.8.0_212 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/2/testReport/ | | Max. process+thread count | 308 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/2/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-519347472 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 71 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1213 | trunk passed | | +1 | compile | 34 | trunk passed | | +1 | checkstyle | 26 | trunk passed | | +1 | mvnsite | 41 | trunk passed | | +1 | shadedclient | 832 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 28 | trunk passed | | 0 | spotbugs | 69 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 67 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 37 | the patch passed | | +1 | compile | 29 | the patch passed | | +1 | javac | 29 | the patch passed | | +1 | checkstyle | 20 | the patch passed | | +1 | mvnsite | 35 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 852 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 25 | the patch passed | | +1 | findbugs | 82 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 309 | hadoop-aws in the patch passed. | | +1 | asflicense | 30 | The patch does not generate ASF License warnings. | | | | 3807 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/5/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux be6422046785 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 70b4617 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/5/testReport/ | | Max. process+thread count | 306 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/5/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-519476615 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 45 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1026 | trunk passed | | +1 | compile | 27 | trunk passed | | +1 | checkstyle | 21 | trunk passed | | +1 | mvnsite | 31 | trunk passed | | +1 | shadedclient | 660 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 21 | trunk passed | | 0 | spotbugs | 55 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 53 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 33 | the patch passed | | +1 | compile | 24 | the patch passed | | +1 | javac | 24 | the patch passed | | +1 | checkstyle | 16 | the patch passed | | +1 | mvnsite | 29 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 723 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 21 | the patch passed | | +1 | findbugs | 60 | the patch passed | ||| _ Other Tests _ | | -1 | unit | 266 | hadoop-aws in the patch failed. | | +1 | asflicense | 26 | The patch does not generate ASF License warnings. | | | | 3151 | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.fs.s3a.commit.staging.TestStagingCommitter | | | hadoop.fs.s3a.s3guard.TestNullMetadataStore | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/6/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 59c4ea1b2e0a 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 397a563 | | Default Java | 1.8.0_222 | | unit | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/6/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/6/testReport/ | | Max. process+thread count | 447 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/6/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-517513363 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 77 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 1 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1405 | trunk passed | | +1 | compile | 46 | trunk passed | | +1 | checkstyle | 27 | trunk passed | | +1 | mvnsite | 48 | trunk passed | | +1 | shadedclient | 858 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 32 | trunk passed | | 0 | spotbugs | 75 | Used deprecated FindBugs config; considering switching to SpotBugs. | | -1 | findbugs | 73 | hadoop-tools/hadoop-aws in trunk has 1 extant findbugs warnings. | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 37 | the patch passed | | +1 | compile | 35 | the patch passed | | +1 | javac | 35 | the patch passed | | +1 | checkstyle | 20 | the patch passed | | +1 | mvnsite | 43 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 855 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 22 | the patch passed | | +1 | findbugs | 67 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 307 | hadoop-aws in the patch passed. | | +1 | asflicense | 31 | The patch does not generate ASF License warnings. | | | | 4044 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/4/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 86d0c519a07f 4.15.0-54-generic #58-Ubuntu SMP Mon Jun 24 10:55:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 17e8cf5 | | Default Java | 1.8.0_222 | | findbugs | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/4/artifact/out/branch-findbugs-hadoop-tools_hadoop-aws-warnings.html | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/4/testReport/ | | Max. process+thread count | 326 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/4/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-522081102 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 44 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1035 | trunk passed | | +1 | compile | 29 | trunk passed | | +1 | checkstyle | 21 | trunk passed | | +1 | mvnsite | 33 | trunk passed | | +1 | shadedclient | 668 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 22 | trunk passed | | 0 | spotbugs | 56 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 55 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 29 | the patch passed | | +1 | compile | 24 | the patch passed | | +1 | javac | 24 | the patch passed | | +1 | checkstyle | 15 | the patch passed | | +1 | mvnsite | 30 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 698 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 23 | the patch passed | | +1 | findbugs | 63 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 70 | hadoop-aws in the patch passed. | | +1 | asflicense | 26 | The patch does not generate ASF License warnings. | | | | 2950 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/7/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 44b12f47119a 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / e356e4f | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/7/testReport/ | | Max. process+thread count | 446 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/7/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-523044381 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 45 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1088 | trunk passed | | +1 | compile | 26 | trunk passed | | +1 | checkstyle | 18 | trunk passed | | +1 | mvnsite | 31 | trunk passed | | +1 | shadedclient | 678 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 23 | trunk passed | | 0 | spotbugs | 54 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 53 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 36 | the patch passed | | +1 | compile | 24 | the patch passed | | +1 | javac | 24 | the patch passed | | +1 | checkstyle | 16 | the patch passed | | +1 | mvnsite | 29 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 728 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 22 | the patch passed | | +1 | findbugs | 57 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 74 | hadoop-aws in the patch passed. | | +1 | asflicense | 25 | The patch does not generate ASF License warnings. | | | | 3057 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/8/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux e1cbb6b18a0e 4.4.0-138-generic #164-Ubuntu SMP Tue Oct 2 17:16:02 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 094d736 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/8/testReport/ | | Max. process+thread count | 447 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/8/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-525244667 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 42 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1142 | trunk passed | | +1 | compile | 32 | trunk passed | | +1 | checkstyle | 26 | trunk passed | | +1 | mvnsite | 36 | trunk passed | | +1 | shadedclient | 709 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 25 | trunk passed | | 0 | spotbugs | 65 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 63 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 31 | the patch passed | | +1 | compile | 27 | the patch passed | | +1 | javac | 27 | the patch passed | | +1 | checkstyle | 19 | the patch passed | | +1 | mvnsite | 32 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 744 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 24 | the patch passed | | +1 | findbugs | 64 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 80 | hadoop-aws in the patch passed. | | +1 | asflicense | 28 | The patch does not generate ASF License warnings. | | | | 3208 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/10/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux acaf590bf412 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 3329257 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/10/testReport/ | | Max. process+thread count | 441 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/10/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving
hadoop-yetus removed a comment on issue #1041: HADOOP-15844. Tag S3GuardTool entry points as LimitedPrivate/Evolving URL: https://github.com/apache/hadoop/pull/1041#issuecomment-523912741 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 42 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 1071 | trunk passed | | +1 | compile | 29 | trunk passed | | +1 | checkstyle | 19 | trunk passed | | +1 | mvnsite | 30 | trunk passed | | +1 | shadedclient | 687 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 22 | trunk passed | | 0 | spotbugs | 55 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 54 | trunk passed | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 31 | the patch passed | | +1 | compile | 24 | the patch passed | | +1 | javac | 24 | the patch passed | | +1 | checkstyle | 15 | the patch passed | | +1 | mvnsite | 31 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 703 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 21 | the patch passed | | +1 | findbugs | 58 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 68 | hadoop-aws in the patch passed. | | +1 | asflicense | 25 | The patch does not generate ASF License warnings. | | | | 3013 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/9/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1041 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle | | uname | Linux 9f6a5e25e563 4.4.0-139-generic #165-Ubuntu SMP Wed Oct 24 10:58:50 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / 69ddb36 | | Default Java | 1.8.0_222 | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/9/testReport/ | | Max. process+thread count | 413 (vs. ulimit of 5500) | | modules | C: hadoop-tools/hadoop-aws U: hadoop-tools/hadoop-aws | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1041/9/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file …
hadoop-yetus commented on issue #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file … URL: https://github.com/apache/hadoop/pull/1411#issuecomment-528994454 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 108 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | 0 | shelldocs | 0 | Shelldocs was not available. | | +1 | @author | 0 | The patch does not contain any @author tags. | | -1 | test4tests | 0 | The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. | ||| _ trunk Compile Tests _ | | +1 | mvninstall | 724 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 896 | branch has no errors when building and testing our client artifacts. | ||| _ Patch Compile Tests _ | | +1 | mvninstall | 616 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | -1 | shellcheck | 33 | The patch generated 1 new + 3 unchanged - 0 fixed = 4 total (was 3) | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | shadedclient | 770 | patch has no errors when building and testing our client artifacts. | ||| _ Other Tests _ | | +1 | unit | 112 | hadoop-hdds in the patch passed. | | +1 | unit | 297 | hadoop-ozone in the patch passed. | | +1 | asflicense | 45 | The patch does not generate ASF License warnings. | | | | 3806 | | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.2 Server=19.03.2 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1411/1/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1411 | | Optional Tests | dupname asflicense mvnsite unit shellcheck shelldocs | | uname | Linux 5ce4281a3c08 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / b15c116 | | shellcheck | https://builds.apache.org/job/hadoop-multibranch/job/PR-1411/1/artifact/out/diff-patch-shellcheck.txt | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1411/1/testReport/ | | Max. process+thread count | 306 (vs. ulimit of 5500) | | modules | C: hadoop-ozone/common U: hadoop-ozone/common | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1411/1/console | | versions | git=2.7.4 maven=3.3.9 shellcheck=0.4.6 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on a change in pull request #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file …
hadoop-yetus commented on a change in pull request #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file … URL: https://github.com/apache/hadoop/pull/1411#discussion_r321889337 ## File path: hadoop-ozone/common/src/main/bin/ozone ## @@ -69,6 +69,12 @@ function ozonecmd_case subcmd=$1 shift + ozone_default_log4j="${HADOOP_CONF_DIR}/log4j.properties" + ozone_shell_log4j="${HADOOP_CONF_DIR}/ozone-shell-log4j.properties" + if [ ! -f ${ozone_shell_log4j} ]; then Review comment: shellcheck:13: note: Double quote to prevent globbing and word splitting. [SC2086] This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager
hadoop-yetus commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager URL: https://github.com/apache/hadoop/pull/1409#issuecomment-528985415 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 41 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 0 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 1 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 24 | Maven dependency ordering for branch | | +1 | mvninstall | 592 | trunk passed | | +1 | compile | 378 | trunk passed | | +1 | checkstyle | 80 | trunk passed | | +1 | mvnsite | 0 | trunk passed | | +1 | shadedclient | 878 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 175 | trunk passed | | 0 | spotbugs | 420 | Used deprecated FindBugs config; considering switching to SpotBugs. | | +1 | findbugs | 615 | trunk passed | ||| _ Patch Compile Tests _ | | 0 | mvndep | 39 | Maven dependency ordering for patch | | +1 | mvninstall | 537 | the patch passed | | +1 | compile | 385 | the patch passed | | +1 | javac | 385 | the patch passed | | +1 | checkstyle | 87 | the patch passed | | +1 | mvnsite | 0 | the patch passed | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | xml | 1 | The patch has no ill-formed XML file. | | +1 | shadedclient | 683 | patch has no errors when building and testing our client artifacts. | | +1 | javadoc | 175 | the patch passed | | +1 | findbugs | 630 | the patch passed | ||| _ Other Tests _ | | +1 | unit | 283 | hadoop-hdds in the patch passed. | | -1 | unit | 184 | hadoop-ozone in the patch failed. | | +1 | asflicense | 45 | The patch does not generate ASF License warnings. | | | | 6025 | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.ozone.om.ratis.TestOzoneManagerRatisServer | | Subsystem | Report/Notes | |--:|:-| | Docker | Client=19.03.1 Server=19.03.1 base: https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/2/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/1409 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite unit shadedclient findbugs checkstyle xml | | uname | Linux 494335476d80 4.15.0-58-generic #64-Ubuntu SMP Tue Aug 6 11:12:41 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | personality/hadoop.sh | | git revision | trunk / a234175 | | Default Java | 1.8.0_222 | | unit | https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/2/artifact/out/patch-unit-hadoop-ozone.txt | | Test Results | https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/2/testReport/ | | Max. process+thread count | 1286 (vs. ulimit of 5500) | | modules | C: hadoop-hdds/common hadoop-hdds/container-service hadoop-ozone/integration-test U: . | | Console output | https://builds.apache.org/job/hadoop-multibranch/job/PR-1409/2/console | | versions | git=2.7.4 maven=3.3.9 findbugs=3.1.0-RC1 | | Powered by | Apache Yetus 0.10.0 http://yetus.apache.org | This message was automatically generated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] anuengineer commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager
anuengineer commented on issue #1409: HDDS-2087. Remove the hard coded config key in ChunkManager URL: https://github.com/apache/hadoop/pull/1409#issuecomment-528978496 +1, I will commit this after the test run. Thx This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13363) Upgrade protobuf from 2.5.0 to something newer
[ https://issues.apache.org/jira/browse/HADOOP-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924523#comment-16924523 ] Anu Engineer commented on HADOOP-13363: --- +1, on the protoc maven approach. Ozone also uses it and we have never had a problem in the Jenkins or other build systems (we build and test using Argo/K8s internally). > Upgrade protobuf from 2.5.0 to something newer > -- > > Key: HADOOP-13363 > URL: https://issues.apache.org/jira/browse/HADOOP-13363 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Affects Versions: 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Allen Wittenauer >Assignee: Vinayakumar B >Priority: Major > Labels: security > Attachments: HADOOP-13363.001.patch, HADOOP-13363.002.patch, > HADOOP-13363.003.patch, HADOOP-13363.004.patch, HADOOP-13363.005.patch > > > Standard protobuf 2.5.0 does not work properly on many platforms. (See, for > example, https://gist.github.com/BennettSmith/7111094 ). In order for us to > avoid crazy work arounds in the build environment and the fact that 2.5.0 is > starting to slowly disappear as a standard install-able package for even > Linux/x86, we need to either upgrade or self bundle or something else. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] avijayanhwx commented on issue #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file …
avijayanhwx commented on issue #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file … URL: https://github.com/apache/hadoop/pull/1411#issuecomment-528973963 /label ozone This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] avijayanhwx opened a new pull request #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file …
avijayanhwx opened a new pull request #1411: HDDS-2098 : Ozone shell command prints out ERROR when the log4j file … URL: https://github.com/apache/hadoop/pull/1411 …is not present. Manually tested change on cluster. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-13363) Upgrade protobuf from 2.5.0 to something newer
[ https://issues.apache.org/jira/browse/HADOOP-13363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924511#comment-16924511 ] stack commented on HADOOP-13363: Perhaps this pom helps? https://github.com/apache/hbase/blob/master/hbase-protocol-shaded/pom.xml Its the hackery from hbase generating shaded pb module inline w/ main build. PB files are generated on the fly using the godsend protobuf-maven-plugin plugin which pulls the appropriate protoc at build time (so no need to set up a protoc or protoc path). The replacer plugin rewrites generated pbs so shaded and in place for downstream modules at build time. I remember getting ordering and shading correct was a pain but have forgotten the details unfortunately. > Upgrade protobuf from 2.5.0 to something newer > -- > > Key: HADOOP-13363 > URL: https://issues.apache.org/jira/browse/HADOOP-13363 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Affects Versions: 3.0.0-alpha1, 3.0.0-alpha2 >Reporter: Allen Wittenauer >Assignee: Vinayakumar B >Priority: Major > Labels: security > Attachments: HADOOP-13363.001.patch, HADOOP-13363.002.patch, > HADOOP-13363.003.patch, HADOOP-13363.004.patch, HADOOP-13363.005.patch > > > Standard protobuf 2.5.0 does not work properly on many platforms. (See, for > example, https://gist.github.com/BennettSmith/7111094 ). In order for us to > avoid crazy work arounds in the build environment and the fact that 2.5.0 is > starting to slowly disappear as a standard install-able package for even > Linux/x86, we need to either upgrade or self bundle or something else. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] anuengineer commented on issue #1386: HDDS-2015. Encrypt/decrypt key using symmetric key while writing/reading
anuengineer commented on issue #1386: HDDS-2015. Encrypt/decrypt key using symmetric key while writing/reading URL: https://github.com/apache/hadoop/pull/1386#issuecomment-528969668 @ajayydv @bharatviswa504 Thanks for comments. @dineshchitlangia Thanks for the contribution. I have committed this patch to the trunk branch. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] anuengineer closed pull request #1386: HDDS-2015. Encrypt/decrypt key using symmetric key while writing/reading
anuengineer closed pull request #1386: HDDS-2015. Encrypt/decrypt key using symmetric key while writing/reading URL: https://github.com/apache/hadoop/pull/1386 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16531) Log more detail for slow RPC
[ https://issues.apache.org/jira/browse/HADOOP-16531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924457#comment-16924457 ] Hudson commented on HADOOP-16531: - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17244 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17244/]) HADOOP-16531. Log more timing information for slow RPCs. Contributed by (xkrogen: rev a23417533e1ee052893baf207ec636c4993c5994) * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/ipc/Server.java > Log more detail for slow RPC > > > Key: HADOOP-16531 > URL: https://issues.apache.org/jira/browse/HADOOP-16531 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Chen Zhang >Assignee: Chen Zhang >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16531.001.patch > > > Current implementation only log process time > {code:java} > if ((rpcMetrics.getProcessingSampleCount() > minSampleSize) && > (processingTime > threeSigma)) { > LOG.warn("Slow RPC : {} took {} {} to process from client {}", > methodName, processingTime, RpcMetrics.TIMEUNIT, call); > rpcMetrics.incrSlowRpc(); > } > {code} > We need to log more details to help us locate the problem (eg. how long it > take to request lock, holding lock, or do other things) -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16531) Log more detail for slow RPC
[ https://issues.apache.org/jira/browse/HADOOP-16531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HADOOP-16531: - Fix Version/s: 3.3.0 Resolution: Fixed Status: Resolved (was: Patch Available) I just committed this to trunk. Thanks a lot for the contribution [~zhangchen]! > Log more detail for slow RPC > > > Key: HADOOP-16531 > URL: https://issues.apache.org/jira/browse/HADOOP-16531 > Project: Hadoop Common > Issue Type: Improvement >Reporter: Chen Zhang >Assignee: Chen Zhang >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-16531.001.patch > > > Current implementation only log process time > {code:java} > if ((rpcMetrics.getProcessingSampleCount() > minSampleSize) && > (processingTime > threeSigma)) { > LOG.warn("Slow RPC : {} took {} {} to process from client {}", > methodName, processingTime, RpcMetrics.TIMEUNIT, call); > rpcMetrics.incrSlowRpc(); > } > {code} > We need to log more details to help us locate the problem (eg. how long it > take to request lock, holding lock, or do other things) -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.
[ https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924443#comment-16924443 ] Hudson commented on HADOOP-15565: - FAILURE: Integrated in Jenkins build Hadoop-trunk-Commit #17242 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/17242/]) HADOOP-15565. Add an inner FS cache to ViewFileSystem, separate from the (xkrogen: rev c92a3e94d80c86199e65735ee5aec4a6f02f50a3) * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/TestChRootedFileSystem.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/ViewFileSystemBaseTest.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ChRootedFileSystem.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/ViewFileSystem.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/TestFileUtil.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFsDefaultValue.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemDelegation.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java * (edit) hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/viewfs/Constants.java * (edit) hadoop-common-project/hadoop-common/src/test/java/org/apache/hadoop/fs/viewfs/TestViewFileSystemDelegationTokenSupport.java > ViewFileSystem.close doesn't close child filesystems and causes FileSystem > objects leak. > > > Key: HADOOP-15565 > URL: https://issues.apache.org/jira/browse/HADOOP-15565 > Project: Hadoop Common > Issue Type: Bug >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch, > HADOOP-15565.0003.patch, HADOOP-15565.0004.patch, HADOOP-15565.0005.patch, > HADOOP-15565.0006.bak, HADOOP-15565.0006.patch, HADOOP-15565.0007.patch, > HADOOP-15565.0008.patch > > > ViewFileSystem.close() does nothing but remove itself from FileSystem.CACHE. > It's children filesystems are cached in FileSystem.CACHE and shared by all > the ViewFileSystem instances. We could't simply close all the children > filesystems because it will break the semantic of FileSystem.newInstance(). > We might add an inner cache to ViewFileSystem, let it cache all the children > filesystems. The children filesystems are not shared any more. When > ViewFileSystem is closed we close all the children filesystems in the inner > cache. The ViewFileSystem is still cached by FileSystem.CACHE so there won't > be too many FileSystem instances. > The FileSystem.CACHE caches the ViewFileSysem instance and the other > instances(the children filesystems) are cached in the inner cache. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.
[ https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Krogen updated HADOOP-15565: - Fix Version/s: 3.3.0 Resolution: Fixed Status: Resolved (was: Patch Available) > ViewFileSystem.close doesn't close child filesystems and causes FileSystem > objects leak. > > > Key: HADOOP-15565 > URL: https://issues.apache.org/jira/browse/HADOOP-15565 > Project: Hadoop Common > Issue Type: Bug >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Fix For: 3.3.0 > > Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch, > HADOOP-15565.0003.patch, HADOOP-15565.0004.patch, HADOOP-15565.0005.patch, > HADOOP-15565.0006.bak, HADOOP-15565.0006.patch, HADOOP-15565.0007.patch, > HADOOP-15565.0008.patch > > > ViewFileSystem.close() does nothing but remove itself from FileSystem.CACHE. > It's children filesystems are cached in FileSystem.CACHE and shared by all > the ViewFileSystem instances. We could't simply close all the children > filesystems because it will break the semantic of FileSystem.newInstance(). > We might add an inner cache to ViewFileSystem, let it cache all the children > filesystems. The children filesystems are not shared any more. When > ViewFileSystem is closed we close all the children filesystems in the inner > cache. The ViewFileSystem is still cached by FileSystem.CACHE so there won't > be too many FileSystem instances. > The FileSystem.CACHE caches the ViewFileSysem instance and the other > instances(the children filesystems) are cached in the inner cache. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-15565) ViewFileSystem.close doesn't close child filesystems and causes FileSystem objects leak.
[ https://issues.apache.org/jira/browse/HADOOP-15565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924441#comment-16924441 ] Erik Krogen commented on HADOOP-15565: -- The v8 patch LGTM, thanks a lot [~LiJinglun]! I just committed this to trunk. > ViewFileSystem.close doesn't close child filesystems and causes FileSystem > objects leak. > > > Key: HADOOP-15565 > URL: https://issues.apache.org/jira/browse/HADOOP-15565 > Project: Hadoop Common > Issue Type: Bug >Reporter: Jinglun >Assignee: Jinglun >Priority: Major > Attachments: HADOOP-15565.0001.patch, HADOOP-15565.0002.patch, > HADOOP-15565.0003.patch, HADOOP-15565.0004.patch, HADOOP-15565.0005.patch, > HADOOP-15565.0006.bak, HADOOP-15565.0006.patch, HADOOP-15565.0007.patch, > HADOOP-15565.0008.patch > > > ViewFileSystem.close() does nothing but remove itself from FileSystem.CACHE. > It's children filesystems are cached in FileSystem.CACHE and shared by all > the ViewFileSystem instances. We could't simply close all the children > filesystems because it will break the semantic of FileSystem.newInstance(). > We might add an inner cache to ViewFileSystem, let it cache all the children > filesystems. The children filesystems are not shared any more. When > ViewFileSystem is closed we close all the children filesystems in the inner > cache. The ViewFileSystem is still cached by FileSystem.CACHE so there won't > be too many FileSystem instances. > The FileSystem.CACHE caches the ViewFileSysem instance and the other > instances(the children filesystems) are cached in the inner cache. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-16547) s3guard prune command doesn't get AWS auth chain from FS
[ https://issues.apache.org/jira/browse/HADOOP-16547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16924403#comment-16924403 ] Steve Loughran commented on HADOOP-16547: - more: to get to this you have to have set the fs.s3a.s3guard.ddb.region property, or provide the -region option, otherwise the FS is instantiated to work out the region. > s3guard prune command doesn't get AWS auth chain from FS > > > Key: HADOOP-16547 > URL: https://issues.apache.org/jira/browse/HADOOP-16547 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.0 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > > s3guard prune command doesn't get AWS auth chain from any FS, so it just > drives the DDB store from the conf settings. If S3A is set up to use > Delegation tokens then the DTs/custom AWS auth sequence is not picked up, so > you get an auth failure. > Fix: > # instantiate the FS before calling initMetadataStore > # review other commands to make sure problem isn't replicated -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] anuengineer commented on issue #1344: HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states
anuengineer commented on issue #1344: HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states URL: https://github.com/apache/hadoop/pull/1344#issuecomment-528927862 Just a note; Originally DatanodeInfo was based on the HDFS code. Then I think we copied and created our own structure. At this point, diverging should not be a big deal is what I think. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] anuengineer commented on a change in pull request #1344: HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states
anuengineer commented on a change in pull request #1344: HDDS-1982 Extend SCMNodeManager to support decommission and maintenance states URL: https://github.com/apache/hadoop/pull/1344#discussion_r321819701 ## File path: hadoop-hdds/server-scm/src/main/java/org/apache/hadoop/hdds/scm/node/states/NodeStateMap.java ## @@ -43,7 +45,7 @@ /** * Represents the current state of node. */ - private final ConcurrentHashMap> stateMap; + private final ConcurrentHashMap stateMap; Review comment: even if you have 15x states, the number of nodes is less. if you have 100 nodes, there are only 1500 states, and if you have 1000 nodes, it is 15000 states. It is still trivial to keep these in memory. Here is the real kicker, just like we decided not to write all cross products for the NodeState static functions, we will end up needing lists of only frequently accessed pattern (in mind that would be (in_service, healthy). All other node queries can be retrieved by iterating the lists as needed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] hadoop-yetus commented on issue #1408: HADOOP-13363. Upgrade protobuf from 2.5.0 to something newer
hadoop-yetus commented on issue #1408: HADOOP-13363. Upgrade protobuf from 2.5.0 to something newer URL: https://github.com/apache/hadoop/pull/1408#issuecomment-528924774 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Comment | |::|--:|:|:| | 0 | reexec | 49 | Docker mode activated. | ||| _ Prechecks _ | | +1 | dupname | 15 | No case conflicting files found. | | +1 | @author | 0 | The patch does not contain any @author tags. | | +1 | test4tests | 0 | The patch appears to include 31 new or modified test files. | ||| _ trunk Compile Tests _ | | 0 | mvndep | 25 | Maven dependency ordering for branch | | +1 | mvninstall | 1090 | trunk passed | | +1 | compile | 1044 | trunk passed | | +1 | checkstyle | 223 | trunk passed | | +1 | mvnsite | 1090 | trunk passed | | +1 | shadedclient | 2077 | branch has no errors when building and testing our client artifacts. | | +1 | javadoc | 884 | trunk passed | | 0 | spotbugs | 32 | Used deprecated FindBugs config; considering switching to SpotBugs. | | 0 | findbugs | 31 | branch/hadoop-project no findbugs output file (findbugsXml.xml) | | 0 | findbugs | 36 | branch/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-tests no findbugs output file (findbugsXml.xml) | | 0 | findbugs | 32 | branch/hadoop-client-modules/hadoop-client-api no findbugs output file (findbugsXml.xml) | ||| _ Patch Compile Tests _ | | 0 | mvndep | 19 | Maven dependency ordering for patch | | -1 | mvninstall | 31 | hadoop-common-project in the patch failed. | | -1 | mvninstall | 13 | hadoop-common in the patch failed. | | -1 | mvninstall | 25 | hadoop-hdfs in the patch failed. | | -1 | mvninstall | 22 | hadoop-hdfs-client in the patch failed. | | -1 | mvninstall | 19 | hadoop-hdfs-rbf in the patch failed. | | -1 | mvninstall | 19 | hadoop-mapreduce-client-common in the patch failed. | | -1 | mvninstall | 19 | hadoop-mapreduce-client-hs in the patch failed. | | -1 | mvninstall | 17 | hadoop-mapreduce-client-shuffle in the patch failed. | | -1 | mvninstall | 16 | hadoop-fs2img in the patch failed. | | -1 | mvninstall | 22 | hadoop-yarn-api in the patch failed. | | -1 | mvninstall | 20 | hadoop-yarn-services-core in the patch failed. | | -1 | mvninstall | 19 | hadoop-yarn-client in the patch failed. | | -1 | mvninstall | 25 | hadoop-yarn-common in the patch failed. | | -1 | mvninstall | 21 | hadoop-yarn-server-applicationhistoryservice in the patch failed. | | -1 | mvninstall | 19 | hadoop-yarn-server-common in the patch failed. | | -1 | mvninstall | 19 | hadoop-yarn-server-nodemanager in the patch failed. | | -1 | mvninstall | 23 | hadoop-yarn-server-resourcemanager in the patch failed. | | -1 | mvninstall | 18 | hadoop-yarn-server-tests in the patch failed. | | +1 | compile | 1213 | the patch passed | | -1 | javac | 1213 | root generated 409 new + 1471 unchanged - 0 fixed = 1880 total (was 1471) | | -0 | checkstyle | 244 | root: The patch generated 25 new + 3590 unchanged - 2 fixed = 3615 total (was 3592) | | -1 | mvnsite | 40 | hadoop-hdfs in the patch failed. | | -1 | mvnsite | 41 | hadoop-hdfs-client in the patch failed. | | -1 | mvnsite | 37 | hadoop-hdfs-rbf in the patch failed. | | -1 | mvnsite | 37 | hadoop-mapreduce-client-common in the patch failed. | | -1 | mvnsite | 35 | hadoop-mapreduce-client-hs in the patch failed. | | -1 | mvnsite | 35 | hadoop-mapreduce-client-shuffle in the patch failed. | | -1 | mvnsite | 33 | hadoop-fs2img in the patch failed. | | -1 | mvnsite | 40 | hadoop-yarn-api in the patch failed. | | -1 | mvnsite | 37 | hadoop-yarn-services-core in the patch failed. | | -1 | mvnsite | 34 | hadoop-yarn-client in the patch failed. | | -1 | mvnsite | 42 | hadoop-yarn-common in the patch failed. | | -1 | mvnsite | 36 | hadoop-yarn-server-applicationhistoryservice in the patch failed. | | -1 | mvnsite | 38 | hadoop-yarn-server-common in the patch failed. | | -1 | mvnsite | 37 | hadoop-yarn-server-nodemanager in the patch failed. | | -1 | mvnsite | 38 | hadoop-yarn-server-resourcemanager in the patch failed. | | -1 | mvnsite | 35 | hadoop-yarn-server-tests in the patch failed. | | +1 | whitespace | 0 | The patch has no whitespace issues. | | +1 | xml | 33 | The patch has no ill-formed XML file. | | -1 | shadedclient | 954 | patch has errors when building and testing our client artifacts. | | -1 | javadoc | 54 | hadoop-hdfs-project_hadoop-hdfs generated 100 new + 0 unchanged - 0 fixed = 100 total (was 0) | | -1 | javadoc | 47 | hadoop-hdfs-project_hadoop-hdfs-client generated 100 new + 0 unchanged - 0 fixed = 100 total (was 0) | | -1 | javadoc | 43 | hadoop-hdfs-project_hadoop-hdfs-rbf generated 100 new + 0 unchanged - 0 fixed = 100 total (was 0) | | -1 | javadoc | 37 | hadoop-mapred
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321800994 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321802607 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321806732 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321803184 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,315 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + // The rawFS and metadataStore are here to prepare when the ViolationHandlers + // will not just log, but fix the violations, so they will have access. + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); Review comment: do ("{}", this) so that the toString is only invoked at debug level log This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321812144 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardFsck.java ## @@ -0,0 +1,707 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + + +import java.net.URI; +import java.util.List; +import java.util.UUID; + +import org.apache.hadoop.io.IOUtils; +import org.junit.Before; +import org.junit.Test; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AbstractS3ATestBase; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.assertj.core.api.Assertions; + +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.fs.s3a.Constants.METADATASTORE_AUTHORITATIVE; +import static org.apache.hadoop.fs.s3a.Constants.S3_METADATA_STORE_IMPL; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.awaitFileStatus; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.metadataStorePersistsAuthoritativeBit; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides; +import static org.junit.Assume.assumeTrue; + +/** + * Integration tests for the S3Guard Fsck against a dyamodb backed metadata + * store. + */ +public class ITestS3GuardFsck extends AbstractS3ATestBase { + + private S3AFileSystem guardedFs; + private S3AFileSystem rawFS; + + private MetadataStore metadataStore; + + @Before + public void setup() throws Exception { +super.setup(); +S3AFileSystem fs = getFileSystem(); +// These test will fail if no ms +assertTrue("FS needs to have a metadatastore.", +fs.hasMetadataStore()); +assertTrue("Metadatastore should persist authoritative bit", +metadataStorePersistsAuthoritativeBit(fs.getMetadataStore())); + +guardedFs = fs; +metadataStore = fs.getMetadataStore(); + +// create raw fs without s3guard +rawFS = createUnguardedFS(); +assertFalse("Raw FS still has S3Guard " + rawFS, +rawFS.hasMetadataStore()); + } + + @Override + public void teardown() throws Exception { +if (guardedFs != null) { + IOUtils.cleanupWithLogger(LOG, guardedFs); +} +IOUtils.cleanupWithLogger(LOG, rawFS); +super.teardown(); + } + + /** + * Create a test filesystem which is always unguarded. + * This filesystem MUST be closed in test teardown. + * @return the new FS + */ + private S3AFileSystem createUnguardedFS() throws Exception { +S3AFileSystem testFS = getFileSystem(); +Configuration config = new Configuration(testFS.getConf()); +URI uri = testFS.getUri(); + +removeBaseAndBucketOverrides(uri.getHost(), config, +S3_METADATA_STORE_IMPL); +removeBaseAndBucketOverrides(uri.getHost(), config, +METADATASTORE_AUTHORITATIVE); +S3AFileSystem fs2 = new S3AFileSystem(); +fs2.initialize(uri, config); +return fs2; + } + + @Test + public void testIDetectNoMetadataEntry() throws Exception { +final Path cwd = path("/" + getMethodName() + "-" + UUID.randomUUID()); +final Path file = new Path(cwd, "file"); +try { + touch(rawFS, file); + awaitFileStatus(rawFS, file); + + final S3GuardFsck s3GuardFsck = + new S3GuardFsck(rawFS, metadataStore); + + final List comparePairs = + s3GuardFsck.compareS3ToMs(cwd); + + assertEquals("Number of pairs should be two.", 2, + comparePairs.size()); + final S3GuardFsck.ComparePair pair = comparePairs.get(0); + assertTrue("The pair must contain a violation.", pair.containsViolation()); + assertEquals("The pair must contain only one violation", 1, + pair.getViolations().size()); + + final S3GuardFsck.Violation violation = + pair.getViolations().iterator().next(); + assertEquals("The violation should be that there is no violation entry.", + violation, S3GuardFsck.Vi
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321806500 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316670881 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java ## @@ -1485,6 +1486,89 @@ private void vprintln(PrintStream out, String format, Object... } } + /** + * Prune metadata that has not been modified recently. + */ + static class Fsck extends S3GuardTool { +public static final String CHECK_FLAG = "check"; + +public static final String NAME = "fsck"; +public static final String PURPOSE = "Compares S3 with MetadataStore, and " ++ "returns a failure status if any rules or invariants are violated. " ++ "Only works with DynamoDbMetadataStore."; Review comment: how about "only works with DynamoDB metadata stores" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321801869 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316660718 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,395 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return + */ + public List compareS3RootToMs(Path p) throws IOException { +final Path rootPath = rawFS.qualify(p); +final S3AFileStatus root = +(S3AFileStatus) rawFS.getFileStatus(rootPath); +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + // pop front node from the queue + final S3AFileStatus currentDir = queue.poll(); + + // Get a listing of that dir from s3 and add just the files. + // (Each directory will be added as a root.) + // Files should be casted to S3AFileStatus instead of plain FileStatus + // to get the VersionID and Etag. + final Path currentDirPath = currentDir.getPath(); + + final FileStatus[] s3DirListing = rawFS.listStatus(currentDirPath); + final List children = + Arrays.asList(s3DirListing).stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); + + // Compare the directory contents if the listing is authoritative + final DirListingMetadata msDirListing = + metadataStore.listChildren(currentDirPath); + if (msDirListing != null && msDirListing.isAuthoritative()) { +final ComparePair cP = +compareAuthDirListing(s3DirListing, msDirListing); +if (cP.containsViolation()) { + comparePairs.add(cP); +} + } + + // Compare directory and contents, but not the listing + final
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321801203 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321808553 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java ## @@ -1485,6 +1486,93 @@ private void vprintln(PrintStream out, String format, Object... } } + /** + * Prune metadata that has not been modified recently. + */ + static class Fsck extends S3GuardTool { +public static final String CHECK_FLAG = "check"; + +public static final String NAME = "fsck"; +public static final String PURPOSE = "Compares S3 with MetadataStore, and " ++ "returns a failure status if any rules or invariants are violated. " ++ "Only works with DynamoDbMetadataStore."; +private static final String USAGE = NAME + " [OPTIONS] [s3a://BUCKET]\n" + +"\t" + PURPOSE + "\n\n" + +"Common options:\n" + +" " + CHECK_FLAG + " Check the metadata store for errors, but do " ++ "not fix any issues.\n"; + +Fsck(Configuration conf) { + super(conf, CHECK_FLAG); +} + +@Override +public String getName() { + return NAME; +} + +@Override +public String getUsage() { + return USAGE; +} + +public int run(String[] args, PrintStream out) throws +InterruptedException, IOException { + List paths = parseArgs(args); + if (paths.isEmpty()) { +out.println(USAGE); +throw invalidArgs("no arguments"); + } + + String s3Path = paths.get(0); + try { +initS3AFileSystem(s3Path); + } catch (Exception e) { +errorln("Failed to initialize S3AFileSystem from path: " + s3Path); +throw e; + } + + URI uri = toUri(s3Path); + Path root; + if (uri.getPath().isEmpty()) { +root = new Path("/"); + } else { +root = new Path(uri.getPath()); + } + + final S3AFileSystem fs = getFilesystem(); + initMetadataStore(false); + final MetadataStore ms = getStore(); + + if (ms == null || + !(ms instanceof DynamoDBMetadataStore)) { +errorln(s3Path + " path uses MS: " + ms); +errorln(NAME + " can be only used with a DynamoDB backed s3a bucket."); +errorln(USAGE); +return ERROR; + } + + final CommandFormat commandFormat = getCommandFormat(); + if (commandFormat.getOpt(CHECK_FLAG)) { +// do the check +S3GuardFsck s3GuardFsck = new S3GuardFsck(fs, ms); +try { + s3GuardFsck.compareS3ToMs(fs.qualify(root)); +} catch (IOException e) { + errorln("Error while running the check: compareS3ToMs"); Review comment: Is this needed; the runner logs anyway? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321803937 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,315 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + // The rawFS and metadataStore are here to prepare when the ViolationHandlers + // will not just log, but fix the violations, so they will have access. + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() +.getDeclaredConstructor(S3GuardFsck.ComparePair.class) +.newInstance(comparePair); +final String errorStr = handler.getError(); +sB.append(errorStr); + } catch (NoSuchMethodException e) { +LOG.error("Can not find declared constructor for handler: {}", +violation.getHandler()); + } catch (IllegalAccessException | InstantiationException | InvocationTargetException e) { +LOG.error("Can not instantiate handler: {}", +violation.getHandler()); + } + sB.append(newLine); +} +LOG.error(sB.toString()); Review comment: I think we should change the log level based on the severity. This matters for those of us who have their log4j settings set to log different levels in different colours, and it will help people interpret the output This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316665014 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,312 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() +.getDeclaredConstructor(S3GuardFsck.ComparePair.class) +.newInstance(comparePair); +final String errorStr = handler.getError(); +sB.append(errorStr); + } catch (NoSuchMethodException e) { +LOG.error("Can not find declared constructor for handler: {}", +violation.getHandler()); + } catch (IllegalAccessException | InstantiationException | InvocationTargetException e) { +LOG.error("Can not instantiate handler: {}", +violation.getHandler()); + } + sB.append(newLine); +} +LOG.error(sB.toString()); + } + + /** + * Violation handler abstract class. + * This class should be extended for violation handlers. + */ + public static abstract class ViolationHandler { +private final PathMetadata pathMetadata; +private final S3AFileStatus s3FileStatus; +private final S3AFileStatus msFileStatus; +private final List s3DirListing; +private final DirListingMetadata msDirListing; + +public ViolationHandler(S3GuardFsck.ComparePair comparePair) { + pathMetadata = comparePair.getMsPathMetadata(); + s3FileStatus = comparePair.getS3FileStatus(); + if (pathMetadata != null) { +msFileStatus = pathMetadata.getFileStatus(); + } else { +msFileStatus = null; + } + s3DirListing = comparePair.getS3DirListing(); + msDirListing = comparePair.getMsDirListing(); +} + +abstract String getError(); + +public PathMetadata getPathMetadata() { + return pathMetadata; +} + +public S3AFileStatus getS3FileStatus() { + return s3FileStatus; +} + +public S3AFileStatus getMsFileStatus() { + return msFileStatus; +} + +public List getS3DirListing() { + return s3DirListing; +} + +public DirListingMetadata getMsDirListing() { + return msDirListing; +} + } + + /** + * The violation handler when there's no matching metadata entry in the MS. + */ + public static class NoMetadataEntry extends ViolationHandler { + +public NoMetadataEntry(S3GuardFsck.ComparePair comparePair) { + super(comparePair); +} + +@Override +public String getError() { + return "No PathMetadata for this path in the MS."; +} + } + + /** + * The violation handler when there's no parent entry. + */ + public s
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321798717 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); Review comment: This already is an IOE; no need to convert it to one (while losing the stack). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321810743 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardToolDynamoDB.java ## @@ -289,4 +291,30 @@ public void testDestroyUnknownTable() throws Throwable { "-meta", "dynamodb://" + getTestTableName(DYNAMODB_TABLE)); } + @Test + public void testCLIFsckWithoutParam() throws Exception { +intercept(ExitUtil.ExitException.class, () -> run(Fsck.NAME)); + } + + @Test + public void testCLIFsckWithParam() throws Exception { +final int result = run(S3GuardTool.Fsck.NAME, "-check", +"s3a://" + getFileSystem().getBucket()); Review comment: this test failed for me during a parallel run. This parallelizable test should have a path which we know is there but is private to this test; we can have another one which invokes on a missing path. The full root scan should be run in the ITestS3GuardDDBRootOperations test, before any cleanup This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321804202 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,315 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + // The rawFS and metadataStore are here to prepare when the ViolationHandlers + // will not just log, but fix the violations, so they will have access. + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() +.getDeclaredConstructor(S3GuardFsck.ComparePair.class) +.newInstance(comparePair); +final String errorStr = handler.getError(); +sB.append(errorStr); + } catch (NoSuchMethodException e) { +LOG.error("Can not find declared constructor for handler: {}", +violation.getHandler()); + } catch (IllegalAccessException | InstantiationException | InvocationTargetException e) { +LOG.error("Can not instantiate handler: {}", +violation.getHandler()); + } + sB.append(newLine); +} +LOG.error(sB.toString()); + } + + /** + * Violation handler abstract class. + * This class should be extended for violation handlers. + */ + public static abstract class ViolationHandler { +private final PathMetadata pathMetadata; +private final S3AFileStatus s3FileStatus; +private final S3AFileStatus msFileStatus; +private final List s3DirListing; +private final DirListingMetadata msDirListing; + +public ViolationHandler(S3GuardFsck.ComparePair comparePair) { + pathMetadata = comparePair.getMsPathMetadata(); + s3FileStatus = comparePair.getS3FileStatus(); + if (pathMetadata != null) { +msFileStatus = pathMetadata.getFileStatus(); + } else { +msFileStatus = null; + } + s3DirListing = comparePair.getS3DirListing(); + msDirListing = comparePair.getMsDirListing(); +} + +abstract String getError(); Review comment: this should be public too, if the rest is This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316664767 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,312 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() +.getDeclaredConstructor(S3GuardFsck.ComparePair.class) +.newInstance(comparePair); +final String errorStr = handler.getError(); +sB.append(errorStr); + } catch (NoSuchMethodException e) { +LOG.error("Can not find declared constructor for handler: {}", +violation.getHandler()); + } catch (IllegalAccessException | InstantiationException | InvocationTargetException e) { +LOG.error("Can not instantiate handler: {}", +violation.getHandler()); + } + sB.append(newLine); +} +LOG.error(sB.toString()); + } + + /** + * Violation handler abstract class. + * This class should be extended for violation handlers. + */ + public static abstract class ViolationHandler { +private final PathMetadata pathMetadata; +private final S3AFileStatus s3FileStatus; +private final S3AFileStatus msFileStatus; +private final List s3DirListing; +private final DirListingMetadata msDirListing; + +public ViolationHandler(S3GuardFsck.ComparePair comparePair) { + pathMetadata = comparePair.getMsPathMetadata(); + s3FileStatus = comparePair.getS3FileStatus(); + if (pathMetadata != null) { +msFileStatus = pathMetadata.getFileStatus(); + } else { +msFileStatus = null; + } + s3DirListing = comparePair.getS3DirListing(); + msDirListing = comparePair.getMsDirListing(); +} + +abstract String getError(); + +public PathMetadata getPathMetadata() { + return pathMetadata; +} + +public S3AFileStatus getS3FileStatus() { + return s3FileStatus; +} + +public S3AFileStatus getMsFileStatus() { + return msFileStatus; +} + +public List getS3DirListing() { + return s3DirListing; +} + +public DirListingMetadata getMsDirListing() { + return msDirListing; +} + } + + /** + * The violation handler when there's no matching metadata entry in the MS. + */ + public static class NoMetadataEntry extends ViolationHandler { + +public NoMetadataEntry(S3GuardFsck.ComparePair comparePair) { + super(comparePair); +} + +@Override +public String getError() { + return "No PathMetadata for this path in the MS."; +} + } + + /** + * The violation handler when there's no parent entry. + */ + public s
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316667252 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardFsck.java ## @@ -0,0 +1,707 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + + +import java.net.URI; +import java.util.List; +import java.util.UUID; + +import org.apache.hadoop.io.IOUtils; +import org.junit.Before; +import org.junit.Test; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AbstractS3ATestBase; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.assertj.core.api.Assertions; + +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.fs.s3a.Constants.METADATASTORE_AUTHORITATIVE; +import static org.apache.hadoop.fs.s3a.Constants.S3_METADATA_STORE_IMPL; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.awaitFileStatus; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.metadataStorePersistsAuthoritativeBit; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides; +import static org.junit.Assume.assumeTrue; + +/** + * Integration tests for the S3Guard Fsck against a dyamodb backed metadata + * store. + */ +public class ITestS3GuardFsck extends AbstractS3ATestBase { + + private S3AFileSystem guardedFs; + private S3AFileSystem rawFS; + + private MetadataStore metadataStore; + + @Before + public void setup() throws Exception { +super.setup(); +S3AFileSystem fs = getFileSystem(); +// These test will fail if no ms +assertTrue("FS needs to have a metadatastore.", +fs.hasMetadataStore()); +assertTrue("Metadatastore should persist authoritative bit", +metadataStorePersistsAuthoritativeBit(fs.getMetadataStore())); + +guardedFs = fs; +metadataStore = fs.getMetadataStore(); + +// create raw fs without s3guard +rawFS = createUnguardedFS(); +assertFalse("Raw FS still has S3Guard " + rawFS, +rawFS.hasMetadataStore()); + } + + @Override + public void teardown() throws Exception { +if (guardedFs != null) { + IOUtils.cleanupWithLogger(LOG, guardedFs); +} +IOUtils.cleanupWithLogger(LOG, rawFS); +super.teardown(); + } + + /** + * Create a test filesystem which is always unguarded. + * This filesystem MUST be closed in test teardown. + * @return the new FS + */ + private S3AFileSystem createUnguardedFS() throws Exception { +S3AFileSystem testFS = getFileSystem(); +Configuration config = new Configuration(testFS.getConf()); +URI uri = testFS.getUri(); + +removeBaseAndBucketOverrides(uri.getHost(), config, +S3_METADATA_STORE_IMPL); +removeBaseAndBucketOverrides(uri.getHost(), config, +METADATASTORE_AUTHORITATIVE); +S3AFileSystem fs2 = new S3AFileSystem(); +fs2.initialize(uri, config); +return fs2; + } + + @Test + public void testIDetectNoMetadataEntry() throws Exception { +final Path cwd = path("/" + getMethodName() + "-" + UUID.randomUUID()); +final Path file = new Path(cwd, "file"); +try { + touch(rawFS, file); + awaitFileStatus(rawFS, file); + + final S3GuardFsck s3GuardFsck = + new S3GuardFsck(rawFS, metadataStore); + + final List comparePairs = + s3GuardFsck.compareS3RootToMs(cwd); + + assertEquals("Number of pairs should be two.", 2, + comparePairs.size()); + final S3GuardFsck.ComparePair pair = comparePairs.get(0); + assertTrue("The pair must contain a violation.", pair.containsViolation()); + assertEquals("The pair must contain only one violation", 1, + pair.getViolations().size()); + + final S3GuardFsck.Violation violation = + pair.getViolations().iterator().next(); + assertEquals("The violation should be that there is no violation entry.", + violation, S3GuardFsc
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321804874 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,315 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + // The rawFS and metadataStore are here to prepare when the ViolationHandlers + // will not just log, but fix the violations, so they will have access. + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() +.getDeclaredConstructor(S3GuardFsck.ComparePair.class) +.newInstance(comparePair); +final String errorStr = handler.getError(); +sB.append(errorStr); + } catch (NoSuchMethodException e) { +LOG.error("Can not find declared constructor for handler: {}", +violation.getHandler()); + } catch (IllegalAccessException | InstantiationException | InvocationTargetException e) { +LOG.error("Can not instantiate handler: {}", +violation.getHandler()); + } + sB.append(newLine); +} +LOG.error(sB.toString()); + } + + /** + * Violation handler abstract class. + * This class should be extended for violation handlers. + */ + public static abstract class ViolationHandler { +private final PathMetadata pathMetadata; +private final S3AFileStatus s3FileStatus; +private final S3AFileStatus msFileStatus; +private final List s3DirListing; +private final DirListingMetadata msDirListing; + +public ViolationHandler(S3GuardFsck.ComparePair comparePair) { + pathMetadata = comparePair.getMsPathMetadata(); + s3FileStatus = comparePair.getS3FileStatus(); + if (pathMetadata != null) { +msFileStatus = pathMetadata.getFileStatus(); + } else { +msFileStatus = null; + } + s3DirListing = comparePair.getS3DirListing(); + msDirListing = comparePair.getMsDirListing(); +} + +abstract String getError(); + +public PathMetadata getPathMetadata() { + return pathMetadata; +} + +public S3AFileStatus getS3FileStatus() { + return s3FileStatus; +} + +public S3AFileStatus getMsFileStatus() { + return msFileStatus; +} + +public List getS3DirListing() { + return s3DirListing; +} + +public DirListingMetadata getMsDirListing() { + return msDirListing; +} + } + + /** + * The violation handler when there's no matching metadata entry in the MS. + */ + public static class NoMetadataEntry extends ViolationHandler { + +public NoMetadataEntry(S3GuardFsck.ComparePair comparePair) { + super(comparePair); +} + +@Override +public String getError()
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321799619 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} Review comment: flip the order of return and throws This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316668703 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardFsck.java ## @@ -0,0 +1,707 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + + +import java.net.URI; +import java.util.List; +import java.util.UUID; + +import org.apache.hadoop.io.IOUtils; +import org.junit.Before; +import org.junit.Test; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AbstractS3ATestBase; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.assertj.core.api.Assertions; + +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.fs.s3a.Constants.METADATASTORE_AUTHORITATIVE; +import static org.apache.hadoop.fs.s3a.Constants.S3_METADATA_STORE_IMPL; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.awaitFileStatus; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.metadataStorePersistsAuthoritativeBit; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides; +import static org.junit.Assume.assumeTrue; + +/** + * Integration tests for the S3Guard Fsck against a dyamodb backed metadata + * store. + */ +public class ITestS3GuardFsck extends AbstractS3ATestBase { Review comment: there's a lot of commonality in all these test cases if possible we should factor that out so that most of the boilerplate code is reused This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321799219 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316663744 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,312 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; Review comment: Imports This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316664318 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,312 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() Review comment: if you pulled this out into a static method it could be tested on its own This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321806663 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321800562 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); Review comment: I think this needs to handle the possibility of the raw FS raising an FNFE, saying the path has been deleted since it was queued. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: co
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r31881 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardFsck.java ## @@ -0,0 +1,707 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + + +import java.net.URI; +import java.util.List; +import java.util.UUID; + +import org.apache.hadoop.io.IOUtils; +import org.junit.Before; +import org.junit.Test; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AbstractS3ATestBase; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.assertj.core.api.Assertions; + +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.fs.s3a.Constants.METADATASTORE_AUTHORITATIVE; +import static org.apache.hadoop.fs.s3a.Constants.S3_METADATA_STORE_IMPL; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.awaitFileStatus; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.metadataStorePersistsAuthoritativeBit; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides; +import static org.junit.Assume.assumeTrue; + +/** + * Integration tests for the S3Guard Fsck against a dyamodb backed metadata + * store. + */ +public class ITestS3GuardFsck extends AbstractS3ATestBase { + + private S3AFileSystem guardedFs; + private S3AFileSystem rawFS; + + private MetadataStore metadataStore; + + @Before + public void setup() throws Exception { +super.setup(); +S3AFileSystem fs = getFileSystem(); +// These test will fail if no ms +assertTrue("FS needs to have a metadatastore.", +fs.hasMetadataStore()); +assertTrue("Metadatastore should persist authoritative bit", +metadataStorePersistsAuthoritativeBit(fs.getMetadataStore())); + +guardedFs = fs; +metadataStore = fs.getMetadataStore(); + +// create raw fs without s3guard +rawFS = createUnguardedFS(); +assertFalse("Raw FS still has S3Guard " + rawFS, +rawFS.hasMetadataStore()); + } + + @Override + public void teardown() throws Exception { +if (guardedFs != null) { + IOUtils.cleanupWithLogger(LOG, guardedFs); +} +IOUtils.cleanupWithLogger(LOG, rawFS); +super.teardown(); + } + + /** + * Create a test filesystem which is always unguarded. + * This filesystem MUST be closed in test teardown. + * @return the new FS + */ + private S3AFileSystem createUnguardedFS() throws Exception { +S3AFileSystem testFS = getFileSystem(); +Configuration config = new Configuration(testFS.getConf()); +URI uri = testFS.getUri(); + +removeBaseAndBucketOverrides(uri.getHost(), config, +S3_METADATA_STORE_IMPL); +removeBaseAndBucketOverrides(uri.getHost(), config, Review comment: We need to remove authoritative paths too This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321805419 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,315 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + // The rawFS and metadataStore are here to prepare when the ViolationHandlers + // will not just log, but fix the violations, so they will have access. + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() +.getDeclaredConstructor(S3GuardFsck.ComparePair.class) +.newInstance(comparePair); +final String errorStr = handler.getError(); +sB.append(errorStr); + } catch (NoSuchMethodException e) { +LOG.error("Can not find declared constructor for handler: {}", +violation.getHandler()); + } catch (IllegalAccessException | InstantiationException | InvocationTargetException e) { +LOG.error("Can not instantiate handler: {}", +violation.getHandler()); + } + sB.append(newLine); +} +LOG.error(sB.toString()); + } + + /** + * Violation handler abstract class. + * This class should be extended for violation handlers. + */ + public static abstract class ViolationHandler { +private final PathMetadata pathMetadata; +private final S3AFileStatus s3FileStatus; +private final S3AFileStatus msFileStatus; +private final List s3DirListing; +private final DirListingMetadata msDirListing; + +public ViolationHandler(S3GuardFsck.ComparePair comparePair) { + pathMetadata = comparePair.getMsPathMetadata(); + s3FileStatus = comparePair.getS3FileStatus(); + if (pathMetadata != null) { +msFileStatus = pathMetadata.getFileStatus(); + } else { +msFileStatus = null; + } + s3DirListing = comparePair.getS3DirListing(); + msDirListing = comparePair.getMsDirListing(); +} + +abstract String getError(); + +public PathMetadata getPathMetadata() { + return pathMetadata; +} + +public S3AFileStatus getS3FileStatus() { + return s3FileStatus; +} + +public S3AFileStatus getMsFileStatus() { + return msFileStatus; +} + +public List getS3DirListing() { + return s3DirListing; +} + +public DirListingMetadata getMsDirListing() { + return msDirListing; +} + } + + /** + * The violation handler when there's no matching metadata entry in the MS. + */ + public static class NoMetadataEntry extends ViolationHandler { + +public NoMetadataEntry(S3GuardFsck.ComparePair comparePair) { + super(comparePair); +} + +@Override +public String getError()
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r31565 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardFsck.java ## @@ -0,0 +1,707 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + + +import java.net.URI; +import java.util.List; +import java.util.UUID; + +import org.apache.hadoop.io.IOUtils; +import org.junit.Before; +import org.junit.Test; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AbstractS3ATestBase; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.assertj.core.api.Assertions; + +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.fs.s3a.Constants.METADATASTORE_AUTHORITATIVE; +import static org.apache.hadoop.fs.s3a.Constants.S3_METADATA_STORE_IMPL; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.awaitFileStatus; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.metadataStorePersistsAuthoritativeBit; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides; +import static org.junit.Assume.assumeTrue; + +/** + * Integration tests for the S3Guard Fsck against a dyamodb backed metadata + * store. + */ +public class ITestS3GuardFsck extends AbstractS3ATestBase { + + private S3AFileSystem guardedFs; + private S3AFileSystem rawFS; + + private MetadataStore metadataStore; + + @Before + public void setup() throws Exception { +super.setup(); +S3AFileSystem fs = getFileSystem(); +// These test will fail if no ms +assertTrue("FS needs to have a metadatastore.", +fs.hasMetadataStore()); +assertTrue("Metadatastore should persist authoritative bit", Review comment: And here too This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321815770 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321703265 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java ## @@ -1485,6 +1486,89 @@ private void vprintln(PrintStream out, String format, Object... } } + /** + * Prune metadata that has not been modified recently. + */ + static class Fsck extends S3GuardTool { +public static final String CHECK_FLAG = "check"; + +public static final String NAME = "fsck"; +public static final String PURPOSE = "Compares S3 with MetadataStore, and " ++ "returns a failure status if any rules or invariants are violated. " ++ "Only works with DynamoDbMetadataStore."; Review comment: + say "-check" in usage so its clear that you need the prefix This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316662769 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,395 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return + */ + public List compareS3RootToMs(Path p) throws IOException { +final Path rootPath = rawFS.qualify(p); +final S3AFileStatus root = +(S3AFileStatus) rawFS.getFileStatus(rootPath); +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + // pop front node from the queue + final S3AFileStatus currentDir = queue.poll(); + + // Get a listing of that dir from s3 and add just the files. + // (Each directory will be added as a root.) + // Files should be casted to S3AFileStatus instead of plain FileStatus + // to get the VersionID and Etag. + final Path currentDirPath = currentDir.getPath(); + + final FileStatus[] s3DirListing = rawFS.listStatus(currentDirPath); + final List children = + Arrays.asList(s3DirListing).stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); + + // Compare the directory contents if the listing is authoritative + final DirListingMetadata msDirListing = + metadataStore.listChildren(currentDirPath); + if (msDirListing != null && msDirListing.isAuthoritative()) { +final ComparePair cP = +compareAuthDirListing(s3DirListing, msDirListing); +if (cP.containsViolation()) { + comparePairs.add(cP); +} + } + + // Compare directory and contents, but not the listing + final
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316665415 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,312 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + private S3AFileSystem rawFs; + private DynamoDBMetadataStore metadataStore; + private static String newLine = System.getProperty("line.separator"); + + public S3GuardFsckViolationHandler(S3AFileSystem fs, + DynamoDBMetadataStore ddbms) { + +this.metadataStore = ddbms; +this.rawFs = fs; + } + + public void handle(S3GuardFsck.ComparePair comparePair) { +if (!comparePair.containsViolation()) { + LOG.debug("There is no violation in the compare pair: " + toString()); + return; +} + +StringBuilder sB = new StringBuilder(); +sB.append(newLine) +.append("On path: ").append(comparePair.getPath()).append(newLine); + +// Create a new instance of the handler and use it. +for (S3GuardFsck.Violation violation : comparePair.getViolations()) { + try { +ViolationHandler handler = violation.getHandler() +.getDeclaredConstructor(S3GuardFsck.ComparePair.class) +.newInstance(comparePair); +final String errorStr = handler.getError(); +sB.append(errorStr); + } catch (NoSuchMethodException e) { +LOG.error("Can not find declared constructor for handler: {}", +violation.getHandler()); + } catch (IllegalAccessException | InstantiationException | InvocationTargetException e) { +LOG.error("Can not instantiate handler: {}", +violation.getHandler()); + } + sB.append(newLine); +} +LOG.error(sB.toString()); + } + + /** + * Violation handler abstract class. + * This class should be extended for violation handlers. + */ + public static abstract class ViolationHandler { +private final PathMetadata pathMetadata; +private final S3AFileStatus s3FileStatus; +private final S3AFileStatus msFileStatus; +private final List s3DirListing; +private final DirListingMetadata msDirListing; + +public ViolationHandler(S3GuardFsck.ComparePair comparePair) { + pathMetadata = comparePair.getMsPathMetadata(); + s3FileStatus = comparePair.getS3FileStatus(); + if (pathMetadata != null) { +msFileStatus = pathMetadata.getFileStatus(); + } else { +msFileStatus = null; + } + s3DirListing = comparePair.getS3DirListing(); + msDirListing = comparePair.getMsDirListing(); +} + +abstract String getError(); + +public PathMetadata getPathMetadata() { + return pathMetadata; +} + +public S3AFileStatus getS3FileStatus() { + return s3FileStatus; +} + +public S3AFileStatus getMsFileStatus() { + return msFileStatus; +} + +public List getS3DirListing() { + return s3DirListing; +} + +public DirListingMetadata getMsDirListing() { + return msDirListing; +} + } + + /** + * The violation handler when there's no matching metadata entry in the MS. + */ + public static class NoMetadataEntry extends ViolationHandler { + +public NoMetadataEntry(S3GuardFsck.ComparePair comparePair) { + super(comparePair); +} + +@Override +public String getError() { + return "No PathMetadata for this path in the MS."; +} + } + + /** + * The violation handler when there's no parent entry. + */ + public s
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321807686 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java ## @@ -1485,6 +1486,93 @@ private void vprintln(PrintStream out, String format, Object... } } + /** + * Prune metadata that has not been modified recently. + */ + static class Fsck extends S3GuardTool { +public static final String CHECK_FLAG = "check"; + +public static final String NAME = "fsck"; +public static final String PURPOSE = "Compares S3 with MetadataStore, and " ++ "returns a failure status if any rules or invariants are violated. " ++ "Only works with DynamoDbMetadataStore."; +private static final String USAGE = NAME + " [OPTIONS] [s3a://BUCKET]\n" + +"\t" + PURPOSE + "\n\n" + +"Common options:\n" + +" " + CHECK_FLAG + " Check the metadata store for errors, but do " Review comment: add a - in front of the check flag This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321799314 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,421 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AWSBadRequestException; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +import com.google.common.base.Stopwatch; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; +import java.util.concurrent.TimeUnit; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return a list of {@link ComparePair} + */ + public List compareS3ToMs(Path p) throws IOException { +Stopwatch stopwatch = Stopwatch.createStarted(); +int scannedItems = 0; + +final Path rootPath = rawFS.qualify(p); +S3AFileStatus root = null; +try { + root = (S3AFileStatus) rawFS.getFileStatus(rootPath); +} catch (AWSBadRequestException e) { + throw new IOException(e.getMessage()); +} +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + final S3AFileStatus currentDir = queue.poll(); + scannedItems++; + + final Path currentDirPath = currentDir.getPath(); + List s3DirListing = Arrays.asList(rawFS.listStatus(currentDirPath)); + + // DIRECTORIES + // Check directory authoritativeness consistency + compareAuthoritativeDirectoryFlag(comparePairs, currentDirPath, s3DirListing); + // Add all descendant directory to the queue + s3DirListing.stream().filter(pm -> pm.isDirectory()) + .map(S3AFileStatus.class::cast) + .forEach(pm -> queue.add(pm)); + + // FILES + // check files for consistency + final List children = s3DirListing.stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); +
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316659206 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,395 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; Review comment: should these be final? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316663521 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,395 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " + + "metadatastore."); +} + } + + /** + * Compares S3 to MS. + * Iterative breadth first walk on the S3 structure from a given root. + * Creates a list of pairs (metadata in S3 and in the MetadataStore) where + * the consistency or any rule is violated. + * Uses {@link S3GuardFsckViolationHandler} to handle violations. + * The violations are listed in Enums: {@link Violation} + * + * @param p the root path to start the traversal + * @throws IOException + * @return + */ + public List compareS3RootToMs(Path p) throws IOException { +final Path rootPath = rawFS.qualify(p); +final S3AFileStatus root = +(S3AFileStatus) rawFS.getFileStatus(rootPath); +final List comparePairs = new ArrayList<>(); +final Queue queue = new ArrayDeque<>(); +queue.add(root); + +while (!queue.isEmpty()) { + // pop front node from the queue + final S3AFileStatus currentDir = queue.poll(); + + // Get a listing of that dir from s3 and add just the files. + // (Each directory will be added as a root.) + // Files should be casted to S3AFileStatus instead of plain FileStatus + // to get the VersionID and Etag. + final Path currentDirPath = currentDir.getPath(); + + final FileStatus[] s3DirListing = rawFS.listStatus(currentDirPath); + final List children = + Arrays.asList(s3DirListing).stream() + .filter(status -> !status.isDirectory()) + .map(S3AFileStatus.class::cast).collect(toList()); + + // Compare the directory contents if the listing is authoritative + final DirListingMetadata msDirListing = + metadataStore.listChildren(currentDirPath); + if (msDirListing != null && msDirListing.isAuthoritative()) { +final ComparePair cP = +compareAuthDirListing(s3DirListing, msDirListing); +if (cP.containsViolation()) { + comparePairs.add(cP); +} + } + + // Compare directory and contents, but not the listing + final
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321812633 ## File path: hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/s3guard/ITestS3GuardFsck.java ## @@ -0,0 +1,707 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + + +import java.net.URI; +import java.util.List; +import java.util.UUID; + +import org.apache.hadoop.io.IOUtils; +import org.junit.Before; +import org.junit.Test; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.AbstractS3ATestBase; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.assertj.core.api.Assertions; + +import static org.apache.hadoop.fs.contract.ContractTestUtils.touch; +import static org.apache.hadoop.fs.s3a.Constants.METADATASTORE_AUTHORITATIVE; +import static org.apache.hadoop.fs.s3a.Constants.S3_METADATA_STORE_IMPL; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.awaitFileStatus; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.metadataStorePersistsAuthoritativeBit; +import static org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides; +import static org.junit.Assume.assumeTrue; + +/** + * Integration tests for the S3Guard Fsck against a dyamodb backed metadata + * store. + */ +public class ITestS3GuardFsck extends AbstractS3ATestBase { + + private S3AFileSystem guardedFs; + private S3AFileSystem rawFS; + + private MetadataStore metadataStore; + + @Before + public void setup() throws Exception { +super.setup(); +S3AFileSystem fs = getFileSystem(); +// These test will fail if no ms +assertTrue("FS needs to have a metadatastore.", +fs.hasMetadataStore()); +assertTrue("Metadatastore should persist authoritative bit", +metadataStorePersistsAuthoritativeBit(fs.getMetadataStore())); + +guardedFs = fs; +metadataStore = fs.getMetadataStore(); + +// create raw fs without s3guard +rawFS = createUnguardedFS(); +assertFalse("Raw FS still has S3Guard " + rawFS, +rawFS.hasMetadataStore()); + } + + @Override + public void teardown() throws Exception { +if (guardedFs != null) { + IOUtils.cleanupWithLogger(LOG, guardedFs); +} +IOUtils.cleanupWithLogger(LOG, rawFS); +super.teardown(); + } + + /** + * Create a test filesystem which is always unguarded. + * This filesystem MUST be closed in test teardown. + * @return the new FS + */ + private S3AFileSystem createUnguardedFS() throws Exception { +S3AFileSystem testFS = getFileSystem(); +Configuration config = new Configuration(testFS.getConf()); +URI uri = testFS.getUri(); + +removeBaseAndBucketOverrides(uri.getHost(), config, +S3_METADATA_STORE_IMPL); +removeBaseAndBucketOverrides(uri.getHost(), config, +METADATASTORE_AUTHORITATIVE); +S3AFileSystem fs2 = new S3AFileSystem(); +fs2.initialize(uri, config); +return fs2; + } + + @Test + public void testIDetectNoMetadataEntry() throws Exception { +final Path cwd = path("/" + getMethodName() + "-" + UUID.randomUUID()); +final Path file = new Path(cwd, "file"); +try { + touch(rawFS, file); + awaitFileStatus(rawFS, file); + + final S3GuardFsck s3GuardFsck = + new S3GuardFsck(rawFS, metadataStore); + + final List comparePairs = + s3GuardFsck.compareS3ToMs(cwd); + + assertEquals("Number of pairs should be two.", 2, + comparePairs.size()); + final S3GuardFsck.ComparePair pair = comparePairs.get(0); + assertTrue("The pair must contain a violation.", pair.containsViolation()); + assertEquals("The pair must contain only one violation", 1, + pair.getViolations().size()); + + final S3GuardFsck.Violation violation = + pair.getViolations().iterator().next(); + assertEquals("The violation should be that there is no violation entry.", + violation, S3GuardFsck.Vi
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r321703047 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardTool.java ## @@ -1485,6 +1486,93 @@ private void vprintln(PrintStream out, String format, Object... } } + /** + * Prune metadata that has not been modified recently. Review comment: javadoc needs updating This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316659790 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsck.java ## @@ -0,0 +1,395 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.IOException; +import java.security.InvalidParameterException; +import java.util.ArrayDeque; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.HashSet; +import java.util.List; +import java.util.Queue; +import java.util.Set; + +import static java.util.stream.Collectors.toList; +import static java.util.stream.Collectors.toSet; + +/** + * Main class for the FSCK factored out from S3GuardTool + * The implementation uses fixed DynamoDBMetadataStore as the backing store + * for metadata. + * + * Functions: + * + * Checking metadata consistency between S3 and metadatastore + * + */ +public class S3GuardFsck { + private static final Logger LOG = LoggerFactory.getLogger(S3GuardFsck.class); + public static final String ROOT_PATH_STRING = "/"; + + private S3AFileSystem rawFS; + private DynamoDBMetadataStore metadataStore; + + /** + * Creates an S3GuardFsck. + * @param fs the filesystem to compare to + * @param ms metadatastore the metadatastore to compare with (dynamo) + */ + S3GuardFsck(S3AFileSystem fs, MetadataStore ms) + throws InvalidParameterException { +this.rawFS = fs; + +if (ms == null) { + throw new InvalidParameterException("S3AFileSystem should be guarded by" + + " a " + DynamoDBMetadataStore.class.getCanonicalName()); +} +this.metadataStore = (DynamoDBMetadataStore) ms; + +if (rawFS.hasMetadataStore()) { + throw new InvalidParameterException("Raw fs should not have a " Review comment: You can use google preconditions This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[GitHub] [hadoop] steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log)
steveloughran commented on a change in pull request #1208: HADOOP-16423. S3Guard fsck: Check metadata consistency between S3 and metadatastore (log) URL: https://github.com/apache/hadoop/pull/1208#discussion_r316663888 ## File path: hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/s3guard/S3GuardFsckViolationHandler.java ## @@ -0,0 +1,312 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.s3guard; + +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.s3a.S3AFileStatus; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.reflect.InvocationTargetException; +import java.util.Arrays; +import java.util.List; + +/** + * Violation handler for the S3Guard's fsck. + */ +public class S3GuardFsckViolationHandler { + private static final Logger LOG = LoggerFactory.getLogger( + S3GuardFsckViolationHandler.class); + + private S3AFileSystem rawFs; Review comment: Mark as final This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org