[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=791718=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791718 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 17/Jul/22 06:01 Start Date: 17/Jul/22 06:01 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r922774936 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=791349=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791349 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 15/Jul/22 10:24 Start Date: 15/Jul/22 10:24 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r922034066 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=791328=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791328 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 15/Jul/22 09:04 Start Date: 15/Jul/22 09:04 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r921968636 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=791329=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-791329 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 15/Jul/22 09:04 Start Date: 15/Jul/22 09:04 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r921968917 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=790766=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790766 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 14/Jul/22 06:53 Start Date: 14/Jul/22 06:53 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r920814131 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=790764=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790764 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 14/Jul/22 06:50 Start Date: 14/Jul/22 06:50 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r920812601 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/AbstractAuditToolTest.java: ## @@ -0,0 +1,62 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.fs.s3a.AbstractS3ATestBase; + +import static org.apache.hadoop.fs.s3a.S3ATestUtils.disableFilesystemCaching; +import static org.apache.hadoop.fs.s3a.s3guard.S3GuardToolTestHelper.runS3GuardCommand; + +/** + * An extension of the contract test base set up for S3A tests. + */ +public class AbstractAuditToolTest extends AbstractS3ATestBase { + + private static final Logger LOG = + LoggerFactory.getLogger(AbstractAuditToolTest.class); + + /** + * Take a configuration, copy it and disable FS Caching on + * the new one. + * + * @param conf source config + * @return a new, patched, config + */ + protected Configuration uncachedFSConfig(final Configuration conf) { Review Comment: removed Issue Time Tracking --- Worklog Id: (was: 790764) Time Spent: 8.5h (was: 8h 20m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 8.5h > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=790754=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790754 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 14/Jul/22 06:27 Start Date: 14/Jul/22 06:27 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r920794339 ## hadoop-tools/hadoop-aws/pom.xml: ## @@ -438,6 +438,23 @@ + +org.apache.avro +avro-maven-plugin + + +generate-avro-sources +generate-sources + + schema + + + + + ../../hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/avro Review Comment: changed to src/main Issue Time Tracking --- Worklog Id: (was: 790754) Time Spent: 8h 20m (was: 8h 10m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 8h 20m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=790753=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790753 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 14/Jul/22 06:26 Start Date: 14/Jul/22 06:26 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r920793828 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=790749=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790749 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 14/Jul/22 06:23 Start Date: 14/Jul/22 06:23 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r920791859 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=790470=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790470 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 13/Jul/22 15:00 Start Date: 13/Jul/22 15:00 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r920187510 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=790469=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-790469 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 13/Jul/22 14:59 Start Date: 13/Jul/22 14:59 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r920186659 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMergerAndParser.java: ## @@ -0,0 +1,546 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.IOException; +import java.io.StringReader; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.Collections; +import java.util.HashMap; +import java.util.List; +import java.util.Map; +import java.util.regex.Matcher; +import java.util.regex.Pattern; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.avro.file.DataFileWriter; +import org.apache.avro.io.DatumWriter; +import org.apache.avro.specific.SpecificDatumWriter; +import org.apache.commons.io.IOUtils; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FSDataOutputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; + +/** + * Merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMergerAndParser { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMergerAndParser.class); + + /** + * Simple entry: anything up to a space. + * {@value}. + */ + private static final String SIMPLE = "[^ ]*"; + + /** + * Date/Time. Everything within square braces. + * {@value}. + */ + private static final String DATETIME = "\\[(.*?)\\]"; + + /** + * A natural number or "-". + * {@value}. + */ + private static final String NUMBER = "(-|[0-9]*)"; + + /** + * A Quoted field or "-". + * {@value}. + */ + private static final String QUOTED = "(-|\"[^\"]*\")"; + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String e(String name, String pattern) { +return String.format("(?<%s>%s) ", name, pattern); + } + + /** + * An entry in the regexp. + * + * @param namename of the group + * @param pattern pattern to use in the regexp + * @return the pattern for the regexp + */ + private static String eNoTrailing(String name, String pattern) { +return String.format("(?<%s>%s)", name, pattern); + } + + /** + * Simple entry using the {@link #SIMPLE} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String e(String name) { +return e(name, SIMPLE); + } + + /** + * Quoted entry using the {@link #QUOTED} pattern. + * + * @param name name of the element (for code clarity only) + * @return the pattern for the regexp + */ + private static String q(String name) { +return e(name, QUOTED); + } + + /** + * Log group {@value}. + */ + public static final String OWNER_GROUP = "owner"; + + /** + * Log group {@value}. + */ + public static final String BUCKET_GROUP = "bucket"; + + /** + * Log group {@value}. + */ + public static final String TIMESTAMP_GROUP = "timestamp"; + + /** + * Log group {@value}. + */ + public static final String REMOTEIP_GROUP = "remoteip"; + + /** + * Log group {@value}. + */ + public static final String REQUESTER_GROUP = "requester"; + + /** + * Log group {@value}. + */ + public static final String REQUESTID_GROUP = "requestid"; + + /** + * Log group {@value}. + */ + public static final
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=789137=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-789137 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 08/Jul/22 19:11 Start Date: 08/Jul/22 19:11 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1179288323 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 38s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 0s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 3 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 39m 14s | | trunk passed | | +1 :green_heart: | compile | 0m 54s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 0m 49s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 0m 52s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 1s | | trunk passed | | +1 :green_heart: | javadoc | 0m 48s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 0m 51s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 1m 34s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 45s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 21m 12s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 40s | | the patch passed | | +1 :green_heart: | compile | 0m 44s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 0m 44s | | the patch passed | | +1 :green_heart: | compile | 0m 37s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 0m 37s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 27s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 42s | | the patch passed | | +1 :green_heart: | javadoc | 0m 24s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 0m 31s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 1m 18s | [/new-spotbugs-hadoop-tools_hadoop-aws.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/13/artifact/out/new-spotbugs-hadoop-tools_hadoop-aws.html) | hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 20m 33s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 44s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 0m 51s | | The patch does not generate ASF License warnings. | | | | 97m 54s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-aws | | | A known null value is checked to see if it is an instance of org.apache.avro.util.Utf8 in org.apache.hadoop.fs.s3a.audit.AvroDataRecord.customDecode(ResolvingDecoder) At AvroDataRecord.java:checked to see if it is an instance of org.apache.avro.util.Utf8 in org.apache.hadoop.fs.s3a.audit.AvroDataRecord.customDecode(ResolvingDecoder) At AvroDataRecord.java:[line 2392] | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/13/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4383 | | Optional Tests | dupname asflicense compile javac javadoc mvninstall mvnsite
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=788981=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788981 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 08/Jul/22 13:20 Start Date: 08/Jul/22 13:20 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1178982191 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 43s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | xmllint | 0m 1s | | xmllint was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 3 new or modified test files. | _ trunk Compile Tests _ | | +1 :green_heart: | mvninstall | 64m 35s | | trunk passed | | +1 :green_heart: | compile | 1m 0s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 0m 55s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 0m 52s | | trunk passed | | +1 :green_heart: | mvnsite | 1m 1s | | trunk passed | | +1 :green_heart: | javadoc | 0m 48s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 0m 50s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 1m 34s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 41s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 21m 8s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +1 :green_heart: | mvninstall | 0m 39s | | the patch passed | | +1 :green_heart: | compile | 0m 44s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 0m 44s | | the patch passed | | +1 :green_heart: | compile | 0m 38s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 0m 38s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 0m 27s | | the patch passed | | +1 :green_heart: | mvnsite | 0m 42s | | the patch passed | | +1 :green_heart: | javadoc | 0m 25s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | -1 :x: | javadoc | 0m 32s | [/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/12/artifact/out/patch-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | hadoop-aws in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | -1 :x: | spotbugs | 1m 17s | [/new-spotbugs-hadoop-tools_hadoop-aws.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/12/artifact/out/new-spotbugs-hadoop-tools_hadoop-aws.html) | hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 20m 15s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 2m 41s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 0m 51s | | The patch does not generate ASF License warnings. | | | | 123m 32s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-aws | | | A known null value is checked to see if it is an instance of org.apache.avro.util.Utf8 in org.apache.hadoop.fs.s3a.audit.AvroDataRecord.customDecode(ResolvingDecoder) At AvroDataRecord.java:checked to see if it is an instance of org.apache.avro.util.Utf8 in org.apache.hadoop.fs.s3a.audit.AvroDataRecord.customDecode(ResolvingDecoder) At AvroDataRecord.java:[line 2392] | | Subsystem | Report/Notes | |--:|:-| |
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=788939=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-788939 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 08/Jul/22 11:16 Start Date: 08/Jul/22 11:16 Worklog Time Spent: 10m Work Description: sravanigadey commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1178868414 pushed the changes using -f to resolve conflicts Issue Time Tracking --- Worklog Id: (was: 788939) Time Spent: 7h (was: 6h 50m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 7h > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=784156=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784156 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 23/Jun/22 11:35 Start Date: 23/Jun/22 11:35 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r904912305 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/** + * AuditTool is a Command Line Interface. + * i.e, it's functionality is to parse the merged audit log file. + * and generate avro file. + */ +public class AuditTool extends Configured implements Tool, Closeable { Review Comment: done Issue Time Tracking --- Worklog Id: (was: 784156) Time Spent: 6h 50m (was: 6h 40m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 6h 50m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=784040=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-784040 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 23/Jun/22 05:59 Start Date: 23/Jun/22 05:59 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r904609878 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java: ## @@ -0,0 +1,78 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.BufferedReader; +import java.io.File; +import java.io.FileInputStream; +import java.io.IOException; +import java.io.InputStreamReader; +import java.io.PrintWriter; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Merge all the audit logs present in a directory of. + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMerger { + + private static final Logger LOG = + LoggerFactory.getLogger(S3AAuditLogMerger.class); + + /** + * Merge all the audit log files from a directory into single audit log file. + * @param auditLogsDirectoryPath path where audit log files are present. + * @throws IOException on any failure. + */ + public void mergeFiles(String auditLogsDirectoryPath) throws IOException { +File auditLogFilesDirectory = new File(auditLogsDirectoryPath); +String[] auditLogFileNames = auditLogFilesDirectory.list(); + +//Merging of audit log files present in a directory into a single audit log file +if (auditLogFileNames != null && auditLogFileNames.length != 0) { + File auditLogFile = new File("AuditLogFile"); Review Comment: Do you mean listing of files in s3 path and iterating over that that? Can you please brief it Issue Time Tracking --- Worklog Id: (was: 784040) Time Spent: 6h 40m (was: 6.5h) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 6h 40m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783992=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783992 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 22:59 Start Date: 22/Jun/22 22:59 Worklog Time Spent: 10m Work Description: mukund-thakur commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r904350867 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/** + * AuditTool is a Command Line Interface. + * i.e, it's functionality is to parse the merged audit log file. + * and generate avro file. + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + public AuditTool() { + } + + /** + * Tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * This run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments. + * and merge the audit log files present in that path into single file in. + * local system. + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code. + * @throws Exception on any failure. + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//Path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//Setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); +RemoteIterator listOfS3LogFiles = +s3AFileSystem.listFiles(s3LogsPath, true); + +//Merging local audit files into a single file +File s3aLogsDirectory = new File(s3LogsPath.getName()); +boolean s3aLogsDirectoryCreation = false; +if (!s3aLogsDirectory.exists()) { +
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783822=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783822 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 11:40 Start Date: 22/Jun/22 11:40 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r903633874 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,182 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * MergerTest will implement different tests on Merger class methods. + */ +public class TestS3AAuditLogMerger { Review Comment: extended AbstractHadoopTestBase. Issue Time Tracking --- Worklog Id: (was: 783822) Time Spent: 6h 20m (was: 6h 10m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 6h 20m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783821=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783821 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 11:38 Start Date: 22/Jun/22 11:38 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r903630927 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/** + * AuditTool is a Command Line Interface. + * i.e, it's functionality is to parse the merged audit log file. + * and generate avro file. + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + public AuditTool() { + } + + /** + * Tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * This run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments. + * and merge the audit log files present in that path into single file in. + * local system. + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code. + * @throws Exception on any failure. + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//Path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//Setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); +RemoteIterator listOfS3LogFiles = +s3AFileSystem.listFiles(s3LogsPath, true); + +//Merging local audit files into a single file +File s3aLogsDirectory = new File(s3LogsPath.getName()); +boolean s3aLogsDirectoryCreation = false; +if (!s3aLogsDirectory.exists()) { +
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783820=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783820 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 11:37 Start Date: 22/Jun/22 11:37 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r903629695 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/** + * AuditTool is a Command Line Interface. + * i.e, it's functionality is to parse the merged audit log file. + * and generate avro file. + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + public AuditTool() { + } + + /** + * Tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * This run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments. + * and merge the audit log files present in that path into single file in. + * local system. + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code. + * @throws Exception on any failure. + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//Path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//Setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); +RemoteIterator listOfS3LogFiles = +s3AFileSystem.listFiles(s3LogsPath, true); + +//Merging local audit files into a single file +File s3aLogsDirectory = new File(s3LogsPath.getName()); +boolean s3aLogsDirectoryCreation = false; +if (!s3aLogsDirectory.exists()) { +
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783819=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783819 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 11:35 Start Date: 22/Jun/22 11:35 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r903626816 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/** + * AuditTool is a Command Line Interface. + * i.e, it's functionality is to parse the merged audit log file. + * and generate avro file. + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + public AuditTool() { + } + + /** + * Tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * This run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments. + * and merge the audit log files present in that path into single file in. + * local system. + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code. + * @throws Exception on any failure. + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//Path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//Setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); Review Comment: modified to ```FileSystem.get(fsURI, getConf())``` Issue Time Tracking --- Worklog Id: (was: 783819) Time Spent: 5h 50m (was: 5h 40m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 >
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783818 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 11:32 Start Date: 22/Jun/22 11:32 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r903623994 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/** + * AuditTool is a Command Line Interface. + * i.e, it's functionality is to parse the merged audit log file. + * and generate avro file. + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + public AuditTool() { + } + + /** + * Tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * This run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments. + * and merge the audit log files present in that path into single file in. + * local system. + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code. + * @throws Exception on any failure. + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//Path of audit log files in s3 bucket Review Comment: added space Issue Time Tracking --- Worklog Id: (was: 783818) Time Spent: 5h 40m (was: 5.5h) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major >
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783806=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783806 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 10:15 Start Date: 22/Jun/22 10:15 Worklog Time Spent: 10m Work Description: steveloughran commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r903522555 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/** + * AuditTool is a Command Line Interface. + * i.e, it's functionality is to parse the merged audit log file. + * and generate avro file. + */ +public class AuditTool extends Configured implements Tool, Closeable { Review Comment: extend S3GuardTool and make a subcommand of it. ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,308 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783720=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783720 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 06:24 Start Date: 22/Jun/22 06:24 Worklog Time Spent: 10m Work Description: sravanigadey commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1162694818 Yeah changed the javadocs and access modifiers. Yes, s3guard tool's command is defined in hadoop-aws in below shell script. `hadoop-tools/hadoop-aws/src/main/shellprofile.d/hadoop-s3guard.sh` I think we can add auditTool in the above shell script. Issue Time Tracking --- Worklog Id: (was: 783720) Time Spent: 5h 20m (was: 5h 10m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 5h 20m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783692=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783692 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 22/Jun/22 04:21 Start Date: 22/Jun/22 04:21 Worklog Time Spent: 10m Work Description: mehakmeet commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1162621569 few small nits here - You have added "." at the end of every line in javadocs, it should only be at the end of a sentence or end of the javadoc, not each line. - Some of the methods in AuditTool have wrong access modifiers, if you're not using a method outside this class just keep them private. - I have one doubt about putting the auditTool in the Hadoop shell script. Since, at the moment this is hadoop-aws specific tool, not sure if defining the command in hadoop-common is the right move. Can you check and see where s3guard tool's command is defined in hadoop-aws, that might be a more appropriate place for hadoop-aws specific shell commands. After these, this LGTM. CC: @mukund-thakur @steveloughran to review and merge. Issue Time Tracking --- Worklog Id: (was: 783692) Time Spent: 5h 10m (was: 5h) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 5h 10m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783531=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783531 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 21/Jun/22 19:03 Start Date: 21/Jun/22 19:03 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1162204153 :confetti_ball: **+1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 52s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 27s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 31s | | trunk passed | | +1 :green_heart: | compile | 25m 8s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 21m 45s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 10s | | trunk passed | | +1 :green_heart: | javadoc | 2m 25s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 5s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 36s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 36s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 25m 4s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 24s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 42s | | the patch passed | | +1 :green_heart: | compile | 24m 16s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 24m 16s | | the patch passed | | +1 :green_heart: | compile | 21m 50s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 21m 50s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 4m 23s | | the patch passed | | +1 :green_heart: | mvnsite | 3m 9s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 8s | | No new issues. | | +1 :green_heart: | javadoc | 2m 16s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 5s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 48s | | the patch passed | | +1 :green_heart: | shadedclient | 25m 51s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 39s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 3m 1s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 1m 18s | | The patch does not generate ASF License warnings. | | | | 250m 15s | | | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/10/artifact/out/Dockerfile | | GITHUB PR | https://github.com/apache/hadoop/pull/4383 | | Optional Tests | dupname asflicense mvnsite unit codespell detsecrets shellcheck shelldocs compile javac javadoc mvninstall shadedclient spotbugs checkstyle | | uname | Linux bf36a51f2952 4.15.0-175-generic #184-Ubuntu SMP Thu Mar 24 17:48:36 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | dev-support/bin/hadoop.sh | | git revision | trunk / 305a5800b81a056c2431f9e5c606419e29cbac92 | | Default Java | Private
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783261=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783261 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 21/Jun/22 08:43 Start Date: 21/Jun/22 08:43 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r902335968 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.BufferedReader; +import java.io.File; +import java.io.FileReader; +import java.io.IOException; +import java.io.PrintWriter; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Merger class will merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMerger { + + private final Logger logger = + LoggerFactory.getLogger(S3AAuditLogMerger.class); + + public void mergeFiles(String auditLogsDirectoryPath) throws IOException { +File auditLogFilesDirectory = new File(auditLogsDirectoryPath); +String[] auditLogFileNames = auditLogFilesDirectory.list(); + +//Read each audit log file present in directory and writes each and every audit log in it +//into a single audit log file +if (auditLogFileNames != null && auditLogFileNames.length != 0) { + File auditLogFile = new File("AuditLogFile"); Review Comment: If the file already exists, the data will overwrite into that existing file. Issue Time Tracking --- Worklog Id: (was: 783261) Time Spent: 4h 50m (was: 4h 40m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 4h 50m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783260=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783260 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 21/Jun/22 08:42 Start Date: 21/Jun/22 08:42 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r902335062 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. + * AuditTool is a Command Line Interface to manage S3 Auditing. + * i.e, it is a functionality which directly takes s3 path of audit log files + * and merge all those into single audit log file + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory"); + + public AuditTool() { + } + + /** + * tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * this run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments + * and merge the audit log files present in that path into single file in local system + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code + * @throws Exception on any failure + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +println("argv: %s", argv); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); +RemoteIterator listOfS3LogFiles = +s3AFileSystem.listFiles(s3LogsPath, true); + +//creating local audit log files directory and
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783206=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783206 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 21/Jun/22 05:19 Start Date: 21/Jun/22 05:19 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r902171306 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. + * AuditTool is a Command Line Interface to manage S3 Auditing. + * i.e, it is a functionality which directly takes s3 path of audit log files + * and merge all those into single audit log file + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory"); + + public AuditTool() { + } + + /** + * tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * this run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments + * and merge the audit log files present in that path into single file in local system + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code + * @throws Exception on any failure + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +println("argv: {}" , argv); Review Comment: removed Issue Time Tracking --- Worklog Id: (was: 783206) Time Spent: 4.5h (was: 4h 20m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major >
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783205=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783205 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 21/Jun/22 05:18 Start Date: 21/Jun/22 05:18 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r902171219 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. + * AuditTool is a Command Line Interface to manage S3 Auditing. + * i.e, it is a functionality which directly takes s3 path of audit log files + * and merge all those into single audit log file Review Comment: modified javadocs Issue Time Tracking --- Worklog Id: (was: 783205) Time Spent: 4h 20m (was: 4h 10m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 4h 20m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783136=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783136 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 22:15 Start Date: 20/Jun/22 22:15 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1160871955 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 38s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 51s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 24m 43s | | trunk passed | | +1 :green_heart: | compile | 22m 47s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 20m 26s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 17s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 42s | | trunk passed | | +1 :green_heart: | javadoc | 2m 56s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 40s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 5m 2s | | trunk passed | | +1 :green_heart: | shadedclient | 21m 59s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 22m 29s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 32s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 48s | | the patch passed | | +1 :green_heart: | compile | 23m 37s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 23m 37s | | the patch passed | | +1 :green_heart: | compile | 21m 34s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 21m 34s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 4m 3s | | the patch passed | | +1 :green_heart: | mvnsite | 3m 44s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 8s | | No new issues. | | +1 :green_heart: | javadoc | 2m 39s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 36s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 6m 27s | | the patch passed | | -1 :x: | shadedclient | 30m 46s | | patch has errors when building and testing our client artifacts. | _ Other Tests _ | | -1 :x: | unit | 28m 50s | [/patch-unit-hadoop-common-project_hadoop-common.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/9/artifact/out/patch-unit-hadoop-common-project_hadoop-common.txt) | hadoop-common in the patch failed. | | -1 :x: | unit | 1m 14s | [/patch-unit-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/9/artifact/out/patch-unit-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +0 :ok: | asflicense | 1m 12s | | ASF License check generated no output? | | | | 257m 24s | | | | Reason | Tests | |---:|:--| | Failed junit tests | hadoop.ipc.TestIPCServerResponder | | | hadoop.ipc.TestIPC | | | hadoop.metrics2.source.TestJvmMetrics | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base: https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/9/artifact/out/Dockerfile | | GITHUB PR |
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783082=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783082 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 17:00 Start Date: 20/Jun/22 17:00 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901861689 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. + * AuditTool is a Command Line Interface to manage S3 Auditing. + * i.e, it is a functionality which directly takes s3 path of audit log files + * and merge all those into single audit log file + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory"); + + public AuditTool() { + } + + /** + * tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * this run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments + * and merge the audit log files present in that path into single file in local system + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code + * @throws Exception on any failure + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +println("argv: %s", argv); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); +RemoteIterator listOfS3LogFiles = +s3AFileSystem.listFiles(s3LogsPath, true); + +//creating local audit log files directory and
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783081=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783081 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:50 Start Date: 20/Jun/22 16:50 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901855926 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * MergerTest will implement different tests on Merger class methods. + */ +public class TestS3AAuditLogMerger { + + private final Logger logger = LoggerFactory.getLogger(TestS3AAuditLogMerger.class); + + private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger(); + + /** + * sample directories and files to test. + */ + private final File auditLogFile = new File("AuditLogFile"); + private final File sampleDirectory = new File("sampleFilesDirectory"); + private final File emptyDirectory = new File("emptyFilesDirectory"); + private final File firstSampleFile = + new File("sampleFilesDirectory", "sampleFile1.txt"); + private final File secondSampleFile = + new File("sampleFilesDirectory", "sampleFile2.txt"); + private final File thirdSampleFile = + new File("sampleFilesDirectory", "sampleFile3.txt"); Review Comment: yeah i'll will try in this way then. Issue Time Tracking --- Worklog Id: (was: 783081) Time Spent: 3h 50m (was: 3h 40m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 3h 50m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783073=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783073 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:40 Start Date: 20/Jun/22 16:40 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901850160 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * MergerTest will implement different tests on Merger class methods. + */ +public class TestS3AAuditLogMerger { + + private final Logger logger = LoggerFactory.getLogger(TestS3AAuditLogMerger.class); + + private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger(); + + /** + * sample directories and files to test. + */ + private final File auditLogFile = new File("AuditLogFile"); + private final File sampleDirectory = new File("sampleFilesDirectory"); + private final File emptyDirectory = new File("emptyFilesDirectory"); + private final File firstSampleFile = + new File("sampleFilesDirectory", "sampleFile1.txt"); + private final File secondSampleFile = + new File("sampleFilesDirectory", "sampleFile2.txt"); + private final File thirdSampleFile = + new File("sampleFilesDirectory", "sampleFile3.txt"); + + /** + * creates the sample directories and files before each test. + * + * @throws IOException on failure + */ + @Before + public void setUp() throws IOException { +boolean sampleDirCreation = sampleDirectory.mkdir(); +boolean emptyDirCreation = emptyDirectory.mkdir(); +if (sampleDirCreation && emptyDirCreation) { + try (FileWriter fw = new FileWriter(firstSampleFile); + FileWriter fw1 = new FileWriter(secondSampleFile); + FileWriter fw2 = new FileWriter(thirdSampleFile)) { +fw.write("abcd"); +fw1.write("efgh"); +fw2.write("ijkl"); + } +} + } + + /** + * mergeFilesTest() will test the mergeFiles() method in Merger class. + * by passing a sample directory which contains files with some content in it + * and checks if files in a directory are merged into single file + * + * @throws IOException on any failure + */ + @Test + public void mergeFilesTest() throws IOException { +s3AAuditLogMerger.mergeFiles(sampleDirectory.getPath()); +String str = +new String(Files.readAllBytes(Paths.get(auditLogFile.getPath(; +String fileText = str.replace("\n", ""); Review Comment: added comment Issue Time Tracking --- Worklog Id: (was: 783073) Time Spent: 3h 40m (was: 3.5h) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 3h 40m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783071=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783071 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:34 Start Date: 20/Jun/22 16:34 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901846695 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * MergerTest will implement different tests on Merger class methods. + */ +public class TestS3AAuditLogMerger { + + private final Logger logger = LoggerFactory.getLogger(TestS3AAuditLogMerger.class); + + private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger(); + + /** + * sample directories and files to test. + */ + private final File auditLogFile = new File("AuditLogFile"); + private final File sampleDirectory = new File("sampleFilesDirectory"); + private final File emptyDirectory = new File("emptyFilesDirectory"); + private final File firstSampleFile = + new File("sampleFilesDirectory", "sampleFile1.txt"); + private final File secondSampleFile = + new File("sampleFilesDirectory", "sampleFile2.txt"); + private final File thirdSampleFile = + new File("sampleFilesDirectory", "sampleFile3.txt"); + + /** + * creates the sample directories and files before each test. + * + * @throws IOException on failure + */ + @Before + public void setUp() throws IOException { +boolean sampleDirCreation = sampleDirectory.mkdir(); +boolean emptyDirCreation = emptyDirectory.mkdir(); +if (sampleDirCreation && emptyDirCreation) { + try (FileWriter fw = new FileWriter(firstSampleFile); + FileWriter fw1 = new FileWriter(secondSampleFile); + FileWriter fw2 = new FileWriter(thirdSampleFile)) { +fw.write("abcd"); +fw1.write("efgh"); +fw2.write("ijkl"); + } +} + } + + /** + * mergeFilesTest() will test the mergeFiles() method in Merger class. + * by passing a sample directory which contains files with some content in it + * and checks if files in a directory are merged into single file + * + * @throws IOException on any failure + */ + @Test + public void mergeFilesTest() throws IOException { +s3AAuditLogMerger.mergeFiles(sampleDirectory.getPath()); +String str = +new String(Files.readAllBytes(Paths.get(auditLogFile.getPath(; +String fileText = str.replace("\n", ""); +assertTrue("the string 'abcd' should be in the merged file", +fileText.contains("abcd")); +assertTrue("the string 'efgh' should be in the merged file", +fileText.contains("efgh")); +assertTrue("the string 'ijkl' should be in the merged file", +fileText.contains("ijkl")); + } + + /** + * mergeFilesTestEmpty() will test the mergeFiles(). + * by passing an empty directory and checks if merged file is created or not + * + * @throws IOException on any failure + */ + @Test + public void mergeFilesTestEmpty() throws IOException { Review Comment: changed name of the test Issue Time Tracking --- Worklog Id: (was: 783071) Time Spent: 3.5h (was: 3h 20m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783070=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783070 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:33 Start Date: 20/Jun/22 16:33 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901846062 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * MergerTest will implement different tests on Merger class methods. + */ +public class TestS3AAuditLogMerger { + + private final Logger logger = LoggerFactory.getLogger(TestS3AAuditLogMerger.class); + + private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger(); + + /** + * sample directories and files to test. + */ + private final File auditLogFile = new File("AuditLogFile"); + private final File sampleDirectory = new File("sampleFilesDirectory"); + private final File emptyDirectory = new File("emptyFilesDirectory"); + private final File firstSampleFile = + new File("sampleFilesDirectory", "sampleFile1.txt"); + private final File secondSampleFile = + new File("sampleFilesDirectory", "sampleFile2.txt"); + private final File thirdSampleFile = + new File("sampleFilesDirectory", "sampleFile3.txt"); + + /** + * creates the sample directories and files before each test. + * + * @throws IOException on failure + */ + @Before + public void setUp() throws IOException { +boolean sampleDirCreation = sampleDirectory.mkdir(); +boolean emptyDirCreation = emptyDirectory.mkdir(); +if (sampleDirCreation && emptyDirCreation) { + try (FileWriter fw = new FileWriter(firstSampleFile); + FileWriter fw1 = new FileWriter(secondSampleFile); + FileWriter fw2 = new FileWriter(thirdSampleFile)) { +fw.write("abcd"); +fw1.write("efgh"); +fw2.write("ijkl"); + } +} + } + + /** + * mergeFilesTest() will test the mergeFiles() method in Merger class. + * by passing a sample directory which contains files with some content in it + * and checks if files in a directory are merged into single file + * + * @throws IOException on any failure + */ + @Test + public void mergeFilesTest() throws IOException { +s3AAuditLogMerger.mergeFiles(sampleDirectory.getPath()); +String str = +new String(Files.readAllBytes(Paths.get(auditLogFile.getPath(; +String fileText = str.replace("\n", ""); +assertTrue("the string 'abcd' should be in the merged file", +fileText.contains("abcd")); +assertTrue("the string 'efgh' should be in the merged file", +fileText.contains("efgh")); +assertTrue("the string 'ijkl' should be in the merged file", +fileText.contains("ijkl")); + } + + /** + * mergeFilesTestEmpty() will test the mergeFiles(). Review Comment: removed test name Issue Time Tracking --- Worklog Id: (was: 783070) Time Spent: 3h 20m (was: 3h 10m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > Merging
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783052=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783052 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:08 Start Date: 20/Jun/22 16:08 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901828806 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * MergerTest will implement different tests on Merger class methods. + */ +public class TestS3AAuditLogMerger { + + private final Logger logger = LoggerFactory.getLogger(TestS3AAuditLogMerger.class); + + private final S3AAuditLogMerger s3AAuditLogMerger = new S3AAuditLogMerger(); + + /** + * sample directories and files to test. + */ + private final File auditLogFile = new File("AuditLogFile"); + private final File sampleDirectory = new File("sampleFilesDirectory"); + private final File emptyDirectory = new File("emptyFilesDirectory"); + private final File firstSampleFile = + new File("sampleFilesDirectory", "sampleFile1.txt"); + private final File secondSampleFile = + new File("sampleFilesDirectory", "sampleFile2.txt"); + private final File thirdSampleFile = + new File("sampleFilesDirectory", "sampleFile3.txt"); + + /** + * creates the sample directories and files before each test. + * + * @throws IOException on failure + */ + @Before + public void setUp() throws IOException { +boolean sampleDirCreation = sampleDirectory.mkdir(); +boolean emptyDirCreation = emptyDirectory.mkdir(); +if (sampleDirCreation && emptyDirCreation) { + try (FileWriter fw = new FileWriter(firstSampleFile); + FileWriter fw1 = new FileWriter(secondSampleFile); + FileWriter fw2 = new FileWriter(thirdSampleFile)) { +fw.write("abcd"); +fw1.write("efgh"); +fw2.write("ijkl"); + } +} + } + + /** + * mergeFilesTest() will test the mergeFiles() method in Merger class. + * by passing a sample directory which contains files with some content in it + * and checks if files in a directory are merged into single file + * + * @throws IOException on any failure Review Comment: removed @throws ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783051=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783051 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:07 Start Date: 20/Jun/22 16:07 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901827942 ## hadoop-tools/hadoop-aws/src/test/java/org/apache/hadoop/fs/s3a/audit/TestS3AAuditLogMerger.java: ## @@ -0,0 +1,131 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.nio.file.Files; +import java.nio.file.Paths; + +import org.junit.After; +import org.junit.Before; +import org.junit.Test; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import static org.junit.Assert.assertFalse; +import static org.junit.Assert.assertTrue; + +/** + * MergerTest will implement different tests on Merger class methods. + */ +public class TestS3AAuditLogMerger { + + private final Logger logger = LoggerFactory.getLogger(TestS3AAuditLogMerger.class); Review Comment: done Issue Time Tracking --- Worklog Id: (was: 783051) Time Spent: 3h (was: 2h 50m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783048=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783048 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:05 Start Date: 20/Jun/22 16:05 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901825553 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. + * AuditTool is a Command Line Interface to manage S3 Auditing. + * i.e, it is a functionality which directly takes s3 path of audit log files + * and merge all those into single audit log file + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory"); + + public AuditTool() { + } + + /** + * tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * this run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments + * and merge the audit log files present in that path into single file in local system + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code + * @throws Exception on any failure + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +println("argv: %s", argv); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); +RemoteIterator listOfS3LogFiles = +s3AFileSystem.listFiles(s3LogsPath, true); + +//creating local audit log files directory and
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783045=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783045 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:01 Start Date: 20/Jun/22 16:01 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901821623 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.BufferedReader; +import java.io.File; +import java.io.FileReader; +import java.io.IOException; +import java.io.PrintWriter; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Merger class will merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMerger { + + private final Logger logger = + LoggerFactory.getLogger(S3AAuditLogMerger.class); + + public void mergeFiles(String auditLogsDirectoryPath) throws IOException { +File auditLogFilesDirectory = new File(auditLogsDirectoryPath); +String[] auditLogFileNames = auditLogFilesDirectory.list(); + +//Read each audit log file present in directory and writes each and every audit log in it +//into a single audit log file Review Comment: changed comment Issue Time Tracking --- Worklog Id: (was: 783045) Time Spent: 2.5h (was: 2h 20m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 2.5h > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783046=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783046 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 16:01 Start Date: 20/Jun/22 16:01 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901821972 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. + * AuditTool is a Command Line Interface to manage S3 Auditing. + * i.e, it is a functionality which directly takes s3 path of audit log files + * and merge all those into single audit log file + */ +public class AuditTool extends Configured implements Tool, Closeable { + + private static final Logger LOG = LoggerFactory.getLogger(AuditTool.class); + + private final String entryPoint = "s3audit"; + + private PrintWriter out; + + // Exit codes + private static final int SUCCESS = EXIT_SUCCESS; + private static final int INVALID_ARGUMENT = EXIT_COMMAND_ARGUMENT_ERROR; + + /** + * Error String when the wrong FS is used for binding: {@value}. + **/ + @VisibleForTesting + public static final String WRONG_FILESYSTEM = "Wrong filesystem for "; + + private final String usage = entryPoint + " s3a://BUCKET\n"; + + private final File s3aLogsDirectory = new File("S3AAuditLogsDirectory"); + + public AuditTool() { + } + + /** + * tells us the usage of the AuditTool by commands. + * + * @return the string USAGE + */ + public String getUsage() { +return usage; + } + + /** + * this run method in AuditTool takes S3 bucket path. + * which contains audit log files from command line arguments + * and merge the audit log files present in that path into single file in local system + * + * @param args command specific arguments. + * @return SUCCESS i.e, '0', which is an exit code + * @throws Exception on any failure + */ + @Override + public int run(String[] args) throws Exception { +List argv = new ArrayList<>(Arrays.asList(args)); +println("argv: %s", argv); +if (argv.isEmpty()) { + errorln(getUsage()); + throw invalidArgs("No bucket specified"); +} +//path of audit log files in s3 bucket +Path s3LogsPath = new Path(argv.get(0)); + +//setting the file system +URI fsURI = toUri(String.valueOf(s3LogsPath)); +S3AFileSystem s3AFileSystem = +bindFilesystem(FileSystem.newInstance(fsURI, getConf())); +RemoteIterator listOfS3LogFiles = +s3AFileSystem.listFiles(s3LogsPath, true); + +//creating local audit log files directory and
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783044=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783044 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 15:59 Start Date: 20/Jun/22 15:59 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901819850 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.BufferedReader; +import java.io.File; +import java.io.FileReader; +import java.io.IOException; +import java.io.PrintWriter; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Merger class will merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMerger { + + private final Logger logger = + LoggerFactory.getLogger(S3AAuditLogMerger.class); + + public void mergeFiles(String auditLogsDirectoryPath) throws IOException { Review Comment: added javadocs Issue Time Tracking --- Worklog Id: (was: 783044) Time Spent: 2h 20m (was: 2h 10m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 2h 20m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=783025=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-783025 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 15:26 Start Date: 20/Jun/22 15:26 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901792603 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.BufferedReader; +import java.io.File; +import java.io.FileReader; +import java.io.IOException; +import java.io.PrintWriter; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Merger class will merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMerger { + + private final Logger logger = Review Comment: done Issue Time Tracking --- Worklog Id: (was: 783025) Time Spent: 2h 10m (was: 2h) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 2h 10m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782945=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782945 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 11:09 Start Date: 20/Jun/22 11:09 Worklog Time Spent: 10m Work Description: sravanigadey commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r901544959 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. Review Comment: removed Issue Time Tracking --- Worklog Id: (was: 782945) Time Spent: 2h (was: 1h 50m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 2h > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782924=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782924 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 10:03 Start Date: 20/Jun/22 10:03 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1160239434 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 42s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 17m 38s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 24m 53s | | trunk passed | | +1 :green_heart: | compile | 22m 54s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 20m 36s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 25s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 49s | | trunk passed | | +1 :green_heart: | javadoc | 3m 1s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 34s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 57s | | trunk passed | | +1 :green_heart: | shadedclient | 23m 19s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 23m 50s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 31s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 46s | | the patch passed | | +1 :green_heart: | compile | 22m 7s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 22m 7s | | the patch passed | | +1 :green_heart: | compile | 20m 29s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 20m 29s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 18s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/8/artifact/out/results-checkstyle-root.txt) | root: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | mvnsite | 3m 39s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 8s | | No new issues. | | +1 :green_heart: | javadoc | 2m 50s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 43s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 2m 5s | [/new-spotbugs-hadoop-tools_hadoop-aws.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/8/artifact/out/new-spotbugs-hadoop-tools_hadoop-aws.html) | hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 22m 24s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 15s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 4m 5s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 2m 6s | | The patch does not generate ASF License warnings. | | | | 245m 47s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-aws | | | Found reliance on default encoding in org.apache.hadoop.fs.s3a.audit.S3AAuditLogMerger.mergeFiles(String):in org.apache.hadoop.fs.s3a.audit.S3AAuditLogMerger.mergeFiles(String): new java.io.FileReader(String) At
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782825=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782825 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 20/Jun/22 06:32 Start Date: 20/Jun/22 06:32 Worklog Time Spent: 10m Work Description: mehakmeet commented on code in PR #4383: URL: https://github.com/apache/hadoop/pull/4383#discussion_r899217349 ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/AuditTool.java: ## @@ -0,0 +1,334 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.Closeable; +import java.io.EOFException; +import java.io.File; +import java.io.IOException; +import java.io.PrintWriter; +import java.net.URI; +import java.net.URISyntaxException; +import java.util.ArrayList; +import java.util.Arrays; +import java.util.List; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import org.apache.commons.io.FileUtils; +import org.apache.hadoop.classification.VisibleForTesting; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.conf.Configured; +import org.apache.hadoop.fs.FSDataInputStream; +import org.apache.hadoop.fs.FileStatus; +import org.apache.hadoop.fs.FileSystem; +import org.apache.hadoop.fs.FilterFileSystem; +import org.apache.hadoop.fs.LocatedFileStatus; +import org.apache.hadoop.fs.Path; +import org.apache.hadoop.fs.RemoteIterator; +import org.apache.hadoop.fs.s3a.S3AFileSystem; +import org.apache.hadoop.util.ExitUtil; +import org.apache.hadoop.util.Tool; +import org.apache.hadoop.util.ToolRunner; + +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_COMMAND_ARGUMENT_ERROR; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SERVICE_UNAVAILABLE; +import static org.apache.hadoop.service.launcher.LauncherExitCodes.EXIT_SUCCESS; + +/**. + * AuditTool is a Command Line Interface to manage S3 Auditing. + * i.e, it is a functionality which directly takes s3 path of audit log files + * and merge all those into single audit log file Review Comment: This isn't the correct functionality, we only support merging in this patch, but our end goal is to parse the audit log into an avro file. ## hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/audit/S3AAuditLogMerger.java: ## @@ -0,0 +1,77 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.hadoop.fs.s3a.audit; + +import java.io.BufferedReader; +import java.io.File; +import java.io.FileReader; +import java.io.IOException; +import java.io.PrintWriter; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +/** + * Merger class will merge all the audit logs present in a directory of + * multiple audit log files into a single audit log file. + */ +public class S3AAuditLogMerger { + + private final Logger logger = + LoggerFactory.getLogger(S3AAuditLogMerger.class); + + public void mergeFiles(String auditLogsDirectoryPath) throws IOException { +File auditLogFilesDirectory = new File(auditLogsDirectoryPath); +String[] auditLogFileNames = auditLogFilesDirectory.list(); + +//Read each audit log file present in directory and writes each and every audit log in it +
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782740=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782740 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 19/Jun/22 22:24 Start Date: 19/Jun/22 22:24 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1159821219 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 49s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 21m 41s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 30s | | trunk passed | | +1 :green_heart: | compile | 25m 3s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 21m 37s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 29s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 11s | | trunk passed | | +1 :green_heart: | javadoc | 2m 23s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 3s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 37s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 38s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 25m 5s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 24s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 42s | | the patch passed | | +1 :green_heart: | compile | 24m 17s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 24m 17s | | the patch passed | | +1 :green_heart: | compile | 21m 34s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 21m 34s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 4m 17s | | the patch passed | | +1 :green_heart: | mvnsite | 3m 8s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 9s | | No new issues. | | +1 :green_heart: | javadoc | 2m 16s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 4s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 1m 47s | [/new-spotbugs-hadoop-tools_hadoop-aws.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/7/artifact/out/new-spotbugs-hadoop-tools_hadoop-aws.html) | hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 24m 51s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 38s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 54s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 1m 17s | | The patch does not generate ASF License warnings. | | | | 254m 22s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-aws | | | Found reliance on default encoding in org.apache.hadoop.fs.s3a.audit.S3AAuditLogMerger.mergeFiles(String):in org.apache.hadoop.fs.s3a.audit.S3AAuditLogMerger.mergeFiles(String): new java.io.FileReader(File) At S3AAuditLogMerger.java:[line 54] | | Subsystem | Report/Notes | |--:|:-| | Docker | ClientAPI=1.41 ServerAPI=1.41 base:
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782379=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782379 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 17/Jun/22 12:21 Start Date: 17/Jun/22 12:21 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1158817356 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 48s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 15m 24s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 59s | | trunk passed | | +1 :green_heart: | compile | 29m 29s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 21m 29s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 36s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 14s | | trunk passed | | +1 :green_heart: | javadoc | 2m 24s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 6s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 42s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 38s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 25m 5s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 44s | | the patch passed | | +1 :green_heart: | compile | 24m 14s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 24m 14s | | the patch passed | | +1 :green_heart: | compile | 22m 32s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 22m 32s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 42s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/6/artifact/out/results-checkstyle-root.txt) | root: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | mvnsite | 3m 12s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 8s | | No new issues. | | +1 :green_heart: | javadoc | 2m 15s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 1m 59s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 1m 45s | [/new-spotbugs-hadoop-tools_hadoop-aws.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/6/artifact/out/new-spotbugs-hadoop-tools_hadoop-aws.html) | hadoop-tools/hadoop-aws generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | +1 :green_heart: | shadedclient | 24m 45s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 19m 20s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 3m 8s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 1m 16s | | The patch does not generate ASF License warnings. | | | | 255m 17s | | | | Reason | Tests | |---:|:--| | SpotBugs | module:hadoop-tools/hadoop-aws | | | Found reliance on default encoding in org.apache.hadoop.fs.s3a.audit.S3AAuditLogMerger.mergeFiles(String):in org.apache.hadoop.fs.s3a.audit.S3AAuditLogMerger.mergeFiles(String): new java.io.FileReader(File) At
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782178=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782178 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 16/Jun/22 22:15 Start Date: 16/Jun/22 22:15 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1158185507 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 1s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 1s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 1s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 33s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 25m 2s | | trunk passed | | +1 :green_heart: | compile | 23m 0s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 20m 35s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 25s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 46s | | trunk passed | | +1 :green_heart: | javadoc | 3m 1s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 43s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 5m 3s | | trunk passed | | +1 :green_heart: | shadedclient | 22m 8s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 22m 41s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | -1 :x: | mvninstall | 0m 40s | [/patch-mvninstall-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/5/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +1 :green_heart: | compile | 22m 7s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 22m 7s | | the patch passed | | -1 :x: | compile | 19m 36s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/5/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | -1 :x: | javac | 19m 36s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/5/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | +1 :green_heart: | checkstyle | 4m 15s | | the patch passed | | -1 :x: | mvnsite | 1m 22s | [/patch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/5/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +1 :green_heart: | shellcheck | 0m 8s | | No new issues. | | +1 :green_heart: | javadoc | 2m 54s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 43s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 1m 27s | [/patch-spotbugs-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/5/artifact/out/patch-spotbugs-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +1 :green_heart: | shadedclient | 23m 29s | | patch has no errors when building and testing our client artifacts. | _ Other
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=782102=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-782102 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 16/Jun/22 16:19 Start Date: 16/Jun/22 16:19 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1157867827 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 52s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 11s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 38s | | trunk passed | | +1 :green_heart: | compile | 24m 57s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 21m 38s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 27s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 14s | | trunk passed | | +1 :green_heart: | javadoc | 2m 24s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 6s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 39s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 20s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 24m 46s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 27s | | Maven dependency ordering for patch | | -1 :x: | mvninstall | 0m 27s | [/patch-mvninstall-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/4/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +1 :green_heart: | compile | 24m 14s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 24m 14s | | the patch passed | | -1 :x: | compile | 20m 35s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/4/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | -1 :x: | javac | 20m 35s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/4/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 20s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/4/artifact/out/results-checkstyle-root.txt) | root: The patch generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) | | -1 :x: | mvnsite | 1m 2s | [/patch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/4/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +1 :green_heart: | shellcheck | 0m 8s | | No new issues. | | +1 :green_heart: | javadoc | 2m 17s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 5s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 0m 59s |
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=781818=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781818 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 15/Jun/22 20:00 Start Date: 15/Jun/22 20:00 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1156875753 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 57s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 0s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 16s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 50s | | trunk passed | | +1 :green_heart: | compile | 25m 0s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 21m 40s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 32s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 17s | | trunk passed | | +1 :green_heart: | javadoc | 2m 25s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 7s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 39s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 40s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 25m 7s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | | Maven dependency ordering for patch | | -1 :x: | mvninstall | 0m 25s | [/patch-mvninstall-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/3/artifact/out/patch-mvninstall-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +1 :green_heart: | compile | 24m 25s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 24m 25s | | the patch passed | | -1 :x: | compile | 20m 36s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/3/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | -1 :x: | javac | 20m 36s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/3/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 24s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/3/artifact/out/results-checkstyle-root.txt) | root: The patch generated 229 new + 0 unchanged - 0 fixed = 229 total (was 0) | | -1 :x: | mvnsite | 1m 2s | [/patch-mvnsite-hadoop-tools_hadoop-aws.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/3/artifact/out/patch-mvnsite-hadoop-tools_hadoop-aws.txt) | hadoop-aws in the patch failed. | | +1 :green_heart: | shellcheck | 0m 9s | | No new issues. | | +1 :green_heart: | javadoc | 2m 17s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 7s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | -1 :x: | spotbugs | 1m 0s |
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=781366=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-781366 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 14/Jun/22 19:42 Start Date: 14/Jun/22 19:42 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1155640788 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 47s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +0 :ok: | shelldocs | 0m 1s | | Shelldocs was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 44s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 27m 38s | | trunk passed | | +1 :green_heart: | compile | 24m 55s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | compile | 21m 40s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | checkstyle | 4m 28s | | trunk passed | | +1 :green_heart: | mvnsite | 3m 14s | | trunk passed | | +1 :green_heart: | javadoc | 2m 24s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 2m 6s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 36s | | trunk passed | | +1 :green_heart: | shadedclient | 24m 28s | | branch has no errors when building and testing our client artifacts. | | -0 :warning: | patch | 24m 55s | | Used diff version of patch file. Binary files and potentially other changes not applied. Please rebase and squash commits if necessary. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 26s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 43s | | the patch passed | | +1 :green_heart: | compile | 24m 4s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javac | 24m 4s | | the patch passed | | +1 :green_heart: | compile | 21m 33s | | the patch passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | javac | 21m 33s | | the patch passed | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 4m 18s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/2/artifact/out/results-checkstyle-root.txt) | root: The patch generated 219 new + 0 unchanged - 0 fixed = 219 total (was 0) | | +1 :green_heart: | mvnsite | 3m 11s | | the patch passed | | +1 :green_heart: | shellcheck | 0m 8s | | No new issues. | | +1 :green_heart: | javadoc | 2m 17s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | -1 :x: | javadoc | 1m 1s | [/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/2/artifact/out/results-javadoc-javadoc-hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | hadoop-tools_hadoop-aws-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 generated 2 new + 38 unchanged - 0 fixed = 40 total (was 38) | | -1 :x: | spotbugs | 1m 49s | [/new-spotbugs-hadoop-tools_hadoop-aws.html](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/2/artifact/out/new-spotbugs-hadoop-tools_hadoop-aws.html) | hadoop-tools/hadoop-aws generated 5 new + 0 unchanged - 0 fixed = 5 total (was 0) | | +1 :green_heart: | shadedclient | 24m 43s | | patch has no errors when building and testing our client artifacts. | _ Other Tests _ | | +1 :green_heart: | unit | 18m 46s | | hadoop-common in the patch passed. | | +1 :green_heart: | unit | 2m 50s | | hadoop-aws in the patch passed. | | +1 :green_heart: | asflicense | 1m 17s | |
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=776238=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776238 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 31/May/22 10:54 Start Date: 31/May/22 10:54 Worklog Time Spent: 10m Work Description: hadoop-yetus commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1141979806 :broken_heart: **-1 overall** | Vote | Subsystem | Runtime | Logfile | Comment | |::|--:|:|::|:---:| | +0 :ok: | reexec | 0m 41s | | Docker mode activated. | _ Prechecks _ | | +1 :green_heart: | dupname | 0m 0s | | No case conflicting files found. | | +0 :ok: | codespell | 0m 0s | | codespell was not available. | | +0 :ok: | detsecrets | 0m 0s | | detect-secrets was not available. | | +1 :green_heart: | @author | 0m 0s | | The patch does not contain any @author tags. | | +1 :green_heart: | test4tests | 0m 0s | | The patch appears to include 1 new or modified test files. | _ trunk Compile Tests _ | | +0 :ok: | mvndep | 14m 24s | | Maven dependency ordering for branch | | +1 :green_heart: | mvninstall | 24m 56s | | trunk passed | | -1 :x: | compile | 4m 8s | [/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/1/artifact/out/branch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) | root in trunk failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1. | | -1 :x: | compile | 3m 38s | [/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/1/artifact/out/branch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in trunk failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | +1 :green_heart: | checkstyle | 3m 57s | | trunk passed | | +1 :green_heart: | mvnsite | 2m 53s | | trunk passed | | +1 :green_heart: | javadoc | 2m 3s | | trunk passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: | javadoc | 1m 45s | | trunk passed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07 | | +1 :green_heart: | spotbugs | 4m 3s | | trunk passed | | +1 :green_heart: | shadedclient | 20m 37s | | branch has no errors when building and testing our client artifacts. | _ Patch Compile Tests _ | | +0 :ok: | mvndep | 0m 29s | | Maven dependency ordering for patch | | +1 :green_heart: | mvninstall | 1m 48s | | the patch passed | | -1 :x: | compile | 3m 51s | [/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/1/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) | root in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1. | | -1 :x: | javac | 3m 51s | [/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/1/artifact/out/patch-compile-root-jdkPrivateBuild-11.0.15+10-Ubuntu-0ubuntu0.20.04.1.txt) | root in the patch failed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1. | | -1 :x: | compile | 3m 21s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/1/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | -1 :x: | javac | 3m 21s | [/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/1/artifact/out/patch-compile-root-jdkPrivateBuild-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07.txt) | root in the patch failed with JDK Private Build-1.8.0_312-8u312-b07-0ubuntu1~20.04-b07. | | +1 :green_heart: | blanks | 0m 0s | | The patch has no blanks issues. | | -0 :warning: | checkstyle | 3m 34s | [/results-checkstyle-root.txt](https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-4383/1/artifact/out/results-checkstyle-root.txt) | root: The patch generated 77 new + 0 unchanged - 0 fixed = 77 total (was 0) | | +1 :green_heart: | mvnsite | 2m 18s | | the patch passed | | +1 :green_heart: | javadoc | 1m 28s | | the patch passed with JDK Private Build-11.0.15+10-Ubuntu-0ubuntu0.20.04.1 | | +1 :green_heart: |
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=776231=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776231 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 31/May/22 10:48 Start Date: 31/May/22 10:48 Worklog Time Spent: 10m Work Description: sravanigadey commented on PR #4383: URL: https://github.com/apache/hadoop/pull/4383#issuecomment-1141974711 When I run the test in trunk using same command as above i.e, 'mvn clean verify -Dparallel-tests -DtestsThreadCount=4 -Dscale' command, then also I got errors which are almost similar to the errors that I got when I run test in HADOOP-18258 branch. ``` [ERROR] testUnbufferBeforeRead(org.apache.hadoop.fs.contract.s3a.ITestS3AContractUnbuffer) Time elapsed: 3.623 s <<< FAILURE! java.lang.AssertionError: failed to read expected number of bytes from stream. This may be transient expected:<1024> but was:<93> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.apache.hadoop.fs.contract.AbstractContractUnbufferTest.validateFileContents(AbstractContractUnbufferTest.java:139) at org.apache.hadoop.fs.contract.AbstractContractUnbufferTest.validateFullFileContents(AbstractContractUnbufferTest.java:132) at org.apache.hadoop.fs.contract.AbstractContractUnbufferTest.testUnbufferBeforeRead(AbstractContractUnbufferTest.java:63) ``` ``` [ERROR] testSTS(org.apache.hadoop.fs.s3a.ITestS3ATemporaryCredentials) Time elapsed: 12.976 s <<< ERROR! com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region. at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) at org.apache.hadoop.fs.s3a.ITestS3ATemporaryCredentials.testSTS(ITestS3ATemporaryCredentials.java:130) ``` ``` [ERROR] testDTUtilShell(org.apache.hadoop.fs.s3a.auth.delegation.ITestSessionDelegationInFileystem) Time elapsed: 8.354 s <<< FAILURE! java.lang.AssertionError: expected:<0> but was:<1> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633) at org.apache.hadoop.fs.s3a.auth.delegation.ITestSessionDelegationInFileystem.dtutil(ITestSessionDelegationInFileystem.java:739) at org.apache.hadoop.fs.s3a.auth.delegation.ITestSessionDelegationInFileystem.testDTUtilShell(ITestSessionDelegationInFileystem.java:750) ``` Issue Time Tracking --- Worklog Id: (was: 776231) Time Spent: 20m (was: 10m) > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Work logged] (HADOOP-18258) Merging of S3A Audit Logs
[ https://issues.apache.org/jira/browse/HADOOP-18258?focusedWorklogId=776126=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-776126 ] ASF GitHub Bot logged work on HADOOP-18258: --- Author: ASF GitHub Bot Created on: 31/May/22 08:23 Start Date: 31/May/22 08:23 Worklog Time Spent: 10m Work Description: sravanigadey opened a new pull request, #4383: URL: https://github.com/apache/hadoop/pull/4383 ### Description of PR Merging audit log files containing huge number of audit logs collected from a job like Hive or Spark job containing various S3 requests like list, head, get and put requests. ### How was this patch tested? Region : AP-South-1 Command used : mvn clean verify -Dparallel-tests -DtestsThreadCount=4 -Dscale Getting these two errors while testing ``` [ERROR] testSTS(org.apache.hadoop.fs.s3a.ITestS3ATemporaryCredentials) Time elapsed: 7.931 s <<< ERROR! com.amazonaws.SdkClientException: Unable to find a region via the region provider chain. Must provide an explicit region in the builder or setup environment to supply a region. at com.amazonaws.client.builder.AwsClientBuilder.setRegion(AwsClientBuilder.java:462) at com.amazonaws.client.builder.AwsClientBuilder.configureMutableProperties(AwsClientBuilder.java:424) at com.amazonaws.client.builder.AwsSyncClientBuilder.build(AwsSyncClientBuilder.java:46) at org.apache.hadoop.fs.s3a.ITestS3ATemporaryCredentials.testSTS(ITestS3ATemporaryCredentials.java:130) ``` ``` [ERROR] testDTUtilShell(org.apache.hadoop.fs.s3a.auth.delegation.ITestSessionDelegationInFileystem) Time elapsed: 1.861 s <<< FAILURE! java.lang.AssertionError: expected:<0> but was:<1> at org.junit.Assert.fail(Assert.java:89) at org.junit.Assert.failNotEquals(Assert.java:835) at org.junit.Assert.assertEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:633) at org.apache.hadoop.fs.s3a.auth.delegation.ITestSessionDelegationInFileystem.dtutil(ITestSessionDelegationInFileystem.java:739) at org.apache.hadoop.fs.s3a.auth.delegation.ITestSessionDelegationInFileystem.testDTUtilShell(ITestSessionDelegationInFileystem.java:750) ``` ### For code changes: - [ ] Does the title or this PR starts with the corresponding JIRA issue id (e.g. 'HADOOP-17799. Your PR title ...')? - [ ] Object storage: have the integration tests been executed and the endpoint declared according to the connector-specific documentation? - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)? - [ ] If applicable, have you updated the `LICENSE`, `LICENSE-binary`, `NOTICE-binary` files? Issue Time Tracking --- Worklog Id: (was: 776126) Remaining Estimate: 0h Time Spent: 10m > Merging of S3A Audit Logs > - > > Key: HADOOP-18258 > URL: https://issues.apache.org/jira/browse/HADOOP-18258 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Reporter: Sravani Gadey >Assignee: Sravani Gadey >Priority: Major > Time Spent: 10m > Remaining Estimate: 0h > > Merging audit log files containing huge number of audit logs collected from a > job like Hive or Spark job containing various S3 requests like list, head, > get and put requests. -- This message was sent by Atlassian Jira (v8.20.7#820007) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org