[jira] [Resolved] (HADOOP-18962) Upgrade kafka to 3.4.0
[ https://issues.apache.org/jira/browse/HADOOP-18962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-18962. - Fix Version/s: 3.5.0 Resolution: Fixed > Upgrade kafka to 3.4.0 > -- > > Key: HADOOP-18962 > URL: https://issues.apache.org/jira/browse/HADOOP-18962 > Project: Hadoop Common > Issue Type: Bug >Reporter: D M Murali Krishna Reddy >Assignee: D M Murali Krishna Reddy >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Upgrade kafka-clients to 3.4.0 to fix > https://nvd.nist.gov/vuln/detail/CVE-2023-25194 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19186) Change loglevel to ERROR/WARNING so that it would easy to identify the problem without ignoring
Srinivasu Majeti created HADOOP-19186: - Summary: Change loglevel to ERROR/WARNING so that it would easy to identify the problem without ignoring Key: HADOOP-19186 URL: https://issues.apache.org/jira/browse/HADOOP-19186 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Srinivasu Majeti On the new Host with Java version 11, the DN was not able to communicate with the NN. We enabled DEBUG logging for the DN and the below message was logged under DEBUG level. DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedActionException as:hdfs/av3l704p.bigdata.it.internal@PRODUCTION.LOCAL (auth:KERBEROS) cause:javax.security.sasl.SaslExcept ion: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Receive timed out)] Without a DEBUG level logging, this was shown up as a WARNING as below WARN org.apache.hadoop.ipc.Client: Couldn't setup connection for hdfs/av3l704p.bigdata.it.internal@PRODUCTION.LOCAL to avl2785p.bigdata.it.internal/172.24.178.32:8022 javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Receive timed out)] A considerable amount of time was spent troubleshooting this issue as this exception was moved to a DEBUG level which was difficult to track in the logs. Can we have such critical WARNINGs shown up at the WARN/ERROR level so that it's not missed when we enable DEBUG level logging for datanodes? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19168) Upgrade Kafka Clients due to CVEs
[ https://issues.apache.org/jira/browse/HADOOP-19168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19168. - Resolution: Duplicate rohit, dupe of HADOOP-18962. let's focus on that > Upgrade Kafka Clients due to CVEs > - > > Key: HADOOP-19168 > URL: https://issues.apache.org/jira/browse/HADOOP-19168 > Project: Hadoop Common > Issue Type: Task >Reporter: Rohit Kumar >Priority: Major > Labels: pull-request-available > > Upgrade Kafka Clients due to CVEs > CVE-2023-25194:- Affected versions of this package are vulnerable to > Deserialization of Untrusted Data when there are gadgets in the > {{{}classpath{}}}. The server will connect to the attacker's LDAP server and > deserialize the LDAP response, which the attacker can use to execute java > deserialization gadget chains on the Kafka connect server. > CVSS Score:- 8.8(High) > [https://nvd.nist.gov/vuln/detail/CVE-2023-25194] > CVE-2021-38153 > CVE-2018-17196 > Insufficient Entropy > [https://security.snyk.io/package/maven/org.apache.kafka:kafka-clients] > Upgrade Kafka-Clients to 3.4.0 or higher. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19182) Upgrade kafka to 3.4.0
[ https://issues.apache.org/jira/browse/HADOOP-19182?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19182. - Resolution: Duplicate > Upgrade kafka to 3.4.0 > -- > > Key: HADOOP-19182 > URL: https://issues.apache.org/jira/browse/HADOOP-19182 > Project: Hadoop Common > Issue Type: Bug > Components: build >Reporter: fuchaohong >Priority: Major > Labels: pull-request-available > > Upgrade kafka to 3.4.0 to resolve CVE-2023-25194 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19185) Improve ABFS metric integration with iOStatistics
Steve Loughran created HADOOP-19185: --- Summary: Improve ABFS metric integration with iOStatistics Key: HADOOP-19185 URL: https://issues.apache.org/jira/browse/HADOOP-19185 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Reporter: Steve Loughran Followup to HADOOP-18325 covering the outstanding comments of https://github.com/apache/hadoop/pull/6314/files -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18325) ABFS: Add correlated metric support for ABFS operations
[ https://issues.apache.org/jira/browse/HADOOP-18325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-18325. - Fix Version/s: 3.5.0 Resolution: Fixed > ABFS: Add correlated metric support for ABFS operations > --- > > Key: HADOOP-18325 > URL: https://issues.apache.org/jira/browse/HADOOP-18325 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.3 >Reporter: Anmol Asrani >Assignee: Anmol Asrani >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Add metrics related to a particular job, specific to number of total > requests, retried requests, retry count and others -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19184) TestStagingCommitter.testJobCommitFailure failing
Mukund Thakur created HADOOP-19184: -- Summary: TestStagingCommitter.testJobCommitFailure failing Key: HADOOP-19184 URL: https://issues.apache.org/jira/browse/HADOOP-19184 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Reporter: Mukund Thakur Assignee: Mukund Thakur [INFO] [ERROR] Failures: [ERROR] TestStagingCommitter.testJobCommitFailure:662 [Committed objects compared to deleted paths org.apache.hadoop.fs.s3a.commit.staging.StagingTestBase$ClientResults@2de1acf4\{ requests=12, uploads=12, parts=12, tagsByUpload=12, commits=5, aborts=7, deletes=0}] Expecting: <["s3a://bucket-name/output/path/r_0_0_c055250c-58c7-47ea-8b14-215cb5462e89", "s3a://bucket-name/output/path/r_1_1_9111aa65-96c2-465c-b278-696aff7707e3", "s3a://bucket-name/output/path/r_0_0_dec7f398-ee4e-4a53-a783-6b72cead569a", "s3a://bucket-name/output/path/r_1_1_39ad0eba-1053-4217-aa63-ddc8edfa7c64", "s3a://bucket-name/output/path/r_0_0_6c0518f6-7c1b-418f-a3e4-7db568880e6a"]> to contain exactly in any order: <[]> but the following elements were unexpected: <["s3a://bucket-name/output/path/r_0_0_c055250c-58c7-47ea-8b14-215cb5462e89", "s3a://bucket-name/output/path/r_1_1_9111aa65-96c2-465c-b278-696aff7707e3", "s3a://bucket-name/output/path/r_0_0_dec7f398-ee4e-4a53-a783-6b72cead569a", "s3a://bucket-name/output/path/r_1_1_39ad0eba-1053-4217-aa63-ddc8edfa7c64", "s3a://bucket-name/output/path/r_0_0_6c0518f6-7c1b-418f-a3e4-7db568880e6a"]> -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19183) RBF: Support leader follower mode for multiple subclusters
Yuanbo Liu created HADOOP-19183: --- Summary: RBF: Support leader follower mode for multiple subclusters Key: HADOOP-19183 URL: https://issues.apache.org/jira/browse/HADOOP-19183 Project: Hadoop Common Issue Type: Improvement Components: RBF Reporter: Yuanbo Liu Currently there are five modes in multiple subclusters like HASH, LOCAL, RANDOM, HASH_ALL,SPACE; Proposal a new mode called leader/follower mode. routers try to write to leader subcluster as many as possible. When routers read data, put leader subcluster into first rank. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19182) Upgrade kafka to 3.4.0
fuchaohong created HADOOP-19182: --- Summary: Upgrade kafka to 3.4.0 Key: HADOOP-19182 URL: https://issues.apache.org/jira/browse/HADOOP-19182 Project: Hadoop Common Issue Type: Bug Components: build Reporter: fuchaohong -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19163) Upgrade protobuf version to 3.25.3
[ https://issues.apache.org/jira/browse/HADOOP-19163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19163. - Resolution: Fixed done. not sure what version to tag with. Proposed: we cut a new release of this > Upgrade protobuf version to 3.25.3 > -- > > Key: HADOOP-19163 > URL: https://issues.apache.org/jira/browse/HADOOP-19163 > Project: Hadoop Common > Issue Type: Bug > Components: hadoop-thirdparty >Reporter: Bilwa S T >Assignee: Bilwa S T >Priority: Major > Labels: pull-request-available > Time Spent: 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-13147) Constructors must not call overrideable methods in PureJavaCrc32C
[ https://issues.apache.org/jira/browse/HADOOP-13147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HADOOP-13147. --- Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Constructors must not call overrideable methods in PureJavaCrc32C > - > > Key: HADOOP-13147 > URL: https://issues.apache.org/jira/browse/HADOOP-13147 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 2.0.6-alpha > Environment: > http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/PureJavaCrc32C.java >Reporter: Sebb >Assignee: Sebb >Priority: Blocker > Labels: pull-request-available > Fix For: 3.5.0 > > > Constructors must not call overrideable methods. > An object is not guaranteed fully constructed until the constructor exits, so > the subclass override may not see the fully created parent object. > This applies to: > PureJavaCrc32 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19181) IAMCredentialsProvider throttle failures
Steve Loughran created HADOOP-19181: --- Summary: IAMCredentialsProvider throttle failures Key: HADOOP-19181 URL: https://issues.apache.org/jira/browse/HADOOP-19181 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.4.0 Reporter: Steve Loughran Tests report throttling errors in IAM being remapped to noauth and failure Again, impala tests, but with multiple processes on same host. this means that HADOOP-18945 isn't sufficient as even if it ensures a singleton instance for a process * it doesn't if there are many test buckets (fixable) * it doesn't work across processes (not fixable) we may be able to * use a singleton across all filesystem instances * once we know how throttling is reported, handle it through retries + error/stats collection {code} 2024-02-17T18:02:10,175 WARN [TThreadPoolServer WorkerProcess-22] fs.FileSystem: Failed to initialize fileystem s3a://impala-test-uswest2-1/test-warehouse/test_num_values_def_levels_mismatch_15b31ddb.db/too_many_def_levels: java.nio.file.AccessDeniedException: impala-test-uswest2-1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId). 2024-02-17T18:02:10,175 ERROR [TThreadPoolServer WorkerProcess-22] utils.MetaStoreUtils: Got exception: java.nio.file.AccessDeniedException impala-test-uswest2-1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId). java.nio.file.AccessDeniedException: impala-test-uswest2-1: org.apache.hadoop.fs.s3a.auth.NoAuthWithAWSException: No AWS Credentials provided by TemporaryAWSCredentialsProvider SimpleAWSCredentialsProvider EnvironmentVariableCredentialsProvider IAMInstanceCredentialsProvider : software.amazon.awssdk.core.exception.SdkClientException: Unable to load credentials from system settings. Access key must be specified either via environment variable (AWS_ACCESS_KEY_ID) or system property (aws.accessKeyId). at org.apache.hadoop.fs.s3a.AWSCredentialProviderList.maybeTranslateCredentialException(AWSCredentialProviderList.java:351) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:201) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.Invoker.once(Invoker.java:124) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.Invoker.lambda$retry$4(Invoker.java:376) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.Invoker.retryUntranslated(Invoker.java:468) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:372) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.Invoker.retry(Invoker.java:347) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$verifyBucketExists$2(S3AFileSystem.java:972) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:543) ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:524) ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:445) ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2748) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.verifyBucketExists(S3AFileSystem.java:970) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.doBucketProbing(S3AFileSystem.java:859) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.s3a.S3AFileSystem.initialize(S3AFileSystem.java:715) ~[hadoop-aws-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3452) ~[hadoop-common-3.1.1.7.2.18.0-620.jar:?] at org.apache.hadoop.fs.FileSystem.access
[jira] [Resolved] (HADOOP-19167) Change of Codec configuration does not work
[ https://issues.apache.org/jira/browse/HADOOP-19167?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ZanderXu resolved HADOOP-19167. --- Fix Version/s: 3.5.0 Resolution: Fixed > Change of Codec configuration does not work > --- > > Key: HADOOP-19167 > URL: https://issues.apache.org/jira/browse/HADOOP-19167 > Project: Hadoop Common > Issue Type: Bug > Components: compress >Reporter: Zhikai Hu >Priority: Minor > Labels: pull-request-available > Fix For: 3.5.0 > > > In one of my projects, I need to dynamically adjust compression level for > different files. > However, I found that in most cases the new compression level does not take > effect as expected, the old compression level continues to be used. > Here is the relevant code snippet: > ZStandardCodec zStandardCodec = new ZStandardCodec(); > zStandardCodec.setConf(conf); > conf.set("io.compression.codec.zstd.level", "5"); // level may change > dynamically > conf.set("io.compression.codec.zstd", zStandardCodec.getClass().getName()); > writer = SequenceFile.createWriter(conf, > SequenceFile.Writer.file(sequenceFilePath), > > SequenceFile.Writer.keyClass(LongWritable.class), > > SequenceFile.Writer.valueClass(BytesWritable.class), > > SequenceFile.Writer.compression(CompressionType.BLOCK)); > The reason is SequenceFile.Writer.init() method will call > CodecPool.getCompressor(codec, null) to get a compressor. > If the compressor is a reused instance, the conf is not applied because it is > passed as null: > public static Compressor getCompressor(CompressionCodec codec, Configuration > conf) { > Compressor compressor = borrow(compressorPool, codec.getCompressorType()); > if (compressor == null) > { compressor = codec.createCompressor(); LOG.info("Got brand-new compressor > ["+codec.getDefaultExtension()+"]"); } > else { > compressor.reinit(conf); //conf is null here > .. > > Please also refer to my unit test to reproduce the bug. > To address this bug, I modified the code to ensure that the configuration is > read back from the codec when a compressor is reused. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18759) [ABFS][Backoff-Optimization] Have a Static retry policy for connection timeout failures
[ https://issues.apache.org/jira/browse/HADOOP-18759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj Modi resolved HADOOP-18759. Fix Version/s: 3.4.1 (was: 3.5.0) Release Note: https://github.com/apache/hadoop/pull/5881 Target Version/s: 3.4.0 (was: 3.3.4) Resolution: Fixed [Hadoop-18759: [ABFS][Backoff-Optimization] Have a Static retry policy for connection timeout. by anujmodi2021 · Pull Request #5881 · apache/hadoop (github.com)|https://github.com/apache/hadoop/pull/5881] > [ABFS][Backoff-Optimization] Have a Static retry policy for connection > timeout failures > --- > > Key: HADOOP-18759 > URL: https://issues.apache.org/jira/browse/HADOOP-18759 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.4 >Reporter: Anuj Modi >Assignee: Anuj Modi >Priority: Major > Fix For: 3.4.1 > > > Today when a request fails with connection timeout, it falls back into the > loop for exponential retry. Unlike Azure Storage, there are no guarantees of > success on exponentially retried request or recommendations for ideal retry > policies for Azure network or any other general failures. Faster failure and > retry might be more beneficial for such generic connection timeout failures. > This PR introduces a new Static Retry Policy which will currently be used > only for Connection Timeout failures. It means all the requests failing with > Connection Timeout errors will be retried after a constant retry(sleep) > interval independent of how many times that request has failed. Max Retry > Count check will still be in place. > Following Configurations will be introduced in the change: > # "fs.azure.static.retry.for.connection.timeout.enabled" - default: true, > true: static retry will be used for CT, false: Exponential retry will be used. > # "fs.azure.static.retry.interval" - default: 1000ms. > This also introduces a new field in x-ms-client-request-id only for the > requests that are being retried after connection timeout failure. New filed > will tell what retry policy was used to get the sleep interval before making > this request. > Header "x-ms-client-request-id " right now has only the retryCount and > retryReason this particular API call is. For ex: > :eb06d8f6-5693-461b-b63c-5858fa7655e6:29cb0d19-2b68-4409-bc35-cb7160b90dd8:::CF:1_CT. > Moving ahead for retryReason "CT" it will have retry policy abbreviation as > well. > For ex: > :eb06d8f6-5693-461b-b63c-5858fa7655e6:29cb0d19-2b68-4409-bc35-cb7160b90dd8:::CF:1_CT_E. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18011) ABFS: Enable config control for default connection timeout
[ https://issues.apache.org/jira/browse/HADOOP-18011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj Modi resolved HADOOP-18011. Fix Version/s: 3.4.1 Hadoop Flags: Reviewed Release Note: https://github.com/apache/hadoop/pull/5881 Resolution: Fixed PR checked in: [Hadoop-18759: [ABFS][Backoff-Optimization] Have a Static retry policy for connection timeout. by anujmodi2021 · Pull Request #5881 · apache/hadoop (github.com)|https://github.com/apache/hadoop/pull/5881] > ABFS: Enable config control for default connection timeout > --- > > Key: HADOOP-18011 > URL: https://issues.apache.org/jira/browse/HADOOP-18011 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.3.1 >Reporter: Sneha Vijayarajan >Assignee: Sneha Vijayarajan >Priority: Major > Labels: pull-request-available > Fix For: 3.4.1 > > Time Spent: 50m > Remaining Estimate: 0h > > ABFS driver has a default connection timeout and read timeout value of 30 > secs. For jobs that are time sensitive, preference would be quick failure and > have shorter HTTP connection and read timeout. > This Jira is created enable config control over the default connection and > read timeout. > New config name: > fs.azure.http.connection.timeout > fs.azure.http.read.timeout -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19172) Upgrade aws-java-sdk to 1.12.720
[ https://issues.apache.org/jira/browse/HADOOP-19172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19172. - Fix Version/s: 3.3.9 3.5.0 3.4.1 Resolution: Fixed > Upgrade aws-java-sdk to 1.12.720 > > > Key: HADOOP-19172 > URL: https://issues.apache.org/jira/browse/HADOOP-19172 > Project: Hadoop Common > Issue Type: Improvement > Components: build, fs/s3 >Affects Versions: 3.4.0, 3.3.6 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Minor > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Update to the latest AWS SDK, to stop anyone worrying about the ion library > CVE https://nvd.nist.gov/vuln/detail/CVE-2024-21634 > This isn't exposed in the s3a client, but may be used downstream. > on v2 sdk releases, the v1 sdk is only used during builds; 3.3.x it is shipped -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager
[ https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammi Chen resolved HADOOP-18851. - Resolution: Fixed > Performance improvement for DelegationTokenSecretManager > > > Key: HADOOP-18851 > URL: https://issues.apache.org/jira/browse/HADOOP-18851 > Project: Hadoop Common > Issue Type: Task > Components: common >Affects Versions: 3.4.0 >Reporter: Vikas Kumar >Assignee: Vikas Kumar >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.0 > > Attachments: > 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot > 2023-08-16 at 5.36.57 PM.png > > > *Context:* > KMS depends on hadoop-common for DT management. Recently we were analysing > one performance issue and following is out findings: > # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at > following: > ## *AbstractDelegationTokenSecretManager.verifyToken()* > ## *AbstractDelegationTokenSecretManager.createPassword()* > # And then process crashed. > > {code:java} > http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : > 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED > stackTrace: > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474) > - waiting to lock <0x0005f2f545e8> (a > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396) > at {code} > All the 199 out of 200 were blocked at above point. > And the lock they are waiting for is acquired by a thread that was trying to > createPassword and publishing the same on ZK. > > {code:java} > stackTrace: > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598) > - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet) > at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570) > at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235) > at > org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398) > at > org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385) > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93) > at > org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382) > at > org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358) > at > org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36) > at > org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201) > at > org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116) > at > org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586) > at > org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601) > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402) > - locked <0x0005f2f545e8> (a > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager) > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48) > at org.apache.hadoop.security.token.Token.(Token.java:67) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183) > {code} > We can say that this thread is slow and has blocked remaining all. But > following is my observation: > > # verifyToken() and createPaswword() has been synchronized because one is > reading the tokenMap and another is updating the map. If it's only to protect > t
[jira] [Created] (HADOOP-19179) ABFS: Support FNS Accounts over BlobEndpoint
Sneha Vijayarajan created HADOOP-19179: -- Summary: ABFS: Support FNS Accounts over BlobEndpoint Key: HADOOP-19179 URL: https://issues.apache.org/jira/browse/HADOOP-19179 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Affects Versions: 3.4.0 Reporter: Sneha Vijayarajan Assignee: Sneha Vijayarajan Fix For: 3.5.0, 3.4.1 As a pre-requisite to deprecating WASB Driver, ABFS Driver will need to match FNS account support as intended by WASB driver. This will provide an official migrating means for customers still using the legacy driver to ABFS Driver. Parent Jira for WASB deprecation: [HADOOP-19178] WASB Driver Deprecation and eventual removal - ASF JIRA (apache.org) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19178) WASB Driver Deprecation and eventual removal
Sneha Vijayarajan created HADOOP-19178: -- Summary: WASB Driver Deprecation and eventual removal Key: HADOOP-19178 URL: https://issues.apache.org/jira/browse/HADOOP-19178 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Affects Versions: 3.4.0 Reporter: Sneha Vijayarajan Assignee: Sneha Vijayarajan Fix For: 3.4.1 *WASB Driver* WASB driver was developed to support FNS (FlatNameSpace) Azure Storage accounts. FNS accounts do not honor File-Folder syntax. HDFS Folder operations hence are mimicked at client side by WASB driver and certain folder operations like Rename and Delete can lead to lot of IOPs with client-side enumeration and orchestration of rename/delete operation blob by blob. It was not ideal for other APIs too as initial checks for path is a file or folder needs to be done over multiple metadata calls. These led to a degraded performance. To provide better service to Analytics customers, Microsoft released ADLS Gen2 which are HNS (Hierarchical Namespace) , i.e File-Folder aware store. ABFS driver was designed to overcome the inherent deficiencies of WASB and customers were informed to migrate to ABFS driver. *Customers who still use the legacy WASB driver and the challenges they face* Some of our customers have not migrated to the ABFS driver yet and continue to use the legacy WASB driver with FNS accounts. These customers face the following challenges: * They cannot leverage the optimizations and benefits of the ABFS driver. * They need to deal with the compatibility issues should the files and folders were modified with the legacy WASB driver and the ABFS driver concurrently in a phased transition situation. * There are differences for supported features for FNS and HNS over ABFS Driver * In certain cases, they must perform a significant amount of re-work on their workloads to migrate to the ABFS driver, which is available only on HNS enabled accounts in a fully tested and supported scenario. ** *Deprecation plans for WASB* We are introducing a new feature that will enable the ABFS driver to support FNS accounts (over BlobEndpoint) using the ABFS scheme. This feature will enable customers to use the ABFS driver to interact with data stored in GPv2 (General Purpose v2) storage accounts. With this feature, the customers who still use the legacy WASB driver will be able to migrate to the ABFS driver without much re-work on their workloads. They will however need to change the URIs from the WASB scheme to the ABFS scheme. Once ABFS driver has built FNS support capability to migrate WASB customers, WASB driver will be declared deprecated in OSS documentation and marked for removal in next major release. This will remove any ambiguity for new customer onboards as there will be only one Microsoft driver for Azure Storage and migrating customers will get SLA bound support for driver and service, which was not guaranteed over WASB. We anticipate that this feature will serve as a stepping stone for customers to move to HNS enabled accounts with the ABFS driver, which is our recommended stack for big data analytics on ADLS Gen2. *Any Impact for* *existing customers who are using ADLS Gen2 (HNS enabled account) with ABFS driver* *?* This feature does not impact the existing customers who are using ADLS Gen2 (HNS enabled account) with ABFS driver. They do not need to make any changes to their workloads or configurations. They will still enjoy the benefits of HNS, such as atomic operations, fine-grained access control, scalability, and performance. *Official recommendation* Microsoft continues to recommend all Big Data and Analytics customers to use Azure Data Lake Gen2 (ADLS Gen2) using the ABFS driver and will continue to optimize this scenario in future, we believe that this new option will help all those customers to transition to a supported scenario immediately, while they plan to ultimately move to ADLS Gen2 (HNS enabled account). *New Authentication options that a WASB to ABFS Driver migrating customer will get* Below auth types that WASB provides will continue to work on the new FNS over ABFS Driver over configuration that accepts these SAS types (similar to WASB) * SharedKey * Account SAS * Service/Container SAS Below authentication types that were not supported by WASB driver but supported by ABFS driver will continue to be available for new FNS over ABFS Driver * OAuth 2.0 Client Credentials * OAuth 2.0: Refresh Token * Azure Managed Identity * Custom OAuth 2.0 Token Provider ABFS Driver SAS Token Provider plugin present today for UserDelegation SAS and Directly SAS will continue to work only for HNS accounts. -- This message was sent by Atlassian Jira (v8.20.10#820010
[jira] [Resolved] (HADOOP-19013) fs.getXattrs(path) for S3FS doesn't have x-amz-server-side-encryption-aws-kms-key-id header.
[ https://issues.apache.org/jira/browse/HADOOP-19013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukund Thakur resolved HADOOP-19013. Resolution: Fixed > fs.getXattrs(path) for S3FS doesn't have > x-amz-server-side-encryption-aws-kms-key-id header. > > > Key: HADOOP-19013 > URL: https://issues.apache.org/jira/browse/HADOOP-19013 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/s3 >Affects Versions: 3.3.6 >Reporter: Mukund Thakur >Assignee: Mukund Thakur >Priority: Major > Labels: pull-request-available > Fix For: 3.4.1 > > > Once a path while uploading has been encrypted with SSE-KMS with a key id and > then later when we try to read the attributes of the same file, it doesn't > contain the key id information as an attribute. should we add it? -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19177) TestS3ACachingBlockManager fails intermittently in Yetus
Mukund Thakur created HADOOP-19177: -- Summary: TestS3ACachingBlockManager fails intermittently in Yetus Key: HADOOP-19177 URL: https://issues.apache.org/jira/browse/HADOOP-19177 Project: Hadoop Common Issue Type: Test Components: fs/s3 Affects Versions: 3.4.0 Reporter: Mukund Thakur {code:java} [ERROR] org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet -- Time elapsed: 60.45 s <<< ERROR! java.lang.IllegalStateException: waitForCaching: expected: 1, actual: 0, read errors: 0, caching errors: 1 at org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.waitForCaching(TestS3ACachingBlockManager.java:465) at org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGetHelper(TestS3ACachingBlockManager.java:435) at org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingOfGet(TestS3ACachingBlockManager.java:398) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) at org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.lang.Thread.run(Thread.java:750) [INFO] [INFO] Results: [INFO] [ERROR] Errors: [ERROR] org.apache.hadoop.fs.s3a.prefetch.TestS3ACachingBlockManager.testCachingFailureOfGet [ERROR] Run 1: TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465 IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching errors: 1 [ERROR] Run 2: TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465 IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching errors: 1 [ERROR] Run 3: TestS3ACachingBlockManager.testCachingFailureOfGet:405->testCachingOfGetHelper:435->waitForCaching:465 IllegalState waitForCaching: expected: 1, actual: 0, read errors: 0, caching errors: 1 {code} Discovered in [https://github.com/apache/hadoop/pull/6646#issuecomment-2111558054] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19073) WASB: Fix connection leak in FolderRenamePending
[ https://issues.apache.org/jira/browse/HADOOP-19073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19073. - Resolution: Fixed > WASB: Fix connection leak in FolderRenamePending > > > Key: HADOOP-19073 > URL: https://issues.apache.org/jira/browse/HADOOP-19073 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Affects Versions: 3.3.6 >Reporter: xy >Assignee: xy >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Fix connection leak in FolderRenamePending in getting bytes -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19176) S3A Xattr headers need hdfs-compatible prefix
Steve Loughran created HADOOP-19176: --- Summary: S3A Xattr headers need hdfs-compatible prefix Key: HADOOP-19176 URL: https://issues.apache.org/jira/browse/HADOOP-19176 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.3.6, 3.4.0 Reporter: Steve Loughran x3a xattr list needs a prefix compatible with hdfs or existing code which tries to copy attributes between stores can break we need a prefix of {user/trusted/security/system/raw}. now, problem: currently xattrs are used by the magic committer to propagate file size progress; renaming the prefix will break existing code. But as it's read only we could modify spark to look for both old and new values. {code} org.apache.hadoop.HadoopIllegalArgumentException: An XAttr name must be prefixed with user/trusted/security/system/raw, followed by a '.' at org.apache.hadoop.hdfs.XAttrHelper.buildXAttr(XAttrHelper.java:77) at org.apache.hadoop.hdfs.DFSClient.setXAttr(DFSClient.java:2835) at org.apache.hadoop.hdfs.DistributedFileSystem$59.doCall(DistributedFileSystem.java:3106) at org.apache.hadoop.hdfs.DistributedFileSystem$59.doCall(DistributedFileSystem.java:3102) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.setXAttr(DistributedFileSystem.java:3115) at org.apache.hadoop.fs.FileSystem.setXAttr(FileSystem.java:3097) {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18958) Improve UserGroupInformation debug log
[ https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-18958. - Fix Version/s: 3.5.0 Resolution: Fixed > Improve UserGroupInformation debug log > --- > > Key: HADOOP-18958 > URL: https://issues.apache.org/jira/browse/HADOOP-18958 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.3.0, 3.3.5 >Reporter: wangzhihui >Assignee: wangzhihui >Priority: Minor > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, > 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, > image-2023-10-30-14-35-11-161.png > > Original Estimate: 1h > Remaining Estimate: 1h > > Using “new Exception( )” to print the call stack of "doAs Method " in > the UserGroupInformation class. Using this way will print meaningless > Exception information and too many call stacks, This is not conducive to > troubleshooting > *example:* > !20231029-122825.jpeg|width=991,height=548! > > *improved result* : > > !image-2023-10-29-09-47-56-489.png|width=1099,height=156! > !20231030-143525.jpeg|width=572,height=674! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Reopened] (HADOOP-18958) UserGroupInformation debug log improve
[ https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran reopened HADOOP-18958: - > UserGroupInformation debug log improve > -- > > Key: HADOOP-18958 > URL: https://issues.apache.org/jira/browse/HADOOP-18958 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.3.0, 3.3.5 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, > 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, > image-2023-10-30-14-35-11-161.png > > Original Estimate: 1h > Remaining Estimate: 1h > > Using “new Exception( )” to print the call stack of "doAs Method " in > the UserGroupInformation class. Using this way will print meaningless > Exception information and too many call stacks, This is not conducive to > troubleshooting > *example:* > !20231029-122825.jpeg|width=991,height=548! > > *improved result* : > > !image-2023-10-29-09-47-56-489.png|width=1099,height=156! > !20231030-143525.jpeg|width=572,height=674! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19152) Do not hard code security providers.
[ https://issues.apache.org/jira/browse/HADOOP-19152?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze resolved HADOOP-19152. - Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Release Note: Added a new conf "hadoop.security.crypto.jce.provider.auto-add" (default: true) to enable/disable auto-adding BouncyCastleProvider. This change also avoid statically loading the BouncyCastleProvider class. Resolution: Fixed The pull request is now merged. > Do not hard code security providers. > > > Key: HADOOP-19152 > URL: https://issues.apache.org/jira/browse/HADOOP-19152 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > In order to support different security providers in different clusters, we > should not hard code a provider in our code. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19175) update s3a committer docs
Steve Loughran created HADOOP-19175: --- Summary: update s3a committer docs Key: HADOOP-19175 URL: https://issues.apache.org/jira/browse/HADOOP-19175 Project: Hadoop Common Issue Type: Improvement Components: documentation, fs/s3 Affects Versions: 3.4.0 Reporter: Steve Loughran Update s3a committer docs * declare that magic committer is stable and make it the recommended one * show how to use new command "mapred successfile" to print the success file. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19174) Tez and hive jobs fail due to google's protobuf 2.5.0 in classpath
Bilwa S T created HADOOP-19174: -- Summary: Tez and hive jobs fail due to google's protobuf 2.5.0 in classpath Key: HADOOP-19174 URL: https://issues.apache.org/jira/browse/HADOOP-19174 Project: Hadoop Common Issue Type: Bug Reporter: Bilwa S T Assignee: Bilwa S T There are two issues here: 1. We are running tez 0.10.3 which uses hadoop 3.3.6 version. Tez has protobuf version 3.21.1 Below is the exception we get. This is due to protobuf-2.5.0 in our hadoop classpath java.lang.IllegalAccessError: class org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto tried to access private field com.google.protobuf.AbstractMessage.memoizedSize (org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto and com.google.protobuf.AbstractMessage are in unnamed module of loader 'app') at org.apache.tez.dag.api.records.DAGProtos$ConfigurationProto.getSerializedSize(DAGProtos.java:21636) at com.google.protobuf.AbstractMessageLite.writeTo(AbstractMessageLite.java:75) at org.apache.tez.common.TezUtils.writeConfInPB(TezUtils.java:170) at org.apache.tez.common.TezUtils.createByteStringFromConf(TezUtils.java:83) at org.apache.tez.common.TezUtils.createUserPayloadFromConf(TezUtils.java:101) at org.apache.tez.dag.app.DAGAppMaster.serviceInit(DAGAppMaster.java:436) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164) at org.apache.tez.dag.app.DAGAppMaster$9.run(DAGAppMaster.java:2600) at java.base/java.security.AccessController.doPrivileged(AccessController.java:712) at java.base/javax.security.auth.Subject.doAs(Subject.java:439) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1899) at org.apache.tez.dag.app.DAGAppMaster.initAndStartAppMaster(DAGAppMaster.java:2597) at org.apache.tez.dag.app.DAGAppMaster.main(DAGAppMaster.java:2384) 2024-04-18 16:27:54,741 [INFO] [shutdown-hook-0] |app.DAGAppMaster|: DAGAppMasterShutdownHook invoked 2024-04-18 16:27:54,743 [INFO] [shutdown-hook-0] |service.AbstractService|: Service org.apache.tez.dag.app.DAGAppMaster failed in state STOPPED java.lang.NullPointerException: Cannot invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because "this.taskSchedulerManager" is null at org.apache.tez.dag.app.DAGAppMaster.initiateStop(DAGAppMaster.java:2111) at org.apache.tez.dag.app.DAGAppMaster.serviceStop(DAGAppMaster.java:2126) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220) at org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterShutdownHook.run(DAGAppMaster.java:2432) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.java:840) 2024-04-18 16:27:54,744 [WARN] [Thread-2] |util.ShutdownHookManager|: ShutdownHook 'DAGAppMasterShutdownHook' failed, java.util.concurrent.ExecutionException: java.lang.NullPointerException: Cannot invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because "this.taskSchedulerManager" is null java.util.concurrent.ExecutionException: java.lang.NullPointerException: Cannot invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because "this.taskSchedulerManager" is null at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:205) at org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124) at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95) Caused by: java.lang.NullPointerException: Cannot invoke "org.apache.tez.dag.app.rm.TaskSchedulerManager.initiateStop()" because "this.taskSchedulerManager" is null at org.apache.tez.dag.app.DAGAppMaster.initiateStop(DAGAppMaster.java:2111) at org.apache.tez.dag.app.DAGAppMaster.serviceStop(DAGAppMaster.java:2126) at org.apache.hadoop.service.AbstractService.stop(AbstractService.java:220) at org.apache.tez.dag.app.DAGAppMaster$DAGAppMasterShutdownHook.run(DAGAppMaster.java:2432) at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539) at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635) at java.base/java.lang.Thread.run(Thread.ja
[jira] [Resolved] (HADOOP-19170) Fixes compilation issues on Mac
[ https://issues.apache.org/jira/browse/HADOOP-19170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HADOOP-19170. -- Fix Version/s: 3.5.0 Resolution: Fixed > Fixes compilation issues on Mac > --- > > Key: HADOOP-19170 > URL: https://issues.apache.org/jira/browse/HADOOP-19170 > Project: Hadoop Common > Issue Type: Bug > Environment: OS: macOS Catalina 10.15.7 > compiler: clang 12.0.0 > cmake: 3.24.0 >Reporter: Chenyu Zheng >Assignee: Chenyu Zheng >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > When I build hadoop-common native in Mac OS, I found this error: > {code:java} > /x/hadoop/hadoop-common-project/hadoop-common/src/main/native/src/exception.c:114:50: > error: function-like macro '__GLIBC_PREREQ' is not defined > #if defined(__sun) || defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2, 32) {code} > The reason is that Mac OS does not support glibc. And C conditional > compilation requires validation of all expressions. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19173) Upgrade org.apache.derby:derby to 10.17.1.0
Shilun Fan created HADOOP-19173: --- Summary: Upgrade org.apache.derby:derby to 10.17.1.0 Key: HADOOP-19173 URL: https://issues.apache.org/jira/browse/HADOOP-19173 Project: Hadoop Common Issue Type: Improvement Components: build, common Affects Versions: 3.5.0, 3.4.1 Reporter: Shilun Fan Assignee: Shilun Fan Upgrade org.apache.derby:derby to 10.17.1.0. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19172) Upgrade aws-java-sdk to 1.12.720
Steve Loughran created HADOOP-19172: --- Summary: Upgrade aws-java-sdk to 1.12.720 Key: HADOOP-19172 URL: https://issues.apache.org/jira/browse/HADOOP-19172 Project: Hadoop Common Issue Type: Improvement Components: build, fs/s3 Affects Versions: 3.3.6, 3.4.0 Reporter: Steve Loughran Update to the latest AWS SDK, to stop anyone worrying about the ion library CVE https://nvd.nist.gov/vuln/detail/CVE-2024-21634 This isn't exposed in the s3a client, but may be used downstream. on v2 sdk releases, the v1 sdk is only used during builds; 3.3.x it is shipped -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19171) AWS v2: handle alternative forms of connection failure
Steve Loughran created HADOOP-19171: --- Summary: AWS v2: handle alternative forms of connection failure Key: HADOOP-19171 URL: https://issues.apache.org/jira/browse/HADOOP-19171 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.3.6, 3.4.0 Reporter: Steve Loughran We've had reports of network connection failures surfacing deeper in the stack where we don't convert to AWSApiCallTimeoutException so they aren't retried properly (retire connection and repeat) {code} Unable to execute HTTP request: Broken pipe (Write failed) {code} {code} Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. (Service: Amazon S3; Status Code: 400; Error Code: RequestTimeout {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19170) Fixes compilation issues on non-Linux systems
Chenyu Zheng created HADOOP-19170: - Summary: Fixes compilation issues on non-Linux systems Key: HADOOP-19170 URL: https://issues.apache.org/jira/browse/HADOOP-19170 Project: Hadoop Common Issue Type: Bug Reporter: Chenyu Zheng Assignee: Chenyu Zheng When I build hadoop-common native in Mac OS, I found this error: {code:java} /x/hadoop/hadoop-common-project/hadoop-common/src/main/native/src/exception.c:114:50: error: function-like macro '__GLIBC_PREREQ' is not defined #if defined(__sun) || defined(__GLIBC_PREREQ) && __GLIBC_PREREQ(2, 32) {code} The reason is that Mac OS does not support glibc. And C conditional compilation requires validation of all expressions. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19169) Hadoop: Upgrade @shore/bootstrap 3.3.5-shore.76
Sandeep Kumar created HADOOP-19169: -- Summary: Hadoop: Upgrade @shore/bootstrap 3.3.5-shore.76 Key: HADOOP-19169 URL: https://issues.apache.org/jira/browse/HADOOP-19169 Project: Hadoop Common Issue Type: Bug Reporter: Sandeep Kumar Upgrade @shore/bootstrap 3.3.5-shore.76 to stable version -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19168) Upgrade Kafka Clients due to CVEs
Rohit Kumar created HADOOP-19168: Summary: Upgrade Kafka Clients due to CVEs Key: HADOOP-19168 URL: https://issues.apache.org/jira/browse/HADOOP-19168 Project: Hadoop Common Issue Type: Task Reporter: Rohit Kumar Upgrade Kafka Clients due to CVEs CVE-2023-25194:- Affected versions of this package are vulnerable to Deserialization of Untrusted Data when there are gadgets in the {{{}classpath{}}}. The server will connect to the attacker's LDAP server and deserialize the LDAP response, which the attacker can use to execute java deserialization gadget chains on the Kafka connect server. CVSS Score:- 8.8(High) [https://nvd.nist.gov/vuln/detail/CVE-2023-25194] CVE-2021-38153 CVE-2018-17196 Insufficient Entropy [https://security.snyk.io/package/maven/org.apache.kafka:kafka-clients] Upgrade Kafka-Clients to 3.4.0 or higher. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19166) [DOC] Drop Migrating from Apache Hadoop 1.x to Apache Hadoop 2.x
Ayush Saxena created HADOOP-19166: - Summary: [DOC] Drop Migrating from Apache Hadoop 1.x to Apache Hadoop 2.x Key: HADOOP-19166 URL: https://issues.apache.org/jira/browse/HADOOP-19166 Project: Hadoop Common Issue Type: Improvement Reporter: Ayush Saxena Reading the docs, found this page, which is pretty irrelevant in current context or upcoming 3.x releases, can explore dropping it https://apache.github.io/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/MapReduce_Compatibility_Hadoop1_Hadoop2.html -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19165) Explore dropping protobuf 2.5.0 from the distro
Ayush Saxena created HADOOP-19165: - Summary: Explore dropping protobuf 2.5.0 from the distro Key: HADOOP-19165 URL: https://issues.apache.org/jira/browse/HADOOP-19165 Project: Hadoop Common Issue Type: Improvement Reporter: Ayush Saxena explore if protobuf-2.5.0 can be dropped from distro, it is a transitive dependency from HBase, but HBase doesn't use it in the code. Check if it is the only one pulling it into the distro & if things break we exclude that, if none let get rid of it -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Reopened] (HADOOP-18851) Performance improvement for DelegationTokenSecretManager.
[ https://issues.apache.org/jira/browse/HADOOP-18851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sammi Chen reopened HADOOP-18851: - Revert the previous commit which removes the synchronized keywords. Will have a new implementation using ReentrantReadWriteLock. > Performance improvement for DelegationTokenSecretManager. > - > > Key: HADOOP-18851 > URL: https://issues.apache.org/jira/browse/HADOOP-18851 > Project: Hadoop Common > Issue Type: Task > Components: common >Affects Versions: 3.4.0 >Reporter: Vikas Kumar >Assignee: Vikas Kumar >Priority: Major > Labels: pull-request-available > Fix For: 3.4.0 > > Attachments: > 0001-HADOOP-18851-Perfm-improvement-for-ZKDT-management.patch, Screenshot > 2023-08-16 at 5.36.57 PM.png > > > *Context:* > KMS depends on hadoop-common for DT management. Recently we were analysing > one performance issue and following is out findings: > # Around 96% (196 out of 200) KMS container threads were in BLOCKED state at > following: > ## *AbstractDelegationTokenSecretManager.verifyToken()* > ## *AbstractDelegationTokenSecretManager.createPassword()* > # And then process crashed. > > {code:java} > http-nio-9292-exec-200PRIORITY : 5THREAD ID : 0X7F075C157800NATIVE ID : > 0X2C87FNATIVE ID (DECIMAL) : 182399STATE : BLOCKED > stackTrace: > java.lang.Thread.State: BLOCKED (on object monitor) > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.verifyToken(AbstractDelegationTokenSecretManager.java:474) > - waiting to lock <0x0005f2f545e8> (a > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.verifyToken(DelegationTokenManager.java:213) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationHandler.authenticate(DelegationTokenAuthenticationHandler.java:396) > at {code} > All the 199 out of 200 were blocked at above point. > And the lock they are waiting for is acquired by a thread that was trying to > createPassword and publishing the same on ZK. > > {code:java} > stackTrace: > java.lang.Thread.State: WAITING (on object monitor) > at java.lang.Object.wait(Native Method) > at java.lang.Object.wait(Object.java:502) > at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1598) > - locked <0x000749263ec0> (a org.apache.zookeeper.ClientCnxn$Packet) > at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1570) > at org.apache.zookeeper.ZooKeeper.setData(ZooKeeper.java:2235) > at > org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:398) > at > org.apache.curator.framework.imps.SetDataBuilderImpl$7.call(SetDataBuilderImpl.java:385) > at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:93) > at > org.apache.curator.framework.imps.SetDataBuilderImpl.pathInForeground(SetDataBuilderImpl.java:382) > at > org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:358) > at > org.apache.curator.framework.imps.SetDataBuilderImpl.forPath(SetDataBuilderImpl.java:36) > at > org.apache.curator.framework.recipes.shared.SharedValue.trySetValue(SharedValue.java:201) > at > org.apache.curator.framework.recipes.shared.SharedCount.trySetCount(SharedCount.java:116) > at > org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrSharedCount(ZKDelegationTokenSecretManager.java:586) > at > org.apache.hadoop.security.token.delegation.ZKDelegationTokenSecretManager.incrementDelegationTokenSeqNum(ZKDelegationTokenSecretManager.java:601) > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:402) > - locked <0x0005f2f545e8> (a > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager$ZKSecretManager) > at > org.apache.hadoop.security.token.delegation.AbstractDelegationTokenSecretManager.createPassword(AbstractDelegationTokenSecretManager.java:48) > at org.apache.hadoop.security.token.Token.(Token.java:67) > at > org.apache.hadoop.security.token.delegation.web.DelegationTokenManager.createToken(DelegationTokenManager.java:183) > {code} > We can say that this thread is slow and has blocked remaining all. But > following is my observation: > > # verifyToken() and createPaswword() has been synchronized because one is > reading the tokenMap
[jira] [Created] (HADOOP-19164) Hadoop CLI MiniCluster is broken
Ayush Saxena created HADOOP-19164: - Summary: Hadoop CLI MiniCluster is broken Key: HADOOP-19164 URL: https://issues.apache.org/jira/browse/HADOOP-19164 Project: Hadoop Common Issue Type: Bug Reporter: Ayush Saxena Documentation is also broken & it doesn't work either (https://apache.github.io/hadoop/hadoop-project-dist/hadoop-common/CLIMiniCluster.html) *Fails with:* {noformat} Exception in thread "main" java.lang.NoClassDefFoundError: org/mockito/stubbing/Answer at org.apache.hadoop.hdfs.MiniDFSCluster.isNameNodeUp(MiniDFSCluster.java:2666) at org.apache.hadoop.hdfs.MiniDFSCluster.isClusterUp(MiniDFSCluster.java:2680) at org.apache.hadoop.hdfs.MiniDFSCluster.waitClusterUp(MiniDFSCluster.java:1510) at org.apache.hadoop.hdfs.MiniDFSCluster.initMiniDFSCluster(MiniDFSCluster.java:989) at org.apache.hadoop.hdfs.MiniDFSCluster.(MiniDFSCluster.java:588) at org.apache.hadoop.hdfs.MiniDFSCluster$Builder.build(MiniDFSCluster.java:530) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.start(MiniHadoopClusterManager.java:160) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.run(MiniHadoopClusterManager.java:132) at org.apache.hadoop.mapreduce.MiniHadoopClusterManager.main(MiniHadoopClusterManager.java:320) Caused by: java.lang.ClassNotFoundException: org.mockito.stubbing.Answer at java.net.URLClassLoader.findClass(URLClassLoader.java:387) at java.lang.ClassLoader.loadClass(ClassLoader.java:419) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:352) at java.lang.ClassLoader.loadClass(ClassLoader.java:352) ... 9 more{noformat} {*}Command executed:{*} {noformat} bin/mapred minicluster -format{noformat} *Documentation Issues:* {noformat} bin/mapred minicluster -rmport RM_PORT -jhsport JHS_PORT{noformat} Without -format option it doesn't work the first time telling Namenode isn't formatted, So, this should be corrected. {noformat} 2024-05-04 00:35:52,933 WARN namenode.FSNamesystem: Encountered exception loading fsimage java.io.IOException: NameNode is not formatted. at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:253) {noformat} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19163) Upgrade protobuf version to 3.24.4
Bilwa S T created HADOOP-19163: -- Summary: Upgrade protobuf version to 3.24.4 Key: HADOOP-19163 URL: https://issues.apache.org/jira/browse/HADOOP-19163 Project: Hadoop Common Issue Type: Bug Components: hadoop-thirdparty Reporter: Bilwa S T Assignee: Bilwa S T -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19162) Add LzoCodec implementation based on aircompressor
L. C. Hsieh created HADOOP-19162: Summary: Add LzoCodec implementation based on aircompressor Key: HADOOP-19162 URL: https://issues.apache.org/jira/browse/HADOOP-19162 Project: Hadoop Common Issue Type: Improvement Reporter: L. C. Hsieh I remember due to license issue, Hadoop doesn't contain built-in LzoCodec. Users can choose to build and install Lzo codec like hadoop-lzo manually. Some implement LzoCodec based on other open source implementations like aircompressor. But it is somehow inconvenience to maintain it separately. I'm wondering if we can add LzoCodec implementation based on aircompressor into Hadoop as default LzoCodec. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19161) S3A: support a comma separated list of performance flags
Steve Loughran created HADOOP-19161: --- Summary: S3A: support a comma separated list of performance flags Key: HADOOP-19161 URL: https://issues.apache.org/jira/browse/HADOOP-19161 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 3.4.1 Reporter: Steve Loughran Assignee: Steve Loughran HADOOP-19072 shows we want to add more optimisations than that of HADOOP-18930. * Extending the new optimisations to the existing option is brittle * Adding explicit options for each feature gets complext fast. Proposed * A new class S3APerformanceFlags keeps all the flags * it build this from a string[] of values, which can be extracted from getConf(), * and it can also support a "*" option to mean "everything" * this class can also be handed off to hasPathCapability() and do the right thing. Proposed optimisations * create file (we will hook up HADOOP-18930) * mkdir (HADOOP-19072) * delete (probe for parent path) * rename (probe for source path) We could think of more, with different names, later. The goal is make it possible to strip out every HTTP request we do for safety/posix compliance, so applications have the option of turning off what they don't need. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
[ https://issues.apache.org/jira/browse/HADOOP-19146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19146. - Fix Version/s: 3.5.0 3.4.1 Resolution: Fixed > noaa-cors-pds bucket access with global endpoint fails > -- > > Key: HADOOP-19146 > URL: https://issues.apache.org/jira/browse/HADOOP-19146 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3, test >Affects Versions: 3.4.0 >Reporter: Viraj Jasani >Assignee: Viraj Jasani >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > All tests accessing noaa-cors-pds use us-east-1 region, as configured at > bucket level. If global endpoint is configured (e.g. us-west-2), they fail to > access to bucket. > > Sample error: > {code:java} > org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect > response to region [us-east-1]. This likely indicates that the S3 region > configured in fs.s3a.endpoint.region does not match the AWS region containing > the bucket.: null (Service: S3, Status Code: 301, Request ID: > PMRWMQC9S91CNEJR, Extended Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) > at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) > at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) > at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) > at org.apache.hadoop.fs.Globber.glob(Globber.java:202) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) > at > org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) > at > org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) > at > org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) > at > org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) > at > org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) > at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) > {code} > {code:java} > Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null > (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended > Request ID: > 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) > at > software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) > at > software.amazon.awssdk.pro
[jira] [Resolved] (HADOOP-19151) Support configurable SASL mechanism
[ https://issues.apache.org/jira/browse/HADOOP-19151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz-wo Sze resolved HADOOP-19151. - Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed The pull request is now merged. > Support configurable SASL mechanism > --- > > Key: HADOOP-19151 > URL: https://issues.apache.org/jira/browse/HADOOP-19151 > Project: Hadoop Common > Issue Type: Improvement > Components: security >Reporter: Tsz-wo Sze >Assignee: Tsz-wo Sze >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Currently, the SASL mechanism is hard coded to DIGEST-MD5. As mentioned in > HADOOP-14811, DIGEST-MD5 is known to be insecure; see > [rfc6331|https://datatracker.ietf.org/doc/html/rfc6331]. > In this JIRA, we will make the SASL mechanism configurable. The default > mechanism will still be DIGEST-MD5 in order to maintain compatibility. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19150) Test ITestAbfsRestOperationException#testAuthFailException is broken.
[ https://issues.apache.org/jira/browse/HADOOP-19150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mukund Thakur resolved HADOOP-19150. Fix Version/s: 3.4.1 Resolution: Fixed > Test ITestAbfsRestOperationException#testAuthFailException is broken. > -- > > Key: HADOOP-19150 > URL: https://issues.apache.org/jira/browse/HADOOP-19150 > Project: Hadoop Common > Issue Type: Sub-task >Reporter: Mukund Thakur >Assignee: Anuj Modi >Priority: Major > Labels: pull-request-available > Fix For: 3.4.1 > > > {code:java} > intercept(Exception.class, > () -> { > fs.getFileStatus(new Path("/")); > }); {code} > Intercept shouldn't be used as there are assertions in catch statements. > > CC [~ste...@apache.org] [~anujmodi2021] [~asrani_anmol] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads
[ https://issues.apache.org/jira/browse/HADOOP-19159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19159. - Fix Version/s: 3.3.9 3.5.0 3.4.1 Resolution: Fixed > Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads > -- > > Key: HADOOP-19159 > URL: https://issues.apache.org/jira/browse/HADOOP-19159 > Project: Hadoop Common > Issue Type: Improvement > Components: documentation >Reporter: Xi Chen >Assignee: Xi Chen >Priority: Minor > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > The description about `fs.s3a.committer.abort.pending.uploads` in the > _Concurrent Jobs writing to the same destination_ is not all correct. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19159) Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads
Xi Chen created HADOOP-19159: Summary: Fix hadoop-aws document for fs.s3a.committer.abort.pending.uploads Key: HADOOP-19159 URL: https://issues.apache.org/jira/browse/HADOOP-19159 Project: Hadoop Common Issue Type: Improvement Reporter: Xi Chen The description about `fs.s3a.committer.abort.pending.uploads` in the _Concurrent Jobs writing to the same destination_ is not all correct. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19158) Support delegating ByteBufferPositionedReadable to vector reads
Steve Loughran created HADOOP-19158: --- Summary: Support delegating ByteBufferPositionedReadable to vector reads Key: HADOOP-19158 URL: https://issues.apache.org/jira/browse/HADOOP-19158 Project: Hadoop Common Issue Type: Sub-task Components: fs, fs/s3 Affects Versions: 3.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Make it easy for any stream with vector io to suppor Specifically, ByteBufferPositionedReadable.readFully() is exactly a single range read so is easy to read. the simpler read() call which can return less isn't part of the vector API. Proposed: invoke the readFully() but convert an EOFException to -1 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-17647) Release Hadoop 3.3.1
[ https://issues.apache.org/jira/browse/HADOOP-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang resolved HADOOP-17647. -- Assignee: Wei-Chiu Chuang Resolution: Done The release was published on June 15 2021. > Release Hadoop 3.3.1 > > > Key: HADOOP-17647 > URL: https://issues.apache.org/jira/browse/HADOOP-17647 > Project: Hadoop Common > Issue Type: Task >Reporter: Wei-Chiu Chuang >Assignee: Wei-Chiu Chuang >Priority: Major > > File this jira to track the release work of Hadoop 3.3.1 > Release dashboard: > https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12336122 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19107) Drop support for HBase v1
[ https://issues.apache.org/jira/browse/HADOOP-19107?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HADOOP-19107. --- Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Drop support for HBase v1 > - > > Key: HADOOP-19107 > URL: https://issues.apache.org/jira/browse/HADOOP-19107 > Project: Hadoop Common > Issue Type: Task >Reporter: Ayush Saxena >Assignee: Ayush Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Drop support for Hbase V1 and make building Hbase v2 default. > Dev List: > [https://lists.apache.org/thread/vb2gh5ljwncbrmqnk0oflb8ftdz64hhs] > https://lists.apache.org/thread/o88hnm7q8n3b4bng81q14vsj3fbhfx5w -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19157) [ABFS] Filesystem contract tests to use methodPath for robust parallel test runs
Steve Loughran created HADOOP-19157: --- Summary: [ABFS] Filesystem contract tests to use methodPath for robust parallel test runs Key: HADOOP-19157 URL: https://issues.apache.org/jira/browse/HADOOP-19157 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure, test Affects Versions: 3.4.0 Reporter: Steve Loughran Assignee: Steve Loughran hadoop-azure supports parallel test runs, but unlike hadoop-aws, the azure ones are parallelised across methods in the same test suites. this can fail badly where contract tests have hard coded filenames and assume that they can use this across all test cases. Shows up when you are testing on a store with reduced IO capacity triggering retries and making some test cases slower Fix: hadoop-common contract tests to use methodPath() names -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19102) [ABFS]: FooterReadBufferSize should not be greater than readBufferSize
[ https://issues.apache.org/jira/browse/HADOOP-19102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19102. - Fix Version/s: 3.5.0 3.4.1 Resolution: Fixed > [ABFS]: FooterReadBufferSize should not be greater than readBufferSize > -- > > Key: HADOOP-19102 > URL: https://issues.apache.org/jira/browse/HADOOP-19102 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.4.0 >Reporter: Pranav Saxena >Assignee: Pranav Saxena >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > The method `optimisedRead` creates a buffer array of size `readBufferSize`. > If footerReadBufferSize is greater than readBufferSize, abfs will attempt to > read more data than the buffer array can hold, which causes an exception. > Change: To avoid this, we will keep footerBufferSize = > min(readBufferSizeConfig, footerBufferSizeConfig) > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19156) ZooKeeper based state stores use different ZK address configs
liu bin created HADOOP-19156: Summary: ZooKeeper based state stores use different ZK address configs Key: HADOOP-19156 URL: https://issues.apache.org/jira/browse/HADOOP-19156 Project: Hadoop Common Issue Type: Improvement Reporter: liu bin Currently, the Zookeeper-based state stores of RM, YARN Federation, and HDFS Federation use the same ZK address config `{{{}hadoop.zk.address`{}}}. But in our production environment, we hope that different services can use different ZKs to avoid mutual influence. This jira adds separate ZK address configs for each service. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19155) Fix TestZKSignerSecretProvider failing unit test
kuper created HADOOP-19155: -- Summary: Fix TestZKSignerSecretProvider failing unit test Key: HADOOP-19155 URL: https://issues.apache.org/jira/browse/HADOOP-19155 Project: Hadoop Common Issue Type: Test Components: auth Affects Versions: 3.4.0 Reporter: kuper Attachments: 企业微信截图_4436de68-18c5-43bf-9382-4d9a853f7ef0.png, 企业微信截图_ab901a4a-c0d4-4a20-a595-057cf648c30c.png, 企业微信截图_fa5e7d54-b3a8-4ca3-8d4a-25fe493b4eb1.png * {{TestZKSignerSecretProvider and }}{{{}TestRandomSignerSecretProvider{}}}}} unit test o{}}}ccasional failure * The reason was that the MockZKSignerSecretProvider class rollSecret method is {{synchronized}} * {{{}s{}}}ometimes verify (secretProvider, timeout (timeout). AtLeastOnce ()). RollSecret () method first in RolloverSignerSecretProvider scheduler thread lock, this results in a timeout -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18958) UserGroupInformation debug log improve
[ https://issues.apache.org/jira/browse/HADOOP-18958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] wangzhihui resolved HADOOP-18958. - Resolution: Not A Bug > UserGroupInformation debug log improve > -- > > Key: HADOOP-18958 > URL: https://issues.apache.org/jira/browse/HADOOP-18958 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.3.0, 3.3.5 >Reporter: wangzhihui >Priority: Minor > Labels: pull-request-available > Attachments: 20231029-122825-1.jpeg, 20231029-122825.jpeg, > 20231030-143525.jpeg, image-2023-10-29-09-47-56-489.png, > image-2023-10-30-14-35-11-161.png > > Original Estimate: 1h > Remaining Estimate: 1h > > Using “new Exception( )” to print the call stack of "doAs Method " in > the UserGroupInformation class. Using this way will print meaningless > Exception information and too many call stacks, This is not conducive to > troubleshooting > *example:* > !20231029-122825.jpeg|width=991,height=548! > > *improved result* : > > !image-2023-10-29-09-47-56-489.png|width=1099,height=156! > !20231030-143525.jpeg|width=572,height=674! -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19154) upgrade bouncy castle to 1.78.1 due to CVEs
PJ Fanning created HADOOP-19154: --- Summary: upgrade bouncy castle to 1.78.1 due to CVEs Key: HADOOP-19154 URL: https://issues.apache.org/jira/browse/HADOOP-19154 Project: Hadoop Common Issue Type: Improvement Components: common Reporter: PJ Fanning [https://www.bouncycastle.org/releasenotes.html#r1rv78] There is a v1.78.1 release but no notes for it yet. For v1.78 h3. 2.1.5 Security Advisories. Release 1.78 deals with the following CVEs: * CVE-2024-29857 - Importing an EC certificate with specially crafted F2m parameters can cause high CPU usage during parameter evaluation. * CVE-2024-30171 - Possible timing based leakage in RSA based handshakes due to exception processing eliminated. * CVE-2024-30172 - Crafted signature and public key can be used to trigger an infinite loop in the Ed25519 verification code. * CVE-2024-301XX - When endpoint identification is enabled and an SSL socket is not created with an explicit hostname (as happens with HttpsURLConnection), hostname verification could be performed against a DNS-resolved IP address. This has been fixed. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19130) FTPFileSystem rename with full qualified path broken
[ https://issues.apache.org/jira/browse/HADOOP-19130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HADOOP-19130. --- Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > FTPFileSystem rename with full qualified path broken > > > Key: HADOOP-19130 > URL: https://issues.apache.org/jira/browse/HADOOP-19130 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 0.20.2, 3.3.3, 3.3.4, 3.3.6 >Reporter: shawn >Assignee: shawn >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > Attachments: image-2024-03-27-09-59-12-381.png, > image-2024-03-28-09-58-19-721.png > > Original Estimate: 2h > Remaining Estimate: 2h > > When use fs shell to put/rename file in ftp server with full qualified > path , it always get "Input/output error"(eg. > [ftp://user:password@localhost/pathxxx]), the reason is that > changeWorkingDirectory command underneath is being passed a string with > [file://|file:///] uri prefix which will not be understand by ftp server > !image-2024-03-27-09-59-12-381.png|width=948,height=156! > > in our case, after > client.changeWorkingDirectory("ftp://mytest:myt...@10.5.xx.xx/files;) > executed, the workingDirectory of ftp server is still "/", which is > incorrect(not understand by ftp server) > !image-2024-03-28-09-58-19-721.png|width=745,height=431! > the solution should be pass > absoluteSrc.getParent().toUri().getPath().toString to avoid > [file://|file:///] uri prefix, like this: > {code:java} > --- a/FTPFileSystem.java > +++ b/FTPFileSystem.java > @@ -549,15 +549,15 @@ public class FTPFileSystem extends FileSystem { > throw new IOException("Destination path " + dst > + " already exist, cannot rename!"); > } > - String parentSrc = absoluteSrc.getParent().toUri().toString(); > - String parentDst = absoluteDst.getParent().toUri().toString(); > + URI parentSrc = absoluteSrc.getParent().toUri(); > + URI parentDst = absoluteDst.getParent().toUri(); > String from = src.getName(); > String to = dst.getName(); > - if (!parentSrc.equals(parentDst)) { > + if (!parentSrc.toString().equals(parentDst.toString())) { > throw new IOException("Cannot rename parent(source): " + parentSrc > + ", parent(destination): " + parentDst); > } > - client.changeWorkingDirectory(parentSrc); > + client.changeWorkingDirectory(parentSrc.getPath().toString()); > boolean renamed = client.rename(from, to); > return renamed; > }{code} > already related issue as follows > https://issues.apache.org/jira/browse/HADOOP-8653 > I create this issue and add related unit test. > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19153) hadoop-common still exports logback as a transitive dependency
Steve Loughran created HADOOP-19153: --- Summary: hadoop-common still exports logback as a transitive dependency Key: HADOOP-19153 URL: https://issues.apache.org/jira/browse/HADOOP-19153 Project: Hadoop Common Issue Type: Bug Components: build, common Affects Versions: 3.4.0 Reporter: Steve Loughran Even though HADOOP-19084 set out to stop it, somehow ZK's declaration of a logback dependency is still contaminating the hadoop-common dependency graph, so causing problems downstream. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19152) Do not hard code security providers.
Tsz-wo Sze created HADOOP-19152: --- Summary: Do not hard code security providers. Key: HADOOP-19152 URL: https://issues.apache.org/jira/browse/HADOOP-19152 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Tsz-wo Sze Assignee: Tsz-wo Sze In order to support different security providers in different clusters, we should not hard code a provider in our code. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19151) Support configurable SASL mechanism
Tsz-wo Sze created HADOOP-19151: --- Summary: Support configurable SASL mechanism Key: HADOOP-19151 URL: https://issues.apache.org/jira/browse/HADOOP-19151 Project: Hadoop Common Issue Type: Improvement Components: security Reporter: Tsz-wo Sze Assignee: Tsz-wo Sze Currently, the SASL mechanism is hard coded to DIGEST-MD5. As mentioned in HADOOP-14811, DIGEST-MD5 is known to be insecure; see [rfc6331|https://datatracker.ietf.org/doc/html/rfc6331]. In this JIRA, we will make the SASL mechanism configurable. The default mechanism will still be DIGEST-MD5 in order to maintain compatibility. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19150) Test ITestAbfsRestOperationException#testAuthFailException is broken.
Mukund Thakur created HADOOP-19150: -- Summary: Test ITestAbfsRestOperationException#testAuthFailException is broken. Key: HADOOP-19150 URL: https://issues.apache.org/jira/browse/HADOOP-19150 Project: Hadoop Common Issue Type: Sub-task Reporter: Mukund Thakur {code:java} intercept(Exception.class, () -> { fs.getFileStatus(new Path("/")); }); {code} Intercept shouldn't be used as there are assertions in catch statements. CC [~ste...@apache.org] [~anujmodi2021] [~asrani_anmol] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19149) ABFS: Implement ThreadLocal for ObjectMapper in AzureHttpOperation via config option with static shared instance as an alternative.
Mukund Thakur created HADOOP-19149: -- Summary: ABFS: Implement ThreadLocal for ObjectMapper in AzureHttpOperation via config option with static shared instance as an alternative. Key: HADOOP-19149 URL: https://issues.apache.org/jira/browse/HADOOP-19149 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Affects Versions: 3.4.0 Reporter: Mukund Thakur Assignee: Mukund Thakur While doing internal tests on Hive TPCDS queries we have seen many instances of ObjectMapper have been created in an Application Master thus sharing a thread local object mapper instances will improve the performance. CC [~ste...@apache.org] [~harshit.gupta] -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19148) Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298
Brahma Reddy Battula created HADOOP-19148: - Summary: Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 Key: HADOOP-19148 URL: https://issues.apache.org/jira/browse/HADOOP-19148 Project: Hadoop Common Issue Type: Improvement Components: common Reporter: Brahma Reddy Battula Update solr from 8.11.2 to 8.11.3 to address CVE-2023-50298 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19106) [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE
[ https://issues.apache.org/jira/browse/HADOOP-19106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj Modi resolved HADOOP-19106. Fix Version/s: 3.4.1 Hadoop Flags: Reviewed Release Note: https://github.com/apache/hadoop/pull/6676 Target Version/s: 3.4.1 Resolution: Fixed [HADOOP-19129: [ABFS] Test Fixes and Test Script Bug Fixes by anujmodi2021 · Pull Request #6676 · apache/hadoop (github.com)|https://github.com/apache/hadoop/pull/6676] > [ABFS] All tests of. ITestAzureBlobFileSystemAuthorization fails with NPE > - > > Key: HADOOP-19106 > URL: https://issues.apache.org/jira/browse/HADOOP-19106 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Affects Versions: 3.4.0 >Reporter: Mukund Thakur >Assignee: Anuj Modi >Priority: Major > Fix For: 3.4.1 > > > When below config set to true all of the tests fails else it skips. > > fs.azure.test.namespace.enabled > true > > > [*ERROR*] > testOpenFileAuthorized(org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization) > Time elapsed: 0.064 s <<< ERROR! > java.lang.NullPointerException > at > org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.runTest(ITestAzureBlobFileSystemAuthorization.java:273) > at > org.apache.hadoop.fs.azurebfs.ITestAzureBlobFileSystemAuthorization.testOpenFileAuthorized(ITestAzureBlobFileSystemAuthorization.java:132) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26) > at > org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27) -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19129) ABFS: Fixing Test Script Bug and Some Known test Failures in ABFS Test Suite
[ https://issues.apache.org/jira/browse/HADOOP-19129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj Modi resolved HADOOP-19129. Fix Version/s: 3.4.1 Hadoop Flags: Reviewed Release Note: https://github.com/apache/hadoop/pull/6676 Resolution: Fixed [HADOOP-19129: [ABFS] Test Fixes and Test Script Bug Fixes by anujmodi2021 · Pull Request #6676 · apache/hadoop (github.com)|https://github.com/apache/hadoop/pull/6676] > ABFS: Fixing Test Script Bug and Some Known test Failures in ABFS Test Suite > > > Key: HADOOP-19129 > URL: https://issues.apache.org/jira/browse/HADOOP-19129 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.4.0, 3.4.1 >Reporter: Anuj Modi >Assignee: Anuj Modi >Priority: Major > Labels: pull-request-available > Fix For: 3.4.1 > > > Test Script used by ABFS to validate changes has following two issues: > # When there are a lot of test failures or when error message of any failing > test becomes very large, the regex used today to filter test results does not > work as expected and fails to report all the failing tests. > To resolve this, we have come up with new regex that will only target one > line test names for reporting them into aggregated test results. > # While running the test suite for different combinations of Auth type and > account type, we add the combination specific configs first and then include > the account specific configs in core-site.xml file. This will override the > combination specific configs like auth type if the same config is present in > account specific config file. To avoid this, we will first include the > account specific configs and then add the combination specific configs. > Due to above bug in test script, some test failures in ABFS were not getting > our attention. This PR also targets to resolve them. Following are the tests > fixed: > # ITestAzureBlobFileSystemAppend.testCloseOfDataBlockOnAppendComplete(): It > was failing only when append blobs were enabled. In case of append blobs we > were not closing the active block on outputstrea,close() due to which > block.close() was not getting called and assertions around it were failing. > Fixed by updating the production code to close the active block on flush. > # ITestAzureBlobFileSystemAuthorization: Tests in this class works with an > existing remote filesystem instead of creating a new file system instance. > For this they require file system configured in account settings using > following config: "fs.contract.test.fs.abfs". Tests weref ailing with NPE > when this config was not present. Updated code to skip thsi test if required > config is not present. > # ITestAbfsClient.testListPathWithValueGreaterThanServerMaximum(): Test was > failing Intermittently only for HNS enabled accounts. Test wants to assert > that client.listPath() does not return more objects than what is configured > in maxListResults. Assertions should be that number of objects returned could > be less than expected as server might end up returning even lesser due to > partition splits along with a continuation token. > # ITestGetNameSpaceEnabled.testGetIsNamespaceEnabledWhenConfigIsTrue(): Fail > when "fs.azure.test.namespace.enabled" config is missing. Ignore the test if > config is missing. > # ITestGetNameSpaceEnabled.testGetIsNamespaceEnabledWhenConfigIsFalse(): > Fail when "fs.azure.test.namespace.enabled" config is missing. Ignore the > test if config is missing. > # ITestGetNameSpaceEnabled.testNonXNSAccount(): Fail when > "fs.azure.test.namespace.enabled" config is missing. Ignore the test if > config is missing. > # ITestAbfsStreamStatistics.testAbfsStreamOps: Fails when > "fs.azure.test.appendblob.enabled" is set to true. Test wanted to assert that > number of read operations can be more in case of append blobs as compared to > normal blob because of automatic flush. It could be same as that of normal > blob as well. > # ITestAzureBlobFileSystemCheckAccess.testCheckAccessForAccountWithoutNS: > Fails for FNS Account only when following config is present: > fs.azure.account.hns.enabled". Failure is because test wants to assert that > when driver does not know if the account is HNS enabled or not it makes a > server call and fails. But above config is letting driver know the account > type and skipping the head call. Remove these configs from the test specific > configurations and no
[jira] [Resolved] (HADOOP-19110) ITestExponentialRetryPolicy failing in branch-3.4
[ https://issues.apache.org/jira/browse/HADOOP-19110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anuj Modi resolved HADOOP-19110. Fix Version/s: 3.4.1 Target Version/s: 3.4.1 Resolution: Fixed > ITestExponentialRetryPolicy failing in branch-3.4 > - > > Key: HADOOP-19110 > URL: https://issues.apache.org/jira/browse/HADOOP-19110 > Project: Hadoop Common > Issue Type: Bug > Components: fs/azure >Affects Versions: 3.4.0 >Reporter: Mukund Thakur >Assignee: Anuj Modi >Priority: Major > Fix For: 3.4.1 > > > {code:java} > [ERROR] Tests run: 6, Failures: 0, Errors: 1, Skipped: 2, Time elapsed: > 91.416 s <<< FAILURE! - in > org.apache.hadoop.fs.azurebfs.services.ITestExponentialRetryPolicy > [ERROR] > testThrottlingIntercept(org.apache.hadoop.fs.azurebfs.services.ITestExponentialRetryPolicy) > Time elapsed: 0.622 s <<< ERROR! > Failure to initialize configuration for dummy.dfs.core.windows.net key > ="null": Invalid configuration value detected for fs.azure.account.key > at > org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:53) > at > org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:646) > at > org.apache.hadoop.fs.azurebfs.services.ITestAbfsClient.createTestClientFromCurrentContext(ITestAbfsClient.java:339) > at > org.apache.hadoop.fs.azurebfs.services.ITestExponentialRetryPolicy.testThrottlingIntercept(ITestExponentialRetryPolicy.java:106) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19146) noaa-cors-pds bucket access with global endpoint fails
Viraj Jasani created HADOOP-19146: - Summary: noaa-cors-pds bucket access with global endpoint fails Key: HADOOP-19146 URL: https://issues.apache.org/jira/browse/HADOOP-19146 Project: Hadoop Common Issue Type: Improvement Components: fs/s3 Affects Versions: 3.4.0 Reporter: Viraj Jasani All tests accessing noaa-cors-pds use us-east-1 region, as configured at bucket level. If global endpoint is configured (e.g. us-west-2), they fail to access to bucket. Sample error: {code:java} org.apache.hadoop.fs.s3a.AWSRedirectException: Received permanent redirect response to region [us-east-1]. This likely indicates that the S3 region configured in fs.s3a.endpoint.region does not match the AWS region containing the bucket.: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:253) at org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:155) at org.apache.hadoop.fs.s3a.S3AFileSystem.s3GetFileStatus(S3AFileSystem.java:4041) at org.apache.hadoop.fs.s3a.S3AFileSystem.innerGetFileStatus(S3AFileSystem.java:3947) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$getFileStatus$26(S3AFileSystem.java:3924) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:3922) at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:115) at org.apache.hadoop.fs.Globber.doGlob(Globber.java:349) at org.apache.hadoop.fs.Globber.glob(Globber.java:202) at org.apache.hadoop.fs.s3a.S3AFileSystem.lambda$globStatus$35(S3AFileSystem.java:4956) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.invokeTrackingDuration(IOStatisticsBinding.java:547) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.lambda$trackDurationOfOperation$5(IOStatisticsBinding.java:528) at org.apache.hadoop.fs.statistics.impl.IOStatisticsBinding.trackDuration(IOStatisticsBinding.java:449) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2716) at org.apache.hadoop.fs.s3a.S3AFileSystem.trackDurationAndSpan(S3AFileSystem.java:2735) at org.apache.hadoop.fs.s3a.S3AFileSystem.globStatus(S3AFileSystem.java:4949) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:313) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:281) at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:445) at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:311) at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:328) at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:201) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1677) at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1674) {code} {code:java} Caused by: software.amazon.awssdk.services.s3.model.S3Exception: null (Service: S3, Status Code: 301, Request ID: PMRWMQC9S91CNEJR, Extended Request ID: 6Xrg9thLiZXffBM9rbSCRgBqwTxdLAzm6OzWk9qYJz1kGex3TVfdiMtqJ+G4vaYCyjkqL8cteKI/NuPBQu5A0Q==) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleErrorResponse(AwsXmlPredicatedResponseHandler.java:156) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handleResponse(AwsXmlPredicatedResponseHandler.java:108) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:85) at software.amazon.awssdk.protocols.xml.internal.unmarshall.AwsXmlPredicatedResponseHandler.handle(AwsXmlPredicatedResponseHandler.java:43) at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler$Crc32ValidationResponseHandler.handle(AwsSyncClientHandler.java:93) at software.amazon.awssdk.core.internal.handler.BaseClientHandler.lambda$successTransformationResponseHandler$7(BaseClientHandler.java:279) ... ... ... at software.amazon.awssdk.awscore.client.handler.AwsSyncClientHandler.execute(AwsSyncClientHandler.java:53
[jira] [Resolved] (HADOOP-19079) HttpExceptionUtils to check that loaded class is really an exception before instantiation
[ https://issues.apache.org/jira/browse/HADOOP-19079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19079. - Fix Version/s: 3.3.9 3.5.0 3.4.1 Resolution: Fixed > HttpExceptionUtils to check that loaded class is really an exception before > instantiation > - > > Key: HADOOP-19079 > URL: https://issues.apache.org/jira/browse/HADOOP-19079 > Project: Hadoop Common > Issue Type: Task > Components: common, security >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > It can be dangerous taking class names as inputs from HTTP messages even if > we control the source. Issue is in HttpExceptionUtils in hadoop-common > (validateResponse method). > I can provide a PR that will highlight the issue. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19096) [ABFS] Enhancing Client-Side Throttling Metrics Updation Logic
[ https://issues.apache.org/jira/browse/HADOOP-19096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19096. - Fix Version/s: 3.5.0 Resolution: Fixed > [ABFS] Enhancing Client-Side Throttling Metrics Updation Logic > -- > > Key: HADOOP-19096 > URL: https://issues.apache.org/jira/browse/HADOOP-19096 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs/azure >Affects Versions: 3.4.1 >Reporter: Anuj Modi >Assignee: Anuj Modi >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > ABFS has a client-side throttling mechanism which works on the metrics > collected from past requests made. I requests are getting failed due to > throttling at server, we update our metrics and client side backoff is > calculated based on those metrics. > This PR enhances the logic to decide which requests should be considered to > compute client side backoff interval as follows: > For each request made by ABFS driver, we will determine if they should > contribute to Client-Side Throttling based on the status code and result: > # Status code in 2xx range: Successful Operations should contribute. > # Status code in 3xx range: Redirection Operations should not contribute. > # Status code in 4xx range: User Errors should not contribute. > # Status code is 503: Throttling Error should contribute only if they are > due to client limits breach as follows: > ## 503, Ingress Over Account Limit: Should Contribute > ## 503, Egress Over Account Limit: Should Contribute > ## 503, TPS Over Account Limit: Should Contribute > ## 503, Other Server Throttling: Should not Contribute. > # Status code in 5xx range other than 503: Should not Contribute. > # IOException and UnknownHostExceptions: Should not Contribute. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19098) Vector IO: consistent specified rejection of overlapping ranges
[ https://issues.apache.org/jira/browse/HADOOP-19098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19098. - Resolution: Fixed > Vector IO: consistent specified rejection of overlapping ranges > --- > > Key: HADOOP-19098 > URL: https://issues.apache.org/jira/browse/HADOOP-19098 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs, fs/s3 >Affects Versions: 3.3.6 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > Related to PARQUET-2171 q: "how do you deal with overlapping ranges?" > I believe s3a rejects this, but the other impls may not. > Proposed > FS spec to say > * "overlap triggers IllegalArgumentException". > * special case: 0 byte ranges may be short circuited to return empty buffer > even without checking file length etc. > Contract tests to validate this > (+ common helper code to do this). > I'll copy the validation stuff into the parquet PR for consistency with older > releases -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19101) Vectored Read into off-heap buffer broken in fallback implementation
[ https://issues.apache.org/jira/browse/HADOOP-19101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19101. - Fix Version/s: 3.3.9 3.4.1 Resolution: Fixed > Vectored Read into off-heap buffer broken in fallback implementation > > > Key: HADOOP-19101 > URL: https://issues.apache.org/jira/browse/HADOOP-19101 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs, fs/azure >Affects Versions: 3.4.0, 3.3.6 >Reporter: Steve Loughran >Assignee: Steve Loughran >Priority: Blocker > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > {{VectoredReadUtils.readInDirectBuffer()}} always starts off reading at > position zero even when the range is at a different offset. As a result: you > can get incorrect information. > Thanks for this is straightforward: we pass in a FileRange and use its offset > as the starting position. > However, this does mean that all shipping releases 3.3.5-3.4.0 cannot safely > read vectorIO into direct buffers through HDFS, ABFS or GCS. Note that we > have never seen this in production because the parquet and ORC libraries both > read into on-heap storage. > Those libraries needs to be audited to make sure that they never attempt to > read into off-heap DirectBuffers. This is a bit trickier than you would think > because an allocator is passed in. For PARQUET-2171 we will > * only invoke the API on streams which explicitly declare their support for > the API (so fallback in parquet itself) > * not invoke when direct buffer allocation is in use. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19109) Fix metrics description
[ https://issues.apache.org/jira/browse/HADOOP-19109?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaobao Wu resolved HADOOP-19109. - Resolution: Not A Problem > Fix metrics description > --- > > Key: HADOOP-19109 > URL: https://issues.apache.org/jira/browse/HADOOP-19109 > Project: Hadoop Common > Issue Type: Improvement >Affects Versions: 3.3.0, 3.3.4 >Reporter: Xiaobao Wu >Priority: Minor > Labels: pull-request-available > > This description of the RpcLockWaitTimeNumOps metrics seems to be incorrect: > {code:java} > | `RpcQueueTimeNumOps` | Total number of RPC calls | > | `RpcQueueTimeAvgTime` | Average queue time in milliseconds | > | `RpcLockWaitTimeNumOps` | Total number of RPC calls (same as > RpcQueueTimeNumOps) |{code} > I think the description of this metrics should be more clear: > {code:java} > | `RpcLockWaitTimeNumOps` | Total number of waiting for lock acquisition > |{code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-18135) Produce Windows binaries of Hadoop
[ https://issues.apache.org/jira/browse/HADOOP-18135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gautham Banasandra resolved HADOOP-18135. - Fix Version/s: 3.5.0 Resolution: Fixed Merged PR [https://github.com/apache/hadoop/pull/6673] to trunk. > Produce Windows binaries of Hadoop > -- > > Key: HADOOP-18135 > URL: https://issues.apache.org/jira/browse/HADOOP-18135 > Project: Hadoop Common > Issue Type: Improvement > Components: build >Affects Versions: 3.4.0 > Environment: Windows 10 >Reporter: Gautham Banasandra >Assignee: Gautham Banasandra >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > We currently only provide Linux libraries and binaries. We need to provide > the same for Windows. We need to port the [create-release > script|https://github.com/apache/hadoop/blob/5f9932acc4fa2b36a3005e587637c53f2da1618d/dev-support/bin/create-release] > to run on Windows and produce the Windows binaries. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19145) Software Architecture Document
Levon Khorasandzhian created HADOOP-19145: - Summary: Software Architecture Document Key: HADOOP-19145 URL: https://issues.apache.org/jira/browse/HADOOP-19145 Project: Hadoop Common Issue Type: Improvement Components: documentation Reporter: Levon Khorasandzhian Attachments: Apache_Hadoop_SAD.pdf We (@lkhorasandzhian & @vacherkasskiy) have prepared features for documentation. This attached Software Architecture Document is very useful for new contributors and developers to get acquainted with enormous system in a short time. Currently it's only in Russian, but if you're interested in such files we can translate it in English. There are no changes in code, only adding new documentation files. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19135) Remove Jcache 1.0-alpha
[ https://issues.apache.org/jira/browse/HADOOP-19135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shilun Fan resolved HADOOP-19135. - Fix Version/s: 3.5.0 3.4.1 Hadoop Flags: Reviewed Resolution: Fixed > Remove Jcache 1.0-alpha > --- > > Key: HADOOP-19135 > URL: https://issues.apache.org/jira/browse/HADOOP-19135 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.5.0, 3.4.1 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > In YARN Federation, we use JCache. The version of JCache has not been > maintained for a long time. We directly use ECache instead of JCache in > YARN-11663, so we can remove JCache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19144) S3A prefetching to support Vector IO
Steve Loughran created HADOOP-19144: --- Summary: S3A prefetching to support Vector IO Key: HADOOP-19144 URL: https://issues.apache.org/jira/browse/HADOOP-19144 Project: Hadoop Common Issue Type: Sub-task Components: fs/s3 Affects Versions: 3.4.0 Reporter: Steve Loughran Add explicit support for vector IO in s3a prefetching stream. * if a range is in 1+ cached block, it SHALL be read from cache and returned * if a range is not in cache : TBD * If a range is partially in cache: TBD these are the same decisions that abfs has to make: should the client fetch/cache block or just do one or more GET requests A big issue is: does caching of data fetched in a range request make any sense at all? Or more specifically: does fetching the blocks in which range requests are found make sense Simply going to the store is a lot simpler -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19141) Update VectorIO default values consistently
[ https://issues.apache.org/jira/browse/HADOOP-19141?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun resolved HADOOP-19141. Fix Version/s: 3.3.7 3.4.1 Resolution: Fixed > Update VectorIO default values consistently > --- > > Key: HADOOP-19141 > URL: https://issues.apache.org/jira/browse/HADOOP-19141 > Project: Hadoop Common > Issue Type: Sub-task > Components: fs, fs/s3 >Affects Versions: 3.4.1 >Reporter: Dongjoon Hyun >Priority: Major > Labels: pull-request-available > Fix For: 3.3.7, 3.4.1 > > -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19143) Upgrade commons-cli to 1.6.0.
Shilun Fan created HADOOP-19143: --- Summary: Upgrade commons-cli to 1.6.0. Key: HADOOP-19143 URL: https://issues.apache.org/jira/browse/HADOOP-19143 Project: Hadoop Common Issue Type: Improvement Components: build, common Affects Versions: 3.5.0, 3.4.1 Reporter: Shilun Fan Assignee: Shilun Fan commons-cli can be upgraded to 1.6.0, I will try to upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19142) DfsRouterAdmin RefreshCallQueue fails when authorization is enabled
Ananya Singh created HADOOP-19142: - Summary: DfsRouterAdmin RefreshCallQueue fails when authorization is enabled Key: HADOOP-19142 URL: https://issues.apache.org/jira/browse/HADOOP-19142 Project: Hadoop Common Issue Type: Bug Components: ipc Affects Versions: 3.3.6 Reporter: Ananya Singh Assignee: Ananya Singh -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19141) Update VectorIO default values consistently
Dongjoon Hyun created HADOOP-19141: -- Summary: Update VectorIO default values consistently Key: HADOOP-19141 URL: https://issues.apache.org/jira/browse/HADOOP-19141 Project: Hadoop Common Issue Type: Sub-task Components: fs, fs/s3 Affects Versions: 3.4.1 Reporter: Dongjoon Hyun -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19140) [ABFS, S3A] Add IORateLimiter api to hadoop common
Steve Loughran created HADOOP-19140: --- Summary: [ABFS, S3A] Add IORateLimiter api to hadoop common Key: HADOOP-19140 URL: https://issues.apache.org/jira/browse/HADOOP-19140 Project: Hadoop Common Issue Type: Sub-task Components: fs, fs/azure, fs/s3 Affects Versions: 3.4.0 Reporter: Steve Loughran Assignee: Steve Loughran Create a rate limiter API in hadoop common which code (initially, manifest committer, bulk delete).. can request iO capacity for a specific operation. this can be exported by filesystems so support shared rate limiting across all threads pulled from HADOOP-19093 PR -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19139) [ABFS]: No GetPathStatus call for opening AbfsInputStream
Pranav Saxena created HADOOP-19139: -- Summary: [ABFS]: No GetPathStatus call for opening AbfsInputStream Key: HADOOP-19139 URL: https://issues.apache.org/jira/browse/HADOOP-19139 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Reporter: Pranav Saxena Assignee: Pranav Saxena Read API gives contentLen and etag of the path. This information would be used in future calls on that inputStream. Prior information of eTag is of not much importance. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19138) CSE-KMS S3A: Support for InstructionFile to store ECEK meta info
Vikas Kumar created HADOOP-19138: Summary: CSE-KMS S3A: Support for InstructionFile to store ECEK meta info Key: HADOOP-19138 URL: https://issues.apache.org/jira/browse/HADOOP-19138 Project: Hadoop Common Issue Type: New Feature Components: command, tools Reporter: Vikas Kumar {*}Task{*}: Support for InstructionFile to store ECEK meta info *Current implementation/Context:* Hadoop-aws supports CSE-KMS. During CSE, key encryption info needs to be kept somewhere. AWS SDK supports two ways: # *S3 Object's metadata* : Current integration in haddop-aws only supports this approach. ## But S3 metadata has limitation of 2 KB size. ## Also, metadata can not be updated independently. It would be complete object read/write operation even if we only need to change the metadata. # *Instruction file approach:* It's a small file containing meta-info in the same bucket at the same location. This approach needs one extra trip to S3 Read/Write operation but could be useful if business needs frequent metadata changes. *Use case:* to implement KMS RE-ENCRYPT, where only CEK(DEK) needs to be encrypted with new key material. Here instruction file approach could be useful. Plus there could be many other use cases based on different business needs. *My analysis:* I tried to enable this by setting *CryptoStorageMode.InstructionFile* in CryptoConfigurationV2 while building AmazonS3EncryptionClientV2Builder. Note: ObjectMetadata is the default value. {*}Result{*}: Write operation worked but read failed due to missing instruction file. *RCA:* On debugging, I found following: On put request, say myfile.txt : * First , S3AFileSystem writes the file to S3 like *myfile.txt_COPYING_* * Second, it writes the corresponding instruction file as *myfile.txt_COPYING_.instruction* * Third, it calls rename. ** Rename here means copy the file bytes to *myfile.txt and* ** *delete the* *myfile.txt_COPYING* * Here problem occurs, ** AmazonS3EncryptionClientV2 class, after deleting any file it looks for corresponding instruction file and if found it deletes that one also. As a result, it deletes *myfile.txt_COPYING_.instruction* as well. Related Code: com.amazonaws.services.s3.AmazonS3EncryptionClientV2.deleteObject() // part of aws sdk bundle *Possible solution:* S3AFileSystem (part of hadoop-aws) needs to be updated to first rename the instruction file , then the original file. This way deletion of instruction file can be avoided. It also requires config changes to take Objemetadata/InstructionFile as config parameter. Let's discuss if we have any better solution and can be incorporated. Once we agree on one common solution, I can work on implementation part. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19123) Update commons-configuration2 to 2.10.1 due to CVE
[ https://issues.apache.org/jira/browse/HADOOP-19123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HADOOP-19123. --- Fix Version/s: 3.5.0 3.4.1 Hadoop Flags: Reviewed Resolution: Fixed > Update commons-configuration2 to 2.10.1 due to CVE > -- > > Key: HADOOP-19123 > URL: https://issues.apache.org/jira/browse/HADOOP-19123 > Project: Hadoop Common > Issue Type: Task >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > https://github.com/advisories/GHSA-9w38-p64v-xpmv -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19115) upgrade to nimbus-jose-jwt 9.37.2 due to CVE
[ https://issues.apache.org/jira/browse/HADOOP-19115?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19115. - Fix Version/s: 3.3.9 3.5.0 3.4.1 Assignee: PJ Fanning Resolution: Fixed > upgrade to nimbus-jose-jwt 9.37.2 due to CVE > > > Key: HADOOP-19115 > URL: https://issues.apache.org/jira/browse/HADOOP-19115 > Project: Hadoop Common > Issue Type: Bug > Components: build, CVE >Affects Versions: 3.4.0, 3.5.0 >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.3.9, 3.5.0, 3.4.1 > > > https://github.com/advisories/GHSA-gvpg-vgmx-xg6w -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19137) [ABFS]:Extra getAcl call while calling first API of FileSystem
Pranav Saxena created HADOOP-19137: -- Summary: [ABFS]:Extra getAcl call while calling first API of FileSystem Key: HADOOP-19137 URL: https://issues.apache.org/jira/browse/HADOOP-19137 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Affects Versions: 3.4.0 Reporter: Pranav Saxena Assignee: Pranav Saxena Store doesn't flow in the namespace information to the client. In https://github.com/apache/hadoop/pull/3440, getIsNamespaceEnabled is added in client methods which checks if namespace information is there or not, and if not there, it will make getAcl call and set the field. Once the field is set, it would be used in future getIsNamespaceEnabled method calls for a given AbfsClient. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19136) Upgrade commons-io to 2.15.0
Shilun Fan created HADOOP-19136: --- Summary: Upgrade commons-io to 2.15.0 Key: HADOOP-19136 URL: https://issues.apache.org/jira/browse/HADOOP-19136 Project: Hadoop Common Issue Type: Improvement Components: common Affects Versions: 3.4.1 Reporter: Shilun Fan Assignee: Shilun Fan commons-io can be upgraded from 2.14.0 to 2.15.0, try to upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19135) Remove Jcache 1.0-alpha
Shilun Fan created HADOOP-19135: --- Summary: Remove Jcache 1.0-alpha Key: HADOOP-19135 URL: https://issues.apache.org/jira/browse/HADOOP-19135 Project: Hadoop Common Issue Type: Improvement Components: common Affects Versions: 3.5.0, 3.4.1 Reporter: Shilun Fan Assignee: Shilun Fan In YARN Federation, we use JCache. The version of JCache has not been maintained for a long time. We directly use ECache instead of JCache in YARN-11663, so we can remove JCache. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19077) Remove use of javax.ws.rs.core.HttpHeaders
[ https://issues.apache.org/jira/browse/HADOOP-19077?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HADOOP-19077. --- Fix Version/s: 3.4.1 Hadoop Flags: Reviewed Resolution: Fixed > Remove use of javax.ws.rs.core.HttpHeaders > -- > > Key: HADOOP-19077 > URL: https://issues.apache.org/jira/browse/HADOOP-19077 > Project: Hadoop Common > Issue Type: Task > Components: io >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.4.1 > > > One step towards removing Hadoop's dependence on Jersey1 and jsr311-api. > We have other classes where we can get HTTP header names. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19134) use StringBuilder instead of StringBuffer
PJ Fanning created HADOOP-19134: --- Summary: use StringBuilder instead of StringBuffer Key: HADOOP-19134 URL: https://issues.apache.org/jira/browse/HADOOP-19134 Project: Hadoop Common Issue Type: Improvement Reporter: PJ Fanning StringBuilder is basically the same as StringBuffer but doesn't use synchronized. String appending rarely needs locking like this. There are some public and package private APIs that use StringBuffers as input or return types - I have left these alone for compatibility reasons. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19024) Use bouncycastle jdk18 1.77
[ https://issues.apache.org/jira/browse/HADOOP-19024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ayush Saxena resolved HADOOP-19024. --- Fix Version/s: 3.5.0 Hadoop Flags: Reviewed Resolution: Fixed > Use bouncycastle jdk18 1.77 > --- > > Key: HADOOP-19024 > URL: https://issues.apache.org/jira/browse/HADOOP-19024 > Project: Hadoop Common > Issue Type: Task >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > They have stopped patching the JDK 1.5 jars that Hadoop uses (see > https://issues.apache.org/jira/browse/HADOOP-18540). > The new artifacts have similar names - but the names are like bcprov-jdk18on > as opposed to bcprov-jdk15on. > CVE-2023-33201 is an example of a security issue that seems only to be fixed > in the JDK 1.8 artifacts (ie no JDK 1.5 jar has the fix). > https://www.bouncycastle.org/releasenotes.html#r1rv77 latest current release > but the CVE was fixed in 1.74. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19133) "No test bucket" error in ITestS3AContractVectoredRead if provided via CLI property
Attila Doroszlai created HADOOP-19133: - Summary: "No test bucket" error in ITestS3AContractVectoredRead if provided via CLI property Key: HADOOP-19133 URL: https://issues.apache.org/jira/browse/HADOOP-19133 Project: Hadoop Common Issue Type: Bug Components: test, tools Reporter: Attila Doroszlai ITestS3AContractVectoredRead fails with {{NullPointerException: No test bucket}} if test bucket is defined as {{-Dtest.fs.s3a.name=...}} via CLI , not in {{auth-keys.xml}}. The same setup works for other S3A contract tests. Tested on 3.3.6. {code:title=src/test/resources/auth-keys.xml} fs.s3a.endpoint ${test.fs.s3a.endpoint} fs.contract.test.fs.s3a ${test.fs.s3a.name} {code} {code} export AWS_ACCESS_KEY_ID='' export AWS_SECRET_KEY='' mvn -Dtest=ITestS3AContractVectoredRead -Dtest.fs.s3a.name="s3a://mybucket" -Dtest.fs.s3a.endpoint="http://localhost:9878/; clean test {code} {code:title=test results} Tests run: 46, Failures: 0, Errors: 8, Skipped: 0, Time elapsed: 7.879 s <<< FAILURE! - in org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead testMinSeekAndMaxSizeDefaultValues[Buffer type : direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 1.95 s <<< ERROR! java.lang.NullPointerException: No test bucket at org.apache.hadoop.util.Preconditions.checkNotNull(Preconditions.java:88) at org.apache.hadoop.fs.s3a.S3ATestUtils.getTestBucketName(S3ATestUtils.java:714) at org.apache.hadoop.fs.s3a.S3ATestUtils.removeBaseAndBucketOverrides(S3ATestUtils.java:775) at org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead.testMinSeekAndMaxSizeDefaultValues(ITestS3AContractVectoredRead.java:104) ... testMinSeekAndMaxSizeConfigsPropagation[Buffer type : direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 0.176 s <<< ERROR! testMultiVectoredReadStatsCollection[Buffer type : direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 0.179 s <<< ERROR! testNormalReadVsVectoredReadStatsCollection[Buffer type : direct](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 0.155 s <<< ERROR! testMinSeekAndMaxSizeDefaultValues[Buffer type : array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 0.116 s <<< ERROR! testMinSeekAndMaxSizeConfigsPropagation[Buffer type : array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 0.102 s <<< ERROR! testMultiVectoredReadStatsCollection[Buffer type : array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 0.105 s <<< ERROR! testNormalReadVsVectoredReadStatsCollection[Buffer type : array](org.apache.hadoop.fs.contract.s3a.ITestS3AContractVectoredRead) Time elapsed: 0.107 s <<< ERROR! {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19041) further use of StandardCharsets
[ https://issues.apache.org/jira/browse/HADOOP-19041?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia resolved HADOOP-19041. Fix Version/s: 3.5.0 Assignee: PJ Fanning Resolution: Fixed Thanks for the contribution [~fanningpj] > further use of StandardCharsets > --- > > Key: HADOOP-19041 > URL: https://issues.apache.org/jira/browse/HADOOP-19041 > Project: Hadoop Common > Issue Type: Task >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > builds on HADOOP-18957 -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19124) Update org.ehcache from 3.3.1 to 3.8.2.
[ https://issues.apache.org/jira/browse/HADOOP-19124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia resolved HADOOP-19124. Fix Version/s: 3.5.0 Resolution: Fixed Thanks [~slfan1989] for the contribution. > Update org.ehcache from 3.3.1 to 3.8.2. > --- > > Key: HADOOP-19124 > URL: https://issues.apache.org/jira/browse/HADOOP-19124 > Project: Hadoop Common > Issue Type: Improvement > Components: common >Affects Versions: 3.4.1 >Reporter: Shilun Fan >Assignee: Shilun Fan >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > We need to enhance the caching functionality in Yarn Federation by adding a > limit on the number of cached entries. I noticed that the version of > org.ehcache is relatively old and requires an upgrade. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19131) Assist reflection iO with WrappedOperations class
Steve Loughran created HADOOP-19131: --- Summary: Assist reflection iO with WrappedOperations class Key: HADOOP-19131 URL: https://issues.apache.org/jira/browse/HADOOP-19131 Project: Hadoop Common Issue Type: Sub-task Components: fs, fs/azure, fs/s3 Affects Versions: 3.4.0 Reporter: Steve Loughran parquet, avro etc are still stuck building with older hadoop releases. This makes using new APIs hard (PARQUET-2117) and means that APIs which are 5 years old (!) such as HADOOP-15229 just aren't picked up. This lack of openFIle() adoption hurts working with files in cloud storage as * extra HEAD requests are made * read policies can't be explicitly set * split start/end can't be passed down Proposed # create class org.apache.hadoop.io.WrappedOperations # add methods to wrap the apis # test in contract tests via reflection loading -verifies we have done it properly. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19130) FTPFileSystem rename with full qualified path broken
shawn created HADOOP-19130: -- Summary: FTPFileSystem rename with full qualified path broken Key: HADOOP-19130 URL: https://issues.apache.org/jira/browse/HADOOP-19130 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 3.3.6, 3.3.4, 3.3.3, 0.20.2 Reporter: shawn Attachments: image-2024-03-27-09-59-12-381.png When use fs shell to rename file in ftp server, it always get "Input/output error", when full qualified path is passed to it(eg. ftp://user:password@localhost/pathxxx), the reason is that changeWorkingDirectory command underneath is being passed a string with file:// uri prefix which will not be understand by ftp server。 !image-2024-03-27-09-59-12-381.png! the solution should be pass absoluteSrc.getParent().toUri().getPath().toString to avoid file:// uri prefix, like this: {code:java} --- a/FTPFileSystem.java +++ b/FTPFileSystem.java @@ -549,15 +549,15 @@ public class FTPFileSystem extends FileSystem { throw new IOException("Destination path " + dst + " already exist, cannot rename!"); } - String parentSrc = absoluteSrc.getParent().toUri().toString(); - String parentDst = absoluteDst.getParent().toUri().toString(); + URI parentSrc = absoluteSrc.getParent().toUri(); + URI parentDst = absoluteDst.getParent().toUri(); String from = src.getName(); String to = dst.getName(); - if (!parentSrc.equals(parentDst)) { + if (!parentSrc.toString().equals(parentDst.toString())) { throw new IOException("Cannot rename parent(source): " + parentSrc + ", parent(destination): " + parentDst); } - client.changeWorkingDirectory(parentSrc); + client.changeWorkingDirectory(parentSrc.getPath().toString()); boolean renamed = client.rename(from, to); return renamed; }{code} already related issue as follows https://issues.apache.org/jira/browse/HADOOP-8653 I wonder why this bug haven't be fixed -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19047) Support InMemory Tracking Of S3A Magic Commits
[ https://issues.apache.org/jira/browse/HADOOP-19047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran resolved HADOOP-19047. - Fix Version/s: 3.5.0 3.4.1 Resolution: Fixed > Support InMemory Tracking Of S3A Magic Commits > -- > > Key: HADOOP-19047 > URL: https://issues.apache.org/jira/browse/HADOOP-19047 > Project: Hadoop Common > Issue Type: Improvement > Components: fs/s3 >Reporter: Syed Shameerur Rahman >Assignee: Syed Shameerur Rahman >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0, 3.4.1 > > > The following are the operations which happens within a Task when it uses S3A > Magic Committer. > *During closing of stream* > 1. A 0-byte file with a same name of the original file is uploaded to S3 > using PUT operation. Refer > [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L152] > for more information. This is done so that the downstream application like > Spark could get the size of the file which is being written. > 2. MultiPartUpload(MPU) metadata is uploaded to S3. Refer > [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicCommitTracker.java#L176] > for more information. > *During TaskCommit* > 1. All the MPU metadata which the task wrote to S3 (There will be 'x' number > of metadata file in S3 if a single task writes to 'x' files) are read and > rewritten to S3 as a single metadata file. Refer > [here|https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/commit/magic/MagicS3GuardCommitter.java#L201] > for more information > Since these operations happens with the Task JVM, We could optimize as well > as save cost by storing these information in memory when Task memory usage is > not a constraint. Hence the proposal here is to introduce a new MagicCommit > Tracker called "InMemoryMagicCommitTracker" which will store the > 1. Metadata of MPU in memory till the Task is committed > 2. Store the size of the file which can be used by the downstream application > to get the file size before it is committed/visible to the output path. > This optimization will save 2 PUT S3 calls, 1 LIST S3 call, and 1 GET S3 call > given a Task writes only 1 file. -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Created] (HADOOP-19129) ABFS: Fixing Test Script Bug and Some Known test Failures in ABFS Test Suite
Anuj Modi created HADOOP-19129: -- Summary: ABFS: Fixing Test Script Bug and Some Known test Failures in ABFS Test Suite Key: HADOOP-19129 URL: https://issues.apache.org/jira/browse/HADOOP-19129 Project: Hadoop Common Issue Type: Sub-task Components: fs/azure Affects Versions: 3.4.0, 3.4.1 Reporter: Anuj Modi -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19060) Support hadoop client authentication through keytab configuration.
[ https://issues.apache.org/jira/browse/HADOOP-19060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He resolved HADOOP-19060. -- Resolution: Won't Fix > Support hadoop client authentication through keytab configuration. > -- > > Key: HADOOP-19060 > URL: https://issues.apache.org/jira/browse/HADOOP-19060 > Project: Hadoop Common > Issue Type: New Feature >Reporter: Zhaobo Huang >Assignee: Zhaobo Huang >Priority: Minor > Labels: pull-request-available > > *Shield references to {{UserGroupInformation}} Class.* > The current HDFS client keytab authentication code is as follows: > {code:java} > Configuration conf = new Configuration(); > conf.addResource(new > Path("/usr/local/service/hadoop/etc/hadoop/hdfs-site.xml")); > conf.addResource(new > Path("/usr/local/service/hadoop/etc/hadoop/core-site.xml")); > UserGroupInformation.setConfiguration(conf); > UserGroupInformation.loginUserFromKeytab("foo", "/var/krb5kdc/foo.keytab"); > FileSystem fileSystem = FileSystem.get(conf); > FileStatus[] fileStatus = fileSystem.listStatus(new Path("/")); > for (FileStatus status : fileStatus) { > System.out.println(status.getPath()); > } {code} > This feature supports configuring keytab information in core-site.xml or hdfs > site.xml. The authentication code is as follows: > {code:java} > Configuration conf = new Configuration(); > conf.addResource(new > Path("/usr/local/service/hadoop/etc/hadoop/hdfs-site.xml")); > conf.addResource(new > Path("/usr/local/service/hadoop/etc/hadoop/core-site.xml")); > FileSystem fileSystem = FileSystem.get(conf); > FileStatus[] fileStatus = fileSystem.listStatus(new Path("/")); > for (FileStatus status : fileStatus) { > System.out.println(status.getPath()); > } {code} > The config of core-site.xml related to authentication is as follows: > {code:java} > > > hadoop.security.authentication > kerberos > > > hadoop.client.keytab.principal > foo > > > hadoop.client.keytab.file.path > /var/krb5kdc/foo.keytab > > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org
[jira] [Resolved] (HADOOP-19088) upgrade to jersey-json 1.22.0
[ https://issues.apache.org/jira/browse/HADOOP-19088?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dinesh Chitlangia resolved HADOOP-19088. Resolution: Fixed > upgrade to jersey-json 1.22.0 > - > > Key: HADOOP-19088 > URL: https://issues.apache.org/jira/browse/HADOOP-19088 > Project: Hadoop Common > Issue Type: Bug > Components: build >Affects Versions: 3.3.6 >Reporter: PJ Fanning >Assignee: PJ Fanning >Priority: Major > Labels: pull-request-available > Fix For: 3.5.0 > > > Tidies up support for Jettison and Jackson versions used by Hadoop -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: common-dev-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-dev-h...@hadoop.apache.org