Re: Flink S3 Presto Checkpointing Permission Forbidden
Hi Dennis, Were you able to use checkpointing on s3 with native kubernetes. I am using flink 1.13.1 and did tried your solution of passing the webidentitytokencredentialsprovider. *-Dfs.s3a.aws.credentials.provider=com.amazonaws.auth.WebIdentityTokenCredentialsProvider* I am getting this error in job-manager logs - *Caused by: com.amazonaws.SdkClientException: Unable to locate specified web identity token file: /var/run/secrets/eks.amazonaws.com/serviceaccount/token <http://eks.amazonaws.com/serviceaccount/token>* Describing the pod shows that that volume is mounted to the pod. Is there anything specific that needs to be done as on the same EKS cluster for testing I ran a sample pod with aws cli image and it's able to do *ls* on the same s3 bucket. Thanks, Hemant On Mon, Oct 11, 2021 at 1:56 PM Denis Nutiu wrote: > Hi Rommel, > > > > Thanks for getting back to me and for your time. > > I switched to the Hadoop plugin and used the following authentication > method that worked: > *fs.s3a.aws.credentials.provider: > "com.amazonaws.auth.WebIdentityTokenCredentialsProvider"* > > > Turns out that I was using the wrong credentials provider. Reading > AWSCredentialProvider[1] and seeing that I have the > AWS_WEB_IDENTITY_TOKEN_FILE variable in the container allowed me to find > the correct one. > > > [1] > https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/AWSCredentialsProvider.html > > > Best, > > Denis > > > > > > *From:* Rommel Holmes > *Sent:* Saturday, October 9, 2021 02:09 > *To:* Denis Nutiu > *Cc:* user > *Subject:* Re: Flink S3 Presto Checkpointing Permission Forbidden > > > > You already have s3 request ID, you can easily reach out to AWS tech > support to know what account was used to write to S3. I guess that account > probably doesn't have permission to do the following: > > > > "s3:GetObject", > "s3:PutObject", > "s3:DeleteObject", > "s3:ListBucket" > > Then grant the account with the permission in k8s. Then you should be good > to go. > > > > > > > > > > On Fri, Oct 8, 2021 at 6:06 AM Denis Nutiu wrote: > > Hello, > > > > I'm trying to deploy my Flink cluster inside of an AWS EKS using Flink > Native. I want to use S3 as a filesystem for checkpointing, and giving the > following options related to flink-s3-fs-presto: > > > > "-Dhive.s3.endpoint": "https://s3.eu-central-1.amazonaws.com; > "-Dhive.s3.iam-role": "arn:aws:iam::xxx:role/s3-flink" > "-Dhive.s3.use-instance-credentials": "true" > "-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS": > "flink-s3-fs-presto-1.13.2.jar" > "-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS": > "flink-s3-fs-presto-1.13.2.jar" > "-Dstate.backend": "rocksdb" > "-Dstate.backend.incremental": "true" > "-Dstate.checkpoints.dir": "s3://bucket/checkpoints/" > "-Dstate.savepoints.dir": "s3://bucket/savepoints/" > > > > But my job fails with: > > > > 2021-10-08 11:38:49,771 WARN > org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Could > not properly dispose the private states in the pending checkpoint 45 of job > 75bdd6fb6e689961ef4e096684e867bc. > com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > JEZ3X8YPDZ2TF4T9; S3 Extended Request ID: > u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c=; > Proxy: null), S3 Extended Request ID: > u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c= > (Path: s3://bucket/checkpoints/75bdd6fb6e689961ef4e096684e867bc/chk-45) > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.lambda$getS3ObjectMetadata$2(PrestoS3FileSystem.java:573) > ~[?:?] > at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.getS3ObjectMetadata(PrestoS3FileSystem.java:560) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.getFileStatus(PrestoS3FileSystem.java:311) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.directory(PrestoS3FileSystem.java:450) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.delete(PrestoS3FileSystem.java:427) > ~[?:?] > at > org.apache.flink.fs.s3presto.common.HadoopFileSystem.delete(HadoopFileSystem.java:160) > ~[?:?] > at >
RE: Flink S3 Presto Checkpointing Permission Forbidden
Hi Rommel, Thanks for getting back to me and for your time. I switched to the Hadoop plugin and used the following authentication method that worked: *fs.s3a.aws.credentials.provider: "com.amazonaws.auth.WebIdentityTokenCredentialsProvider"* Turns out that I was using the wrong credentials provider. Reading AWSCredentialProvider[1] and seeing that I have the AWS_WEB_IDENTITY_TOKEN_FILE variable in the container allowed me to find the correct one. [1] https://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/auth/AWSCredentialsProvider.html Best, Denis *From:* Rommel Holmes *Sent:* Saturday, October 9, 2021 02:09 *To:* Denis Nutiu *Cc:* user *Subject:* Re: Flink S3 Presto Checkpointing Permission Forbidden You already have s3 request ID, you can easily reach out to AWS tech support to know what account was used to write to S3. I guess that account probably doesn't have permission to do the following: "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket" Then grant the account with the permission in k8s. Then you should be good to go. On Fri, Oct 8, 2021 at 6:06 AM Denis Nutiu wrote: Hello, I'm trying to deploy my Flink cluster inside of an AWS EKS using Flink Native. I want to use S3 as a filesystem for checkpointing, and giving the following options related to flink-s3-fs-presto: "-Dhive.s3.endpoint": "https://s3.eu-central-1.amazonaws.com; "-Dhive.s3.iam-role": "arn:aws:iam::xxx:role/s3-flink" "-Dhive.s3.use-instance-credentials": "true" "-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS": "flink-s3-fs-presto-1.13.2.jar" "-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS": "flink-s3-fs-presto-1.13.2.jar" "-Dstate.backend": "rocksdb" "-Dstate.backend.incremental": "true" "-Dstate.checkpoints.dir": "s3://bucket/checkpoints/" "-Dstate.savepoints.dir": "s3://bucket/savepoints/" But my job fails with: 2021-10-08 11:38:49,771 WARN org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Could not properly dispose the private states in the pending checkpoint 45 of job 75bdd6fb6e689961ef4e096684e867bc. com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: JEZ3X8YPDZ2TF4T9; S3 Extended Request ID: u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c=; Proxy: null), S3 Extended Request ID: u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c= (Path: s3://bucket/checkpoints/75bdd6fb6e689961ef4e096684e867bc/chk-45) at com.facebook.presto.hive.s3.PrestoS3FileSystem.lambda$getS3ObjectMetadata$2(PrestoS3FileSystem.java:573) ~[?:?] at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.getS3ObjectMetadata(PrestoS3FileSystem.java:560) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.getFileStatus(PrestoS3FileSystem.java:311) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.directory(PrestoS3FileSystem.java:450) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.delete(PrestoS3FileSystem.java:427) ~[?:?] at org.apache.flink.fs.s3presto.common.HadoopFileSystem.delete(HadoopFileSystem.java:160) ~[?:?] at org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.delete(PluginFileSystemFactory.java:155) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.state.filesystem.FsCheckpointStorageLocation.disposeOnFailure(FsCheckpointStorageLocation.java:117) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.checkpoint.PendingCheckpoint.discard(PendingCheckpoint.java:588) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.checkpoint.CheckpointsCleaner.lambda$cleanCheckpoint$0(CheckpointsCleaner.java:60) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.checkpoint.CheckpointsCleaner.lambda$cleanup$2(CheckpointsCleaner.java:85) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?] at java.util.concurrent.FutureTask.run(Unknown Source) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: JEZ3X8YPDZ2TF4T9; S3 Extended Request ID: u2RBcDpifTnzO4
Re: Flink S3 Presto Checkpointing Permission Forbidden
You already have s3 request ID, you can easily reach out to AWS tech support to know what account was used to write to S3. I guess that account probably doesn't have permission to do the following: "s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket" Then grant the account with the permission in k8s. Then you should be good to go. On Fri, Oct 8, 2021 at 6:06 AM Denis Nutiu wrote: > Hello, > > I'm trying to deploy my Flink cluster inside of an AWS EKS using Flink > Native. I want to use S3 as a filesystem for checkpointing, and giving the > following options related to flink-s3-fs-presto: > > "-Dhive.s3.endpoint": "https://s3.eu-central-1.amazonaws.com; > "-Dhive.s3.iam-role": "arn:aws:iam::xxx:role/s3-flink" > "-Dhive.s3.use-instance-credentials": "true" > "-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS": > "flink-s3-fs-presto-1.13.2.jar" > "-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS": > "flink-s3-fs-presto-1.13.2.jar" > "-Dstate.backend": "rocksdb" > "-Dstate.backend.incremental": "true" > "-Dstate.checkpoints.dir": "s3://bucket/checkpoints/" > "-Dstate.savepoints.dir": "s3://bucket/savepoints/" > > But my job fails with: > > 2021-10-08 11:38:49,771 WARN > org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Could > not properly dispose the private states in the pending checkpoint 45 of job > 75bdd6fb6e689961ef4e096684e867bc. > com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: > Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: > JEZ3X8YPDZ2TF4T9; S3 Extended Request ID: > u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c=; > Proxy: null), S3 Extended Request ID: > u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c= > (Path: s3://bucket/checkpoints/75bdd6fb6e689961ef4e096684e867bc/chk-45) > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.lambda$getS3ObjectMetadata$2(PrestoS3FileSystem.java:573) > ~[?:?] > at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.getS3ObjectMetadata(PrestoS3FileSystem.java:560) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.getFileStatus(PrestoS3FileSystem.java:311) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.directory(PrestoS3FileSystem.java:450) > ~[?:?] > at > com.facebook.presto.hive.s3.PrestoS3FileSystem.delete(PrestoS3FileSystem.java:427) > ~[?:?] > at > org.apache.flink.fs.s3presto.common.HadoopFileSystem.delete(HadoopFileSystem.java:160) > ~[?:?] > at > org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.delete(PluginFileSystemFactory.java:155) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > org.apache.flink.runtime.state.filesystem.FsCheckpointStorageLocation.disposeOnFailure(FsCheckpointStorageLocation.java:117) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > org.apache.flink.runtime.checkpoint.PendingCheckpoint.discard(PendingCheckpoint.java:588) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > org.apache.flink.runtime.checkpoint.CheckpointsCleaner.lambda$cleanCheckpoint$0(CheckpointsCleaner.java:60) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at > org.apache.flink.runtime.checkpoint.CheckpointsCleaner.lambda$cleanup$2(CheckpointsCleaner.java:85) > ~[flink-dist_2.11-1.13.2.jar:1.13.2] > at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) > [?:?] > at java.util.concurrent.FutureTask.run(Unknown Source) [?:?] > at > java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown > Source) [?:?] > at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] > at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] > at java.lang.Thread.run(Unknown Source) [?:?] > Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden > (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request > ID: JEZ3X8YPDZ2TF4T9; S3 Extended Request ID: > u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c=; > Proxy: null) > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1811) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1395) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1371) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) > ~[?:?] > at > com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) > ~[?:?]
Flink S3 Presto Checkpointing Permission Forbidden
Hello, I'm trying to deploy my Flink cluster inside of an AWS EKS using Flink Native. I want to use S3 as a filesystem for checkpointing, and giving the following options related to flink-s3-fs-presto: "-Dhive.s3.endpoint": "https://s3.eu-central-1.amazonaws.com; "-Dhive.s3.iam-role": "arn:aws:iam::xxx:role/s3-flink" "-Dhive.s3.use-instance-credentials": "true" "-Dcontainerized.master.env.ENABLE_BUILT_IN_PLUGINS": "flink-s3-fs-presto-1.13.2.jar" "-Dcontainerized.taskmanager.env.ENABLE_BUILT_IN_PLUGINS": "flink-s3-fs-presto-1.13.2.jar" "-Dstate.backend": "rocksdb" "-Dstate.backend.incremental": "true" "-Dstate.checkpoints.dir": "s3://bucket/checkpoints/" "-Dstate.savepoints.dir": "s3://bucket/savepoints/" But my job fails with: 2021-10-08 11:38:49,771 WARN org.apache.flink.runtime.checkpoint.CheckpointCoordinator[] - Could not properly dispose the private states in the pending checkpoint 45 of job 75bdd6fb6e689961ef4e096684e867bc. com.facebook.presto.hive.s3.PrestoS3FileSystem$UnrecoverableS3OperationException: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: JEZ3X8YPDZ2TF4T9; S3 Extended Request ID: u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c=; Proxy: null), S3 Extended Request ID: u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c= (Path: s3://bucket/checkpoints/75bdd6fb6e689961ef4e096684e867bc/chk-45) at com.facebook.presto.hive.s3.PrestoS3FileSystem.lambda$getS3ObjectMetadata$2(PrestoS3FileSystem.java:573) ~[?:?] at com.facebook.presto.hive.RetryDriver.run(RetryDriver.java:138) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.getS3ObjectMetadata(PrestoS3FileSystem.java:560) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.getFileStatus(PrestoS3FileSystem.java:311) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.directory(PrestoS3FileSystem.java:450) ~[?:?] at com.facebook.presto.hive.s3.PrestoS3FileSystem.delete(PrestoS3FileSystem.java:427) ~[?:?] at org.apache.flink.fs.s3presto.common.HadoopFileSystem.delete(HadoopFileSystem.java:160) ~[?:?] at org.apache.flink.core.fs.PluginFileSystemFactory$ClassLoaderFixingFileSystem.delete(PluginFileSystemFactory.java:155) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.state.filesystem.FsCheckpointStorageLocation.disposeOnFailure(FsCheckpointStorageLocation.java:117) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.checkpoint.PendingCheckpoint.discard(PendingCheckpoint.java:588) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.checkpoint.CheckpointsCleaner.lambda$cleanCheckpoint$0(CheckpointsCleaner.java:60) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at org.apache.flink.runtime.checkpoint.CheckpointsCleaner.lambda$cleanup$2(CheckpointsCleaner.java:85) ~[flink-dist_2.11-1.13.2.jar:1.13.2] at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [?:?] at java.util.concurrent.FutureTask.run(Unknown Source) [?:?] at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) [?:?] at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [?:?] at java.lang.Thread.run(Unknown Source) [?:?] Caused by: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: JEZ3X8YPDZ2TF4T9; S3 Extended Request ID: u2RBcDpifTnzO4hIOGqgTOKDY+nw6iSeSepd4eYThITCPCpVddIUGMU7jY5DpJBg1LkPuYXiH9c=; Proxy: null) at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1811) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleServiceErrorResponse(AmazonHttpClient.java:1395) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeOneRequest(AmazonHttpClient.java:1371) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeHelper(AmazonHttpClient.java:1145) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.doExecute(AmazonHttpClient.java:802) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.executeWithTimer(AmazonHttpClient.java:770) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.execute(AmazonHttpClient.java:744) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutor.access$500(AmazonHttpClient.java:704) ~[?:?] at com.amazonaws.http.AmazonHttpClient$RequestExecutionBuilderImpl.execute(AmazonHttpClient.java:686) ~[?:?] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:550) ~[?:?] at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:530) ~[?:?] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5062) ~[?:?] at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:5008) ~[?:?] at