Ferdinanddb opened a new issue, #14634:
URL: https://github.com/apache/iceberg/issues/14634

   ### Feature Request / Improvement
   
   I am using Spark (pyspark) 4.0.1 and Iceberg 1.10.0.
   
   Right now, when using the BigLake REST catalog with the vended-credentials 
mode enabled and Spark to manipulate Iceberg tables, Spark jobs that runs for 
more than an hour fails with the following error:
   ```
   INFO - [spark-kubernetes-driver] Caused by: 
com.google.cloud.storage.StorageException: OAuth2Credentials instance does not 
support refreshing the access token. An instance with a new access token should 
be used, or a derived type that supports refreshing.
   ```
   
   It works fine for jobs that take less than an hour to finish. And it also 
work fine if I disable the vended-credentials mode, but it means that I need to 
grant more IAM roles (specifically read role on the GCS bucket where the data 
is stored, as [described 
here](https://docs.cloud.google.com/biglake/docs/blms-rest-catalog#required_roles)).
   
   My understanding is that the vended-credentials mode generates an access 
token that is valid for one hour, and therefore the job will fail if it is 
still trying to interact with Iceberg tables after an hour.
   
   I contacted the BigLake support some time ago, and it seems that the method 
`OAuth2RefreshCredentialsHandler.refreshAccessToken()` ([link 
here](https://github.com/apache/iceberg/blob/cc38966a84775a369bb4e35e8158845a0e8a54a6/gcp/src/main/java/org/apache/iceberg/gcp/gcs/OAuth2RefreshCredentialsHandler.java#L66))
 is not used anywhere.
   
   I would assume that this method should be used somewhere in this Iceberg 
project to not get an error anymore. Sorry in advance if this is not the case.
   
   To reproduce the error, one can create a BigLake REST catalog and spin up a 
SparkSession using [the following config from their 
documentation](https://docs.cloud.google.com/biglake/docs/blms-rest-catalog#iceberg-1.10-or-later).
 The necessary should be done so that the job runs for more than an hour, and 
then try to read/write an Iceberg table. The job should fail with the same 
error I get.
   
   ### Query engine
   
   Spark
   
   ### Willingness to contribute
   
   - [ ] I can contribute this improvement/feature independently
   - [x] I would be willing to contribute this improvement/feature with 
guidance from the Iceberg community
   - [ ] I cannot contribute this improvement/feature at this time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to