schlichtanders opened a new issue, #7262:
URL: https://github.com/apache/hudi/issues/7262
**Describe the problem you faced**
First time I try to read from a hudi table in another account. The cross
account access is set to read-only.
The read fails with the following error. My interpretation of the error is
that hoodie tries to create a `hoodie.properties.backup` file which fails
silently because of read-only-rights, however then later hudi somehow assumes
that this backup file exists and fails.
**To Reproduce**
Steps to reproduce the behavior:
1. setup cross account access to a hudi table on S3
2. read with hudi from that table
of course there are a couple of more detailed steps you need to do to get
the above working
**Expected behavior**
It would be really great, if there is an option or similar, with which you
can read from a hudi table even if you have only read-rights.
**Environment Description**
* Hudi version : 0.10.1 (AWS Hudi connector)
* Spark version : 3.1.1 (AWS Glue 3.0 Spark version)
* Hive version : AWS Glue Catalog, don't know which hive this is
* Hadoop version : no hadoop involved
* Storage (HDFS/S3/GCS..) : S3
* Running on Docker? (yes/no) : don't know - running within a AWS Glue Job
**Additional context**
I've couldn't find any other issue or google result about this.
**Stacktrace**
```
py4j.protocol.Py4JJavaError: An error occurred while calling o95.load.
: org.apache.hudi.exception.HoodieIOException: Could not load Hoodie
properties from
s3://fielmann.mkt.prod.datalake-consolidated/kls_bipii_customer/.hoodie/hoodie.properties
at
org.apache.hudi.common.table.HoodieTableConfig.<init>(HoodieTableConfig.java:190)
at
org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:114)
at
org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:73)
at
org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:614)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:107)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
at
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:354)
at
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:326)
at
org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:308)
at scala.Option.getOrElse(Option.scala:121)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:308)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:240)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
at py4j.Gateway.invoke(Gateway.java:282)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:238)
at java.lang.Thread.run(Thread.java:750)
Caused by: java.io.FileNotFoundException: No such file or directory
's3://fielmann.mkt.prod.datalake-consolidated/kls_bipii_customer/.hoodie/hoodie.properties.backup'
at
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:532)
at
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:936)
at
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:928)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:906)
at
com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:201)
at
org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:459)
at
org.apache.hudi.common.table.HoodieTableConfig.fetchConfigs(HoodieTableConfig.java:212)
at
org.apache.hudi.common.table.HoodieTableConfig.<init>(HoodieTableConfig.java:180)
... 22 more
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]