[GitHub] [hudi] schlichtanders opened a new issue, #7262: [SUPPORT] Is it possible to read from a hudi table on S3 with only read privileges?

GitBox Mon, 21 Nov 2022 02:32:18 -0800


schlichtanders opened a new issue, #7262:
URL: https://github.com/apache/hudi/issues/7262


   **Describe the problem you faced**
   
   First time I try to read from a hudi table in another account. The cross 
account access is set to read-only.
   The read fails with the following error. My interpretation of the error is 
that hoodie tries to create a `hoodie.properties.backup` file which fails 
silently because of read-only-rights, however then later hudi somehow assumes 
that this backup file exists and fails.
   
   **To Reproduce**
   
   Steps to reproduce the behavior:
   
   1. setup cross account access to a hudi table on S3
   2. read with hudi from that table
   
   of course there are a couple of more detailed steps you need to do to get 
the above working 
   
   **Expected behavior**
   
   It would be really great, if there is an option or similar, with which you 
can read from a hudi table even if you have only read-rights.
   
   **Environment Description**
   
   * Hudi version : 0.10.1 (AWS Hudi connector)
   
   * Spark version : 3.1.1 (AWS Glue 3.0 Spark version)
   
   * Hive version : AWS Glue Catalog, don't know which hive this is
   
   * Hadoop version : no hadoop involved
   
   * Storage (HDFS/S3/GCS..) : S3
   
   * Running on Docker? (yes/no) : don't know - running within a AWS Glue Job
   
   
   **Additional context**
   
   I've couldn't find any other issue or google result about this.
   
   **Stacktrace**
   
   ```
   py4j.protocol.Py4JJavaError: An error occurred while calling o95.load.
   : org.apache.hudi.exception.HoodieIOException: Could not load Hoodie 
properties from 
s3://fielmann.mkt.prod.datalake-consolidated/kls_bipii_customer/.hoodie/hoodie.properties
        at 
org.apache.hudi.common.table.HoodieTableConfig.<init>(HoodieTableConfig.java:190)
        at 
org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:114)
        at 
org.apache.hudi.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:73)
        at 
org.apache.hudi.common.table.HoodieTableMetaClient$Builder.build(HoodieTableMetaClient.java:614)
        at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:107)
        at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:69)
        at 
org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:354)
        at 
org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:326)
        at 
org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:308)
        at scala.Option.getOrElse(Option.scala:121)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:308)
        at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:240)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
        at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
        at py4j.Gateway.invoke(Gateway.java:282)
        at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
        at py4j.commands.CallCommand.execute(CallCommand.java:79)
        at py4j.GatewayConnection.run(GatewayConnection.java:238)
        at java.lang.Thread.run(Thread.java:750)
   Caused by: java.io.FileNotFoundException: No such file or directory 
's3://fielmann.mkt.prod.datalake-consolidated/kls_bipii_customer/.hoodie/hoodie.properties.backup'
        at 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.getFileStatus(S3NativeFileSystem.java:532)
        at 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:936)
        at 
com.amazon.ws.emr.hadoop.fs.s3n.S3NativeFileSystem.open(S3NativeFileSystem.java:928)
        at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:906)
        at 
com.amazon.ws.emr.hadoop.fs.EmrFileSystem.open(EmrFileSystem.java:201)
        at 
org.apache.hudi.common.fs.HoodieWrapperFileSystem.open(HoodieWrapperFileSystem.java:459)
        at 
org.apache.hudi.common.table.HoodieTableConfig.fetchConfigs(HoodieTableConfig.java:212)
        at 
org.apache.hudi.common.table.HoodieTableConfig.<init>(HoodieTableConfig.java:180)
        ... 22 more
   ```
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] schlichtanders opened a new issue, #7262: [SUPPORT] Is it possible to read from a hudi table on S3 with only read privileges?

Reply via email to