[GitHub] [iceberg] pietro-agilelab opened a new issue, #8377: Cannot set a custom location for path based tables

via GitHub Wed, 23 Aug 2023 01:01:06 -0700


pietro-agilelab opened a new issue, #8377:
URL: https://github.com/apache/iceberg/issues/8377


   ### Query engine
   
   PySpark
   
   ### Question
   
   **tl;dr**  I'm having issues writing Iceberg tables to path-based locations. 
What's the correct way to do it?
   ________
   
   
   The [official 
documentation](https://iceberg.apache.org/docs/latest/spark-writes/#creating-tables)
 states that path-based writes are supported:
   
   > The Iceberg table location can also be specified by the location table 
property:
   > ```python
   > data.writeTo("prod.db.table")
   >     .tableProperty("location", "/path/to/location")
   >     .createOrReplace()
   > ```
   
   I followed it and tried to create a table in a custom path in this way:
   
   ```python
   df.writeTo('my_table') \
     .using('iceberg') \
     .tableProperty('location', 
'/home/iceberg/warehouse/test/custom/location/my_table') \
     .createOrReplace()
   ```
   
   but got this error:
   
   ```
   Traceback (most recent call last):
     File "<stdin>", line 1, in <module>
     File "/opt/spark/python/pyspark/sql/readwriter.py", line 1453, in 
createOrReplace
       self._jwriter.createOrReplace()
     File "/opt/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", 
line 1321, in __call__
     File "/opt/spark/python/pyspark/sql/utils.py", line 196, in deco
       raise converted from None
   pyspark.sql.utils.IllegalArgumentException: Cannot set a custom location for 
a path-based table. Expected /home/iceberg/warehouse/default/my_table but got 
/home/iceberg/warehouse/test/custom/location/my_table
   ```
   
   My Spark session is configured as follows:
   ```
   spark.sql.extensions: 
org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions
   spark.sql.catalog.spark_catalog: org.apache.iceberg.spark.SparkSessionCatalog
   spark.sql.catalog.spark_catalog.type: hadoop
   spark.sql.catalog.spark_catalog.warehouse: /home/iceberg/warehouse
   spark.sql.defaultCatalog: spark_catalog
   ```
    
   I also tried different configurations (e.g., adding another catalog with a 
`org.apache.iceberg.spark.SparkCatalog` implementation) without success.
   
   Am I doing something wrong or is the documentation out-of-date?
   
   Any support would be appreciated — thanks everyone!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [iceberg] pietro-agilelab opened a new issue, #8377: Cannot set a custom location for path based tables

Reply via email to