cobookman edited a comment on pull request #2963:
URL: https://github.com/apache/iceberg/pull/2963#issuecomment-898187675
> @cobookman the PR referenced is merged, could you update the documentation
with the correct path resolution strategy? Thank you!
Happy to, just want to understand what the expected write behaviour is for
folder-storage & s3. Having the following fail on my end.
```
CREATE TABLE my_catalog.my_ns.my_table (
id bigint,
data string,
category string)
USING iceberg
OPTIONS (
'write.object-storage.enabled'=true,
'write.folder-storage.path'='s3://some-bucket/some-random-folder/')
PARTITIONED BY (category);
INSERT INTO my_catalog.my_ns.my_table VALUES (1, "some data", "some
category");
java.lang.NullPointerException
at
org.apache.iceberg.LocationProviders.stripTrailingSlash(LocationProviders.java:135)
at
org.apache.iceberg.LocationProviders.access$000(LocationProviders.java:34)
at
org.apache.iceberg.LocationProviders$ObjectStoreLocationProvider.<init>(LocationProviders.java:99)
at
org.apache.iceberg.LocationProviders.locationsFor(LocationProviders.java:65)
at
org.apache.iceberg.BaseMetastoreTableOperations.locationProvider(BaseMetastoreTableOperations.java:200)
at org.apache.iceberg.BaseTable.locationProvider(BaseTable.java:219)
at
org.apache.iceberg.spark.source.SparkWrite.createWriterFactory(SparkWrite.java:172)
at
org.apache.iceberg.spark.source.SparkWrite.access$600(SparkWrite.java:87)
at
org.apache.iceberg.spark.source.SparkWrite$BaseBatchWrite.createBatchWriterFactory(SparkWrite.java:226)
```
Omitting the `'write.object-storage.enabled'=true` avoids the NULL Pointer
exception, but also falls back to the hive driver, writing data to:
`s3://some-bucket/some-random-folder/category=some+category/00000-2-13441dd2-137a-42d1-9c6b-9ccc29a2ebeb-00001.parquet`
```
spark-sql> CREATE TABLE my_catalog.my_ns.my_table (
> id bigint,
> data string,
> category string)
> USING iceberg
> OPTIONS (
>
'write.folder-storage.path'='s3://some-bucket/some-random-folder/')
> PARTITIONED BY (category);
Time taken: 2.021 seconds
spark-sql> INSERT INTO my_catalog.my_ns.my_table VALUES (1, "some data",
"some category");
Time taken: 3.968 seconds
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]