davidshtian commented on issue #1449: URL: https://github.com/apache/iceberg-python/issues/1449#issuecomment-2899736528
> Hi [@davidshtian](https://github.com/davidshtian), I'm facing a similar issue, even tried via spark, but I'm getting a similar issue. Also, found this interesting post, you can use logging to debug https://dev.to/aws-builders/glue-iceberg-rest-api-and-pyiceberg-364g In my case the issue is > > ``` > warnings.warn(f"No preferred file implementation for scheme: {parsed_url.scheme}") > 2025-05-21 10:52:28,157 - pyiceberg.io - INFO - Defaulting to PyArrow FileIO > ``` > > And I think it has something to do with nested catalogs. Thanks for your advice. I've tried direct API call using _awscurl_. ``` awscurl --service glue https://glue.us-east-1.amazonaws.com/iceberg/v1/catalogs/<account id>:<catalog name>/dev/namespaces/public/tables/<table name> | jq ``` and get result like this: ``` { "config": { "aws.server-side-capabilities.scan-planning": "true", "aws.glue.staging.data-transfer-role-arn": "xxx", "aws.glue.staging.location": "s3://redshift-staging-bucket-xxx/xxx:xxx/write/xxx/", "aws.glue.staging.expiration-ms": "1747882835000", "aws.glue.staging.session-token": "xxx", "aws.glue.staging.access-key-id": "xxx", "aws.glue.staging.secret-access-key": "xxx", "aws.server-side-capabilities.data-commit": "true" }, "metadata": { "current-schema-id": 0, "current-snapshot-id": 0, "default-sort-order-id": 0, "default-spec-id": 0, "format-version": 2, "last-column-id": 2, "last-partition-id": 1000, "last-sequence-number": 0, "last-updated-ms": 0, "location": "tbl", "metadata-log": [], "partition-specs": [ { "fields": [], "spec-id": 0 } ], "properties": { "aws.write.format": "RMS", "schema.name-mapping.default": "[ {\n \"field-id\" : 1,\n \"names\" : [ \"id\" ]\n}, {\n \"field-id\" : 2,\n \"names\" : [ \"name\" ]\n} ]" }, "refs": { "main": { "snapshot-id": 0, "type": "branch" } }, "schemas": [ { "fields": [ { "id": 1, "name": "id", "required": false, "type": "int" }, { "id": 2, "name": "name", "required": false, "type": "string" } ], "schema-id": 0, "type": "struct" } ], "snapshot-log": [ { "snapshot-id": 0, "timestamp-ms": 0 } ], "snapshots": [ { "manifest-list": "<table name>", "parent-snapshot-id": 0, "schema-id": 0, "sequence-number": 0, "snapshot-id": 0, "summary": {}, "timestamp-ms": 0 } ], "sort-orders": [ { "fields": [], "order-id": 0 } ], "statistics-files": [], "table-uuid": "xxx" } } ``` Weird for `"snapshots"` part in metadata json, `manifest-list` is just the table name, `schema-id` is always 0... -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org