fischcheng opened a new issue, #1512: URL: https://github.com/apache/polaris/issues/1512
### Describe the bug ### Description: When attempting to load a Polaris catalog using PyIceberg, the call to the GET /api/catalog/v1/config?warehouse=<warehouse_path> endpoint fails with an HTTP 404 error. Server logs indicate the reason is "Unable to find warehouse <warehouse_path>". However, querying the Management API (GET /api/management/v1/catalogs/{catalog_name}) confirms that a catalog does exist with the exact matching default-base-location property. Why would the Catalog API endpoint GET /api/catalog/v1/config fail to find the warehouse configuration (s3://<warehouse_path> ) when the Management API (GET /api/management/v1/catalogs/test_catalog) confirms that exact configuration exists? Is this a potential bug in the 0.9.0 version, or is there another configuration aspect or permission requirement for the /config endpoint that might be missing? ### To Reproduce ### Steps to Reproduce: 1. Set up Docker Compose: Using a `docker-compose.yml` similar to the one below, based on getting-started/eclipselink/docker-compose-minimum.yaml. The key aspect is the polaris-setup service which creates the initial catalog. ``` services: polaris: # IMPORTANT: the image MUST contain the Postgres JDBC driver and EclipseLink dependencies, see README for instructions image: apache/polaris:latest ports: # API port - "8181:8181" # Management port (metrics and health checks) - "8182:8182" environment: polaris.persistence.type: eclipse-link polaris.persistence.eclipselink.configuration-file: /deployments/config/eclipselink/persistence-minimum.xml polaris.realm-context.realms: POLARIS quarkus.otel.sdk.disabled: "true" volumes: - ../assets/eclipselink/:/deployments/config/eclipselink healthcheck: test: ["CMD", "curl", "http://localhost:8182/q/health"] interval: 2s timeout: 10s retries: 10 start_period: 10s ``` 2. After spinning up, use polaris CLI to create a catalog: ``` ./polaris \ --client-id root \ --client-secret s3cr3t \ catalogs create \ --storage-type S3 \ --default-base-location "s3://my-lakehouse" \ --role-arn "arn:aws:iam::role" \ test_catalog ``` 3. docker-compose up 4. PyIcerberg code ``` from pyiceberg.catalog import load_catalog # from pyiceberg.exceptions import NoSuchCatalogError # Or appropriate exception polaris_host = "<YOUR_POLARIS_EC2_DNS_OR_IP>" # Redacted host polaris_api_uri = f"http://{polaris_host}:8181/api/catalog" # Correct prefix found via logs s3_warehouse_location = "s3://my-lakehouse-bucket" # Must match STORAGE_LOCATION used above polaris_client_id = "root" polaris_client_secret = "<REDACTED_SECRET>" s3_role_arn_to_assume = "arn:aws:iam::<AWS_ACCOUNT_ID>:role/docker-iam-role" # Redacted account ID aws_region = "us-east-1" catalog_properties = { "type": "rest", "uri": polaris_api_uri, "credential": f"{polaris_client_id}:{polaris_client_secret}", "warehouse": s3_warehouse_location, "py-io-impl": "pyiceberg.io.pyarrow.PyArrowFileIO", f"s3.assume-role.arn": s3_role_arn_to_assume, f"s3.assume-role.session-name": "pyiceberg-polaris-session", f"s3.region": aws_region, } catalog = load_catalog( name="polaris_catalog", # Logical name for this instance **catalog_properties ) ``` ### Actual Behavior The load_catalog call fails. The underlying HTTP request to GET /api/catalog/v1/config?warehouse=s3%3A%2F%2Fmy-lakehouse-bucket returns an HTTP 404 error. ``` [EL Fine]: sql: ... --SELECT ... FROM ENTITIES_ACTIVE WHERE ... NAME = ?)) bind => [..., s3://my-lakehouse-bucket] INFO [org.apa.pol.ser.exc.IcebergExceptionMapper] ... Handling runtimeException Unable to find warehouse s3://my-lakehouse-bucket INFO [io.qua.htt.access-log] ... "GET /api/catalog/v1/config?warehouse=s3%3A%2F%2Fmy-lakehouse-bucket HTTP/1.1" 404 111 ``` ### Expected Behavior The load_catalog call should succeed, returning a valid catalog object. The underlying call to GET /api/catalog/v1/config?warehouse=s3://my-lakehouse-bucket should return HTTP 200 OK with the catalog configuration. ### Additional context Querying the Management API confirms the catalog configuration seems correct: ``` # Get Token (replace root:<REDACTED_SECRET>) ACCESS_TOKEN=$(curl -s -X POST -u "root:<REDACTED_SECRET>" -H "Content-Type: application/x-www-form-urlencoded" -d "grant_type=client_credentials" http://{HOST}:8181/api/catalog/v1/oauth/tokens | jq -r .access_token) # Query Management API for specific catalog (replace {HOST}) curl -s -H "Authorization: Bearer ${ACCESS_TOKEN}" -H 'Accept: application/json' http://{HOST}:8181/api/management/v1/catalogs/test_catalog | jq . ``` { "type": "INTERNAL", "name": "test_catalog", "properties": { "default-base-location": "s3://my-lakehouse-bucket" // <-- EXACT MATCH! }, "createTimestamp": 1746209705140, "lastUpdateTimestamp": 1746209705140, "entityVersion": 1, "storageConfigInfo": { "roleArn": "arn:aws:iam::<AWS_ACCOUNT_ID>:role/docker-iam-role", // Redacted "externalId": null, "userArn": null, "region": null, "storageType": "S3", "allowedLocations": [ "s3://my-lakehouse-bucket" ] } } This output clearly shows the default-base-location property is correctly set to s3://my-lakehouse-bucket for the test_catalog. Troubleshooting Steps Taken: - Verified API path prefix is /api/catalog/v1/ via server logs. Updated PyIceberg uri accordingly. - Verified authentication works (can get token, requests get past 401 when token is valid). - Verified server configuration using the Management API (output above confirms default-base-location matches). - Attempted deleting and recreating the catalog using curl against the Management API, ensuring the correct default-base-location was specified in the payload. The issue persists. - Verified Polaris basic startup is clean (no obvious errors in startup logs). ### System information Polaris Version: apache/polaris:latest (0.9.0) Deployment: Docker Compose using getting-started/eclipselink/docker-compose-minimum.yaml structure. Database: Postgres (implied by eclipselink setup, use an already setup RDS Postgres) Client: PyIceberg (0.9.0) using pyiceberg.catalog.load_catalog Python Version: 3.11 Host OS: Polaris running on Ubuntu EC2 by docker-compose, PyIceberg running on OSX. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org