vinishjail97 opened a new issue, #810: URL: https://github.com/apache/incubator-xtable/issues/810
## Summary Add Microsoft Fabric's OneLake Catalog as a new catalog sync target in XTable's `RunCatalogSync`, alongside the existing Glue, HMS, and (documented) Unity Catalog integrations. This would allow XTable users to automatically register and refresh tables in Microsoft Fabric lakehouses after format conversions, enabling discovery through the OneLake Catalog UI (embedded in Teams, Excel, and Copilot Studio). ## Motivation Microsoft Fabric is a widely adopted analytics platform, and many organizations store their lakehouse data in OneLake. Currently, XTable users targeting Fabric must manually register converted tables. A native `CatalogSyncClient` for OneLake would close this gap and bring XTable's catalog sync story to parity across the major cloud data platforms: | Cloud | Catalog | XTable Support | |-------|---------|----------------| | AWS | Glue Data Catalog | Implemented | | On-prem / multi-cloud | Hive Metastore (HMS) | Implemented | | Databricks | Unity Catalog | Documented (manual SQL DDL) | | **Microsoft Fabric** | **OneLake Catalog** | **Not yet supported** | ## Available APIs Microsoft Fabric exposes several REST APIs that could be used to implement this: 1. **Fabric Lakehouse REST API** (`https://api.fabric.microsoft.com/v1/workspaces/{id}/lakehouses/{id}/tables`) — supports listing tables, loading files into Delta tables, and table maintenance. This is the most promising path for write operations (creating/refreshing table registrations). 2. **OneLake Table APIs** (`https://onelake.table.fabric.microsoft.com`) — provides read-only metadata operations compatible with the Iceberg REST Catalog (IRC) and Unity Catalog API standards. Write operations are not yet supported but are on the roadmap. 3. **OneLake filesystem APIs** (ADLS Gen2-compatible) — for direct file operations on OneLake storage paths (`abfss://`). Authentication uses Microsoft Entra ID (Azure AD) OAuth tokens with the `https://storage.azure.com/` audience. ## Proposed Implementation Following the existing Glue/HMS patterns: 1. **New module**: `xtable-azure` (or add to an existing module) 2. **`OneLakeCatalogSyncClient`** implementing `CatalogSyncClient<T>` with `getCatalogType()` returning `"ONELAKE"` 3. **`OneLakeCatalogConfig`** for Fabric-specific config (tenant ID, workspace ID, lakehouse ID, auth settings) 4. **Per-format `CatalogTableBuilder` implementations** (Delta, Iceberg, Hudi) using the Fabric Lakehouse REST API 5. **`CatalogType.ONELAKE`** constant 6. **ServiceLoader registration** in `META-INF/services/` 7. **Optional**: `CatalogConversionSource` implementation using the read-only OneLake Table APIs ## References - [OneLake Catalog Overview](https://learn.microsoft.com/en-us/fabric/governance/onelake-catalog-overview) - [OneLake Table APIs Overview](https://learn.microsoft.com/en-us/fabric/onelake/table-apis/table-apis-overview) - [Fabric Lakehouse REST API](https://learn.microsoft.com/en-us/fabric/data-engineering/lakehouse-api) - [OneLake Table APIs for Delta (Unity Catalog compatible)](https://learn.microsoft.com/en-us/fabric/onelake/table-apis/delta-table-apis-overview) - [OneLake Table APIs for Iceberg (IRC compatible)](https://learn.microsoft.com/en-us/fabric/onelake/table-apis/iceberg-table-apis-overview) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
