mengxianwen1 opened a new pull request, #10980:
URL: https://github.com/apache/gravitino/pull/10980

   ### What changes were proposed in this pull request?
   
   This PR adds a three-level property inheritance mechanism for 
`lance.storage.*` properties (AK/SK, endpoint, etc.) so that Lance tables can 
inherit storage credentials from catalog and schema levels instead of requiring 
them on every        
     `createTable` call. The inheritance precedence is: **table > schema > 
catalog** (child overrides parent).
                                                                                
                                                                                
                                                                                
      
     Changes:                                                                   
                                                                                
                                                                                
      
      
     1. **`GenericCatalogPropertiesMetadata` / 
`GenericSchemaPropertiesMetadata`**: Added `lance.storage.*` prefix property 
entry with `hidden=true` so credentials are accepted but not exposed in API 
responses.                                    
                                                               
     2. **`GenericCatalogOperations`**:                                         
                                                                                
                                                                                
      
        - Store `conf` (catalog properties) in `initialize()` following the 
`FilesetCatalogOperations` pattern.
        - Added `mergeLanceStorageProperties()` method that merges 
`lance.storage.*` from catalog → schema → table with child-overrides-parent 
precedence.                                                                     
                       
        - `createTable()` now calls `mergeLanceStorageProperties()` and puts 
merged properties into table properties before delegating to format-specific 
operations.                                                                     
            
                                                                                
                                                                                
                                                                                
      
     3. **`GravitinoLanceTableOperations`** (Lance REST Server path):           
                                                                                
                                                                                
      
        - Added `mergeCatalogStorageProperties()` to merge catalog-level 
`lance.storage.*` into table properties using `putIfAbsent` (table-level takes 
precedence).                                                                    
              
        - Called in `createTable()` and `registerTable()`. 
(`createEmptyTable()` delegates to `createTable()` so it's covered 
automatically.)                                                                 
                                        
                                                                                
                                                                                
                                                                                
      
     4. **`LanceTableOperations.openDataset()`**: Fixed a bug where 
`openDataset()` did not pass storage options to `Dataset.open()`, causing 
`alterTable` and other read operations to fail on cloud-based Lance datasets. 
Added `openDataset(String,
      Map)` overload and updated `handleLanceTableChange()` to pass storage 
options. 
   
   ### Why are the changes needed?
                                                                                
                                                                                
    
   Currently, only the `location` property has a three-level fallback chain 
(table → schema → catalog). The `lance.storage.*` properties (AK/SK, endpoint, 
region) must be explicitly specified on every `createTable` call, even though 
these      
     credentials are typically shared across all tables under the same catalog 
or schema. This is redundant and error-prone.
                                                                                
                                                                                
                                                                                
      
     Additionally, `LanceTableOperations.openDataset()` lacks storage options, 
which means `alterTable` operations fail on cloud-based Lance datasets that 
require credentials.                                                            
               
   
   Fix: #10979 
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes:                                                                         
                                                                                
                                                                                
    
     1. **New property keys**: Users can now set `lance.storage.*` properties 
at catalog and schema levels (e.g., `lance.storage.s3.access-key-id`, 
`lance.storage.s3.endpoint`). These properties are marked as `hidden` and will 
not be returned in
     API responses.                                                             
                                                                                
                                                                                
      
     2. **Behavior change**: When creating a Lance table without 
`lance.storage.*` properties, the table will now automatically inherit them 
from the schema and catalog levels. Previously, these properties would be 
absent, causing creation to
     fail for cloud storage backends.                                           
                                                                                
                                                                                
      
     3. **Backward compatible**: Tables that already explicitly set 
`lance.storage.*` continue to work unchanged — table-level values always take 
precedence.
   
   ### How was this patch tested?
   
   (Please test your changes, and provide instructions on how to test it:
      - **`TestLanceStoragePropertyInheritance`** (12 tests): Covers the 
three-level merge logic including precedence overrides, null handling, and 
non-lance-storage property filtering.                                           
                   
     - **`TestGravitinoLanceTableOperationsStorageMerge`** (5 tests): Covers 
the REST Server path catalog-to-table merge logic.
     - **`TestLanceTableOperations`**: Updated mock for `openDataset` signature 
change.                                                                         
                                                                                
      
     - All tests pass: `./gradlew :catalogs:catalog-lakehouse-generic:test 
:lance:lance-common:test -PskipITs`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to