TimothyDing opened a new pull request, #9741:
URL: https://github.com/apache/gravitino/pull/9741

   What changes were proposed in this pull request?
   
     This PR adds support for Hologres Catalog in Apache Gravitino, enabling 
users to connect to and manage Alibaba Cloud Hologres databases through 
Gravitino's unified metadata interface.
   
     Key changes include:
   
     1. New Hologres Catalog Module: Created catalog-jdbc-hologres module with 
complete implementation
       - HologresCatalog - Main catalog class extending JdbcCatalog
       - HologresCatalogOperations - Catalog operations with driver management
       - HologresSchemaOperations - Schema operations with Hologres-specific 
system schemas filtering
       - HologresTableOperations - Table operations supporting TABLE, VIEW, and 
FOREIGN TABLE types
       - HologresTypeConverter - Type converter for 
Hologres/PostgreSQL-compatible types
       - HologresExceptionConverter - Exception converter mapping SQLSTATE codes
       - HologresColumnDefaultValueConverter - Column default value converter
     2. Hologres-Specific Metadata Support: Added comprehensive metadata 
extraction from hologres.hg_table_properties
       - Distribution key (分布键)
       - Clustering key (聚簇键)
       - Primary key
       - Storage format (ORC/SST)
       - Orientation (row/column/mixed)
       - Table group
       - Lifecycle settings
       - Create/DDL timestamps
     3. System Schema Filtering: Configured to hide Hologres system schemas 
from users:
       - hologres_streaming_mv
       - hologres_sample
       - hg_internal
       - hologres_object_table
       - hg_recyclebin
     4. Frontend Integration: Added Hologres provider configuration to web UI 
with column type mappings and table properties
     5. JDBC Driver Dependencies: Added PostgreSQL JDBC driver to runtime 
dependencies for both Hologres and PostgreSQL catalogs
   
     Why are the changes needed?
   
     Hologres is Alibaba Cloud's real-time OLAP database service that is 
compatible with the PostgreSQL protocol. Many enterprises use Hologres as their 
primary data warehousing solution, and Gravitino users need the ability to:
   
     1. Unified Metadata Management: Manage Hologres metadata alongside other 
data sources (Hive, Iceberg, MySQL, etc.) through a single interface
     2. Data Governance: Apply consistent access control, auditing, and 
discovery policies across Hologres data
     3. Multi-Source Integration: Query Hologres data in conjunction with other 
catalog sources through engines like Trino and Spark
     4. Schema Discovery: Browse and search Hologres schemas, tables, and views 
through Gravitino's web UI
   
     The Hologres Catalog implementation:
     - Leverages Hologres' PostgreSQL protocol compatibility
     - Extends Gravitino's JDBC catalog framework
     - Preserves Hologres-specific metadata (distribution keys, clustering 
keys, storage formats)
     - Filters out system schemas to provide a clean user experience
     - Supports all Hologres table types: regular tables, views, and foreign 
tables (MaxCompute external tables)
   
     Fix: #N/A (new feature)
   
     Does this PR introduce any user-facing change?
   
     Yes, this PR introduces several user-facing changes:
   
     1. New Catalog Provider: Users can now create Hologres catalogs via REST 
API or Web UI with provider name jdbc-hologres
     2. Required Properties: Hologres catalog requires the following 
configuration properties:
       - jdbc-driver: org.postgresql.Driver (pre-filled default)
       - jdbc-url: Hologres connection URL (e.g., 
jdbc:postgresql://{endpoint}:{port}/{database})
       - jdbc-user: Database username
       - jdbc-password: Database password
       - jdbc-database: Hologres database name
     3. Supported Column Types: Hologres catalog supports the following column 
types:
       - binary, boolean, char, date, decimal, double, float, integer, long, 
short
       - string, time, timestamp, timestamp_tz, varchar
     4. Table Properties: Tables loaded from Hologres include Hologres-specific 
properties with hologres. prefix:
       - hologres.table_id - Unique table identifier
       - hologres.storage_format - Storage format (orc/sst)
       - hologres.orientation - Storage mode (row/column/mixed)
       - hologres.distribution_key - Distribution column(s)
       - hologres.clustering_key - Clustering key column(s)
       - hologres.primary_key - Primary key column(s)
       - hologres.table_group - Table group name
       - hologres.lifecycle_in_days - TTL setting
       - hologres.create_time - Creation timestamp
       - hologres.last_ddl_time - Last DDL timestamp
     5. Schema Filtering: The following Hologres system schemas are 
automatically filtered out:
       - PostgreSQL system schemas: pg_toast, pg_catalog, information_schema
       - Hologres system schemas: holo, hologres_streaming_mv, hologres_sample, 
hg_internal, hologres_object_table, hg_recyclebin
     6. Table Type Support: All three Hologres table types are discoverable:
       - Regular tables
       - Views
       - Foreign tables (for MaxCompute and other external sources)
   
     How was this patch tested?
   
     The implementation was tested with a real Hologres instance:
   
     1. Catalog Creation: Verified catalog creation via REST API
     curl -X POST "http://localhost:8090/api/metalakes/{metalake}/catalogs"; \
       -H "Content-Type: application/json" \
       -d '{
         "name": "hologres",
         "type": "relational",
         "provider": "jdbc-hologres",
         "properties": {
           "jdbc-url": "jdbc:postgresql://{endpoint}:{port}/{database}",
           "jdbc-user": "{username}",
           "jdbc-password": "{password}",
           "jdbc-database": "{database}",
           "jdbc-driver": "org.postgresql.Driver"
         }
       }'
     2. Schema Listing: Verified that only user schemas are displayed (system 
schemas filtered):
       - Found: public, foreign_holo (user schemas)
       - Hidden: hologres_streaming_mv, hologres_sample, hg_internal, 
hologres_object_table, hg_recyclebin (system schemas)
     3. Table Listing: Verified listing of all table types:
       - Regular tables: tbl_1, holo_test
       - Views: holo_view
       - Foreign tables: adv_ad_feature, user_profile, etc.
     4. Metadata Loading: Verified Hologres-specific properties are correctly 
extracted:
       - Distribution key: hologres.distribution_key
       - Clustering key: hologres.clustering_key
       - Storage format: hologres.storage_format
       - All properties prefixed with hologres. for clarity
     5. Web UI Integration: Verified Hologres appears as a provider option in 
the frontend with proper configuration fields
     6. JDBC Driver Loading: Fixed and verified PostgreSQL JDBC driver is 
available at runtime for both Hologres and PostgreSQL catalogs
   
     Testing Instructions:
   
     To test this PR:
     1. Build the project: ./gradlew build -PpythonVersion=3.11
     2. Create distribution: ./gradlew compileDistribution
     3. Start Gravitino server: ./distribution/package/bin/graditino.sh start
     4. Create a Hologres catalog via REST API or Web UI
     5. List schemas and verify system schemas are hidden
     6. List tables and verify all table types are shown
     7. Load a table and verify Hologres metadata properties are present with 
hologres. prefix
   
     ---
     References:
     - https://help.aliyun.com/zh/hologres
     - https://help.aliyun.com/zh/hologres/developer-reference/system-tables
     - https://help.aliyun.com/zh/hologres/user-guide/distribution-key
     - https://help.aliyun.com/zh/hologres/user-guide/clustering-key


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to