yuqi1129 commented on code in PR #9779:
URL: https://github.com/apache/gravitino/pull/9779#discussion_r2736524840
##########
scripts/h2/schema-1.2.0-h2.sql:
##########
@@ -480,3 +480,19 @@ CREATE TABLE IF NOT EXISTS `function_version_info` (
KEY `idx_funvcid` (`catalog_id`),
KEY `idx_funvsid` (`schema_id`)
) ENGINE=InnoDB;
+
+-- This schema extends version 1.1.0 with partition statistics storage support
+-- The partition_statistic_meta table stores partition-level statistics for
tables
+
+CREATE TABLE IF NOT EXISTS partition_statistic_meta (
Review Comment:
This PR is a feature that will not be cherry-picked to branch-1.1, so I
believe we do not need to add file `upgrade-1.0.0-to-1.1.1-xxx`.
Another point is: why are the changes in MySQL scripts different from those
in H2 and PostgreSQL?
##########
docs/manage-statistics-in-gravitino.md:
##########
@@ -234,11 +234,73 @@ table.dropPartitionStatistics(statisticsToDrop);
### Server configuration
-| Configuration item | Description
| Default value
| Required | Since version |
-|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------|-----------|---------------|
-| `gravitino.stats.partition.storageFactoryClass` | The storage factory class
for partition statistics, which is used to store partition statistics in the
different storage. The
`org.apache.gravitino.stats.storage.MemoryPartitionStatsStorageFactory` can
only be used for testing. |
`org.apache.gravitino.stats.storage.LancePartitionStatisticStorageFactory` |
No | 1.0.0 |
+| Configuration item | Description
| Default value
| Required | Since version |
+|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|-----------|---------------|
+| `gravitino.stats.partition.storageFactoryClass` | The storage factory class
for partition statistics, which is used to store partition statistics in the
different storage. The
`org.apache.gravitino.stats.storage.MemoryPartitionStatsStorageFactory` can
only be used for testing. |
`org.apache.gravitino.stats.storage.JdbcPartitionStatisticStorageFactory` | No
| 1.0.0 |
+#### JDBC Storage (Default)
+
+Starting from version 1.2.0, Gravitino uses JDBC-based storage as the default
partition statistics storage backend.
+This provides a reliable, production-ready solution that supports multiple
database backends:
+
+- **MySQL** (recommended for production)
+- **PostgreSQL**
+- **H2** (suitable for testing and development)
+
+To use JDBC storage, configure the following options by adding the prefix
`gravitino.stats.partition.storageOption.`:
+
+| Configuration item | Description
| Default value
| Required | Since version |
+|-----------------------------------------------------|----------------------------------------------------------------------|--------------------------------|----------|---------------|
Review Comment:
Can you try to align the table?
<img width="1110" height="425" alt="Image"
src="https://github.com/user-attachments/assets/6d623f01-a874-46a2-8902-32fb74967af0"
/>
##########
docs/manage-statistics-in-gravitino.md:
##########
@@ -234,11 +234,73 @@ table.dropPartitionStatistics(statisticsToDrop);
### Server configuration
-| Configuration item | Description
| Default value
| Required | Since version |
-|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------|-----------|---------------|
-| `gravitino.stats.partition.storageFactoryClass` | The storage factory class
for partition statistics, which is used to store partition statistics in the
different storage. The
`org.apache.gravitino.stats.storage.MemoryPartitionStatsStorageFactory` can
only be used for testing. |
`org.apache.gravitino.stats.storage.LancePartitionStatisticStorageFactory` |
No | 1.0.0 |
+| Configuration item | Description
| Default value
| Required | Since version |
+|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|-----------|---------------|
+| `gravitino.stats.partition.storageFactoryClass` | The storage factory class
for partition statistics, which is used to store partition statistics in the
different storage. The
`org.apache.gravitino.stats.storage.MemoryPartitionStatsStorageFactory` can
only be used for testing. |
`org.apache.gravitino.stats.storage.JdbcPartitionStatisticStorageFactory` | No
| 1.0.0 |
+#### JDBC Storage (Default)
+
+Starting from version 1.2.0, Gravitino uses JDBC-based storage as the default
partition statistics storage backend.
+This provides a reliable, production-ready solution that supports multiple
database backends:
+
+- **MySQL** (recommended for production)
+- **PostgreSQL**
+- **H2** (suitable for testing and development)
+
+To use JDBC storage, configure the following options by adding the prefix
`gravitino.stats.partition.storageOption.`:
+
+| Configuration item | Description
| Default value
| Required | Since version |
+|-----------------------------------------------------|----------------------------------------------------------------------|--------------------------------|----------|---------------|
+| `gravitino.stats.partition.storageOption.jdbc-url` | JDBC connection URL
(e.g., jdbc:mysql://localhost:3306/gravitino) | None
| Yes | 1.2.0 |
+| `gravitino.stats.partition.storageOption.jdbc-user` | Database username
| None
| Yes | 1.2.0 |
+| `gravitino.stats.partition.storageOption.jdbc-password` | Database password
| None
| Yes | 1.2.0 |
+| `gravitino.stats.partition.storageOption.jdbc-driver` | JDBC driver class
name | `com.mysql.cj.jdbc.Driver`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.pool-max-size` | Maximum connection
pool size | `10`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.pool-min-idle` | Minimum idle
connections in pool | `2`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.connection-timeout-ms` | Connection
timeout in milliseconds | `30000`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.test-on-borrow` | Test connections
before use | `true`
| No | 1.2.0 |
+
+**Example MySQL Configuration:**
+
+```properties
+gravitino.stats.partition.storageFactoryClass=org.apache.gravitino.stats.storage.JdbcPartitionStatisticStorageFactory
+gravitino.stats.partition.storageOption.jdbc-url=jdbc:mysql://localhost:3306/gravitino
+gravitino.stats.partition.storageOption.jdbc-user=gravitino
+gravitino.stats.partition.storageOption.jdbc-password=gravitino123
+gravitino.stats.partition.storageOption.pool-max-size=20
+```
+
+**Example PostgreSQL Configuration:**
+
+```properties
+gravitino.stats.partition.storageFactoryClass=org.apache.gravitino.stats.storage.JdbcPartitionStatisticStorageFactory
+gravitino.stats.partition.storageOption.jdbc-url=jdbc:postgresql://localhost:5432/gravitino
+gravitino.stats.partition.storageOption.jdbc-user=gravitino
+gravitino.stats.partition.storageOption.jdbc-password=gravitino123
+gravitino.stats.partition.storageOption.jdbc-driver=org.postgresql.Driver
+```
+
+**Database Schema Setup:**
+
+Before using JDBC storage, you need to create the database schema. Schema
files are provided for all supported databases:
+
+- MySQL: `scripts/mysql/schema-1.2.0-mysql.sql`
Review Comment:
We'd better avoid using a concrete version number; ${GRAVITINO_VERSION} may
be a better choice.
##########
docs/manage-statistics-in-gravitino.md:
##########
@@ -234,11 +234,73 @@ table.dropPartitionStatistics(statisticsToDrop);
### Server configuration
-| Configuration item | Description
| Default value
| Required | Since version |
-|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------|-----------|---------------|
-| `gravitino.stats.partition.storageFactoryClass` | The storage factory class
for partition statistics, which is used to store partition statistics in the
different storage. The
`org.apache.gravitino.stats.storage.MemoryPartitionStatsStorageFactory` can
only be used for testing. |
`org.apache.gravitino.stats.storage.LancePartitionStatisticStorageFactory` |
No | 1.0.0 |
+| Configuration item | Description
| Default value
| Required | Since version |
+|-------------------------------------------------|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------------------------------------------------------------------|-----------|---------------|
+| `gravitino.stats.partition.storageFactoryClass` | The storage factory class
for partition statistics, which is used to store partition statistics in the
different storage. The
`org.apache.gravitino.stats.storage.MemoryPartitionStatsStorageFactory` can
only be used for testing. |
`org.apache.gravitino.stats.storage.JdbcPartitionStatisticStorageFactory` | No
| 1.0.0 |
+#### JDBC Storage (Default)
+
+Starting from version 1.2.0, Gravitino uses JDBC-based storage as the default
partition statistics storage backend.
+This provides a reliable, production-ready solution that supports multiple
database backends:
+
+- **MySQL** (recommended for production)
+- **PostgreSQL**
+- **H2** (suitable for testing and development)
+
+To use JDBC storage, configure the following options by adding the prefix
`gravitino.stats.partition.storageOption.`:
+
+| Configuration item | Description
| Default value
| Required | Since version |
+|-----------------------------------------------------|----------------------------------------------------------------------|--------------------------------|----------|---------------|
+| `gravitino.stats.partition.storageOption.jdbc-url` | JDBC connection URL
(e.g., jdbc:mysql://localhost:3306/gravitino) | None
| Yes | 1.2.0 |
+| `gravitino.stats.partition.storageOption.jdbc-user` | Database username
| None
| Yes | 1.2.0 |
+| `gravitino.stats.partition.storageOption.jdbc-password` | Database password
| None
| Yes | 1.2.0 |
+| `gravitino.stats.partition.storageOption.jdbc-driver` | JDBC driver class
name | `com.mysql.cj.jdbc.Driver`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.pool-max-size` | Maximum connection
pool size | `10`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.pool-min-idle` | Minimum idle
connections in pool | `2`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.connection-timeout-ms` | Connection
timeout in milliseconds | `30000`
| No | 1.2.0 |
+| `gravitino.stats.partition.storageOption.test-on-borrow` | Test connections
before use | `true`
| No | 1.2.0 |
+
+**Example MySQL Configuration:**
+
+```properties
+gravitino.stats.partition.storageFactoryClass=org.apache.gravitino.stats.storage.JdbcPartitionStatisticStorageFactory
Review Comment:
I would suggest you add space before and after `=`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]