Copilot commented on code in PR #3898:
URL: https://github.com/apache/hertzbeat/pull/3898#discussion_r2605470498
##########
hertzbeat-warehouse/src/main/java/org/apache/hertzbeat/warehouse/store/history/tsdb/duckdb/DuckdbDatabaseDataStorage.java:
##########
@@ -200,12 +224,7 @@ public void saveData(CollectRep.MetricsData metricsData) {
}
case CommonConstants.TYPE_TIME -> {
preparedStatement.setShort(5, (short)
CommonConstants.TYPE_TIME);
- try {
- preparedStatement.setInt(6,
Integer.parseInt(columnValue));
- } catch (NumberFormatException nfe) {
- log.warn("Failed to parse columnValue
'{}' as integer for metric '{}'. Setting value to null.", columnValue, metric,
nfe);
- preparedStatement.setObject(6, null);
- }
+ preparedStatement.setInt(6,
Integer.parseInt(columnValue));
Review Comment:
The removal of the try-catch block around `Integer.parseInt(columnValue)`
means that if `columnValue` cannot be parsed as an integer, a
`NumberFormatException` will be thrown but only caught by the outer
SQLException handler at line 243, which logs "error setting prepared statement"
without indicating the actual parsing issue or which metric/value caused the
problem.
The previous code provided better error handling with a more specific error
message that included the columnValue and metric name. Consider re-adding
specific error handling or ensuring the outer catch block provides sufficient
diagnostic information for debugging parsing failures.
##########
hertzbeat-warehouse/src/main/java/org/apache/hertzbeat/warehouse/store/history/tsdb/duckdb/DuckdbDatabaseDataStorage.java:
##########
@@ -17,6 +17,8 @@
package org.apache.hertzbeat.warehouse.store.history.tsdb.duckdb;
+import com.zaxxer.hikari.HikariConfig;
+import com.zaxxer.hikari.HikariDataSource;
Review Comment:
The code imports and uses HikariCP (`com.zaxxer.hikari.HikariConfig` and
`com.zaxxer.hikari.HikariDataSource`), but there's no explicit HikariCP
dependency in the `hertzbeat-warehouse/pom.xml`.
While HikariCP may be transitively available through Spring Boot
dependencies (if spring-boot-starter-jdbc is present), this is not guaranteed
since the module only has `spring-boot-starter-web` with `provided` scope.
Without an explicit dependency, the code may fail at runtime with a
`ClassNotFoundException`.
Recommendation: Add an explicit dependency in `hertzbeat-warehouse/pom.xml`:
```xml
<dependency>
<groupId>com.zaxxer</groupId>
<artifactId>HikariCP</artifactId>
</dependency>
```
##########
hertzbeat-warehouse/src/main/java/org/apache/hertzbeat/warehouse/store/history/tsdb/duckdb/DuckdbDatabaseDataStorage.java:
##########
@@ -80,9 +80,24 @@ public DuckdbDatabaseDataStorage() {
private boolean initDuckDb() {
try {
Class.forName(DRIVER_NAME);
- try (Connection connection =
DriverManager.getConnection(URL_PREFIX + dbPath);
+
+ // Initialize HikariCP
+ HikariConfig config = new HikariConfig();
+ config.setJdbcUrl(URL_PREFIX + dbPath);
+ config.setDriverClassName(DRIVER_NAME);
+ config.setPoolName("DuckDB-Pool");
+ // Important: Maintain at least one connection to keep the DB file
lock held
+ // and avoid frequent open/close of the embedded DB file which
causes lock errors.
+ config.setMinimumIdle(1);
+ config.setMaximumPoolSize(10);
+ config.setConnectionTimeout(30000);
+ config.setConnectionTestQuery("SELECT 1");
Review Comment:
The HikariCP configuration values (minimumIdle=1, maximumPoolSize=10,
connectionTimeout=30000) are hardcoded. Consider making these configurable
through `DuckdbProperties` to allow users to tune the connection pool based on
their workload and environment.
For example:
- `minimumIdle`: Could be adjusted based on expected concurrent query load
- `maximumPoolSize`: Might need to be higher for high-traffic environments
- `connectionTimeout`: Could be adjusted for slower storage systems
##########
hertzbeat-warehouse/src/main/java/org/apache/hertzbeat/warehouse/store/history/tsdb/duckdb/DuckdbDatabaseDataStorage.java:
##########
@@ -96,16 +111,13 @@ CREATE TABLE IF NOT EXISTS hzb_history (
record_time BIGINT,
labels VARCHAR)""";
statement.execute(createTableSql);
Review Comment:
The removal of index creation (previously `idx_hzb_history_composite` on
(instance, app, metrics, metric, record_time) and `idx_hzb_history_record_time`
on (record_time)) could negatively impact query performance for both the data
retrieval methods and the cleanup operations.
These indexes were specifically designed to optimize:
1. The composite index for query methods like `getHistoryMetricData()` and
`getHistoryIntervalMetricData()` which filter by instance, app, metrics, and
metric
2. The record_time index for the cleanup operation that deletes old records
Without justification in the PR description or code comments explaining why
indexes are no longer needed, this change risks performance degradation,
especially as the dataset grows. Consider re-adding the indexes or documenting
why they're not necessary.
```suggestion
statement.execute(createTableSql);
// Re-add indexes for performance on queries and cleanup
statement.execute("CREATE INDEX IF NOT EXISTS
idx_hzb_history_composite ON hzb_history (instance, app, metrics, metric,
record_time)");
statement.execute("CREATE INDEX IF NOT EXISTS
idx_hzb_history_record_time ON hzb_history (record_time)");
```
##########
hertzbeat-warehouse/src/main/java/org/apache/hertzbeat/warehouse/store/history/tsdb/duckdb/DuckdbProperties.java:
##########
@@ -26,14 +26,15 @@
/**
* Duckdb configuration information
* @param enabled use duckdb store metrics history data
- * @param expireTime save data expire time(ms)
+ * @param expireTime save data expire
Review Comment:
The JavaDoc comment for `expireTime` parameter is incomplete. The old
documentation stated "save data expire time(ms)" but the new version just says
"save data expire", which lacks clarity about the format and unit.
Recommendation: Complete the documentation to clarify that this parameter
accepts time duration strings like "30d" or "90d", not just milliseconds. For
example: "@param expireTime save data expire time (supports formats like '30d',
'90d', or milliseconds)"
```suggestion
* @param expireTime save data expire time (supports formats like '30d',
'90d', or milliseconds)
```
##########
hertzbeat-warehouse/src/main/java/org/apache/hertzbeat/warehouse/store/history/tsdb/duckdb/DuckdbProperties.java:
##########
@@ -26,14 +26,15 @@
/**
* Duckdb configuration information
* @param enabled use duckdb store metrics history data
- * @param expireTime save data expire time(ms)
+ * @param expireTime save data expire
+ * @param storePath duckdb database file path
*/
-
@ConfigurationProperties(prefix =
ConfigConstants.FunctionModuleConstants.WAREHOUSE
- + SignConstants.DOT
- + WarehouseConstants.STORE
- + SignConstants.DOT
- + WarehouseConstants.HistoryName.DUCKDB)
+ + SignConstants.DOT
+ + WarehouseConstants.STORE
+ + SignConstants.DOT
+ + WarehouseConstants.HistoryName.DUCKDB)
public record DuckdbProperties(@DefaultValue("true") boolean enabled,
- @DefaultValue("90d") String expireTime) {
+ @DefaultValue("30d") String expireTime,
Review Comment:
The default value for `expireTime` is inconsistent across configuration
files and code. The `DuckdbProperties` class defines the default as "30d", but
all the YAML configuration files (application.yml, application-test.yml)
specify "90d". This inconsistency could lead to unexpected behavior where users
expect a 90-day retention based on the config files, but get 30 days if they
don't explicitly set the value.
Recommendation: Align the default value to be consistent. Consider using
"90d" in the `DuckdbProperties` class to match the configuration files, or
update all configuration files to use "30d".
```suggestion
@DefaultValue("90d") String expireTime,
```
##########
hertzbeat-warehouse/src/main/java/org/apache/hertzbeat/warehouse/store/history/tsdb/duckdb/DuckdbProperties.java:
##########
@@ -6,7 +6,7 @@
* (the "License"); you may not use this file except in compliance with
* the License. You may obtain a copy of the License at
*
- * http://www.apache.org/licenses/LICENSE-2.0
+ * http://www.apache.org/licenses/LICENSE-2.0
Review Comment:
[nitpick] The URL formatting change from indented to aligned is inconsistent
with the rest of the license header. While this is a minor formatting change,
it's better to keep license headers consistent throughout the codebase.
```suggestion
* http://www.apache.org/licenses/LICENSE-2.0
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]