github-actions[bot] commented on code in PR #59065:
URL: https://github.com/apache/doris/pull/59065#discussion_r2942716463
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/FileQueryScanNode.java:
##########
@@ -98,6 +104,12 @@ public abstract class FileQueryScanNode extends
FileScanNode {
protected FileSplitter fileSplitter;
+ // The data cache function only works for queries on Hive, Iceberg, Hudi,
and Paimon tables.
+ // See: https://doris.incubator.apache.org/docs/dev/lakehouse/data-cache
+ private static final Set<String> CACHEABLE_CATALOGS = new HashSet<>(
+ Arrays.asList("hms", "iceberg", "hudi", "paimon")
+ );
Review Comment:
**Info (LOW): `"hudi"` entry is unreachable**
There is no catalog with type `"hudi"` in Doris. Hudi tables are accessed
through HMS catalogs (catalog type = `"hms"`). Looking at `CatalogFactory` and
`InitCatalogLog.CatalogType`, there's no `HUDI` catalog type — Hudi is a table
format, not a catalog type.
This entry is dead code. Not harmful, but potentially misleading for future
maintainers who might think there's a separate Hudi catalog type to handle.
##########
fe/fe-core/src/main/java/org/apache/doris/datasource/FileQueryScanNode.java:
##########
@@ -651,6 +673,51 @@ public TableScanParams getScanParams() {
return this.scanParams;
}
+ protected boolean fileCacheAdmissionCheck() throws UserException {
+ boolean admissionResultAtTableLevel = true;
+ TableIf tableIf = getTargetTable();
+ String database = tableIf.getDatabase().getFullName();
+ String table = tableIf.getName();
Review Comment:
**Bug (HIGH): Potential NullPointerException**
`tableIf.getDatabase()` is called *before* the `instanceof ExternalTable`
check on line 682. For non-external tables like `FunctionGenTable` (used by
`TVFScanNode`), `getDatabase()` delegates to `getDbNullable()` which can return
`null`.
The `database` variable is then used in all code paths including the debug
log on line 713 (`"Skip file cache admission control for non-external table:
{}.{}", database, table`), which would throw NPE.
Suggested fix: move the `getDatabase().getFullName()` call inside the `if
(tableIf instanceof ExternalTable)` block:
```java
protected boolean fileCacheAdmissionCheck() throws UserException {
boolean admissionResultAtTableLevel = true;
TableIf tableIf = getTargetTable();
String table = tableIf.getName();
if (tableIf instanceof ExternalTable) {
ExternalTable externalTableIf = (ExternalTable) tableIf;
String database = tableIf.getDatabase().getFullName();
String catalog = externalTableIf.getCatalog().getName();
// ... rest of logic
}
return admissionResultAtTableLevel;
}
```
##########
fe/fe-common/src/main/java/org/apache/doris/common/Config.java:
##########
@@ -3871,6 +3871,18 @@ public static int metaServiceRpcRetryTimes() {
})
public static int first_error_msg_max_length = 256;
+ @ConfField(mutable = false, description = {
+ "是否启用文件缓存准入控制(黑白名单)",
+ "Whether to enable file cache admission control(Blocklist and
Allowlist)"
+ })
+ public static boolean enable_file_cache_admission_control = false;
+
+ @ConfField(mutable = true, description = {
+ "存储准入规则的JSON文件目录",
Review Comment:
**Issue (MEDIUM): Config mutability inconsistency**
`enable_file_cache_admission_control` is `mutable = false` (requires FE
restart), but `file_cache_admission_control_json_dir` below is `mutable = true`
(changeable at runtime).
However, the `ConfigWatcher` that monitors the JSON directory is only
started in `FileCacheAdmissionManager.loadOnStartup()`, which runs once at FE
startup. If `file_cache_admission_control_json_dir` is changed at runtime via
`SET GLOBAL`, the watcher still monitors the old directory path — the new path
is never picked up until FE restart.
Two options:
1. Make `file_cache_admission_control_json_dir` also `mutable = false`
(simpler, consistent)
2. Add a callback/listener that restarts the `ConfigWatcher` when the dir
config changes (more complex but truly mutable)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]