loserwang1024 commented on code in PR #2651:
URL: https://github.com/apache/fluss/pull/2651#discussion_r2796617518
##########
fluss-rpc/src/main/proto/FlussApi.proto:
##########
@@ -1184,4 +1199,34 @@ message PbKvSnapshotLeaseForBucket {
optional int64 partition_id = 1;
required int32 bucket_id = 2;
required int64 snapshot_id = 3;
+}
+
+message PbTableStatsReqForBucket {
+ optional int64 partition_id = 1;
+ required int32 bucket_id = 2;
+}
+
+message PbTableStatsRespForBucket {
+ optional int32 error_code = 1;
+ optional string error_message = 2;
+ optional int64 partition_id = 3;
+ required int32 bucket_id = 4;
+
+ // --- Table-level stats ---
+ // The number of rows in this bucket.
+ // For KV tables: the number of unique keys (live rows).
+ // For Log tables: total number of log records (highWatermark -
logStartOffset).
+ // Absent if row count is not available (e.g., WAL changelog mode or legacy
tables).
+ optional int64 row_count = 5;
+
+ // The data size in bytes of this bucket.
+ // For KV tables: the size of the KV store.
+ // For Log tables: the size of the log segments.
+ // Reserved for future use.
+ // optional int64 data_size_bytes = 6;
+
+ // --- Column-level stats (future) ---
+ // Per-column statistics, keyed by column index.
+ // Only populated when target_columns is specified in the request.
+ // repeated PbColumnStats column_stats = 10;
Review Comment:
is this code in use?
##########
fluss-rpc/src/main/proto/FlussApi.proto:
##########
@@ -1184,4 +1199,34 @@ message PbKvSnapshotLeaseForBucket {
optional int64 partition_id = 1;
required int32 bucket_id = 2;
required int64 snapshot_id = 3;
+}
+
+message PbTableStatsReqForBucket {
+ optional int64 partition_id = 1;
+ required int32 bucket_id = 2;
+}
+
+message PbTableStatsRespForBucket {
+ optional int32 error_code = 1;
+ optional string error_message = 2;
+ optional int64 partition_id = 3;
+ required int32 bucket_id = 4;
+
+ // --- Table-level stats ---
+ // The number of rows in this bucket.
+ // For KV tables: the number of unique keys (live rows).
+ // For Log tables: total number of log records (highWatermark -
logStartOffset).
+ // Absent if row count is not available (e.g., WAL changelog mode or legacy
tables).
+ optional int64 row_count = 5;
+
+ // The data size in bytes of this bucket.
+ // For KV tables: the size of the KV store.
+ // For Log tables: the size of the log segments.
+ // Reserved for future use.
+ // optional int64 data_size_bytes = 6;
Review Comment:
is this code in use?
##########
fluss-flink/fluss-flink-common/src/main/java/org/apache/fluss/flink/utils/PushdownUtils.java:
##########
@@ -365,76 +360,16 @@ public static Collection<RowData> limitScan(
}
}
- public static long countLogTable(TablePath tablePath, Configuration
flussConfig) {
+ public static long countTable(TablePath tablePath, Configuration
flussConfig) {
try (Connection connection =
ConnectionFactory.createConnection(flussConfig);
Admin flussAdmin = connection.getAdmin()) {
- TableInfo tableInfo = flussAdmin.getTableInfo(tablePath).get();
- int bucketCount = tableInfo.getNumBuckets();
- Collection<Integer> buckets =
- IntStream.range(0,
bucketCount).boxed().collect(Collectors.toList());
- List<PartitionInfo> partitionInfos;
- if (tableInfo.isPartitioned()) {
- partitionInfos =
flussAdmin.listPartitionInfos(tablePath).get();
- } else {
- partitionInfos = Collections.singletonList(null);
- }
-
- List<CompletableFuture<Long>> countFutureList =
- offsetLengthes(flussAdmin, tablePath, partitionInfos,
buckets);
- // wait for all the response
- CompletableFuture.allOf(countFutureList.toArray(new
CompletableFuture[0])).join();
- long count = 0;
- for (CompletableFuture<Long> countFuture : countFutureList) {
- count += countFuture.get();
- }
- return count;
+ TableStats tableStats = flussAdmin.getTableStats(tablePath).get();
Review Comment:
What if the fluss server not support getTableStats? Can flink job still work
here?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]