Copilot commented on code in PR #60543:
URL: https://github.com/apache/doris/pull/60543#discussion_r2768410662
##########
fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java:
##########
@@ -4367,16 +4371,34 @@ public TStatus
reportCommitTxnResult(TReportCommitTxnResultRequest request) thro
return new TStatus(TStatusCode.NOT_MASTER);
}
- LOG.info("receive load stats report request: {}, backend: {}, dbId:
{}, txnId: {}, label: {}",
- request, clientAddr, request.getDbId(), request.getTxnId(),
request.getLabel());
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("receive load stats report from backend: {}, dbId: {},
txnId: {}, label: {}, tabletIds: {}",
+ clientAddr, request.getDbId(), request.getTxnId(),
request.getLabel(), request.getTabletIds());
+ }
try {
- byte[] receivedProtobufBytes = request.getPayload();
- if (receivedProtobufBytes == null || receivedProtobufBytes.length
<= 0) {
- return new TStatus(TStatusCode.INVALID_ARGUMENT);
+ if (request.isSetTxnId() && request.getTxnId() != -1) {
+ byte[] receivedProtobufBytes = request.getPayload();
+ if (receivedProtobufBytes == null ||
receivedProtobufBytes.length <= 0) {
+ return new TStatus(TStatusCode.INVALID_ARGUMENT);
+ }
+ CommitTxnResponse commitTxnResponse =
CommitTxnResponse.parseFrom(receivedProtobufBytes);
+
Env.getCurrentGlobalTransactionMgr().afterCommitTxnResp(commitTxnResponse,
request.getTabletIds());
Review Comment:
Potential NullPointerException: `request.getTabletIds()` can be null if the
optional field is not set in the Thrift request. Should add a null check using
`request.isSetTabletIds()` before accessing the list, or provide a default
empty list value.
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CloudTabletStatMgr.java:
##########
@@ -365,4 +510,70 @@ private GetTabletStatsResponse
getTabletStats(GetTabletStatsRequest request)
public List<OlapTable.Statistics> getCloudTableStats() {
return this.cloudTableStatsList;
}
+
+ public void addActiveTablets(List<Long> tabletIds) {
+ synchronized (activeTablets) {
+ activeTablets.addAll(tabletIds);
+ }
+ }
Review Comment:
Potential NullPointerException: The `tabletIds` parameter is not checked for
null before calling `addAll`. If a null list is passed in, this will throw a
NullPointerException. Should add a null check before synchronizing and adding
to the set.
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/transaction/CloudGlobalTransactionMgr.java:
##########
@@ -541,6 +543,12 @@ public void afterCommitTxnResp(CommitTxnResponse
commitTxnResponse) {
if (sb.length() > 0) {
LOG.info("notify partition first load. {}", sb);
}
+ // 4. notify update tablet stats
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("force sync tablet stats for txnId: {}, tabletNum: {},
tabletIds: {}", txnId,
+ tabletIds.size(), tabletIds);
+ }
+ ((CloudTabletStatMgr)
(env.getTabletStatMgr())).addActiveTablets(tabletIds);
Review Comment:
Potential ClassCastException: There's no check to verify that
`env.getTabletStatMgr()` is actually an instance of `CloudTabletStatMgr` before
casting. Although this is in CloudGlobalTransactionMgr (which should only run
in cloud mode), adding an instanceof check would make the code more robust
against configuration errors.
```suggestion
if (env.getTabletStatMgr() instanceof CloudTabletStatMgr) {
((CloudTabletStatMgr)
env.getTabletStatMgr()).addActiveTablets(tabletIds);
} else {
LOG.warn("tabletStatMgr is not instance of CloudTabletStatMgr,
skip updating active tablets for txnId: {}",
txnId);
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/transaction/CloudGlobalTransactionMgr.java:
##########
@@ -655,11 +663,12 @@ private void commitTransactionWithoutLock(long dbId,
List<Table> tableList, long
}
final CommitTxnRequest commitTxnRequest = builder.build();
- executeCommitTxnRequest(commitTxnRequest, transactionId, is2PC,
txnCommitAttachment);
+ executeCommitTxnRequest(commitTxnRequest, transactionId, is2PC,
txnCommitAttachment,
+ tabletCommitInfos.stream().map(t ->
t.getTabletId()).collect(Collectors.toList()));
Review Comment:
Potential NullPointerException: `tabletCommitInfos` can be null (as seen in
line 1740 where it's passed as null). Calling `tabletCommitInfos.stream()` on
line 667 will throw a NullPointerException. Should add a null check and handle
the case where tabletCommitInfos is null, potentially passing an empty list to
the method.
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CloudTabletStatMgr.java:
##########
@@ -365,4 +510,70 @@ private GetTabletStatsResponse
getTabletStats(GetTabletStatsRequest request)
public List<OlapTable.Statistics> getCloudTableStats() {
return this.cloudTableStatsList;
}
+
+ public void addActiveTablets(List<Long> tabletIds) {
+ synchronized (activeTablets) {
+ activeTablets.addAll(tabletIds);
+ }
+ }
+
+ // master FE send update tablet stats rpc to other FEs
+ private void pushTabletStats(GetTabletStatsResponse response) {
+ List<Frontend> frontends = getFrontends();
+ if (frontends == null || frontends.isEmpty()) {
+ return;
+ }
+ TSyncCloudTabletStatsRequest request = new
TSyncCloudTabletStatsRequest();
+ request.setTabletStatsPb(ByteBuffer.wrap(response.toByteArray()));
+ for (Frontend fe : frontends) {
+ SYNC_TABLET_STATS_THREAD_POOL.submit(() -> {
+ try {
+ pushTabletStatsToFe(request, fe);
+ } catch (Exception e) {
+ LOG.warn("push tablet stats to fe: {} error",
fe.getHost(), e);
+ }
+ });
+ }
+ }
+
+ private void pushTabletStatsToFe(TSyncCloudTabletStatsRequest request,
Frontend fe) {
+ FrontendService.Client client = null;
+ TNetworkAddress addr = new TNetworkAddress(fe.getHost(),
fe.getRpcPort());
+ boolean ok = false;
+ try {
+ client = ClientPool.frontendStatsPool.borrowObject(addr);
+ TSyncCloudTabletStatsResult result =
client.syncCloudTabletStats(request);
+ ok = true;
+ if (result.getStatus().getStatusCode() != TStatusCode.OK) {
+ LOG.warn("failed to update cloud tablet stats to frontend
{}:{}, err: {}", fe.getHost(),
+ fe.getRpcPort(), result.getStatus().getErrorMsgs());
+ }
+ } catch (Exception e) {
+ LOG.warn("failed to update update cloud tablet stats to frontend
{}:{}", fe.getHost(), fe.getRpcPort(), e);
+ } finally {
+ if (ok) {
+ ClientPool.frontendStatsPool.returnObject(addr, client);
+ } else {
+ ClientPool.frontendStatsPool.invalidateObject(addr, client);
Review Comment:
Potential resource leak: If
`ClientPool.frontendStatsPool.borrowObject(addr)` throws an exception, the
client will be null, but `invalidateObject` will still be called with a null
client. While this may be handled internally by the pool, it would be cleaner
to check if client is not null before calling invalidateObject.
```suggestion
if (client != null) {
if (ok) {
ClientPool.frontendStatsPool.returnObject(addr, client);
} else {
ClientPool.frontendStatsPool.invalidateObject(addr,
client);
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/common/proc/TabletsProcDir.java:
##########
@@ -194,6 +207,11 @@ public List<List<Comparable>> fetchComparableResult(long
version, long backendId
} finally {
table.readUnlock();
}
+ if (tabletIds != null && !tabletIds.isEmpty()) {
+ LOG.info("force sync tablet stats for table: {}, tabletNum: {},
tabletIds: {}", table,
+ tabletIds.size(), tabletIds);
+ ((CloudTabletStatMgr)
(Env.getCurrentEnv().getTabletStatMgr())).addActiveTablets(tabletIds);
Review Comment:
Potential ClassCastException: There's no check to verify that
`Env.getCurrentEnv().getTabletStatMgr()` is actually an instance of
`CloudTabletStatMgr` before casting. Although this code path checks
`Config.isCloudMode()`, adding an instanceof check would make the code more
defensive. Additionally, the result of the cast is not checked for null before
calling methods on it.
```suggestion
Object tabletStatMgr = Env.getCurrentEnv().getTabletStatMgr();
if (tabletStatMgr instanceof CloudTabletStatMgr) {
((CloudTabletStatMgr)
tabletStatMgr).addActiveTablets(tabletIds);
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java:
##########
@@ -4707,6 +4729,37 @@ public TGetOlapTableMetaResult
getOlapTableMeta(TGetOlapTableMetaRequest request
}
}
+ @Override
+ public TSyncCloudTabletStatsResult
syncCloudTabletStats(TSyncCloudTabletStatsRequest request)
+ throws TException {
+ TSyncCloudTabletStatsResult response = new
TSyncCloudTabletStatsResult();
+ TStatus status = new TStatus(TStatusCode.OK);
+ response.setStatus(status);
+
+ if (Env.getCurrentEnv().isMaster()) {
+ LOG.warn("syncCloudTabletStats called on master, ignoring");
+ return response;
+ }
+
+ byte[] receivedProtobufBytes = request.getTabletStatsPb();
+ if (receivedProtobufBytes == null || receivedProtobufBytes.length <=
0) {
+ status.setStatusCode(TStatusCode.INVALID_ARGUMENT);
+ status.addToErrorMsgs("TabletStatsPb is null or empty");
+ return response;
+ }
+ GetTabletStatsResponse getTabletStatsResponse;
+ try {
+ getTabletStatsResponse =
GetTabletStatsResponse.parseFrom(receivedProtobufBytes);
+ } catch (Exception e) {
+ status.setStatusCode(TStatusCode.INVALID_ARGUMENT);
+ status.addToErrorMsgs("parse GetTabletStatsResponse error: " +
e.getMessage());
+ return response;
+ }
+ CloudTabletStatMgr cloudTabletStatMgr = (CloudTabletStatMgr)
(Env.getCurrentEnv().getTabletStatMgr());
+ cloudTabletStatMgr.syncTabletStats(getTabletStatsResponse);
Review Comment:
Missing null check: After casting, there's no null check for
`cloudTabletStatMgr` before calling `syncTabletStats`. If `getTabletStatMgr()`
returns null, this will throw a NullPointerException. Should verify that
`cloudTabletStatMgr` is not null before invoking methods on it.
```suggestion
if (cloudTabletStatMgr != null) {
cloudTabletStatMgr.syncTabletStats(getTabletStatsResponse);
} else {
LOG.warn("TabletStatMgr is null in syncCloudTabletStats,
skipping sync");
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CloudTabletStatMgr.java:
##########
@@ -333,24 +458,44 @@ private void updateStatInfo(List<Long> dbIds) {
(System.currentTimeMillis() - start));
}
- private void updateTabletStat(GetTabletStatsResponse response) {
+ private void updateTabletStat(GetTabletStatsResponse response, boolean
activeUpdate) {
TabletInvertedIndex invertedIndex = Env.getCurrentInvertedIndex();
for (TabletStatsPB stat : response.getTabletStatsList()) {
List<Replica> replicas =
invertedIndex.getReplicasByTabletId(stat.getIdx().getTabletId());
if (replicas == null || replicas.isEmpty() || replicas.get(0) ==
null) {
continue;
}
Replica replica = replicas.get(0);
+ boolean statsChanged = replica.getDataSize() != stat.getDataSize()
+ || replica.getRowsetCount() != stat.getNumRowsets()
+ || replica.getSegmentCount() != stat.getNumSegments()
+ || replica.getRowCount() != stat.getNumRows()
+ || replica.getLocalInvertedIndexSize() !=
stat.getIndexSize()
+ || replica.getLocalSegmentSize() != stat.getSegmentSize();
replica.setDataSize(stat.getDataSize());
replica.setRowsetCount(stat.getNumRowsets());
replica.setSegmentCount(stat.getNumSegments());
replica.setRowCount(stat.getNumRows());
replica.setLocalInvertedIndexSize(stat.getIndexSize());
replica.setLocalSegmentSize(stat.getSegmentSize());
+
Review Comment:
Potential ClassCastException: The code casts `replica` to `CloudReplica`
without verifying it's actually an instance of CloudReplica. While this method
should only be called in cloud mode, adding an instanceof check would make the
code more defensive and prevent unexpected ClassCastExceptions.
```suggestion
if (!(replica instanceof CloudReplica)) {
continue;
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/alter/CloudSchemaChangeJobV2.java:
##########
@@ -91,6 +92,15 @@ protected void commitShadowIndex() throws
AlterCancelException {
}
LOG.info("commitShadowIndex finished, dbId:{}, tableId:{}, jobId:{},
shadowIdxList:{}",
dbId, tableId, jobId, shadowIdxList);
+
+ List<Long> tabletIds = partitionIndexMap.cellSet().stream()
+ .flatMap(cell ->
cell.getValue().getTablets().stream().map(Tablet::getId))
+ .collect(Collectors.toList());
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("force sync tablet stats for table: {}, tabletNum: {},
tabletIds: {}", tableId,
+ tabletIds.size(), tabletIds);
+ }
+ ((CloudTabletStatMgr)
(Env.getCurrentEnv().getTabletStatMgr())).addActiveTablets(tabletIds);
Review Comment:
Potential ClassCastException: There's no check to verify that
`Env.getCurrentEnv().getTabletStatMgr()` is actually an instance of
`CloudTabletStatMgr` before casting. Although this is in CloudSchemaChangeJobV2
(which should only run in cloud mode), adding an instanceof check would make
the code more robust.
```suggestion
Object tabletStatMgr = Env.getCurrentEnv().getTabletStatMgr();
if (tabletStatMgr instanceof CloudTabletStatMgr) {
((CloudTabletStatMgr) tabletStatMgr).addActiveTablets(tabletIds);
} else {
LOG.warn("tabletStatMgr is not instance of CloudTabletStatMgr
when committing shadow index for "
+ "tableId: {}, jobId: {}. Actual type: {}", tableId,
jobId,
(tabletStatMgr != null ?
tabletStatMgr.getClass().getName() : "null"));
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CloudTabletStatMgr.java:
##########
@@ -60,11 +101,42 @@ public CloudTabletStatMgr() {
@Override
protected void runAfterCatalogReady() {
LOG.info("cloud tablet stat begin");
- List<Long> dbIds = getAllTabletStats();
+ // get stats for active tablets
+ Set<Long> copiedTablets;
+ synchronized (activeTablets) {
+ copiedTablets = new HashSet<>(activeTablets);
+ activeTablets.clear();
+ }
+ getActiveTabletStats(copiedTablets);
+
+ // get stats by interval
+ List<Long> dbIds = getAllTabletStats(cloudTablet -> {
+ if (copiedTablets.contains(cloudTablet.getId())) {
+ return false;
+ }
+ List<Replica> replicas =
Env.getCurrentInvertedIndex().getReplicas(cloudTablet.getId());
+ if (replicas == null || replicas.isEmpty()) {
+ return false;
+ }
+ CloudReplica cloudReplica = (CloudReplica) replicas.get(0);
Review Comment:
Potential ClassCastException: The code casts `replicas.get(0)` to
`CloudReplica` without verifying it's actually an instance of CloudReplica.
While this should be safe in cloud mode, adding an instanceof check would make
the code more defensive against unexpected states.
```suggestion
Replica firstReplica = replicas.get(0);
if (!(firstReplica instanceof CloudReplica)) {
LOG.warn("unexpected replica type for tabletId {}: {}",
cloudTablet.getId(),
firstReplica != null ?
firstReplica.getClass().getName() : "null");
return false;
}
CloudReplica cloudReplica = (CloudReplica) firstReplica;
```
##########
fe/fe-core/src/main/java/org/apache/doris/cloud/transaction/CloudGlobalTransactionMgr.java:
##########
@@ -541,6 +543,12 @@ public void afterCommitTxnResp(CommitTxnResponse
commitTxnResponse) {
if (sb.length() > 0) {
LOG.info("notify partition first load. {}", sb);
}
+ // 4. notify update tablet stats
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("force sync tablet stats for txnId: {}, tabletNum: {},
tabletIds: {}", txnId,
+ tabletIds.size(), tabletIds);
+ }
+ ((CloudTabletStatMgr)
(env.getTabletStatMgr())).addActiveTablets(tabletIds);
Review Comment:
Potential NullPointerException: `tabletIds` parameter is not checked for
null before calling `.size()` in the debug log. If a caller passes null (which
appears possible from the interface signature), this will throw an NPE. Should
add a null check before accessing the list.
```suggestion
tabletIds == null ? 0 : tabletIds.size(), tabletIds);
}
if (tabletIds != null && !tabletIds.isEmpty()) {
((CloudTabletStatMgr)
(env.getTabletStatMgr())).addActiveTablets(tabletIds);
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/catalog/CloudTabletStatMgr.java:
##########
@@ -87,17 +159,23 @@ private List<Long> getAllTabletStats() {
OlapTable tbl = (OlapTable) table;
for (Partition partition : tbl.getAllPartitions()) {
for (MaterializedIndex index :
partition.getMaterializedIndices(IndexExtState.VISIBLE)) {
- for (Long tabletId : index.getTabletIdsInOrder()) {
+ for (Tablet tablet : index.getTablets()) {
+ if (predicate != null) {
Review Comment:
Potential ClassCastException: The code casts `tablet` to `CloudTablet`
without verifying it's actually an instance of CloudTablet. While this should
be safe in cloud mode, adding an instanceof check would make the code more
defensive against unexpected states.
```suggestion
if (predicate != null) {
if (!(tablet instanceof CloudTablet)) {
continue;
}
```
##########
fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java:
##########
@@ -4367,16 +4371,34 @@ public TStatus
reportCommitTxnResult(TReportCommitTxnResultRequest request) thro
return new TStatus(TStatusCode.NOT_MASTER);
}
- LOG.info("receive load stats report request: {}, backend: {}, dbId:
{}, txnId: {}, label: {}",
- request, clientAddr, request.getDbId(), request.getTxnId(),
request.getLabel());
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("receive load stats report from backend: {}, dbId: {},
txnId: {}, label: {}, tabletIds: {}",
+ clientAddr, request.getDbId(), request.getTxnId(),
request.getLabel(), request.getTabletIds());
+ }
try {
- byte[] receivedProtobufBytes = request.getPayload();
- if (receivedProtobufBytes == null || receivedProtobufBytes.length
<= 0) {
- return new TStatus(TStatusCode.INVALID_ARGUMENT);
+ if (request.isSetTxnId() && request.getTxnId() != -1) {
+ byte[] receivedProtobufBytes = request.getPayload();
+ if (receivedProtobufBytes == null ||
receivedProtobufBytes.length <= 0) {
+ return new TStatus(TStatusCode.INVALID_ARGUMENT);
+ }
+ CommitTxnResponse commitTxnResponse =
CommitTxnResponse.parseFrom(receivedProtobufBytes);
+
Env.getCurrentGlobalTransactionMgr().afterCommitTxnResp(commitTxnResponse,
request.getTabletIds());
+ } else {
+ // compaction notify update tablet stats
+ CloudTabletStatMgr cloudTabletStatMgr = (CloudTabletStatMgr)
(Env.getCurrentEnv().getTabletStatMgr());
+ if (cloudTabletStatMgr == null) {
+ LOG.warn("CloudTabletStatMgr is null");
+ TStatus status = new TStatus(TStatusCode.INTERNAL_ERROR);
+ status.addToErrorMsgs("CloudTabletStatMgr is null");
+ return status;
+ }
+ if (LOG.isDebugEnabled()) {
+ LOG.debug("force sync tablet stats for txnId: {},
tabletNum: {}, tabletIds: {}", request.txnId,
+ request.getTabletIds().size(),
request.getTabletIds());
+ }
+ cloudTabletStatMgr.addActiveTablets(request.getTabletIds());
Review Comment:
Potential NullPointerException: `request.getTabletIds()` can be null if the
field is not set in the Thrift request. This will cause an NPE when calling
`.size()` on line 4398 and when passing it to `addActiveTablets`. Should check
if `request.isSetTabletIds()` and handle the null case before accessing the
list.
##########
fe/fe-core/src/main/java/org/apache/doris/service/FrontendServiceImpl.java:
##########
@@ -4707,6 +4729,37 @@ public TGetOlapTableMetaResult
getOlapTableMeta(TGetOlapTableMetaRequest request
}
}
+ @Override
+ public TSyncCloudTabletStatsResult
syncCloudTabletStats(TSyncCloudTabletStatsRequest request)
+ throws TException {
+ TSyncCloudTabletStatsResult response = new
TSyncCloudTabletStatsResult();
+ TStatus status = new TStatus(TStatusCode.OK);
+ response.setStatus(status);
+
+ if (Env.getCurrentEnv().isMaster()) {
+ LOG.warn("syncCloudTabletStats called on master, ignoring");
+ return response;
+ }
+
+ byte[] receivedProtobufBytes = request.getTabletStatsPb();
+ if (receivedProtobufBytes == null || receivedProtobufBytes.length <=
0) {
+ status.setStatusCode(TStatusCode.INVALID_ARGUMENT);
+ status.addToErrorMsgs("TabletStatsPb is null or empty");
+ return response;
+ }
+ GetTabletStatsResponse getTabletStatsResponse;
+ try {
+ getTabletStatsResponse =
GetTabletStatsResponse.parseFrom(receivedProtobufBytes);
+ } catch (Exception e) {
+ status.setStatusCode(TStatusCode.INVALID_ARGUMENT);
+ status.addToErrorMsgs("parse GetTabletStatsResponse error: " +
e.getMessage());
+ return response;
+ }
+ CloudTabletStatMgr cloudTabletStatMgr = (CloudTabletStatMgr)
(Env.getCurrentEnv().getTabletStatMgr());
Review Comment:
Potential ClassCastException: There's no check to verify that
`Env.getCurrentEnv().getTabletStatMgr()` is actually an instance of
`CloudTabletStatMgr` before casting. In cloud mode, this is likely safe, but in
non-cloud mode or during transitions, this could throw a ClassCastException.
Consider adding an `instanceof` check or wrapping in a try-catch for safety.
```suggestion
if (!(Env.getCurrentEnv().getTabletStatMgr() instanceof
CloudTabletStatMgr)) {
status.setStatusCode(TStatusCode.INTERNAL_ERROR);
status.addToErrorMsgs("TabletStatMgr is not an instance of
CloudTabletStatMgr");
return response;
}
CloudTabletStatMgr cloudTabletStatMgr = (CloudTabletStatMgr)
Env.getCurrentEnv().getTabletStatMgr();
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]