Copilot commented on code in PR #61284:
URL: https://github.com/apache/doris/pull/61284#discussion_r2924233068
##########
fe/fe-core/src/main/java/org/apache/doris/job/offset/s3/S3SourceOffsetProvider.java:
##########
@@ -162,11 +169,19 @@ public void fetchRemoteMeta(Map<String, String>
properties) throws Exception {
String filePath = storageProperties.validateAndNormalizeUri(uri);
List<RemoteFile> objects = new ArrayList<>();
GlobListResult globListResult =
fileSystem.globListWithLimit(filePath, objects, startFile, 1, 1);
- if (globListResult != null && !objects.isEmpty() &&
StringUtils.isNotEmpty(globListResult.getMaxFile())) {
+ // debug point: simulate globListWithLimit returning a failed
status (e.g. S3 auth error)
+ if
(DebugPointUtil.isEnable("S3SourceOffsetProvider.fetchRemoteMeta.error")) {
+ globListResult = new GlobListResult(new
Status(Status.ErrCode.COMMON_ERROR,
+ "debug point: simulated S3 auth error"));
+ }
Review Comment:
The debug point is evaluated after making the real `globListWithLimit(...)`
call, which still performs an S3 list and can introduce latency/flakiness in
the regression test (and defeats the purpose of simulating the failure
deterministically). Consider checking the debug point before calling
`globListWithLimit(...)` and short-circuiting to the failed `GlobListResult` to
avoid the remote call entirely when the debug point is enabled.
##########
fe/fe-core/src/main/java/org/apache/doris/job/offset/s3/S3SourceOffsetProvider.java:
##########
@@ -69,6 +71,11 @@ public S3Offset getNextOffset(StreamingJobProperties
jobProps, Map<String, Strin
filePath = storageProperties.validateAndNormalizeUri(uri);
GlobListResult globListResult =
fileSystem.globListWithLimit(filePath, rfiles, startFile,
jobProps.getS3BatchBytes(), jobProps.getS3BatchFiles());
+ if (globListResult == null || !globListResult.getStatus().ok()) {
+ String errMsg = globListResult != null
+ ? globListResult.getStatus().getErrMsg() : "null
result";
+ throw new RuntimeException("Failed to list S3 files: " +
errMsg);
Review Comment:
When `globListWithLimit(...)` returns null or non-ok, the thrown message
lacks key context (e.g., which `uri`/`filePath` was being listed and possibly
the Status error code). Including `filePath` (and
`globListResult.getStatus().getErrCode()` if available) will make production
debugging and operator triage significantly easier than `null result`/errMsg
alone.
```suggestion
String errCode = (globListResult != null &&
globListResult.getStatus() != null)
?
String.valueOf(globListResult.getStatus().getErrCode())
: "unknown";
throw new RuntimeException("Failed to list S3 files for
path: " + filePath
+ ", errCode=" + errCode + ", errMsg=" + errMsg);
```
##########
regression-test/suites/job_p0/streaming_job/test_streaming_insert_job_fetch_meta_error.groovy:
##########
@@ -0,0 +1,96 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements. See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership. The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License. You may obtain a copy of the License at
+//
+// http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied. See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+
+import org.awaitility.Awaitility
+
+import static java.util.concurrent.TimeUnit.SECONDS
+
+// Verify that when fetchRemoteMeta receives a failed GlobListResult (e.g. S3
auth error),
+// the streaming job is correctly PAUSED rather than hanging indefinitely.
+suite("test_streaming_insert_job_fetch_meta_error", "nonConcurrent") {
+ def tableName = "test_streaming_insert_job_fetch_meta_error_tbl"
+ def jobName = "test_streaming_insert_job_fetch_meta_error_job"
+
+ sql """drop table if exists `${tableName}` force"""
+ sql """
+ DROP JOB IF EXISTS where jobname = '${jobName}'
+ """
+
+ sql """
+ CREATE TABLE IF NOT EXISTS ${tableName} (
+ `c1` int NULL,
+ `c2` string NULL,
+ `c3` int NULL,
Review Comment:
The trailing comma after the last column definition (`c3`) will cause a SQL
syntax error in many SQL parsers. Remove the trailing comma after `c3` so the
table creation is valid.
```suggestion
`c3` int NULL
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]