Copilot commented on code in PR #61844:
URL: https://github.com/apache/doris/pull/61844#discussion_r3014814959


##########
be/src/storage/segment/lazy_init_segment_iterator.cpp:
##########
@@ -40,8 +41,16 @@ Status LazyInitSegmentIterator::init(const 
StorageReadOptions& opts) {
     std::shared_ptr<Segment> segment;
     {
         SegmentCacheHandle segment_cache_handle;
-        RETURN_IF_ERROR(SegmentLoader::instance()->load_segment(
-                _rowset, _segment_id, &segment_cache_handle, 
_should_use_cache, false, opts.stats));
+        auto st = SegmentLoader::instance()->load_segment(
+                _rowset, _segment_id, &segment_cache_handle, 
_should_use_cache, false, opts.stats);
+        if ((st.is<ErrorCode::NOT_FOUND>() || st.is<ErrorCode::IO_ERROR>()) &&
+            config::ignore_not_found_segment) {
+            LOG(WARNING) << "segment io error, skip it. rowset_id=" << 
_rowset->rowset_id()
+                         << ", seg_id=" << _segment_id << ", status=" << st;
+            // _inner_iterator remains nullptr, next_batch() will return EOF
+            return Status::OK();
+        }
+        RETURN_IF_ERROR(st);

Review Comment:
   The ignore+log logic for segment load failures is now duplicated across 
`SegmentLoader::load_segments`, `LazyInitSegmentIterator::init`, and 
`BetaRowset::load_segments`. To avoid future drift (e.g., one path handling a 
new error code differently), consider extracting a small helper (e.g., 
`should_ignore_segment_load_error(Status)`) or a shared utility for the 
condition + standardized log message.



##########
be/src/common/config.h:
##########
@@ -1704,6 +1704,11 @@ DECLARE_mBool(enable_parquet_page_index);
 // Default is true, if set to false, the not found file will result in query 
failure.
 DECLARE_mBool(ignore_not_found_file_in_external_table);
 
+// Whether to ignore IO errors (NOT_FOUND, EIO) when loading segment files in 
native olap tables.
+// Default is true. When a segment file is missing or has IO errors,
+// the query/load will skip the failing segment instead of reporting error to 
users.
+DECLARE_mBool(ignore_not_found_segment);

Review Comment:
   The new config name `ignore_not_found_segment` is misleading because the 
behavior (and comment) also includes `IO_ERROR`, not just missing segments. 
Since this is a newly introduced config, consider renaming it to something like 
`ignore_segment_io_errors` (or otherwise aligning the name strictly to 
NOT_FOUND-only behavior) to avoid operator confusion when toggling at runtime.
   ```suggestion
   // Whether to ignore IO errors (NOT_FOUND, EIO) when loading segment files 
in native OLAP tables.
   // Default is true. When a segment file is missing or has IO errors,
   // the query/load will skip the failing segment instead of reporting error 
to users.
   DECLARE_mBool(ignore_segment_io_errors);
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to