github-actions[bot] commented on code in PR #64755:
URL: https://github.com/apache/doris/pull/64755#discussion_r3461136951


##########
be/src/exec/scan/olap_scanner.cpp:
##########
@@ -301,9 +301,21 @@ Status OlapScanner::_open_impl(RuntimeState* state) {
 
     auto res = _tablet_reader->init(_tablet_reader_params);
     if (!res.ok()) {
-        res.append("failed to initialize storage reader. tablet=" +
-                   std::to_string(_tablet_reader_params.tablet->tablet_id()) +
-                   ", backend=" + BackendOptions::get_localhost());
+        const auto tablet_id = 
std::to_string(_tablet_reader_params.tablet->tablet_id());
+        const auto& backend = BackendOptions::get_localhost();
+        // The reader eagerly reads the first block of each rowset during 
init() to seed
+        // the merge heap, so errors from evaluating pushed-down expressions 
(e.g. a strict
+        // cast on an illegal string -> INVALID_ARGUMENT) can surface here. 
Only label
+        // genuine storage-level failures as "failed to initialize storage 
reader"; for
+        // data/expression errors keep the message neutral so users are not 
misled into
+        // suspecting a corrupted tablet/segment.
+        if (res.is<ErrorCode::INVALID_ARGUMENT>()) {

Review Comment:
   This still misses the same class of misleading message for 
non-`INVALID_ARGUMENT` expression errors. The first-row read during 
`BlockReader::init()` can evaluate any pushed common expression from 
`_common_expr_ctxs_push_down`, not just strict casts. For example, a slot-based 
predicate containing decimal arithmetic can be pushed as a common expr, 
`SegmentIterator::_execute_common_expr()` returns the expression status, and 
with `check_overflow_for_decimal` enabled the arithmetic functions throw 
`ARITHMETIC_OVERFLOW_ERRROR`; datelike functions can similarly throw 
`OUT_OF_BOUND`. Those are also user data/expression failures, but this branch 
falls through to `". failed to initialize storage reader..."`, preserving the 
misleading storage diagnosis the PR is trying to remove. Please classify 
init-time expression/data statuses more broadly, or otherwise identify 
expression-evaluation failures explicitly, and add a regression test for this 
scanner-open path so the wording does not regress.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to