jangjangji commented on code in PR #1426:
URL: https://github.com/apache/cloudberry/pull/1426#discussion_r2567536755


##########
gpMgmt/bin/gppylib/logfilter.py:
##########
@@ -278,13 +278,17 @@ def __next__(self):
         item = next(self.source)
         #we need to make a minor format change to the log level field so that
         # our single regex will match both.
-        item[16] = item[16] + ": "
+        # Check if item has enough columns before accessing index 16
+        if isinstance(item, list) and len(item) > 16:
+            item[16] = item[16] + ": "

Review Comment:
   **### When is the condition false?**
   
     The condition is false when len(item) <= 16, which occurs when the CSV 
reader incorrectly parses lines containing multi-line strings (e.g., SQL      
     statements with newlines).
   
     While our CSV files follow the standard 23-column PostgreSQL csvlog 
format, some log entries contain multi-line quoted strings that cause the CSV   
 
      reader to return incomplete rows with fewer than 17 columns.
   
   
   
   ###   **Is it correct to ignore the else branch?**
   
     Yes. The item[16] + ": " transformation is optional formatting to help 
regex pattern matching - not a required operation.
   
     Without this check, a single malformed line causes IndexError, which 
terminates the entire iterator silently (treated as StopIteration),
     resulting in 0 output despite reading thousands of lines.
   
     By skipping the transformation for malformed lines:
     - Valid lines (23 columns) → processed normally with the transformation
     - Malformed lines (< 17 columns) → passed through unchanged
     - Downstream filters naturally handle/discard non-matching entries
   
     This allows processing to continue instead of failing silently on the 
first malformed line.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to