nnguyen168 commented on PR #55466:
URL: https://github.com/apache/spark/pull/55466#issuecomment-4553918031

   Thanks for addressing the line comment case @yadavay-amzn. However, the 
nested block comment bug I mentioned earlier is still present.
   
     Failing case:
     ```
     "SELECT 1; /* outer /* inner */ */"
       Expected: ["SELECT 1"]
       Actual:   ["SELECT 1", " /* outer /* inner */ */"]
   ```
   
     The issue is that hasPrecedingNonCommentString is still being set for 
every /*, including nested ones. When the second /* is encountered in /* outer 
/* inner */ */, the substring " /* outer " contains non-whitespace characters 
(/, *, o, etc.), so hasPrecedingNonCommentString incorrectly becomes true.
   
     The fix: Only set hasPrecedingNonCommentString when entering the outermost 
comment (i.e., when bracketedCommentLevel == 0 before incrementing):
   
   ```
     } else if (hasNext && line.charAt(index + 1) == '*') {
       // Only set hasPrecedingNonCommentString when entering the outermost 
comment
       if (bracketedCommentLevel == 0) {
         hasPrecedingNonCommentString = beginIndex != index &&
           line.substring(beginIndex, index).replaceAll("--[^\n]*", "")
             .exists(!_.isWhitespace)
       }
       bracketedCommentLevel += 1
     }
   ```
   
     The same fix is needed in both `SparkSQLCLIDriver.splitSemiColon` and 
`StringUtils.splitSemiColonWithIndex`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to