Copilot commented on code in PR #2606:
URL: https://github.com/apache/groovy/pull/2606#discussion_r3407028451


##########
src/main/java/org/apache/groovy/parser/antlr4/util/PositionConfigureUtils.java:
##########
@@ -61,7 +61,10 @@ public static Tuple2<Integer, Integer> endPosition(Token 
token) {
         }
 
         if (0 == newLineCnt) {
-            return tuple(token.getLine(), token.getCharPositionInLine() + 1 + 
token.getText().length());
+            // column numbers are code-point based (the lexer reads a 
CodePointCharStream), so
+            // measure the token text in code points too -- otherwise 
supplementary characters
+            // (e.g. emoji) make lastColumnNumber over-count by one UTF-16 
unit each (GROOVY-12085)
+            return tuple(token.getLine(), token.getCharPositionInLine() + 1 + 
stopText.codePointCount(0, stopTextLength));
         } else { // e.g. GStringEnd contains newlines, we should fix the 
location info
             return tuple(token.getLine() + newLineCnt, stopTextLength - 
stopText.lastIndexOf('\n'));
         }

Review Comment:
   endPosition() still assumes token.getText() is non-null (despite the earlier 
null guard) and also computes the last column on the final line using UTF-16 
indices. This can throw a NullPointerException when stopText is null, and it 
will over-count columns when the final line contains supplementary characters 
after a newline.



##########
src/main/java/org/apache/groovy/parser/antlr4/util/PositionConfigureUtils.java:
##########
@@ -75,7 +78,9 @@ public static <T extends ASTNode> T configureAST(T astNode, 
Token token) {
         astNode.setLineNumber(token.getLine());
         astNode.setColumnNumber(token.getCharPositionInLine() + 1);
         astNode.setLastLineNumber(token.getLine());
-        astNode.setLastColumnNumber(token.getCharPositionInLine() + 1 + 
token.getText().length());
+        // measure the token text in code points to keep lastColumnNumber 
code-point based (GROOVY-12085)
+        String text = token.getText();
+        astNode.setLastColumnNumber(token.getCharPositionInLine() + 1 + 
text.codePointCount(0, text.length()));
 

Review Comment:
   configureAST(Token) now measures token text in code points, but it still 
dereferences token.getText() without a null check. This can throw a 
NullPointerException for tokens that don't have backing text (the method 
previously had the same issue).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to