[GitHub] [netbeans] matthiasblaesing commented on a diff in pull request #5299: [NETBEANS-4123] Initial implementation of handling large strings

via GitHub Sun, 29 Jan 2023 14:42:11 -0800


matthiasblaesing commented on code in PR #5299:
URL: https://github.com/apache/netbeans/pull/5299#discussion_r1081855724



##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -130,7 +132,12 @@ static String getStringWithLengthControl(StringReference 
sr) throws InternalExce
         try {
             ReferenceType st = ObjectReferenceWrapper.referenceType(sr);
             ArrayReference sa = null;
+            //only applicable if the string implementation uses a byte[] 
instead
+            //of a char[]
+            boolean isUTF16 = false;
+            boolean isCompressedImpl = false;

Review Comment:
   It is JEP254 or "Compact Strings". There is no compression involved, just 
different character encodings (ISO-8859-1 vs. UTF-16).



##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -141,25 +148,59 @@ static String getStringWithLengthControl(StringReference 
sr) throws InternalExce
                             continue;
                         }
                         Type type = f.type();
-                        if (type instanceof ArrayType &&
-                            "char".equals(((ArrayType) 
type).componentTypeName())) {
-                            valuesField = f;
+                        if (type instanceof ArrayType) {
+                            String componentType = 
((ArrayType)type).componentTypeName();
+                            if ("byte".equals(componentType)){
+                                isCompressedImpl = true;
+                                valuesField = f;
+                            }
+                            else if ("char".equals(componentType)){
+                                valuesField = f;
+                            }
+                            else{
+                                continue;
+                            }
                             break;
                         }
                     }
                 }
+                else if (valuesField.type() instanceof ArrayType &&
+                        "byte".equals(((ArrayType)valuesField.type()).
+                                componentTypeName())){
+                    isCompressedImpl = true;
+                }
                 if (valuesField == null) {
                     isShort = true; // We did not find the values field.
                 } else {
+                    if (isCompressedImpl){
+                        //is it compressed?
+                        final int LATIN1 = 0;
+                        Field coderField = ReferenceTypeWrapper.fieldByName(st,
+                                "coder");
+                        Value coderValue;
+                        if (coderField != null){
+                            coderValue = ObjectReferenceWrapper.getValue(sr,
+                                    coderField);
+                            if (coderValue instanceof IntegerValue &&
+                                    ((IntegerValue)coderValue).value() != 
LATIN1){

Review Comment:
   This check failed on JDK 17 for me. I got a `ByteValue` at this point. 
Instead of getting to narrow just switch to `PrimitiveValue`, which has helpful 
accessors.
   
   ```suggestion
                               if (coderValue instanceof PrimitiveValue &&
                                       ((PrimitiveValue)coderValue).intValue() 
!= LATIN1){
   ```



##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference 
sr) throws InternalExce
             } else {
                 assert sa != null;
                 int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+                if (isCompressedImpl && isUTF16){
+                    l *= 2;
+                }

Review Comment:
   See below.
   
   ```suggestion
                   List<Value> values = ArrayReferenceWrapper.getValues(sa, 0, 
isUTF16 ? (l * 2) : l);
   ```



##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference 
sr) throws InternalExce
             } else {
                 assert sa != null;
                 int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+                if (isCompressedImpl && isUTF16){
+                    l *= 2;
+                }
                 List<Value> values = ArrayReferenceWrapper.getValues(sa, 0, l);
                 char[] characters = new char[l + 3];
-                for (int i = 0; i < l; i++) {
-                    Value v = values.get(i);
-                    if (!(v instanceof CharValue)) {
-                        return "<Unreadable>";
+                if (isCompressedImpl) {
+                    //java compressed string
+                    if (!isUTF16) {
+                        //we can just cast to char
+                        for (int i = 0; i < l; i++) {
+                            Value v = values.get(i);
+                            if (!(v instanceof ByteValue)) {
+                                return ERROR_RESULT;
+                            }
+                            char c = (char)((ByteValue) v).byteValue();
+                            //remove the extended sign
+                            c &= 0xFF;
+                            characters[i] = c;
+                        }
+                    }
+                    else {
+                        //life is pain
+                        //We can't just inline code for this since... native
+                        //jazz and big/little endian stuff... so...
+                        //see StringUTF16.java
+                        //implement later!
+                        return "<Not Implemented>";

Review Comment:
   To correctly decode a string we'd need to know the byte order of the target 
VM, but the following implemention should work correctly on little endian 
architectures (x86, arm, riscv), which should at least cover mainline onces.
   
   ```suggestion
                           // This assumes little endian encoding. This should 
work
                           // for most architectures (x86, arm, riscv), but will
                           // result in bogus data on big endian architectures
                           for (int i = 0; i < l; i++) {
                               int index = i * 2;
                               Value v = values.get(index);
                               if (!(v instanceof ByteValue)) {
                                   return ERROR_RESULT;
                               }
                               Value v2 = values.get(index + 1);
                               if (!(v instanceof ByteValue)) {
                                   return ERROR_RESULT;
                               }
                               char c1 = (char) ((ByteValue) v).byteValue();
                               char c2 = (char) ((ByteValue) v2).byteValue();
                               //remove the extended sign
                               c1 = (char) (0xFF & c1);
                               c2 = (char) (0xFF & c2);
                               // char bigEndianChar = (char) ((c1 << 8) | c2);
                               char litteEndianChar = (char) ((c2 << 8) | c1);
                               characters[i] = litteEndianChar;
                           }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists

[GitHub] [netbeans] matthiasblaesing commented on a diff in pull request #5299: [NETBEANS-4123] Initial implementation of handling large strings

Reply via email to