matthiasblaesing commented on code in PR #5299:
URL: https://github.com/apache/netbeans/pull/5299#discussion_r1081855724
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -130,7 +132,12 @@ static String getStringWithLengthControl(StringReference
sr) throws InternalExce
try {
ReferenceType st = ObjectReferenceWrapper.referenceType(sr);
ArrayReference sa = null;
+ //only applicable if the string implementation uses a byte[]
instead
+ //of a char[]
+ boolean isUTF16 = false;
+ boolean isCompressedImpl = false;
Review Comment:
It is JEP254 or "Compact Strings". There is no compression involved, just
different character encodings (ISO-8859-1 vs. UTF-16).
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -141,25 +148,59 @@ static String getStringWithLengthControl(StringReference
sr) throws InternalExce
continue;
}
Type type = f.type();
- if (type instanceof ArrayType &&
- "char".equals(((ArrayType)
type).componentTypeName())) {
- valuesField = f;
+ if (type instanceof ArrayType) {
+ String componentType =
((ArrayType)type).componentTypeName();
+ if ("byte".equals(componentType)){
+ isCompressedImpl = true;
+ valuesField = f;
+ }
+ else if ("char".equals(componentType)){
+ valuesField = f;
+ }
+ else{
+ continue;
+ }
break;
}
}
}
+ else if (valuesField.type() instanceof ArrayType &&
+ "byte".equals(((ArrayType)valuesField.type()).
+ componentTypeName())){
+ isCompressedImpl = true;
+ }
if (valuesField == null) {
isShort = true; // We did not find the values field.
} else {
+ if (isCompressedImpl){
+ //is it compressed?
+ final int LATIN1 = 0;
+ Field coderField = ReferenceTypeWrapper.fieldByName(st,
+ "coder");
+ Value coderValue;
+ if (coderField != null){
+ coderValue = ObjectReferenceWrapper.getValue(sr,
+ coderField);
+ if (coderValue instanceof IntegerValue &&
+ ((IntegerValue)coderValue).value() !=
LATIN1){
Review Comment:
This check failed on JDK 17 for me. I got a `ByteValue` at this point.
Instead of getting to narrow just switch to `PrimitiveValue`, which has helpful
accessors.
```suggestion
if (coderValue instanceof PrimitiveValue &&
((PrimitiveValue)coderValue).intValue()
!= LATIN1){
```
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference
sr) throws InternalExce
} else {
assert sa != null;
int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+ if (isCompressedImpl && isUTF16){
+ l *= 2;
+ }
Review Comment:
See below.
```suggestion
List<Value> values = ArrayReferenceWrapper.getValues(sa, 0,
isUTF16 ? (l * 2) : l);
```
##########
java/debugger.jpda/src/org/netbeans/modules/debugger/jpda/models/ShortenedStrings.java:
##########
@@ -171,14 +212,43 @@ static String getStringWithLengthControl(StringReference
sr) throws InternalExce
} else {
assert sa != null;
int l = AbstractObjectVariable.MAX_STRING_LENGTH;
+ if (isCompressedImpl && isUTF16){
+ l *= 2;
+ }
List<Value> values = ArrayReferenceWrapper.getValues(sa, 0, l);
char[] characters = new char[l + 3];
- for (int i = 0; i < l; i++) {
- Value v = values.get(i);
- if (!(v instanceof CharValue)) {
- return "<Unreadable>";
+ if (isCompressedImpl) {
+ //java compressed string
+ if (!isUTF16) {
+ //we can just cast to char
+ for (int i = 0; i < l; i++) {
+ Value v = values.get(i);
+ if (!(v instanceof ByteValue)) {
+ return ERROR_RESULT;
+ }
+ char c = (char)((ByteValue) v).byteValue();
+ //remove the extended sign
+ c &= 0xFF;
+ characters[i] = c;
+ }
+ }
+ else {
+ //life is pain
+ //We can't just inline code for this since... native
+ //jazz and big/little endian stuff... so...
+ //see StringUTF16.java
+ //implement later!
+ return "<Not Implemented>";
Review Comment:
To correctly decode a string we'd need to know the byte order of the target
VM, but the following implemention should work correctly on little endian
architectures (x86, arm, riscv), which should at least cover mainline onces.
```suggestion
// This assumes little endian encoding. This should
work
// for most architectures (x86, arm, riscv), but will
// result in bogus data on big endian architectures
for (int i = 0; i < l; i++) {
int index = i * 2;
Value v = values.get(index);
if (!(v instanceof ByteValue)) {
return ERROR_RESULT;
}
Value v2 = values.get(index + 1);
if (!(v instanceof ByteValue)) {
return ERROR_RESULT;
}
char c1 = (char) ((ByteValue) v).byteValue();
char c2 = (char) ((ByteValue) v2).byteValue();
//remove the extended sign
c1 = (char) (0xFF & c1);
c2 = (char) (0xFF & c2);
// char bigEndianChar = (char) ((c1 << 8) | c2);
char litteEndianChar = (char) ((c2 << 8) | c1);
characters[i] = litteEndianChar;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
For further information about the NetBeans mailing lists, visit:
https://cwiki.apache.org/confluence/display/NETBEANS/Mailing+lists