Github user chenghao-intel commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6762#discussion_r34214489
  
    --- Diff: 
unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
    @@ -206,6 +207,198 @@ public UTF8String toLowerCase() {
         return fromString(toString().toLowerCase());
       }
     
    +  /**
    +   * Copy the bytes from the current UTF8String, and make a new UTF8String.
    +   * @param start the start position of the current UTF8String in bytes.
    +   * @param end the end position of the current UTF8String in bytes.
    +   * @return a new UTF8String in the position of [start, end] of current 
UTF8String bytes.
    +   */
    +  private UTF8String copyUTF8String(byte[] bytes, int start, int end) {
    +    int len = end - start + 1;
    +    byte[] newBytes = new byte[len];
    +    System.arraycopy(bytes, start, newBytes, 0, len);
    +    return UTF8String.fromBytes(newBytes);
    +  }
    +
    +  public UTF8String trim() {
    +    byte[] bytes = getBytes();
    --- End diff --
    
    I am not sure the internal of UNSAFE, can we avoid the call of `getBytes`?
    I assume we have to call the `getBytes` anyway, so we just make the call of 
`getBytes`, and use it internally (and pass to copyUTF8String method).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to