Github user wzhfy commented on a diff in the pull request:
https://github.com/apache/spark/pull/12646#discussion_r118476637
--- Diff:
common/unsafe/src/main/java/org/apache/spark/unsafe/types/UTF8String.java ---
@@ -522,6 +533,37 @@ public UTF8String trimLeft() {
}
}
+ /**
+ * Removes the given trim string from the beginning of a string
+ * This method searches each character in the source string starting
from the left end, removes the character if it
+ * is in the trim string, stops at the first character which is not in
the trim string, returns the new string.
+ * @param trimString the trim character string
+ */
+ public UTF8String trimLeft(UTF8String trimString) {
+ int srchIdx = 0; // the searching byte position of the input string
+ int trimIdx = 0; // the first beginning byte position of a
non-matching character
+
+ while (srchIdx < numBytes) {
+ UTF8String searchChar = copyUTF8String(srchIdx, srchIdx +
numBytesForFirstByte(this.getByte(srchIdx)) - 1);
+ int searchCharBytes = searchChar.numBytes;
+ // try to find the matching for the searchChar in the trimString set
+ if (trimString.find(searchChar, 0) >= 0) {
+ trimIdx += searchCharBytes;
+ } else {
+ // no matching, exit the search
+ break;
+ }
+ srchIdx += searchCharBytes;
+ }
+
+ if (trimIdx >= numBytes) {
+ // empty string
+ return EMPTY_UTF8;
+ } else {
+ return copyUTF8String(trimIdx, numBytes -1);
--- End diff --
nit: `numBytes - 1`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]