[
https://issues.apache.org/jira/browse/LANG-1574?focusedWorklogId=453727&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-453727
]
ASF GitHub Bot logged work on LANG-1574:
----------------------------------------
Author: ASF GitHub Bot
Created on: 02/Jul/20 00:28
Start Date: 02/Jul/20 00:28
Worklog Time Spent: 10m
Work Description: XenoAmess edited a comment on pull request #562:
URL: https://github.com/apache/commons-lang/pull/562#issuecomment-652707356
> I sync'd this method's impl with the one in Commons Text; Text will
eventually reuse this method. WRT this patch, I do not see any explanation for
the use of the magic number 32.
@garydgregory
Hi.
1. I added a null-check for this function, so parameter null will not cause
trouble now.( by StringUtils.length() function.)
2. I added a length-check for not create new array when length == 0( by
StringUtils.length() function too.)
I see your latest commit, and seems you are agreed with thest two changes.
Maybe I should tell more about how I get the number 32.
You can see in the jmh test I added, there actually exists two functions.
```
public static char[] toCharArrayOld(final CharSequence cs) {
if (cs instanceof String) {
return ((String) cs).toCharArray();
}
final int sz = cs.length();
final char[] array = new char[cs.length()];
for (int i = 0; i < sz; i++) {
array[i] = cs.charAt(i);
}
return array;
}
/**
* Green implementation of toCharArray.
*
* @param cs the {@code CharSequence} to be processed
* @return the resulting char array
* @since 3.11
*/
public static char[] toCharArrayNew(final CharSequence cs) {
return cs.toString().toCharArray();
}
```
They actually do the same thing, the only difference is speed.
In theory, if the CharSequence is short enough, then toCharArrayOld should
be faster, as it does not cast a toString.
If CharSequence is long otherwise, then toCharArrayNew should be faster, as
usually when we implement a CharSequence, in toString we can have some more
efficient way than get all chars using index, one by one.
In other world, there exist a magic number m, when length <= m, we should
use toCharArrayOld. Otherwise we should use toCharArrayNew.
And we get the function in the main codes, who use toCharArrayOld when
length <= m, use toCharArrayNew when length > m
For why I chose the number 32:
In short, that is gotten from performance tests in jmh.
For CharBuffer and StringBuffer, it is always faster to use toCharArrayNew,
means you can see n as -1.
But for StringBuilder, it is around 32. At least 32 is better than 16, and
the n is between 16 and 32 according to my jmh tests.
I just think 32 is a elegant number to use here.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 453727)
Time Spent: 1h 20m (was: 1h 10m)
> refine CharSequenceUtils.toCharArray
> ------------------------------------
>
> Key: LANG-1574
> URL: https://issues.apache.org/jira/browse/LANG-1574
> Project: Commons Lang
> Issue Type: Sub-task
> Reporter: Jin Xu
> Priority: Minor
> Time Spent: 1h 20m
> Remaining Estimate: 0h
>
> [https://github.com/apache/commons-lang/pull/562]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)