[jira] [Work logged] (LANG-1574) refine CharSequenceUtils.toCharArray

ASF GitHub Bot (Jira) Wed, 01 Jul 2020 17:28:08 -0700


     [ 
https://issues.apache.org/jira/browse/LANG-1574?focusedWorklogId=453725&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-453725
 ]


ASF GitHub Bot logged work on LANG-1574:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 02/Jul/20 00:27
            Start Date: 02/Jul/20 00:27
    Worklog Time Spent: 10m 
      Work Description: XenoAmess edited a comment on pull request #562:
URL: https://github.com/apache/commons-lang/pull/562#issuecomment-652707356


   > I sync'd this method's impl with the one in Commons Text; Text will 
eventually reuse this method. WRT this patch, I do not see any explanation for 
the use of the magic number 32.
   
   @garydgregory 
   Hi.
   1. I added a null-check for this function, so parameter null will not cause 
trouble now.( by StringUtils.length() function.)
   2. I added a length-check for not create new array when length == 0( by 
StringUtils.length() function too.)
   
   I see your latest commit, and seems you are agreed with thest two changes.
   
   Maybe I should tell more about how I get the number 32.
   
   You can see in the jmh test I added, there actually exists two functions.
   ```
       public static char[] toCharArrayOld(final CharSequence cs) {
           if (cs instanceof String) {
               return ((String) cs).toCharArray();
           }
           final int sz = cs.length();
           final char[] array = new char[cs.length()];
           for (int i = 0; i < sz; i++) {
               array[i] = cs.charAt(i);
           }
           return array;
       }
   
       /**
        * Green implementation of toCharArray.
        *
        * @param cs the {@code CharSequence} to be processed
        * @return the resulting char array
        * @since 3.11
        */
       public static char[] toCharArrayNew(final CharSequence cs) {
           return cs.toString().toCharArray();
       }
   
   ```
   
   They actually do the same thing, the only difference is speed.
   In theory, if the CharSequence is short enough, then toCharArrayOld should 
be faster, as it does not cast a toString.
   If CharSequence is long otherwise, then toCharArrayNew should be faster, as 
usually when we implement a CharSequence, in toString we can have some more 
efficient way than get all chars using index, one by one.
   In other world, there exist a magic number m, when length < m, we should use 
toCharArrayOld. Otherwise we should use toCharArrayNew.
   And we get the function in the main codes, who use toCharArrayOld when 
length < m, use toCharArrayNew when length >= m
   
   For why I chose the number 32:
   In short, that is gotten from performance tests in jmh.
   For CharBuffer and StringBuffer, it is always faster to use toCharArrayNew, 
means you can see n as -1.
   But for StringBuilder, it is around 32. At least 32 is better than 16, and 
the n is between 16 and 32 according to my jmh tests.
   I just think 32 is a elegant number to use here.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Issue Time Tracking
-------------------

    Worklog Id:     (was: 453725)
    Time Spent: 1h 10m  (was: 1h)

> refine CharSequenceUtils.toCharArray
> ------------------------------------
>
>                 Key: LANG-1574
>                 URL: https://issues.apache.org/jira/browse/LANG-1574
>             Project: Commons Lang
>          Issue Type: Sub-task
>            Reporter: Jin Xu
>            Priority: Minor
>          Time Spent: 1h 10m
>  Remaining Estimate: 0h
>
> [https://github.com/apache/commons-lang/pull/562]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Work logged] (LANG-1574) refine CharSequenceUtils.toCharArray

Reply via email to