btashton opened a new issue #2425:
URL: https://github.com/apache/incubator-nuttx/issues/2425


   # Why is the Build Larger
   ## Intro
   There was some discussion of the growth in code size between 9.1.0 and 
10.0.0-RC0.  We have some tools to help us look into this, so I did a little 
digging.
   
   To be clear I am not making any statement here about if we should do 
something about it or not, but I wanted us to at least be aware of what caused 
these increases.
   
   ## Discovery
   For reference I am using the `stm32f4discovery:nsh` as the reference target 
because it seems to be a common target for people that has been around for a 
long time.
   
   So first off here is the top level size information without with the 
standard configuration out of the box build which includes optimisation.
   ```
   ❯ arm-none-eabi-size ../testdata/opt/nuttx-*
      text         data     bss     dec     hex filename
     69947          104    2104   72155   119db ../testdata/opt/nuttx-10.0.0-RC0
     69447          104    2476   72027   1195b ../testdata/opt/nuttx-9.1.0
   ```
   .text sees an increase of 500 bytes and .data for 372.  These are not huge, 
but are also not insignificant and but we cannot keep increasing by this much 
every release.
   
   There are a lot of changes between these two releases, but we can first 
analyse the debug information to get a better idea what is going on.  I added 
with kconfig-tweak the debug symbols and the  used Bloaty to generate these 
reports
   
   Here is the quick script I used to generate the builds I wanted to look at:
   ```bash
   set -e +x
   NUTTX_TAG=$1
   APPS=apps
   NUTTX=incubator-nuttx
   OUTDIR=testdata
   for REPO in $NUTTX $APPS
   do
     pushd $REPO
     git clean -xdf
     git checkout $NUTTX_TAG
     popd
   done
   pushd $NUTTX
   ./tools/configure.sh stm32f4discovery:nsh
   kconfig-tweak --enable DEBUG_SYMBOLS
   make oldconfig
   make -j8
   popd
   mkdir -p testdata
   cp $NUTTX/nuttx $OUTDIR/$NUTTX_TAG
   ```
   
   This is the comparison between 10.0.0-RC0 and 9.1.0.
   ```
   ❯ bloaty -n 20 -d sections,symbols ../testdata/nuttx-10.0.0-RC0 -- 
../testdata/nuttx-9.1.0
       FILE SIZE        VM SIZE    
    --------------  -------------- 
     +1.8% +16.2Ki  [ = ]       0    .debug_info
     +0.7% +1.64Ki  [ = ]       0    .debug_line
     +3.8% +1.13Ki  [ = ]       0    .debug_str
     +0.6%    +636  +0.6%    +636    .text
       +1.9%    +400  +1.9%    +400    [103 Others]
       [NEW]    +284  [NEW]    +284    mm_memalign
       [NEW]    +232  [NEW]    +232    nxtask_init
       [NEW]    +212  [NEW]    +212    nxsem_clockwait
        +83%    +158   +83%    +158    fclose
        +61%    +132   +61%    +132    nsh_session
       [NEW]     +96  [NEW]     +96    cxx_initialize
        +44%     +76   +44%     +76    nx_vioctl
        +95%     +72   +95%     +72    proc_findnode
        +19%     +64   +19%     +64    nx_vopen
       +238%     +62  +238%     +62    inode_search
       [NEW]     +62  [NEW]     +62    sem_clockwait
      -72.0%     -72 -72.0%     -72    wd_initialize
       [DEL]     -76  [DEL]     -76    up_cxxinitialize
      -46.2%     -96 -46.2%     -96    nxthread_create
      -66.7%    -120 -66.7%    -120    isspace
      -76.7%    -138 -76.7%    -138    nsh_system
       [DEL]    -148  [DEL]    -148    wd_create
       [DEL]    -152  [DEL]    -152    wd_delete
      -40.4%    -194 -40.4%    -194    opendir
      -87.9%    -218 -87.9%    -218    nxsem_timedwait
     +0.9%    +432  [ = ]       0    .symtab
       +1.9%    +384  [ = ]       0    [section .symtab]
       [NEW]     +48  [ = ]       0    _inode_getcwd
       [NEW]     +48  [ = ]       0    cxx_initialize
       [NEW]     +48  [ = ]       0    inode_root_reserve
       [NEW]     +48  [ = ]       0    memalign
       [NEW]     +48  [ = ]       0    nxsem_clockwait
       [NEW]     +48  [ = ]       0    nxtask_init
       [NEW]     +48  [ = ]       0    nxtask_uninit
       [NEW]     +32  [ = ]       0    _assert
       [NEW]     +32  [ = ]       0    inited.5681
       +200%     +32  [ = ]       0    inode_search
       [NEW]     +32  [ = ]       0    mm_memalign
       [DEL]     -32  [ = ]       0    g_wdpool
      -66.7%     -32  [ = ]       0    inode_insert
      -66.7%     -32  [ = ]       0    inode_unlink
      -40.0%     -32  [ = ]       0    nx_start
      -66.7%     -32  [ = ]       0    nx_stat
       [DEL]     -48  [ = ]       0    up_cxxinitialize
       [DEL]     -48  [ = ]       0    wd_create
       [DEL]     -48  [ = ]       0    wd_delete
       -9.5%    -112  [ = ]       0    [56 Others]
     +0.3%    +374  [ = ]       0    .debug_abbrev
     +1.9%    +224  [ = ]       0    .debug_aranges
     +0.5%    +188  [ = ]       0    .debug_frame
     +0.9%    +170  [ = ]       0    .strtab
       +1.3%     +80  [ = ]       0    [section .strtab]
       [NEW]     +32  [ = ]       0    nxsem_clockwait_uninterruptible
       [NEW]     +19  [ = ]       0    inode_root_reserve
       [NEW]     +18  [ = ]       0    __FUNCTION__.5716
       [NEW]     +18  [ = ]       0    __FUNCTION__.6102
       [NEW]     +18  [ = ]       0    __FUNCTION__.6110
       [NEW]     +18  [ = ]       0    __FUNCTION__.6118
       [NEW]     +18  [ = ]       0    __FUNCTION__.6128
       [NEW]     +16  [ = ]       0    nxsem_clockwait
       [NEW]     +15  [ = ]       0    cxx_initialize
       [NEW]     +15  [ = ]       0    nxtask_startup
       [NEW]     +14  [ = ]       0    _inode_getcwd
       [NEW]     +14  [ = ]       0    nxtask_uninit
       -7.2%      -5  [ = ]       0    [13 Others]
       [DEL]     -13  [ = ]       0    g_wdfreelist
       [DEL]     -17  [ = ]       0    up_cxxinitialize
       [DEL]     -18  [ = ]       0    __FUNCTION__.5606
       [DEL]     -18  [ = ]       0    __FUNCTION__.5991
       [DEL]     -18  [ = ]       0    __FUNCTION__.5999
       [DEL]     -18  [ = ]       0    __FUNCTION__.6007
       [DEL]     -18  [ = ]       0    __FUNCTION__.6019
     +0.4%     +36  +0.4%     +36    .ARM.extab
     +0.4%     +24  +0.4%     +24    .ARM.exidx
    -29.4%      -5  [ = ]       0    [Unmapped]
     -5.6%     -24  [ = ]       0    .debug_ranges
     [ = ]       0 -14.8%    -372    .bss
       [ = ]       0  +8.9%     +20    g_idletcb
       [ = ]       0  [NEW]      +4    inited.5681
       [ = ]       0 -13.3%      -2    [section .bss]
       [ = ]       0  [DEL]      -2    g_wdnfree
       [ = ]       0  [DEL]      -8    g_wdfreelist
       [ = ]       0  [DEL]    -384    g_wdpool
     +1.3% +20.9Ki  +0.3%    +324    TOTAL
   ```
   
   ## Analysis
   From this we can see some wins around the watchdog, although that is likely 
at the cost of some dynamic allocations.
   
   The new symbols are good places for us to look for the growth.  Some of 
these are places were logic has been re-factored an example may be where 
`cxx_initialized` is new and `up_cxxinitialize` is old.
   
   Other places we see large features being brought in such as `mm_memalign`.  
Lets take a look at this
   Here are some that jumped out to me:
    * `mm_memalign` (new)
    * `nxtask_init` (new)
    * `nxsem_clockwait` (new)
    * `fclose`
    * `nsh_session`
    * `proc_findnode`
   
   To dig into what these builds actually look like I am using Cutter 
https://cutter.re/ which is mostly just a frontend for Radare2.
   
   ### `mm_memalign` (+284)
   This is a new symbol included in the build, but a function that has existed 
for a long time.  We can look at the global call graph to see what calls it:
   
![image](https://user-images.githubusercontent.com/173245/100486239-f4414a00-30b7-11eb-9d2b-dd16b80b3fe0.png)
   It is only called by `up_create_stack`
   
   We can then look at the local call graph between the two releases for 
`up_create_stack`
   __10.0.0-RC0__
   
![image](https://user-images.githubusercontent.com/173245/100486310-323e6e00-30b8-11eb-8c4d-17af74d7a5fe.png)
   __9.1.0__
   
![image](https://user-images.githubusercontent.com/173245/100486329-44b8a780-30b8-11eb-9d52-6ba92cbd6112.png)
   
   We have made a change between the releases that is not calling memalign 
instead of malloc.
   Taking a look a the diff between the releases we see:
   ```diff
    #else /* CONFIG_TLS_ALIGNED */
   @@ -163,14 +172,16 @@ int up_create_stack(FAR struct tcb_s *tcb, size_t 
stack_size, uint8_t ttype)
    
          if (ttype == TCB_FLAG_TTYPE_KERNEL)
            {
   -          tcb->stack_alloc_ptr = (uint32_t *)kmm_malloc(stack_size);
   +          tcb->stack_alloc_ptr =
   +              (uint32_t *)kmm_memalign(CONFIG_STACK_ALIGNMENT, alloc_size);
            }
          else
    #endif
            {
              /* Use the user-space allocator if this is a task or pthread */
    
   -          tcb->stack_alloc_ptr = (uint32_t *)kumm_malloc(stack_size);
   +          tcb->stack_alloc_ptr =
   +              (uint32_t *)kumm_memalign(CONFIG_STACK_ALIGNMENT, alloc_size);
            }
    #endif /* CONFIG_TLS_ALIGNED */
    
   ```
   This change came in as part of this PR 
https://github.com/apache/incubator-nuttx/pull/1562
   
   ## `proc_findnode` (+72)
   This increase seems to have come in as part of this 
https://github.com/apache/incubator-nuttx/pull/1490
   This is a minor code change to fix a bug, but the generated code is 
surprisingly large. 
   
![image](https://user-images.githubusercontent.com/173245/100487271-8c413280-30bc-11eb-9f2d-5aedc9e497e6.png)
   
   ## `fclose` (+158)
   This largely comes from code complexity change that came in as 
https://github.com/apache/incubator-nuttx/pull/1611
   Once again the change looks small, but ends up creating a fair bit of code.
   
![image](https://user-images.githubusercontent.com/173245/100487668-b85db300-30be-11eb-9289-24dc01f8534a.png)
   
   
   ## `nsh_session` (+132)
   Used the same technique as before looking at the call graph.  We have a 
quite a bit of complexity increase here that came in with
   https://github.com/apache/incubator-nuttx-apps/pull/367 which adds the 
ability to pass in arguments to `nsh`.  I suspect this functionality might have 
been best held behind a config flag.  The file stream change added some code 
size here, but very little.
   
   ## `nxtask_init`
   (TODO)
   ## `nxsem_clockwait`
   (TODO)
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to