Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Joe Perches
On Fri, 2017-09-22 at 23:39 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:
> > On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
> > > On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> > > > On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> > > >textdata bss dec hex filename
> > > >   18220 176   0   1839647dc 
> > > > build/tmp/lib/lz4/lz4_decompress-after.o
> > > >   22297   0   0   222975719 
> > > > build/tmp/lib/lz4/lz4_decompress-before.o
> > > 
> > > Perhaps not so much a gcc bug as an opportunity
> > > for gcc to add an additional optimization.
> > > 
> > > gcc would have to verify that the const array is
> > > not initialized with some variable or argument like:
> > > 
> > > int foo(int a)
> > > {
> > > const int array[] = {1, a};
> > > ...
> > > }
> > 
> > It depends. With a 10KB different in .text size, my guess is that this
> > is a case where gcc does the right optimization in principle, but
> > fails to do what was intended in some corner cases.
> 
> I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
> produces lots of expensive checks here with gcc-5 or higher.
> 
> Disabling it makes a big difference:
> 
> upstream:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
> gcc-7.0.0: 16535 bytes
> 
> patched:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
> gcc-7.0.0: 14490 bytes

I think you are looking at a different issue.

There seems still a difference in size between
current and Colin's patch in compiled object size.



>   Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Joe Perches
On Fri, 2017-09-22 at 21:17 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
> > On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> > > On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> > >textdata bss dec hex filename
> > >   18220 176   0   1839647dc 
> > > build/tmp/lib/lz4/lz4_decompress-after.o
> > >   22297   0   0   222975719 
> > > build/tmp/lib/lz4/lz4_decompress-before.o
> > 
> > Perhaps not so much a gcc bug as an opportunity
> > for gcc to add an additional optimization.
> > 
> > gcc would have to verify that the const array is
> > not initialized with some variable or argument like:
> > 
> > int foo(int a)
> > {
> > const int array[] = {1, a};
> > ...
> > }
> 
> It depends. With a 10KB different in .text size, my guess is that this
> is a case where gcc does the right optimization in principle, but
> fails to do what was intended in some corner cases.

Maybe/maybe not.  
> I just cross-checked by building with clang, there the patch has
> no impact on code size, it is 24929 bytes with or without the patch.
> 
> Looking at other versions of (x86) gcc, I see .text sizes of
> 
>  afterbefore
> gcc-3.4.6 10855 12977
> gcc-4.0.4 11088 11088
> gcc-4.1.3 10973 10973
> gcc-4.2.5 11183 11183
> gcc-4.3.6 15501 17724

Interesting this was apparently deoptimized at version 4.3.

Glancing at the release notes doesn't seem to indicate
anything obvious.

https://gcc.gnu.org/gcc-4.3/changes.html

> gcc-4.4.7 13337 15693
> gcc-4.5.4 13162 15491
> gcc-4.6.4 14846 17302
> gcc-4.7.4 14187 16294
> gcc-4.8.5 16591 18730
> gcc-4.9.4 19582 21995
> gcc-5.4.1 18294 22510
> gcc-6.1.1 20487 25172
> gcc-6.3.1 20487 25172
> gcc-7.0.0 20351 31789
> gcc-7.0.1 20351 24966
> gcc-7.1.1 20383 24982
> gcc-8.0.0 20686 25065
> 
> It seems whatever happened in early versions of gcc-7 has since
> improved, and it probably was a bug since older and newer versions
> create similar code size (I have not looked at the actual object code).
> 
> The 5K difference in gcc-5 and higher still seems like a lot. It would
> also be interesting to look at the decompression performance of
> this code witth the different compilers to see if it got better or worse.

yup

> Most likely, gcc got better at inlining and unrolling parts of the
> algorithm, but sometimes an object file that doubles or triples in
> size is an indication that the compiler did something really bad.

yup[2]

cheers, Joe


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Joe Perches
On Fri, 2017-09-22 at 23:39 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:
> > On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
> > > On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> > > > On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> > > >textdata bss dec hex filename
> > > >   18220 176   0   1839647dc 
> > > > build/tmp/lib/lz4/lz4_decompress-after.o
> > > >   22297   0   0   222975719 
> > > > build/tmp/lib/lz4/lz4_decompress-before.o
> > > 
> > > Perhaps not so much a gcc bug as an opportunity
> > > for gcc to add an additional optimization.
> > > 
> > > gcc would have to verify that the const array is
> > > not initialized with some variable or argument like:
> > > 
> > > int foo(int a)
> > > {
> > > const int array[] = {1, a};
> > > ...
> > > }
> > 
> > It depends. With a 10KB different in .text size, my guess is that this
> > is a case where gcc does the right optimization in principle, but
> > fails to do what was intended in some corner cases.
> 
> I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
> produces lots of expensive checks here with gcc-5 or higher.
> 
> Disabling it makes a big difference:
> 
> upstream:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
> gcc-7.0.0: 16535 bytes
> 
> patched:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
> gcc-7.0.0: 14490 bytes

I think you are looking at a different issue.

There seems still a difference in size between
current and Colin's patch in compiled object size.



>   Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Joe Perches
On Fri, 2017-09-22 at 21:17 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
> > On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> > > On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
> > >textdata bss dec hex filename
> > >   18220 176   0   1839647dc 
> > > build/tmp/lib/lz4/lz4_decompress-after.o
> > >   22297   0   0   222975719 
> > > build/tmp/lib/lz4/lz4_decompress-before.o
> > 
> > Perhaps not so much a gcc bug as an opportunity
> > for gcc to add an additional optimization.
> > 
> > gcc would have to verify that the const array is
> > not initialized with some variable or argument like:
> > 
> > int foo(int a)
> > {
> > const int array[] = {1, a};
> > ...
> > }
> 
> It depends. With a 10KB different in .text size, my guess is that this
> is a case where gcc does the right optimization in principle, but
> fails to do what was intended in some corner cases.

Maybe/maybe not.  
> I just cross-checked by building with clang, there the patch has
> no impact on code size, it is 24929 bytes with or without the patch.
> 
> Looking at other versions of (x86) gcc, I see .text sizes of
> 
>  afterbefore
> gcc-3.4.6 10855 12977
> gcc-4.0.4 11088 11088
> gcc-4.1.3 10973 10973
> gcc-4.2.5 11183 11183
> gcc-4.3.6 15501 17724

Interesting this was apparently deoptimized at version 4.3.

Glancing at the release notes doesn't seem to indicate
anything obvious.

https://gcc.gnu.org/gcc-4.3/changes.html

> gcc-4.4.7 13337 15693
> gcc-4.5.4 13162 15491
> gcc-4.6.4 14846 17302
> gcc-4.7.4 14187 16294
> gcc-4.8.5 16591 18730
> gcc-4.9.4 19582 21995
> gcc-5.4.1 18294 22510
> gcc-6.1.1 20487 25172
> gcc-6.3.1 20487 25172
> gcc-7.0.0 20351 31789
> gcc-7.0.1 20351 24966
> gcc-7.1.1 20383 24982
> gcc-8.0.0 20686 25065
> 
> It seems whatever happened in early versions of gcc-7 has since
> improved, and it probably was a bug since older and newer versions
> create similar code size (I have not looked at the actual object code).
> 
> The 5K difference in gcc-5 and higher still seems like a lot. It would
> also be interesting to look at the decompression performance of
> this code witth the different compilers to see if it got better or worse.

yup

> Most likely, gcc got better at inlining and unrolling parts of the
> algorithm, but sometimes an object file that doubles or triples in
> size is an indication that the compiler did something really bad.

yup[2]

cheers, Joe


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Colin Ian King
On 22/09/17 22:39, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:
>> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
>>> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
 On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>>
textdata bss dec hex filename
   18220 176   0   1839647dc 
 build/tmp/lib/lz4/lz4_decompress-after.o
   22297   0   0   222975719 
 build/tmp/lib/lz4/lz4_decompress-before.o
>>>
>>> Perhaps not so much a gcc bug as an opportunity
>>> for gcc to add an additional optimization.
>>>
>>> gcc would have to verify that the const array is
>>> not initialized with some variable or argument like:
>>>
>>> int foo(int a)
>>> {
>>> const int array[] = {1, a};
>>> ...
>>> }
>>
>> It depends. With a 10KB different in .text size, my guess is that this
>> is a case where gcc does the right optimization in principle, but
>> fails to do what was intended in some corner cases.
> 
> I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
> produces lots of expensive checks here with gcc-5 or higher.
> 
> Disabling it makes a big difference:
> 
> upstream:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
> gcc-7.0.0: 16535 bytes
> 
> patched:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
> gcc-7.0.0: 14490 bytes
> 
>   Arnd
> 

Nice catch!


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Colin Ian King
On 22/09/17 22:39, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:
>> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
>>> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
 On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>>
textdata bss dec hex filename
   18220 176   0   1839647dc 
 build/tmp/lib/lz4/lz4_decompress-after.o
   22297   0   0   222975719 
 build/tmp/lib/lz4/lz4_decompress-before.o
>>>
>>> Perhaps not so much a gcc bug as an opportunity
>>> for gcc to add an additional optimization.
>>>
>>> gcc would have to verify that the const array is
>>> not initialized with some variable or argument like:
>>>
>>> int foo(int a)
>>> {
>>> const int array[] = {1, a};
>>> ...
>>> }
>>
>> It depends. With a 10KB different in .text size, my guess is that this
>> is a case where gcc does the right optimization in principle, but
>> fails to do what was intended in some corner cases.
> 
> I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
> produces lots of expensive checks here with gcc-5 or higher.
> 
> Disabling it makes a big difference:
> 
> upstream:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
> gcc-7.0.0: 16535 bytes
> 
> patched:
> gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
> gcc-7.0.0: 14490 bytes
> 
>   Arnd
> 

Nice catch!


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:
> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
>> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>
>>>textdata bss dec hex filename
>>>   18220 176   0   1839647dc 
>>> build/tmp/lib/lz4/lz4_decompress-after.o
>>>   22297   0   0   222975719 
>>> build/tmp/lib/lz4/lz4_decompress-before.o
>>
>> Perhaps not so much a gcc bug as an opportunity
>> for gcc to add an additional optimization.
>>
>> gcc would have to verify that the const array is
>> not initialized with some variable or argument like:
>>
>> int foo(int a)
>> {
>> const int array[] = {1, a};
>> ...
>> }
>
> It depends. With a 10KB different in .text size, my guess is that this
> is a case where gcc does the right optimization in principle, but
> fails to do what was intended in some corner cases.

I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
produces lots of expensive checks here with gcc-5 or higher.

Disabling it makes a big difference:

upstream:
gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
gcc-7.0.0: 16535 bytes

patched:
gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
gcc-7.0.0: 14490 bytes

  Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:
> On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
>> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>
>>>textdata bss dec hex filename
>>>   18220 176   0   1839647dc 
>>> build/tmp/lib/lz4/lz4_decompress-after.o
>>>   22297   0   0   222975719 
>>> build/tmp/lib/lz4/lz4_decompress-before.o
>>
>> Perhaps not so much a gcc bug as an opportunity
>> for gcc to add an additional optimization.
>>
>> gcc would have to verify that the const array is
>> not initialized with some variable or argument like:
>>
>> int foo(int a)
>> {
>> const int array[] = {1, a};
>> ...
>> }
>
> It depends. With a 10KB different in .text size, my guess is that this
> is a case where gcc does the right optimization in principle, but
> fails to do what was intended in some corner cases.

I found the problem: "gcc -fsanitze=kernel-address --param asan-stack=1"
produces lots of expensive checks here with gcc-5 or higher.

Disabling it makes a big difference:

upstream:
gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 31789 bytes
gcc-7.0.0: 16535 bytes

patched:
gcc-7.0.0 --fsanitze=kernel-address --param asan-stack=1: 20351 bytes
gcc-7.0.0: 14490 bytes

  Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:

>  afterbefore
> gcc-3.4.6 10855 12977
> gcc-4.0.4 11088 11088
> gcc-4.1.3 10973 10973
> gcc-4.2.5 11183 11183
> gcc-4.3.6 15501 17724
> gcc-4.4.7 13337 15693
> gcc-4.5.4 13162 15491
> gcc-4.6.4 14846 17302
> gcc-4.7.4 14187 16294
> gcc-4.8.5 16591 18730
> gcc-4.9.4 19582 21995
> gcc-5.4.1 18294 22510
> gcc-6.1.1 20487 25172
> gcc-6.3.1 20487 25172
> gcc-7.0.0 20351 31789
> gcc-7.0.1 20351 24966
> gcc-7.1.1 20383 24982
> gcc-8.0.0 20686 25065

Here is the same table for 32-bit ARM, showing results that roughly
match my expectations of what we should see here

4.3.6 15318 16574
4.4.7 13364 14388
4.5.4 12856 13912
4.6.4 14596 15124
4.7.4 14942 15850
4.8.5 15454 16074
4.9.4 15994 16742
5.4.1 15930 16686
6.3.1 17302 18154
7.0.0 18042 18958
7.0.1 18278 19098
7.1.1 18302 19106
8.0.0 19710 20490

  Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 9:17 PM, Arnd Bergmann  wrote:

>  afterbefore
> gcc-3.4.6 10855 12977
> gcc-4.0.4 11088 11088
> gcc-4.1.3 10973 10973
> gcc-4.2.5 11183 11183
> gcc-4.3.6 15501 17724
> gcc-4.4.7 13337 15693
> gcc-4.5.4 13162 15491
> gcc-4.6.4 14846 17302
> gcc-4.7.4 14187 16294
> gcc-4.8.5 16591 18730
> gcc-4.9.4 19582 21995
> gcc-5.4.1 18294 22510
> gcc-6.1.1 20487 25172
> gcc-6.3.1 20487 25172
> gcc-7.0.0 20351 31789
> gcc-7.0.1 20351 24966
> gcc-7.1.1 20383 24982
> gcc-8.0.0 20686 25065

Here is the same table for 32-bit ARM, showing results that roughly
match my expectations of what we should see here

4.3.6 15318 16574
4.4.7 13364 14388
4.5.4 12856 13912
4.6.4 14596 15124
4.7.4 14942 15850
4.8.5 15454 16074
4.9.4 15994 16742
5.4.1 15930 16686
6.3.1 17302 18154
7.0.0 18042 18958
7.0.1 18278 19098
7.1.1 18302 19106
8.0.0 19710 20490

  Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King

>>textdata bss dec hex filename
>>   18220 176   0   1839647dc 
>> build/tmp/lib/lz4/lz4_decompress-after.o
>>   22297   0   0   222975719 
>> build/tmp/lib/lz4/lz4_decompress-before.o
>
> Perhaps not so much a gcc bug as an opportunity
> for gcc to add an additional optimization.
>
> gcc would have to verify that the const array is
> not initialized with some variable or argument like:
>
> int foo(int a)
> {
> const int array[] = {1, a};
> ...
> }

It depends. With a 10KB different in .text size, my guess is that this
is a case where gcc does the right optimization in principle, but
fails to do what was intended in some corner cases.

I just cross-checked by building with clang, there the patch has
no impact on code size, it is 24929 bytes with or without the patch.

Looking at other versions of (x86) gcc, I see .text sizes of

 afterbefore
gcc-3.4.6 10855 12977
gcc-4.0.4 11088 11088
gcc-4.1.3 10973 10973
gcc-4.2.5 11183 11183
gcc-4.3.6 15501 17724
gcc-4.4.7 13337 15693
gcc-4.5.4 13162 15491
gcc-4.6.4 14846 17302
gcc-4.7.4 14187 16294
gcc-4.8.5 16591 18730
gcc-4.9.4 19582 21995
gcc-5.4.1 18294 22510
gcc-6.1.1 20487 25172
gcc-6.3.1 20487 25172
gcc-7.0.0 20351 31789
gcc-7.0.1 20351 24966
gcc-7.1.1 20383 24982
gcc-8.0.0 20686 25065

It seems whatever happened in early versions of gcc-7 has since
improved, and it probably was a bug since older and newer versions
create similar code size (I have not looked at the actual object code).

The 5K difference in gcc-5 and higher still seems like a lot. It would
also be interesting to look at the decompression performance of
this code witth the different compilers to see if it got better or worse.

Most likely, gcc got better at inlining and unrolling parts of the
algorithm, but sometimes an object file that doubles or triples in
size is an indication that the compiler did something really bad.

   Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 7:21 PM, Joe Perches  wrote:
> On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
>> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King

>>textdata bss dec hex filename
>>   18220 176   0   1839647dc 
>> build/tmp/lib/lz4/lz4_decompress-after.o
>>   22297   0   0   222975719 
>> build/tmp/lib/lz4/lz4_decompress-before.o
>
> Perhaps not so much a gcc bug as an opportunity
> for gcc to add an additional optimization.
>
> gcc would have to verify that the const array is
> not initialized with some variable or argument like:
>
> int foo(int a)
> {
> const int array[] = {1, a};
> ...
> }

It depends. With a 10KB different in .text size, my guess is that this
is a case where gcc does the right optimization in principle, but
fails to do what was intended in some corner cases.

I just cross-checked by building with clang, there the patch has
no impact on code size, it is 24929 bytes with or without the patch.

Looking at other versions of (x86) gcc, I see .text sizes of

 afterbefore
gcc-3.4.6 10855 12977
gcc-4.0.4 11088 11088
gcc-4.1.3 10973 10973
gcc-4.2.5 11183 11183
gcc-4.3.6 15501 17724
gcc-4.4.7 13337 15693
gcc-4.5.4 13162 15491
gcc-4.6.4 14846 17302
gcc-4.7.4 14187 16294
gcc-4.8.5 16591 18730
gcc-4.9.4 19582 21995
gcc-5.4.1 18294 22510
gcc-6.1.1 20487 25172
gcc-6.3.1 20487 25172
gcc-7.0.0 20351 31789
gcc-7.0.1 20351 24966
gcc-7.1.1 20383 24982
gcc-8.0.0 20686 25065

It seems whatever happened in early versions of gcc-7 has since
improved, and it probably was a bug since older and newer versions
create similar code size (I have not looked at the actual object code).

The 5K difference in gcc-5 and higher still seems like a lot. It would
also be interesting to look at the decompression performance of
this code witth the different compilers to see if it got better or worse.

Most likely, gcc got better at inlining and unrolling parts of the
algorithm, but sometimes an object file that doubles or triples in
size is an indication that the compiler did something really bad.

   Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Joe Perches
On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>  wrote:
> > On 22/09/17 00:09, Christophe JAILLET wrote:
> > > Le 22/09/2017 à 00:19, Colin King a écrit :
> > > > From: Colin Ian King 
> > > > 
> > > > Don't populate the read-only arrays dec32table and dec64table on the
> > > > stack, instead make them both static const.  Makes the object code
> > > > smaller by over 10K bytes:
> > > 
> > > 10k? Wouaouh! This is way much more than what you usually win with such
> > > patches.
> > 
> > Yes, I had to triple check it because it was an unbelievable win.
> > 
> 
> I wonder whether this should be reported as a gcc bug. I tried reproducing
> it here with gcc-7.1.1 and gcc-8.0.0, but I only see a 4K difference:
> 
>textdata bss dec hex filename
>   18220 176   0   1839647dc 
> build/tmp/lib/lz4/lz4_decompress-after.o
>   22297   0   0   222975719 
> build/tmp/lib/lz4/lz4_decompress-before.o

Perhaps not so much a gcc bug as an opportunity
for gcc to add an additional optimization.

gcc would have to verify that the const array is
not initialized with some variable or argument like:

int foo(int a)
{
const int array[] = {1, a};
...
}



Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Joe Perches
On Fri, 2017-09-22 at 09:48 +0200, Arnd Bergmann wrote:
> On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
>  wrote:
> > On 22/09/17 00:09, Christophe JAILLET wrote:
> > > Le 22/09/2017 à 00:19, Colin King a écrit :
> > > > From: Colin Ian King 
> > > > 
> > > > Don't populate the read-only arrays dec32table and dec64table on the
> > > > stack, instead make them both static const.  Makes the object code
> > > > smaller by over 10K bytes:
> > > 
> > > 10k? Wouaouh! This is way much more than what you usually win with such
> > > patches.
> > 
> > Yes, I had to triple check it because it was an unbelievable win.
> > 
> 
> I wonder whether this should be reported as a gcc bug. I tried reproducing
> it here with gcc-7.1.1 and gcc-8.0.0, but I only see a 4K difference:
> 
>textdata bss dec hex filename
>   18220 176   0   1839647dc 
> build/tmp/lib/lz4/lz4_decompress-after.o
>   22297   0   0   222975719 
> build/tmp/lib/lz4/lz4_decompress-before.o

Perhaps not so much a gcc bug as an opportunity
for gcc to add an additional optimization.

gcc would have to verify that the const array is
not initialized with some variable or argument like:

int foo(int a)
{
const int array[] = {1, a};
...
}



Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
 wrote:
> On 22/09/17 00:09, Christophe JAILLET wrote:
>> Le 22/09/2017 à 00:19, Colin King a écrit :
>>> From: Colin Ian King 
>>>
>>> Don't populate the read-only arrays dec32table and dec64table on the
>>> stack, instead make them both static const.  Makes the object code
>>> smaller by over 10K bytes:
>> 10k? Wouaouh! This is way much more than what you usually win with such
>> patches.
>
> Yes, I had to triple check it because it was an unbelievable win.
>

I wonder whether this should be reported as a gcc bug. I tried reproducing
it here with gcc-7.1.1 and gcc-8.0.0, but I only see a 4K difference:

   textdata bss dec hex filename
  18220 176   0   1839647dc build/tmp/lib/lz4/lz4_decompress-after.o
  22297   0   0   222975719
build/tmp/lib/lz4/lz4_decompress-before.o

I tried x86 defconfig and allmodconfig on a slightly patched mainline kernel.

Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-22 Thread Arnd Bergmann
On Fri, Sep 22, 2017 at 1:11 AM, Colin Ian King
 wrote:
> On 22/09/17 00:09, Christophe JAILLET wrote:
>> Le 22/09/2017 à 00:19, Colin King a écrit :
>>> From: Colin Ian King 
>>>
>>> Don't populate the read-only arrays dec32table and dec64table on the
>>> stack, instead make them both static const.  Makes the object code
>>> smaller by over 10K bytes:
>> 10k? Wouaouh! This is way much more than what you usually win with such
>> patches.
>
> Yes, I had to triple check it because it was an unbelievable win.
>

I wonder whether this should be reported as a gcc bug. I tried reproducing
it here with gcc-7.1.1 and gcc-8.0.0, but I only see a 4K difference:

   textdata bss dec hex filename
  18220 176   0   1839647dc build/tmp/lib/lz4/lz4_decompress-after.o
  22297   0   0   222975719
build/tmp/lib/lz4/lz4_decompress-before.o

I tried x86 defconfig and allmodconfig on a slightly patched mainline kernel.

Arnd


Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-21 Thread Colin Ian King
On 22/09/17 00:09, Christophe JAILLET wrote:
> Le 22/09/2017 à 00:19, Colin King a écrit :
>> From: Colin Ian King 
>>
>> Don't populate the read-only arrays dec32table and dec64table on the
>> stack, instead make them both static const.  Makes the object code
>> smaller by over 10K bytes:
> 10k? Wouaouh! This is way much more than what you usually win with such
> patches.

Yes, I had to triple check it because it was an unbelievable win.

> I assume we must thanks FORCE_INLINE in this case.

Yep, I believe so.

> Nice catch!
> 
> CJ
> 
>> Before:
>> text   databssdechexfilename
>>31500  0  0  31500   7b0c   
>> lib/lz4/lz4_decompress.o
>>
>> After:
>> text   databssdechexfilename
>>20237176  0  20413   4fbd   
>> lib/lz4/lz4_decompress.o
>>
>> (gcc version 7.2.0 x86_64)
>>
>> Signed-off-by: Colin Ian King 
>> ---
>>   lib/lz4/lz4_decompress.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
>> index bd3574312b82..141734d255e4 100644
>> --- a/lib/lz4/lz4_decompress.c
>> +++ b/lib/lz4/lz4_decompress.c
>> @@ -85,8 +85,8 @@ static FORCE_INLINE int LZ4_decompress_generic(
>>   const BYTE * const lowLimit = lowPrefix - dictSize;
>> const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;
>> -const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
>> -const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
>> +static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
>> +static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
>> const int safeDecode = (endOnInput == endOnInputSize);
>>   const int checkOffset = ((safeDecode) && (dictSize < (int)(64 *
>> KB)));
> 
> 



Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-21 Thread Colin Ian King
On 22/09/17 00:09, Christophe JAILLET wrote:
> Le 22/09/2017 à 00:19, Colin King a écrit :
>> From: Colin Ian King 
>>
>> Don't populate the read-only arrays dec32table and dec64table on the
>> stack, instead make them both static const.  Makes the object code
>> smaller by over 10K bytes:
> 10k? Wouaouh! This is way much more than what you usually win with such
> patches.

Yes, I had to triple check it because it was an unbelievable win.

> I assume we must thanks FORCE_INLINE in this case.

Yep, I believe so.

> Nice catch!
> 
> CJ
> 
>> Before:
>> text   databssdechexfilename
>>31500  0  0  31500   7b0c   
>> lib/lz4/lz4_decompress.o
>>
>> After:
>> text   databssdechexfilename
>>20237176  0  20413   4fbd   
>> lib/lz4/lz4_decompress.o
>>
>> (gcc version 7.2.0 x86_64)
>>
>> Signed-off-by: Colin Ian King 
>> ---
>>   lib/lz4/lz4_decompress.c | 4 ++--
>>   1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
>> index bd3574312b82..141734d255e4 100644
>> --- a/lib/lz4/lz4_decompress.c
>> +++ b/lib/lz4/lz4_decompress.c
>> @@ -85,8 +85,8 @@ static FORCE_INLINE int LZ4_decompress_generic(
>>   const BYTE * const lowLimit = lowPrefix - dictSize;
>> const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;
>> -const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
>> -const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
>> +static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
>> +static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
>> const int safeDecode = (endOnInput == endOnInputSize);
>>   const int checkOffset = ((safeDecode) && (dictSize < (int)(64 *
>> KB)));
> 
> 



Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-21 Thread Christophe JAILLET

Le 22/09/2017 à 00:19, Colin King a écrit :

From: Colin Ian King 

Don't populate the read-only arrays dec32table and dec64table on the
stack, instead make them both static const.  Makes the object code
smaller by over 10K bytes:
10k? Wouaouh! This is way much more than what you usually win with such 
patches.

I assume we must thanks FORCE_INLINE in this case.
Nice catch!

CJ


Before:
text   data bss dec hex filename
   31500  0   0   315007b0c lib/lz4/lz4_decompress.o

After:
text   data bss dec hex filename
   20237176   0   204134fbd lib/lz4/lz4_decompress.o

(gcc version 7.2.0 x86_64)

Signed-off-by: Colin Ian King 
---
  lib/lz4/lz4_decompress.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
index bd3574312b82..141734d255e4 100644
--- a/lib/lz4/lz4_decompress.c
+++ b/lib/lz4/lz4_decompress.c
@@ -85,8 +85,8 @@ static FORCE_INLINE int LZ4_decompress_generic(
const BYTE * const lowLimit = lowPrefix - dictSize;
  
  	const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;

-   const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
-   const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
+   static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
+   static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
  
  	const int safeDecode = (endOnInput == endOnInputSize);

const int checkOffset = ((safeDecode) && (dictSize < (int)(64 * KB)));





Re: [PATCH] lib/lz4: make arrays static const, reduces object code size

2017-09-21 Thread Christophe JAILLET

Le 22/09/2017 à 00:19, Colin King a écrit :

From: Colin Ian King 

Don't populate the read-only arrays dec32table and dec64table on the
stack, instead make them both static const.  Makes the object code
smaller by over 10K bytes:
10k? Wouaouh! This is way much more than what you usually win with such 
patches.

I assume we must thanks FORCE_INLINE in this case.
Nice catch!

CJ


Before:
text   data bss dec hex filename
   31500  0   0   315007b0c lib/lz4/lz4_decompress.o

After:
text   data bss dec hex filename
   20237176   0   204134fbd lib/lz4/lz4_decompress.o

(gcc version 7.2.0 x86_64)

Signed-off-by: Colin Ian King 
---
  lib/lz4/lz4_decompress.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/lz4/lz4_decompress.c b/lib/lz4/lz4_decompress.c
index bd3574312b82..141734d255e4 100644
--- a/lib/lz4/lz4_decompress.c
+++ b/lib/lz4/lz4_decompress.c
@@ -85,8 +85,8 @@ static FORCE_INLINE int LZ4_decompress_generic(
const BYTE * const lowLimit = lowPrefix - dictSize;
  
  	const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;

-   const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
-   const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
+   static const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
+   static const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
  
  	const int safeDecode = (endOnInput == endOnInputSize);

const int checkOffset = ((safeDecode) && (dictSize < (int)(64 * KB)));