Re: Optimize memory allocation code

2020-10-03 Thread Tomas Vondra

On Fri, Sep 25, 2020 at 07:37:07PM -0500, Merlin Moncure wrote:

On Fri, Sep 25, 2020 at 7:32 PM Li Japin  wrote:




> On Sep 26, 2020, at 8:09 AM, Julien Rouhaud  wrote:
>
> Hi,
>
> On Sat, Sep 26, 2020 at 12:14 AM Li Japin  wrote:
>>
>> Hi, hackers!
>>
>> I find the palloc0() is similar to the palloc(), we can use palloc() inside 
palloc0()
>> to allocate space, thereby I think we can reduce  duplication of code.
>
> The code is duplicated on purpose.  There's a comment at the beginning
> that mentions it:
>
>  /* duplicates MemoryContextAllocZero to avoid increased overhead */
>
> Same for MemoryContextAllocZero() itself.

Thanks! How big is this overhead? Is there any way I can test it?


Profiler.  For example, oprofile. In hot areas of the code (memory
allocation is very hot), profiling is the first step.



Maybe a micro-benchmark would be better, e.g. a function with a loop
doing many palloc/palloc0 calls, or something similar.

FWIW I wonder what kind of overhead is this meant to avoid, the comment
unfortunaly does not go into any details. I suppose it's to not do extra
function calls, but maybe there's something else going on. And maybe the
overhead is much lower on modern CPUs (although this seems to come from
8396447cdbd in 2013, so it's not that old).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Optimize memory allocation code

2020-09-29 Thread Li Japin


On Sep 29, 2020, at 9:30 PM, Alvaro Herrera 
mailto:alvhe...@2ndquadrant.com>> wrote:

On 2020-Sep-26, Li Japin wrote:

Thanks! How big is this overhead? Is there any way I can test it?

You could also have a look at the assembly code that your compiler
generates -- particularly examine how it changes.

Thanks for your advice!

The origin assembly code for palloc0 is:

00517690 :
  517690: 55   push   %rbp
  517691: 53   push   %rbx
  517692: 48 89 fb mov%rdi,%rbx
  517695: 48 83 ec 08   sub$0x8,%rsp
  517699: 48 81 ff ff ff ff 3f cmp$0x3fff,%rdi
  5176a0: 48 8b 2d d9 0c 48 00 mov0x480cd9(%rip),%rbp# 998380 

  5176a7: 0f 87 d5 00 00 00 ja 517782 
  5176ad: 48 8b 45 10   mov0x10(%rbp),%rax
  5176b1: 48 89 fe mov%rdi,%rsi
  5176b4: c6 45 04 00   movb   $0x0,0x4(%rbp)
  5176b8: 48 89 ef mov%rbp,%rdi
  5176bb: ff 10 callq  *(%rax)
  5176bd: 48 85 c0 test   %rax,%rax
  5176c0: 48 89 c1 mov%rax,%rcx
  5176c3: 74 5b je 517720 
  5176c5: f6 c3 07 test   $0x7,%bl
  5176c8: 75 36 jne517700 
  5176ca: 48 81 fb 00 04 00 00 cmp$0x400,%rbx
  5176d1: 77 2d ja 517700 
  5176d3: 48 01 c3 add%rax,%rbx
  5176d6: 48 39 d8 cmp%rbx,%rax
  5176d9: 73 35 jae517710 
  5176db: 0f 1f 44 00 00   nopl   0x0(%rax,%rax,1)
  5176e0: 48 83 c0 08   add$0x8,%rax
  5176e4: 48 c7 40 f8 00 00 00 movq   $0x0,-0x8(%rax)
  5176eb: 00
  5176ec: 48 39 c3 cmp%rax,%rbx
  5176ef: 77 ef ja 5176e0 
  5176f1: 48 83 c4 08   add$0x8,%rsp
  5176f5: 48 89 c8 mov%rcx,%rax
  5176f8: 5b   pop%rbx
  5176f9: 5d   pop%rbp
  5176fa: c3   retq
  5176fb: 0f 1f 44 00 00   nopl   0x0(%rax,%rax,1)
  517700: 48 89 cf mov%rcx,%rdi
  517703: 48 89 da mov%rbx,%rdx
  517706: 31 f6 xor%esi,%esi
  517708: e8 e3 0e ba ff   callq  b85f0 
  51770d: 48 89 c1 mov%rax,%rcx
  517710: 48 83 c4 08   add$0x8,%rsp
  517714: 48 89 c8 mov%rcx,%rax
  517717: 5b   pop%rbx
  517718: 5d   pop%rbp
  517719: c3   retq
  51771a: 66 0f 1f 44 00 00 nopw   0x0(%rax,%rax,1)
  517720: 48 8b 3d 51 0c 48 00 mov0x480c51(%rip),%rdi# 998378 

  517727: be 64 00 00 00   mov$0x64,%esi
  51772c: e8 1f f9 ff ff   callq  517050 
  517731: 31 f6 xor%esi,%esi
  517733: bf 14 00 00 00   mov$0x14,%edi
  517738: e8 53 6d fd ff   callq  4ee490 
  51773d: bf c5 20 00 00   mov$0x20c5,%edi
  517742: e8 99 9b fd ff   callq  4f12e0 
  517747: 48 8d 3d 07 54 03 00 lea0x35407(%rip),%rdi# 54cb55 
<__func__.7554+0x45>
  51774e: 31 c0 xor%eax,%eax
  517750: e8 ab 9d fd ff   callq  4f1500 
  517755: 48 8b 55 38   mov0x38(%rbp),%rdx
  517759: 48 8d 3d 80 11 16 00 lea0x161180(%rip),%rdi# 6788e0 
<__func__.6248+0x150>
  517760: 48 89 de mov%rbx,%rsi
  517763: 31 c0 xor%eax,%eax
  517765: e8 56 a2 fd ff   callq  4f19c0 
  51776a: 48 8d 15 ff 11 16 00 lea0x1611ff(%rip),%rdx# 678970 
<__func__.7326>
  517771: 48 8d 3d 20 11 16 00 lea0x161120(%rip),%rdi# 678898 
<__func__.6248+0x108>
  517778: be eb 03 00 00   mov$0x3eb,%esi
  51777d: e8 0e 95 fd ff   callq  4f0c90 
  517782: 31 f6 xor%esi,%esi
  517784: bf 14 00 00 00   mov$0x14,%edi
  517789: e8 02 6d fd ff   callq  4ee490 
  51778e: 48 8d 3d db 10 16 00 lea0x1610db(%rip),%rdi# 678870 
<__func__.6248+0xe0>
  517795: 48 89 de mov%rbx,%rsi
  517798: 31 c0 xor%eax,%eax
  51779a: e8 91 98 fd ff   callq  4f1030 
  51779f: 48 8d 15 ca 11 16 00 lea0x1611ca(%rip),%rdx# 678970 
<__func__.7326>
  5177a6: 48 8d 3d eb 10 16 00 lea0x1610eb(%rip),%rdi# 678898 
<__func__.6248+0x108>
  5177ad: be df 03 00 00   mov$0x3df,%esi
  5177b2: e8 d9 94 fd ff   callq  4f0c90 
  5177b7: 66 0f 1f 84 00 00 00 nopw   0x0(%rax,%rax,1)
  5177be: 00 00

After modified, the palloc0 assembly code is:

00517690 :
  517690: 53   push   %rbx
  517691: 48 89 fb mov%rdi,%rbx
  517694: e8 17 ff ff ff   callq  5175b0 
  517699: f6 c3 07 test   $0x7,%bl
  51769c: 48 89 c1 mov%rax,%rcx
  51769f: 75 2f jne5176d0 
  5176a1: 48 81 fb 00 04 00 00 cmp$0x400,%rbx
  5176a8: 77 26 ja 5176d0 
  5176aa: 48 01 c3 add%rax,%rbx
  5176ad: 48 39 d8 cmp%rbx,%rax
  

Re: Optimize memory allocation code

2020-09-29 Thread Alvaro Herrera
On 2020-Sep-26, Li Japin wrote:

> Thanks! How big is this overhead? Is there any way I can test it?

You could also have a look at the assembly code that your compiler
generates -- particularly examine how it changes.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Optimize memory allocation code

2020-09-25 Thread Merlin Moncure
On Fri, Sep 25, 2020 at 7:32 PM Li Japin  wrote:
>
>
>
> > On Sep 26, 2020, at 8:09 AM, Julien Rouhaud  wrote:
> >
> > Hi,
> >
> > On Sat, Sep 26, 2020 at 12:14 AM Li Japin  wrote:
> >>
> >> Hi, hackers!
> >>
> >> I find the palloc0() is similar to the palloc(), we can use palloc() 
> >> inside palloc0()
> >> to allocate space, thereby I think we can reduce  duplication of code.
> >
> > The code is duplicated on purpose.  There's a comment at the beginning
> > that mentions it:
> >
> >  /* duplicates MemoryContextAllocZero to avoid increased overhead */
> >
> > Same for MemoryContextAllocZero() itself.
>
> Thanks! How big is this overhead? Is there any way I can test it?

Profiler.  For example, oprofile. In hot areas of the code (memory
allocation is very hot), profiling is the first step.

merlin




Re: Optimize memory allocation code

2020-09-25 Thread Li Japin


> On Sep 26, 2020, at 8:09 AM, Julien Rouhaud  wrote:
> 
> Hi,
> 
> On Sat, Sep 26, 2020 at 12:14 AM Li Japin  wrote:
>> 
>> Hi, hackers!
>> 
>> I find the palloc0() is similar to the palloc(), we can use palloc() inside 
>> palloc0()
>> to allocate space, thereby I think we can reduce  duplication of code.
> 
> The code is duplicated on purpose.  There's a comment at the beginning
> that mentions it:
> 
>  /* duplicates MemoryContextAllocZero to avoid increased overhead */
> 
> Same for MemoryContextAllocZero() itself.

Thanks! How big is this overhead? Is there any way I can test it?

Best regards!

--
Japin Li

Re: Optimize memory allocation code

2020-09-25 Thread Julien Rouhaud
Hi,

On Sat, Sep 26, 2020 at 12:14 AM Li Japin  wrote:
>
> Hi, hackers!
>
> I find the palloc0() is similar to the palloc(), we can use palloc() inside 
> palloc0()
> to allocate space, thereby I think we can reduce  duplication of code.

The code is duplicated on purpose.  There's a comment at the beginning
that mentions it:

  /* duplicates MemoryContextAllocZero to avoid increased overhead */

Same for MemoryContextAllocZero() itself.




Optimize memory allocation code

2020-09-25 Thread Li Japin
Hi, hackers!

I find the palloc0() is similar to the palloc(), we can use palloc() inside 
palloc0()
to allocate space, thereby I think we can reduce  duplication of code.

Best regards!

--
Japin Li



0001-Optimize-memory-allocation-code.patch
Description: 0001-Optimize-memory-allocation-code.patch