Re: [HACKERS] Regarding Postgres Dynamic Shared Memory (DSA)

Mahendranath Gurram Tue, 20 Jun 2017 02:47:25 -0700

Hi Thomas,



Any update on this?



Please let me know how can i proceed further.



Thanks &amp; Best Regards,

-Mahi









---- On Fri, 16 Jun 2017 18:47:37 +0530 Mahi Gurram &lt;[email protected]&gt; 
wrote ----




Hi Thomas,



Thanks for your response and suggestions to change the code.



Now i have modified my code as per your suggestions. Now dsa_area pointer is 
not in shared memory, it is a global variable. Also, implemented all your code 
suggestions but unfortunately, no luck. Still facing the same behaviour. Refer 
the attachment for the modified code.



I have some doubts in your response. Please clarify.



I didn't try your code but I see a few different problems here.  Every
backend is creating a new dsa area, and then storing the pointer to it
in shared memory instead of attaching from other backends using the
handle, and there are synchronisation problems.  That isn't going to
work.  Here's what I think you might want to try:

Actually i'm not creating dsa_area for every backend. I'm creating it only 
once(in BufferShmemHook). 

* I put prints in my _PG_init and  BufferShmemHook  function to confirm the 
same.



As far as i know, _PG_Init of a shared_library/extension is called only 
once(during startup) by postmaster process, and all the postgres backends are 
forked/child process to postmaster process.



Since the backends are the postmaster's child processes and are created after 
the shared memory(dsa_area) has been created and attached, the backend/child 
process will receive the shared memory segment in its address space and as a 
result no shared memory operations like dsa_attach are required to access/use 
dsa data.



Please correct me, if i'm wrong.



3.  Whether you are the backend that created it or a backend that
attached to it, I think you'll need to store the dsa_area in a global
variable for your UDFs to access.  Note that the dsa_area object will
be different in each backend: there is no point in storing that
address itself in shared memory, as you have it, as you certainly
can't use it in any other backend. In other words, each backend that
attached has its own dsa_area object that it can use to access the
common dynamic shared memory area.

In case of forked processes, the OS actually does share the pages initially, 
because fork implements copy-on-write semantics. which means that provided none 
of the processes modifies the pages, they both points to same address and the 
same data.



Based on above theory, assume i have created dsa_area object in postmaster 
process(_PG_Init) and is a global variable, all the backends/forked processes 
can able to access/share the same dsa_area object and it's members.



Hence theoretically, the code should work with out any issues. But i'm sure why 
it is not working as expected :(



I tried debugging by putting prints, and observed the below things:

1. dsa_area_control address is different among postmaster process and backends.

2. After restarting, they seems to be same and hence it is working after that.



2017-06-16 18:08:50.798 IST [9195] LOG:  ++++ Inside Postmaster Process, after 
dsa_create() +++++

2017-06-16 18:08:50.798 IST [9195] LOG:  the address of dsa_area_control is 
0x7f50ddaa6000
2017-06-16 18:08:50.798 IST [9195] LOG:  the dsa_area_handle is 1007561696
2017-06-16 18:11:01.904 IST [9224] LOG:  ++++ Inside UDF function in forked 
process +++++

2017-06-16 18:11:01.904 IST [9224] LOG:  the address of dsa_area_control is 
0x1dac910
2017-06-16 18:11:01.904 IST [9224] LOG:  the dsa_area_handle is 0
2017-06-16 18:11:01.907 IST [9195] LOG:  server process (PID 9224) was 
terminated by signal 11: Segmentation fault

2017-06-16 18:11:01.907 IST [9195] DETAIL:  Failed process was running: select 
test_dsa_data_access(1);

2017-06-16 18:11:01.907 IST [9195] LOG:  terminating any other active server 
processes

2017-06-16 18:11:01.907 IST [9227] FATAL:  the database system is in recovery 
mode

2017-06-16 18:11:01.907 IST [9220] WARNING:  terminating connection because of 
crash of another server process

2017-06-16 18:11:01.907 IST [9220] DETAIL:  The postmaster has commanded this 
server process to roll back the current transaction and exit, because another 
server process exited abnormally and possibly corrupted shared memory.

2017-06-16 18:11:01.907 IST [9220] HINT:  In a moment you should be able to 
reconnect to the database and repeat your command.

2017-06-16 18:11:01.907 IST [9195] LOG:  all server processes terminated; 
reinitialising

2017-06-16 18:08:50.798 IST [9195] LOG:  ++++ Inside Postmaster Process, after 
dsa_create() +++++

2017-06-16 18:11:01.937 IST [9195] LOG:  the address of dsa_area_control is 
0x7f50ddaa6000
2017-06-16 18:11:01.937 IST [9195] LOG:  the dsa_area_handle is 1833840303
2017-06-16 18:11:01.904 IST [9224] LOG:  ++++ Inside UDF function in forked 
process +++++

2017-06-16 18:12:24.247 IST [9239] LOG:  the address of dsa_area_control is 
0x7f50ddaa6000
2017-06-16 18:12:24.247 IST [9239] LOG:  the dsa_area_handle is 1833840303


I may be wrong in my understanding, and i might be missing something :(



Please help me in sorting it out. Really appreciate for all your help :)



PS: In mac, It is working fine as expected. I'm facing this issue only in linux 
systems. I'm working over postgres 10.1 beta FYI.



Thanks &amp; Best Regards,

- Mahi








On Thu, Jun 15, 2017 at 5:00 PM, Thomas Munro 
&lt;[email protected]&gt; wrote:

On Thu, Jun 15, 2017 at 6:32 PM, Mahi Gurram &lt;[email protected]&gt; wrote:

 &gt; Followed the same as per your suggestion. Refer the code snippet below:

 &gt;

 &gt;&gt; void

 &gt;&gt; _PG_init(void){

 &gt;&gt; RequestAddinShmemSpace(100000000);

 &gt;&gt;         PreviousShmemHook = shmem_startup_hook;

 &gt;&gt;        shmem_startup_hook = BufferShmemHook;

 &gt;&gt; }

 &gt;&gt; void BufferShmemHook(){

 &gt;&gt; dsa_area *area;

 &gt;&gt; dsa_pointer data_ptr;

 &gt;&gt;         char *mem;

 &gt;&gt;   area = dsa_create(my_tranche_id());

 &gt;&gt;        data_ptr = dsa_allocate(area, 42);

 &gt;&gt;        mem = (char *) dsa_get_address(area, data_ptr);

 &gt;&gt;        if (mem != NULL){

 &gt;&gt;            snprintf(mem, 42, "Hello world");

 &gt;&gt;        }

 &gt;&gt;         bool found;

 &gt;&gt; shmemData = ShmemInitStruct("Mahi_Shared_Data",

 &gt;&gt;   sizeof(shared_data),

 &gt;&gt;   &amp;found);

 &gt;&gt; shmemData-&gt;shared_area = area;

 &gt;&gt; shmemData-&gt;shared_area_handle = dsa_get_handle(area);

 &gt;&gt; shmemData-&gt;shared_data_ptr = data_ptr;

 &gt;&gt;         shmemData-&gt;head=NULL;

 &gt;&gt; }

 &gt;

 &gt;

 &gt; Wrote one UDF function, which is called by one of the client connection 
and

 &gt; that tries to use the same dsa. But unfortunately it is behaving strange.

 &gt;

 &gt; First call to my UDF function is throwing segmentation fault and postgres 
is

 &gt; quitting and auto restarting. If i try calling the same UDF function again

 &gt; in new connection(after postgres restart) it is working fine.

 &gt;

 &gt; Put some prints in postgres source code and found that dsa_allocate() is

 &gt; trying to use area-&gt;control(dsa_area_control object) which is pointing 
to

 &gt; wrong address but after restarting it is pointing to right address and 
hence

 &gt; it is working fine after restart.

 &gt;

 &gt; I'm totally confused and stuck at this point. Please help me in solving

 &gt; this.

 &gt;

 &gt; PS: It is working fine in Mac.. in only linux systems i'm facing this

 &gt; behaviour.

 &gt;

 &gt; I have attached the zip of my extension code along with screenshot of the

 &gt; pgclient and log file with debug prints for better understanding.

 &gt; *logfile is edited for providing some comments for better understanding.

 &gt;

 &gt; Please help me in solving this.

 



Hi Mahi

 

 I didn't try your code but I see a few different problems here.  Every

 backend is creating a new dsa area, and then storing the pointer to it

 in shared memory instead of attaching from other backends using the

 handle, and there are synchronisation problems.  That isn't going to

 work.  Here's what I think you might want to try:

 

 1.  In BufferShmemHook, acquire and release AddinShmemInitLock while

 initialising "Mahi_Shared_Data" (just like pgss_shmem_startup does),

 because any number of backends could be starting up at the same time

 and would step on each other's toes here.

 

 2.  When ShmemInitStruct returns, check the value of 'found'.  If it's

 false, then this backend is the very first one to attach to this bit

 of (traditional) shmem.  So it should create the DSA area and store

 the handle in the traditional shmem.  Because we hold

 AddinShmemInitLock we know that no one else can be doing that at the

 same time.   Before even trying to create the DSA area, you should

 probably memset the whole thing to zero so that if you fail later, the

 state isn't garbage.  If 'found' is true, then we know it's already

 all set up (or zeroed out), so instead of creating the DSA area it

 should attach to it using the published handle.

 

 3.  Whether you are the backend that created it or a backend that

 attached to it, I think you'll need to store the dsa_area in a global

 variable for your UDFs to access.  Note that the dsa_area object will

 be different in each backend: there is no point in storing that

 address itself in shared memory, as you have it, as you certainly

 can't use it in any other backend. In other words, each backend that

 attached has its own dsa_area object that it can use to access the

 common dynamic shared memory area.

 

 4.  After creating, in this case I think you should call

 dsa_pin(area), so that it doesn't go away when there are no backends

 attached (ie because there are no backends running) (if I understand

 correctly that you want this DSA area to last as long as the whole

 cluster).

 

 By the way, in _PG_init() where you have

 RequestAddinShmemSpace(100000000) I think you want

 RequestAddinShmemSpace(sizeof(shared_data)).

 

 The key point is: only one backend should use LWLockNewTrancheId() and

 dsa_create(), and then make the handle available to others; all the

 other backends should use dsa_attach().  Then they'll all be attached

 to the same dynamic shared memory area and can share data.

 



--

 Thomas Munro

 http://www.enterprisedb.com









-- 

Sent via pgsql-hackers mailing list ([email protected]) 

To make changes to your subscription: 

http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regarding Postgres Dynamic Shared Memory (DSA)

Reply via email to