Re: Spawned Background Process Knows the Exit of Client Process?

2020-05-18 Thread Shichao Jin
Hi Ashutosh,

Thank you for your answer.

For the first point, as you suggested, we will migrate to table AM sooner
or later.

For the second point, your description is exactly correct (an independent
process to access the storage engine). We can have multiple threads to
overcome the performance issue.

The problem comes from the ignorance of data types for storage engine,
where the storage engine has to get the comparator function of PG to
compare two keys. Otherwise, the storage engine uses "memcmp". In order to
get the compare func, we have to let the independent process dependent on a
specific database to access the catalog (relcache). Unfortunately, the
process cannot become independent anymore once it changed its property by
calling BackgroundWorkerInitializeConnection. Then our design evolves to
spawn multiple processes for accessing different tables created by the
storage engine. As a result, we have to release these spawned processes
once the backend process switches database or terminate itself. Currently,
we can set a timer for inactivity duration, in order to release the
resource. I am wondering is there any elegant way to achieve this goal?

Best,
Shichao

On Mon, 18 May 2020 at 08:37, Ashutosh Bapat 
wrote:

> On Fri, May 15, 2020 at 11:53 PM Shichao Jin  wrote:
> >
> > Hi Postgres Hackers,
> >
> > I am wondering is there any elegant way for self-spawned background
> process (forked by us) to get notified when the regular client-connected
> process exit from the current database (switch db or even terminate)?
> >
> > The background is that we are integrating a thread-model based storage
> engine into Postgres via foreign data wrapper.
>
> PostgreSQL now support pluggable storage API. Have you considered
> using that instead of FDW?
>
> > The engine is not allowed to have multiple processes to access it. So we
> have to spawn a background process to access the engine, while the client
> process can communicate with the spawned process via shared memory. In
> order to let the engine recognize the data type in Postgres, the spawned
> process has to access catalog such as relcache, and It must connect to the
> target database via BackgroundWorkerInitializeConnection to get the info.
> Unfortunately, it is not possible to switch databases for background
> process. So it has to get notified when client process switches db or
> terminate, then we can correspondingly close the spawned process. Please
> advise us if there are alternative approaches.
>
> There can be multiple backends accessing different database. But from
> your description it looks like there is only one background process
> that will access the storage engine and it will be shared by multiple
> backends which may be connected to different databases. If that's
> correct, you will need to make that background process independent of
> database and just access storage. That looks less performance though.
> May be you can elaborate more about your usecase.
>
> --
> Best Wishes,
> Ashutosh Bapat
>


Re: Spawned Background Process Knows the Exit of Client Process?

2020-05-18 Thread Ashutosh Bapat
On Fri, May 15, 2020 at 11:53 PM Shichao Jin  wrote:
>
> Hi Postgres Hackers,
>
> I am wondering is there any elegant way for self-spawned background process 
> (forked by us) to get notified when the regular client-connected process exit 
> from the current database (switch db or even terminate)?
>
> The background is that we are integrating a thread-model based storage engine 
> into Postgres via foreign data wrapper.

PostgreSQL now support pluggable storage API. Have you considered
using that instead of FDW?

> The engine is not allowed to have multiple processes to access it. So we have 
> to spawn a background process to access the engine, while the client process 
> can communicate with the spawned process via shared memory. In order to let 
> the engine recognize the data type in Postgres, the spawned process has to 
> access catalog such as relcache, and It must connect to the target database 
> via BackgroundWorkerInitializeConnection to get the info. Unfortunately, it 
> is not possible to switch databases for background process. So it has to get 
> notified when client process switches db or terminate, then we can 
> correspondingly close the spawned process. Please advise us if there are 
> alternative approaches.

There can be multiple backends accessing different database. But from
your description it looks like there is only one background process
that will access the storage engine and it will be shared by multiple
backends which may be connected to different databases. If that's
correct, you will need to make that background process independent of
database and just access storage. That looks less performance though.
May be you can elaborate more about your usecase.

-- 
Best Wishes,
Ashutosh Bapat




Spawned Background Process Knows the Exit of Client Process?

2020-05-15 Thread Shichao Jin
Hi Postgres Hackers,

I am wondering is there any elegant way for self-spawned background process
(forked by us) to get notified when the regular client-connected process
exit from the current database (switch db or even terminate)?

The background is that we are integrating a thread-model based storage
engine into Postgres via foreign data wrapper. The engine is not allowed to
have multiple processes to access it. So we have to spawn a background
process to access the engine, while the client process can communicate with
the spawned process via shared memory. In order to let the engine recognize
the data type in Postgres, the spawned process has to access catalog such
as relcache, and It must connect to the target database
via BackgroundWorkerInitializeConnection to get the info. Unfortunately, it
is not possible to switch databases for background process. So it has to
get notified when client process switches db or terminate, then we can
correspondingly close the spawned process. Please advise us if there are
alternative approaches.

Best,
Shichao