Re: [HACKERS] shared memory message queues

2014-01-15 Thread Robert Haas
On Tue, Jan 14, 2014 at 9:28 PM, Peter Eisentraut pete...@gmx.net wrote:
 Something is causing this new compiler warning:

 setup.c: In function ‘setup_dynamic_shared_memory’:
 setup.c:102:3: error: format ‘%llu’ expects argument of type ‘long long 
 unsigned int’, but argument 2 has type ‘Size’ [-Werror=format=]

Oops.  I've pushed a fix (at least, I hope it's a fix).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-15 Thread Robert Haas
On Tue, Jan 14, 2014 at 9:53 PM, Kevin Grittner kgri...@ymail.com wrote:
 I'm not seeing that one but I am now getting these:

 setup.c: In function ‘test_shm_mq_setup’:
 setup.c:65:25: warning: ‘outq’ may be used uninitialized in this function 
 [-Wmaybe-uninitialized]
 setup.c:66:24: warning: ‘inq’ may be used uninitialized in this function 
 [-Wmaybe-uninitialized]

That warning is bogus and I don't get it, but I'll add an
initialization to placate the compiler, which is being a bit too smart
for it's own good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-15 Thread Andres Freund
On 2014-01-15 10:19:32 -0500, Robert Haas wrote:
 On Tue, Jan 14, 2014 at 9:28 PM, Peter Eisentraut pete...@gmx.net wrote:
  Something is causing this new compiler warning:
 
  setup.c: In function ‘setup_dynamic_shared_memory’:
  setup.c:102:3: error: format ‘%llu’ expects argument of type ‘long long 
  unsigned int’, but argument 2 has type ‘Size’ [-Werror=format=]
 
 Oops.  I've pushed a fix (at least, I hope it's a fix).

I don't think that works on LLP64 platforms like windows - long is 32bit
there.
That's basically why I proposed the z modifier to elog to encapsulate
this in some other thread...

Till then you'll have to cast to a known size.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-15 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 14, 2014 at 9:28 PM, Peter Eisentraut pete...@gmx.net wrote:
 Something is causing this new compiler warning:
 
 setup.c: In function ‘setup_dynamic_shared_memory’:
 setup.c:102:3: error: format ‘%llu’ expects argument of type ‘long long 
 unsigned int’, but argument 2 has type ‘Size’ [-Werror=format=]

 Oops.  I've pushed a fix (at least, I hope it's a fix).

Unless I'm misreading what you did, that will just move the problem to
a different set of platforms.  size_t is not necessarily long (one
counterexample is Win64, IIRC).

In the past we've usually resorted to explicitly casting.  You could
silence the warning correctly with either of
   ereport(... %lu ..., (unsigned long) shm_mq_minimum_size);
   ereport(...  UINT64_FORMAT  ..., (uint64) shm_mq_minimum_size);

However, the problem with the second one (which also existed in your
original code) is that it breaks translatability of the string, because
any given translator is only going to see the version that gets compiled
on their hardware.  Unless there's a significant likelihood of
shm_mq_minimum_size exceeding 4GB, I'd go with the first one.

If you really want to be able to print values 4GB in translatable
messages, what you need to do is to print into a local string variable
with UINT64_FORMAT, then use %s in the translatable string.  There's
examples of this in commands/sequence.c among other places.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-15 Thread Robert Haas
On Wed, Jan 15, 2014 at 12:39 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 14, 2014 at 9:28 PM, Peter Eisentraut pete...@gmx.net wrote:
 Something is causing this new compiler warning:

 setup.c: In function ‘setup_dynamic_shared_memory’:
 setup.c:102:3: error: format ‘%llu’ expects argument of type ‘long long 
 unsigned int’, but argument 2 has type ‘Size’ [-Werror=format=]

 Oops.  I've pushed a fix (at least, I hope it's a fix).

 Unless I'm misreading what you did, that will just move the problem to
 a different set of platforms.  size_t is not necessarily long (one
 counterexample is Win64, IIRC).

 In the past we've usually resorted to explicitly casting.  You could
 silence the warning correctly with either of
ereport(... %lu ..., (unsigned long) shm_mq_minimum_size);
ereport(...  UINT64_FORMAT  ..., (uint64) shm_mq_minimum_size);

 However, the problem with the second one (which also existed in your
 original code) is that it breaks translatability of the string, because
 any given translator is only going to see the version that gets compiled
 on their hardware.  Unless there's a significant likelihood of
 shm_mq_minimum_size exceeding 4GB, I'd go with the first one.

I think the current value is about 56, and while the header might
expand in the future, 4294967296 would be a heck of a lot of header
data.  So I've changed it as you suggest.  I think.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Robert Haas
On Mon, Dec 23, 2013 at 12:46 PM, Robert Haas robertmh...@gmail.com wrote:
 Oh, dear.  That's rather embarrassing.

 Incremental (incremental-shm-mq.patch) and full (shm-mq-v3.patch)
 patches attached.

OK, I have pushed the patches in this stack.  I'm not sure we quite
concluded the review back-and-forth but nobody really seems to have
had serious objections to this line of attack, other than wanting some
more comments which I have added.  I don't doubt that there will be
more things to tweak and tune here, and a whole lot more stuff that
needs to be built using this infrastructure, but I don't think the
code that's here is going to get better for remaining outside the tree
longer.

I decided not to change the dsm_toc patch to use functions instead of
macros as Andres suggested; the struct definition would still have
needed to be public, so any change would still need a recompile, at
least if the size of the struct changed.  It could be changed to work
with a palloc'd chunk, but then you need to worry about
context-lifespan memory leaks and it so didn't seem worth it.  I can't
imagine this having a lot of churn, anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Thom Brown
On 14 January 2014 17:29, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Dec 23, 2013 at 12:46 PM, Robert Haas robertmh...@gmail.com wrote:
 Oh, dear.  That's rather embarrassing.

 Incremental (incremental-shm-mq.patch) and full (shm-mq-v3.patch)
 patches attached.

 OK, I have pushed the patches in this stack.  I'm not sure we quite
 concluded the review back-and-forth but nobody really seems to have
 had serious objections to this line of attack, other than wanting some
 more comments which I have added.  I don't doubt that there will be
 more things to tweak and tune here, and a whole lot more stuff that
 needs to be built using this infrastructure, but I don't think the
 code that's here is going to get better for remaining outside the tree
 longer.

 I decided not to change the dsm_toc patch to use functions instead of
 macros as Andres suggested; the struct definition would still have
 needed to be public, so any change would still need a recompile, at
 least if the size of the struct changed.  It could be changed to work
 with a palloc'd chunk, but then you need to worry about
 context-lifespan memory leaks and it so didn't seem worth it.  I can't
 imagine this having a lot of churn, anyway.

postgres=# SELECT test_shm_mq(32768, (select
string_agg(chr(32+(random()*96)::int), '') from generate_series(1,3)),
1, 10);
ERROR:  could not register background process
HINT:  You may need to increase max_worker_processes.
STATEMENT:  SELECT test_shm_mq(32768, (select
string_agg(chr(32+(random()*96)::int), '') from generate_series(1,3)),
1, 10);
LOG:  registering background worker test_shm_mq
LOG:  starting background worker process test_shm_mq
ERROR:  could not register background process
HINT:  You may need to increase max_worker_processes.
ERROR:  unable to map dynamic shared memory segment
LOG:  worker process: test_shm_mq (PID 21939) exited with exit code 1
LOG:  unregistering background worker test_shm_mq


What's going on here?  This occurs when starting Postgres and run as
the first statement.

I also noticed that everything exits with exit code 1:


postgres=# SELECT test_shm_mq(32768, (select
string_agg(chr(32+(random()*96)::int), '') from
generate_series(1,400)), 1, 1);
LOG:  registering background worker test_shm_mq
LOG:  starting background worker process test_shm_mq
 test_shm_mq
-

(1 row)

LOG:  worker process: test_shm_mq (PID 22041) exited with exit code 1
LOG:  unregistering background worker test_shm_mq

-- 
Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 12:43 PM, Thom Brown t...@linux.com wrote:
 postgres=# SELECT test_shm_mq(32768, (select
 string_agg(chr(32+(random()*96)::int), '') from generate_series(1,3)),
 1, 10);
 ERROR:  could not register background process
 HINT:  You may need to increase max_worker_processes.
 STATEMENT:  SELECT test_shm_mq(32768, (select
 string_agg(chr(32+(random()*96)::int), '') from generate_series(1,3)),
 1, 10);
 LOG:  registering background worker test_shm_mq
 LOG:  starting background worker process test_shm_mq
 ERROR:  could not register background process
 HINT:  You may need to increase max_worker_processes.
 ERROR:  unable to map dynamic shared memory segment
 LOG:  worker process: test_shm_mq (PID 21939) exited with exit code 1
 LOG:  unregistering background worker test_shm_mq

 What's going on here?  This occurs when starting Postgres and run as
 the first statement.

Well, you requested 10 processes, but the default value of
max_worker_processes is 8.  So, as the hint says, you may need to
increase max_worker_processes.  Alternatively, you can decrease the
number of processes in the ring.

 I also noticed that everything exits with exit code 1:

 postgres=# SELECT test_shm_mq(32768, (select
 string_agg(chr(32+(random()*96)::int), '') from
 generate_series(1,400)), 1, 1);
 LOG:  registering background worker test_shm_mq
 LOG:  starting background worker process test_shm_mq
  test_shm_mq
 -

 (1 row)

 LOG:  worker process: test_shm_mq (PID 22041) exited with exit code 1
 LOG:  unregistering background worker test_shm_mq

This is (perhaps unfortunately) required by the background-worker API.
 When a process exits with code 0, it's immediately restarted
regardless of the restart-time setting.  To get the system to respect
the restart time (in this case, never) you have to make it exit with
code 1.  It's been like this since the beginning, and I wasn't in a
hurry to change it even though it seems odd to me.  Perhaps we should
revisit that decision.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Alvaro Herrera
Robert Haas escribió:
 On Tue, Jan 14, 2014 at 12:43 PM, Thom Brown t...@linux.com wrote:

  LOG:  worker process: test_shm_mq (PID 22041) exited with exit code 1
  LOG:  unregistering background worker test_shm_mq
 
 This is (perhaps unfortunately) required by the background-worker API.
  When a process exits with code 0, it's immediately restarted
 regardless of the restart-time setting.  To get the system to respect
 the restart time (in this case, never) you have to make it exit with
 code 1.  It's been like this since the beginning, and I wasn't in a
 hurry to change it even though it seems odd to me.  Perhaps we should
 revisit that decision.

Yeah, it's probably better to do it now rather than waiting.  When this
API was invented there wasn't any thought given to the idea of workers
that wouldn't be always up.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Robert Haas
On Tue, Jan 14, 2014 at 1:54 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Robert Haas escribió:
 On Tue, Jan 14, 2014 at 12:43 PM, Thom Brown t...@linux.com wrote:
  LOG:  worker process: test_shm_mq (PID 22041) exited with exit code 1
  LOG:  unregistering background worker test_shm_mq

 This is (perhaps unfortunately) required by the background-worker API.
  When a process exits with code 0, it's immediately restarted
 regardless of the restart-time setting.  To get the system to respect
 the restart time (in this case, never) you have to make it exit with
 code 1.  It's been like this since the beginning, and I wasn't in a
 hurry to change it even though it seems odd to me.  Perhaps we should
 revisit that decision.

 Yeah, it's probably better to do it now rather than waiting.  When this
 API was invented there wasn't any thought given to the idea of workers
 that wouldn't be always up.

Well, what do we want the semantics to be, then?  Right now we have this:

0: restart immediately
1: restart based on the restart interval

What should we have instead?

I think it might be nice to have an exit code that means never
restart, regardless of the restart interval.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Peter Eisentraut
Something is causing this new compiler warning:

setup.c: In function ‘setup_dynamic_shared_memory’:
setup.c:102:3: error: format ‘%llu’ expects argument of type ‘long long 
unsigned int’, but argument 2 has type ‘Size’ [-Werror=format=]




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Kevin Grittner
Peter Eisentraut pete...@gmx.net wrote:
 Something is causing this new compiler warning:

 setup.c: In function ‘setup_dynamic_shared_memory’:
 setup.c:102:3: error: format ‘%llu’ expects argument of type ‘long long 
 unsigned
 int’, but argument 2 has type ‘Size’ [-Werror=format=]

I'm not seeing that one but I am now getting these:

setup.c: In function ‘test_shm_mq_setup’:
setup.c:65:25: warning: ‘outq’ may be used uninitialized in this function 
[-Wmaybe-uninitialized]
setup.c:66:24: warning: ‘inq’ may be used uninitialized in this function 
[-Wmaybe-uninitialized]

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2014-01-14 Thread Michael Paquier
On Wed, Jan 15, 2014 at 4:32 AM, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 14, 2014 at 1:54 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Robert Haas escribió:
 On Tue, Jan 14, 2014 at 12:43 PM, Thom Brown t...@linux.com wrote:
  LOG:  worker process: test_shm_mq (PID 22041) exited with exit code 1
  LOG:  unregistering background worker test_shm_mq

 This is (perhaps unfortunately) required by the background-worker API.
  When a process exits with code 0, it's immediately restarted
 regardless of the restart-time setting.  To get the system to respect
 the restart time (in this case, never) you have to make it exit with
 code 1.  It's been like this since the beginning, and I wasn't in a
 hurry to change it even though it seems odd to me.  Perhaps we should
 revisit that decision.

 Yeah, it's probably better to do it now rather than waiting.  When this
 API was invented there wasn't any thought given to the idea of workers
 that wouldn't be always up.

 Well, what do we want the semantics to be, then?  Right now we have this:

 0: restart immediately
 1: restart based on the restart interval
With a 9.3 bgworker, restart interval is respected if exit code is
non-zero, worker exists immediately in case of ERROR or FATAL with
exit code 1.

 What should we have instead?

 I think it might be nice to have an exit code that means never
 restart, regardless of the restart interval.
I imagine that to be useful, using 2 to avoid breaking currently
existing bgworkers. Perhaps it would be nicer to associate some error
flags directly in the bgworker API with something of the type:
#define BGW_EXIT_RESTART_NOW 0
#define BGW_EXIT_RESTART_INTERVAL 1
#define BGW_EXIT_STOP_FORCE 2
And all the other error codes could point by default to 1.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-21 Thread Andres Freund
On 2013-12-20 22:04:05 +0100, Andres Freund wrote:
 Hi,
 
 On 2013-12-18 15:23:23 -0500, Robert Haas wrote:
  It sounds like most people who have looked at this stuff are broadly
  happy with it, so I'd like to push on toward commit soon, but it'd be
  helpful, Andres, if you could review the comment additions to
  shm-mq-v2.patch and see whether those address your concerns.  If so,
  I'll see about improving the overall comments for shm-toc-v1.patch as
  well to clarify the points that seem to have caused a bit of
  confusion; specific thoughts on what ought to be covered there, or any
  other review, is most welcome.
 
 Some things:

One more thing:
static uint64
shm_mq_get_bytes_written(volatile shm_mq *mq, bool *detached)
{
uint64  v;

SpinLockAcquire(mq-mq_mutex);
v = mq-mq_bytes_written;
*detached = mq-mq_detached;
SpinLockRelease(mq-mq_mutex);

return mq-mq_bytes_written;
}

Note how you're returning mq-mq_bytes_written instead of v.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-21 Thread Peter Eisentraut
This patch didn't make it out of the 2013-11 commit fest.  You should
move it to the next commit fest (probably with an updated patch) before
January 15th.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-20 Thread Andres Freund
On 2013-10-31 12:21:31 -0400, Robert Haas wrote:
 Patch #2, shm-toc-v1.patch, provides a facility for sizing a dynamic
 shared memory segment before creation, and for dividing it up into
 chunks after it's been created.  It therefore serves a function quite
 similar to RequestAddinShmemSpace, except of course that there is only
 one main shared memory segment created at postmaster startup time,
 whereas new dynamic shared memory segments can come into existence on
 the fly; and it serves even more conspicuously the function of
 ShmemIndex, which enables backends to locate particular data
 structures within the shared memory segment.  It is however quite a
 bit simpler than the ShmemIndex mechanism: we don't need the same
 level of extensibility here that we do for the main shared memory
 segment, because a new extension need not piggyback on an existing
 dynamic shared memory segment, but can create a whole segment of its
 own.

So, without the argument of having per-extension dsm segments, I'd say
that a purely integer key sucks, because it's hard to manage and
debug. This way it's still not too nice, but I don't see a all that good
alternative.

Comments about shm-toc-v1.patch:

Since you're embedding spinlocks in struct shm_toc, this module will be
in conflict with platforms that do --disable-spinlocks, since the number
of spinlocks essentially needs to be predetermined there. I personally
still think the solution to that is getting rid of --disable-spinlocks.

I vote for removing all the shm_toc_estimator() knowledge from the
header, there seems little reason to expose it that way. That just
exposes unneccessary details and makes fixes after releases harder
(requiring extensions to recompile). Function call costs hardly can
matter, right?

Do you assume that lookup speed isn't that important? I have a bit of a
hard time imagining that linear search over the keys isn't going to bite
us. At the least there should be a comment somewhere documenting that
the indented usecase is having a relatively few keys.

Wouldn't it make sense to have a function that does the combined job of
shm_toc_insert() and shm_toc_allocate()?

What's the assumption about the use of the magic in create/attach? Just
statically defined things in user code?

Ok, cooking now, then I'll have a look at the queue itself,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-20 Thread Robert Haas
On Fri, Dec 20, 2013 at 1:11 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-10-31 12:21:31 -0400, Robert Haas wrote:
 Patch #2, shm-toc-v1.patch, provides a facility for sizing a dynamic
 shared memory segment before creation, and for dividing it up into
 chunks after it's been created.  It therefore serves a function quite
 similar to RequestAddinShmemSpace, except of course that there is only
 one main shared memory segment created at postmaster startup time,
 whereas new dynamic shared memory segments can come into existence on
 the fly; and it serves even more conspicuously the function of
 ShmemIndex, which enables backends to locate particular data
 structures within the shared memory segment.  It is however quite a
 bit simpler than the ShmemIndex mechanism: we don't need the same
 level of extensibility here that we do for the main shared memory
 segment, because a new extension need not piggyback on an existing
 dynamic shared memory segment, but can create a whole segment of its
 own.

 So, without the argument of having per-extension dsm segments, I'd say
 that a purely integer key sucks, because it's hard to manage and
 debug. This way it's still not too nice, but I don't see a all that good
 alternative.

 Comments about shm-toc-v1.patch:

 Since you're embedding spinlocks in struct shm_toc, this module will be
 in conflict with platforms that do --disable-spinlocks, since the number
 of spinlocks essentially needs to be predetermined there. I personally
 still think the solution to that is getting rid of --disable-spinlocks.

Yep.  We can either deprecate --disable-spinlocks, or we can add an
API that is the reverse of S_INIT_LOCK().

 I vote for removing all the shm_toc_estimator() knowledge from the
 header, there seems little reason to expose it that way. That just
 exposes unneccessary details and makes fixes after releases harder
 (requiring extensions to recompile). Function call costs hardly can
 matter, right?

Oh, you're saying to make it a function rather than a macro?  Sure, I
could do that.

 Do you assume that lookup speed isn't that important? I have a bit of a
 hard time imagining that linear search over the keys isn't going to bite
 us. At the least there should be a comment somewhere documenting that
 the indented usecase is having a relatively few keys.

Well, as previously noted, in the use cases I imagine for this, it'll
be nworkers + somesmallconstant.  I can add a comment to that effect.

 Wouldn't it make sense to have a function that does the combined job of
 shm_toc_insert() and shm_toc_allocate()?

We could, but I don't really feel compelled.  It's not hard to call
them individually if that's the behavior you happen to want.

 What's the assumption about the use of the magic in create/attach? Just
 statically defined things in user code?

Yeah.

 Ok, cooking now, then I'll have a look at the queue itself,

Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-20 Thread Andres Freund
On 2013-12-20 14:10:57 -0500, Robert Haas wrote:
  Since you're embedding spinlocks in struct shm_toc, this module will be
  in conflict with platforms that do --disable-spinlocks, since the number
  of spinlocks essentially needs to be predetermined there. I personally
  still think the solution to that is getting rid of --disable-spinlocks.
 
 Yep.  We can either deprecate --disable-spinlocks, or we can add an
 API that is the reverse of S_INIT_LOCK().

How is that going to help? Even if we'd deallocate unused spinlocks -
which sure is a good idea when they require persistent resources like
the PGSemaphore based one - the maximum is still going to be there.

  I vote for removing all the shm_toc_estimator() knowledge from the
  header, there seems little reason to expose it that way. That just
  exposes unneccessary details and makes fixes after releases harder
  (requiring extensions to recompile). Function call costs hardly can
  matter, right?
 
 Oh, you're saying to make it a function rather than a macro?  Sure, I
 could do that.

Yea, keeping it private, as a function, seems like a good idea.

  Do you assume that lookup speed isn't that important? I have a bit of a
  hard time imagining that linear search over the keys isn't going to bite
  us. At the least there should be a comment somewhere documenting that
  the indented usecase is having a relatively few keys.
 
 Well, as previously noted, in the use cases I imagine for this, it'll
 be nworkers + somesmallconstant.  I can add a comment to that effect.

I primarily was wondering because you have gone through a bit of effort
making it lockless. A comment would be good, yes.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-20 Thread Robert Haas
On Fri, Dec 20, 2013 at 2:33 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-20 14:10:57 -0500, Robert Haas wrote:
  Since you're embedding spinlocks in struct shm_toc, this module will be
  in conflict with platforms that do --disable-spinlocks, since the number
  of spinlocks essentially needs to be predetermined there. I personally
  still think the solution to that is getting rid of --disable-spinlocks.

 Yep.  We can either deprecate --disable-spinlocks, or we can add an
 API that is the reverse of S_INIT_LOCK().

 How is that going to help? Even if we'd deallocate unused spinlocks -
 which sure is a good idea when they require persistent resources like
 the PGSemaphore based one - the maximum is still going to be there.

Well, we can set the maximum to a bigger number, if that's what we
want to do, but I'll admit it's unclear what value would be
sufficient.  We probably won't have more than one TOC per DSM, and the
maximum number of DSMs is capped based on MaxBackends, but it seems
likely that many applications of dynamic shared memory will require
spinlocks, and I don't think we can plausibly calculate an upper bound
there.  If we could it would be a big number, and that might just make
startup fail anyway.

Personally, I don't have a big problem with the idea that if you
compile with --disable-spinlocks, some things may just fail, and
parallel whatever might be one of those things.  We could just bump
the 30 in SpinlockSemas to 100, and if you happen to use more than
that, too bad for you: it'll fail.

But I also don't have a big problem with removing --disable-spinlocks
altogether.  What do other people think?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-20 Thread Andres Freund
Hi,

On 2013-12-18 15:23:23 -0500, Robert Haas wrote:
 It sounds like most people who have looked at this stuff are broadly
 happy with it, so I'd like to push on toward commit soon, but it'd be
 helpful, Andres, if you could review the comment additions to
 shm-mq-v2.patch and see whether those address your concerns.  If so,
 I'll see about improving the overall comments for shm-toc-v1.patch as
 well to clarify the points that seem to have caused a bit of
 confusion; specific thoughts on what ought to be covered there, or any
 other review, is most welcome.

Some things:

* shm_mq_minimum_size should be const

* the MAXALIGNing in shm_mq_create looks odd. We're computing
  data_offset using MAXALIGN, determine the ring size using it, but then
  set mq_ring_offset to data_offset - offsetof(shm_mq, mq_ring)?

* same problem as the toc stuff with a dynamic number of spinnlocks.

* you don't seem to to initialize the spinlock anywhere. That's clearly
  not ok, independent of the spinlock implementation.

* The comments around shm_mq/shm_mq_handle are a clear improvement. I'd
  be pretty happy if you additionally add someting akin to 'This struct
  describes the shared state of a queue' and 'Backend-local reference to
  a queue'.

* s/MQH_BUFSIZE/MQH_INITIAL_BUFSIZE/, I was already wondering whether
  there's a limit on the max number of bytes.

* I think shm_mq_attach()'s documentation should mention that we'll
  initially and later allocate memory in the current
  CurrentMemoryContext when attaching. That will likely require some
  care for some callers. Yes, it's documented in a bigger comment
  somewhere else, but still.

* I don't really understand the mqh_consume_pending stuff. If we just
  read a message spanning from the beginning to the end of the
  ringbuffer, without wrapping, we might not confirm the read in
  shm_mq_receive() until the next the it is called? Do I understand that
  correctly?
I am talking about the following and other similar pieces of code:
+   if (rb = needed)
+   {
+   /*
+* Technically, we could consume the message length 
information at
+* this point, but the extra write to shared memory 
wouldn't be
+* free and in most cases we would reap no benefit.
+*/
+   mqh-mqh_consume_pending = needed;
+   *nbytesp = nbytes;
+   *datap = ((char *) rawdata) + 
MAXALIGN64(sizeof(uint64));
+   return SHM_MQ_SUCCESS;
+   }

* ISTM the messages around needing to use the same arguments for
  send/receive again after SHM_MQ_WOULD_BLOCK could stand to be more
  forceful. In this case, the caller should call this function again
  after the process latch has been set. doesn't make it all that clear
  that failure to do so will corrupt the next message. Maybe there also
  should be an assert checking that the size is the same when retrying
  as when starting a partial read?

* You have some CHECK_FOR_INTERRUPTS(), are you sure the code is safe to
  longjmp out from there in all the cases? E.g. I am not sure it is for
  shm_mq_send_bytes(), when we've first written a partial message, done
  a shm_mq_inc_bytes_written() and then go into the available == 0
  branch and jump out.

* Do I understand it correctly that mqh-mqh_counterparty_attached is
  just sort of a cache, and we'll still detect a detached counterparty
  when we're actually sending/receiving?

That's it for now.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-18 Thread Andres Freund
On 2013-12-18 15:23:23 -0500, Robert Haas wrote:
 It sounds like most people who have looked at this stuff are broadly
 happy with it, so I'd like to push on toward commit soon, but it'd be
 helpful, Andres, if you could review the comment additions to
 shm-mq-v2.patch and see whether those address your concerns.

FWIW, I haven't looked carefully enough at the patch to consider myself
having reviewed it. I am not saying that it's not ready for committer,
just that I don't know whether it is.

I will have a look at the comment improvements, and if you don't beat me
to it, give the patch a read-over.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-09 Thread Robert Haas
On Sun, Dec 8, 2013 at 5:52 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 2013/12/6 Kohei KaiGai kai...@kaigai.gr.jp:
 What will happen if sender tries to send a large chunk that needs to
 be split into multiple sub-chunks and receiver concurrently detaches
 itself from the queue during the writes by sender?
 It seems to me the sender gets SHM_MQ_DETACHED and only
 earlier half of the chunk still remains on the queue even though
 its total length was already in the message queue.
 It may eventually lead infinite loop on the receiver side when another
 receiver appeared again later, then read incomplete chunk.
 Does it a feasible scenario? If so, it might be a solution to prohibit
 enqueuing something without receiver, and reset queue when a new
 receiver is attached.

 Doesn't it an intended usage to attach a peer process on a message
 queue that had once detached, does it?
 If so, it may be a solution to put ereport() on shm_mq_set_receiver()
 and shm_mq_set_sender() to prohibit to assign a process on the
 message queue with mq_detached = true. It will make the situation
 simplified.

It's not intended that you should be able to attach a new reader or
writer in place of an old one that detached.  That would in fact be
pretty tricky to support, because if the detached process was in the
middle of reading or writing a message at the time it died, then
there's no way to recover protocol sync.  We could design some
mechanism for that, but in the case of background workers connected to
dynamic shared memory segments it isn't needed, because I assume that
when the background worker croaks, you're going to tear down the
dynamic shared memory segment and thus the whole queue will disappear;
if the user retries the query, we'll create a whole new segment
containing a whole new queue (or queues).

Now, if we wanted to use these queues in permanent shared memory, we'd
probably need to think a little bit harder about this.  It is not
impossible to make it work even as things stand, because you could
reuse the same chunk of shared memory and just overwrite it with a
newly-initialized queue.  You'd need some mechanism to figure out when
to do that, and it might be kind of ugly, but I think i'd be doable.
That wasn't the design center for this attempt, though, and if we want
to use it that way then we probably should spend some time figuring
out how to support both a clean detach, where the reader or writer
goes away at a message boundary, and possibly also a dirty detach,
where the reader or writer goes away in the middle of a message.  I
view those as problems for future patches, though.

 Regarding to the test-shm-mq-v1.patch, setup_background_workers()
 tries to launch nworkers of background worker processes, however,
 may fail during the launching if max_worker_processes is not enough.
 Is it a situation to attach the BGWORKER_EPHEMERAL flag when
 your patch gets committed, isn't it?

I dropped the proposal for BGWORKER_EPHEMERAL; I no longer think we
need that.  If not all of the workers can be registered,
setup_background_workers() will throw an error when
RegisterDynamicBackgroundWorker returns false.  If the workers are
successfully registered but don't get as far as connecting to the
shm_mq, wait_for_workers_to_become_ready() will detect that condition
and throw an error.  If all of the workers start up and attached to
the shared memory message queues but then later one of them dies, the
fact that it got as far as connecting to the shm_mq means that the
message queue's on_dsm_detach callback will run, which will mark the
queues to which it is connected as detached.  That will cause the
workers on either side of it to exit also until eventually the failure
propagates back around to the user backend.  This is all a bit complex
but I don't see a simpler solution.

 Also, test_shm_mq_setup() waits for completion of starting up of
 background worker processes. I'm uncertain whether it is really
 needed, because this shared memory message queue allows to
 send byte stream without receiver, and also blocks until byte
 stream will come from the peer to be set later.

That's actually a very important check.  Suppose we've got just 3
worker processes, so that the message queues are connected like this:

user backend - worker 1 - worker 2 - worker 3 - user backend

When test_shm_mq_setup connects to the queues linking it to worker 1
and worker 3, it passes a BackgroundWorkerHandle to shm_mq_attach. As
a result, if either worker 1 or worker 3 fails during startup, before
attaching to the queue, the user backend would notice that and error
out right away, even if it didn't do
wait_for_workers_to_become_ready().  However, if worker 2 fails during
startup, neither the user backend nor either of the other workers
would notice that without wait_for_workers_to_become_ready(): the user
backend isn't connected to worker 2 by a shm_mq at all, and workers 1
and 3 have no BackgroundWorkerHandle to pass to shm_mq_attach(),
because 

Re: [HACKERS] shared memory message queues

2013-12-08 Thread Kohei KaiGai
2013/12/6 Kohei KaiGai kai...@kaigai.gr.jp:
 What will happen if sender tries to send a large chunk that needs to
 be split into multiple sub-chunks and receiver concurrently detaches
 itself from the queue during the writes by sender?
 It seems to me the sender gets SHM_MQ_DETACHED and only
 earlier half of the chunk still remains on the queue even though
 its total length was already in the message queue.
 It may eventually lead infinite loop on the receiver side when another
 receiver appeared again later, then read incomplete chunk.
 Does it a feasible scenario? If so, it might be a solution to prohibit
 enqueuing something without receiver, and reset queue when a new
 receiver is attached.

Doesn't it an intended usage to attach a peer process on a message
queue that had once detached, does it?
If so, it may be a solution to put ereport() on shm_mq_set_receiver()
and shm_mq_set_sender() to prohibit to assign a process on the
message queue with mq_detached = true. It will make the situation
simplified.

Regarding to the test-shm-mq-v1.patch, setup_background_workers()
tries to launch nworkers of background worker processes, however,
may fail during the launching if max_worker_processes is not enough.
Is it a situation to attach the BGWORKER_EPHEMERAL flag when
your patch gets committed, isn't it?
I think it is waste of efforts to add error handling here instead of the
core support to be added, however, it makes sense to put a source
code comment not to forget to add this flag when it came.

Also, test_shm_mq_setup() waits for completion of starting up of
background worker processes. I'm uncertain whether it is really
needed, because this shared memory message queue allows to
send byte stream without receiver, and also blocks until byte
stream will come from the peer to be set later.
This module is designed for test purpose, so I think it makes more
sense if test condition is more corner case.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-07 Thread Robert Haas
On Fri, Dec 6, 2013 at 8:12 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2013-12-05 14:07:39 -0500, Robert Haas wrote:
 On Thu, Dec 5, 2013 at 8:29 AM, Andres Freund and...@2ndquadrant.com wrote:
  Hm. The API change of on_shmem_exit() is going to cause a fair bit of
  pain. There are a fair number of extensions out there using it and all
  would need to be littered by version dependent #ifdefs. What about
  providing a version of on_shmem_exit() that allows to specify the exit
  phase, and make on_shmem_exit() a backward compatible wrapper?

 Or, we could have on_user_exit() or so and leave on_shmem_exit() alone
 altogether, which would probably be transparent for nearly everyone.
 I kind of like that better, except that I'm not sure whether
 on_user_exit() is any good as a name.

 Leaving it as separate calls sounds good to me as well - but like you I
 don't like on_user_exit() that much. Not sure if I can up with something
 really good...
 on_shmem_exit_phase() or at_shmem_exit_phase() if we go for a function
 allowing to specify phases, and just before_shmem_exit() for the user
 level things?

Hmm, before_shmem_exit() and on_shmem_exit() doesn't sound bad.

  Hm. A couple of questions/comments:
  * How do you imagine keys to be used/built?
  * Won't the sequential search over toc entries become problematic?
  * Some high level overview of how it's supposed to be used wouldn't
hurt.
  * the estimator stuff doesn't seem to belong in the public header?

 The test-shm-mq patch is intended to illustrate these points.  In that
 case, for an N-worker configuration, there are N+1 keys; that is, N
 message queues and one fixed-size control area.  The estimator stuff
 is critically needed in the public header so that you can size your
 DSM to hold the stuff you intend to store in it; again, see
 test-shm-mq.

 I still think shm_mq.c needs to explain more of that. That's where I
 look for a brief high-level explanation, no?

That stuff mostly pertains to shm_toc.c, not shm_mq.c.  I can
certainly look at expanding the explanation in the former.

 I had a very quick look at shm_mq_receive()/send() and
 shm_mq_receive_bytes()/shm_mq_send_bytes() - while the low level steps
 seem to be explained fairly well, the high level design of the queue
 doesn't seem to be explained much. I was first thinking that you were
 trying to implement a lockless queue (which is quite efficiently
 possible for 1:1 queues) even because I didn't see any locking in them
 till I noticed it's just delegated to helper functions.

I did initially implement it as lockless, but I figured I needed a
fallback with spinlocks for machines that didn't have atomic 8-bit
writes and/or memory barriers, both of which are required for this to
work without locks.  When I implemented the fallback, I discovered
that it wasn't any slower than the lock-free implementation, so I
decided to simplify my life (and the code) by not bothering with it.

So the basic idea is that you have to take the spinlock to modify the
count of bytes read - if you're the reader - or written - if you're
the writer.  You also need to take the spinlock to read the counter
the other process might be modifying, but you can read the counter
that only you can modify without the lock.  You also don't need the
lock to read or write the data in the buffer, because you can only
write to the portion of the buffer that the reader isn't currently
allowed to read, and you can only read from the portion of the buffer
that the writer isn't currently allowed to write.

 I wonder if we couldn't just generally store a generation number for
 each PGPROC which is increased everytime the slot is getting
 reused. Then one could simply check whether mq-mq_sender-generation ==
 mq-mq_sender_generation.

I think you'd want to bump the generation every time the slot was
released rather than when it was reacquired, but it does sound
possibly sensible.  However, it wouldn't obviate the separate need to
watch a BackgroundWorkerHandle, because what this lets you do is (say)
write to a queue after you've registered the reader but before it's
actually been started by the postmaster, let alone acquired a PGPROC.
I won't object if someone implements such a system for their needs,
but it's not on my critical path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-06 Thread Andres Freund
On 2013-12-05 14:07:39 -0500, Robert Haas wrote:
 On Thu, Dec 5, 2013 at 8:29 AM, Andres Freund and...@2ndquadrant.com wrote:
  Hm. The API change of on_shmem_exit() is going to cause a fair bit of
  pain. There are a fair number of extensions out there using it and all
  would need to be littered by version dependent #ifdefs. What about
  providing a version of on_shmem_exit() that allows to specify the exit
  phase, and make on_shmem_exit() a backward compatible wrapper?
 
 Or, we could have on_user_exit() or so and leave on_shmem_exit() alone
 altogether, which would probably be transparent for nearly everyone.
 I kind of like that better, except that I'm not sure whether
 on_user_exit() is any good as a name.

Leaving it as separate calls sounds good to me as well - but like you I
don't like on_user_exit() that much. Not sure if I can up with something
really good...
on_shmem_exit_phase() or at_shmem_exit_phase() if we go for a function
allowing to specify phases, and just before_shmem_exit() for the user
level things?

  FWIW, I find on_dsm_detach_cancel() to be a confusing name. I thought it
  might be a variant that is called in different circumstances than
  on_dsm_detach(). Maybe go with cancel_on_shmem_detach(), like with
  cancel_shmem_exit()?
 
 It could be cancel_on_dsm_detach() if you like that better.  Not
 cancel_on_shmem_detach(), though.

Yes, I like that better. The shm instead of dsm was just a thinko ,)

  Hm. A couple of questions/comments:
  * How do you imagine keys to be used/built?
  * Won't the sequential search over toc entries become problematic?
  * Some high level overview of how it's supposed to be used wouldn't
hurt.
  * the estimator stuff doesn't seem to belong in the public header?
 
 The test-shm-mq patch is intended to illustrate these points.  In that
 case, for an N-worker configuration, there are N+1 keys; that is, N
 message queues and one fixed-size control area.  The estimator stuff
 is critically needed in the public header so that you can size your
 DSM to hold the stuff you intend to store in it; again, see
 test-shm-mq.

I still think shm_mq.c needs to explain more of that. That's where I
look for a brief high-level explanation, no?

  Patch #3, shm-mq-v1.patch, is the heart of this series.  It creates an
  infrastructure for sending and receiving messages of arbitrary length
  using ring buffers stored in shared memory (presumably dynamic shared
  memory, but hypothetically the main shared memory segment could be
  used).
 
  The API seems to assume it's in dsm tho?
 
 The header file makes no reference to dsm anywhere, so I'm not sure
 why you draw this conclusion.

The reason I was asking was this reference to dsm:
+shm_mq_handle *
+shm_mq_attach(shm_mq *mq, dsm_segment *seg, BackgroundWorkerHandle
*handle)

I'd only looked at the header at that point, but looking at the
function's comment it's otional.


  Hm. Too bad, I'd hoped for single-reader, multiple-writer :P
 
 Sure, that might be useful, too, as might multiple-reader,
 multi-writer.  But those things would come with performance and
 complexity trade-offs of their own.  I think it's appropriate to leave
 the task of creating those things to the people that need them.  If it
 turns out that this can be enhanced to work like that without
 meaningful loss of performance, that's fine by me, too, but I think
 this has plenty of merit as-is.

Yea, it's perfectly fine not to implement what I wished for ;). I just
had hoped you would magically develop what I was dreaming about...

 It's symmetric.  The idea is that you try to read or write data;
 should that fail, you wait on your latch and try again when it's set.

Ok, good. That's what I thought.

  Couple of questions:
  * Some introductory comments about the concurrency approach would be
good.
 
 Not sure exactly what to write.

I had a very quick look at shm_mq_receive()/send() and
shm_mq_receive_bytes()/shm_mq_send_bytes() - while the low level steps
seem to be explained fairly well, the high level design of the queue
doesn't seem to be explained much. I was first thinking that you were
trying to implement a lockless queue (which is quite efficiently
possible for 1:1 queues) even because I didn't see any locking in them
till I noticed it's just delegated to helper functions.

  * shm_mq_attach() is talking about BackgroundWorkerHandles - is there a
limitation that a bgworker has to be involved?

 [sensible stuff]

 Possibly there could be some similar mechanism to wait for a
 non-background-worker to stop, but I haven't thought much about what
 that would look like.

I wonder if we couldn't just generally store a generation number for
each PGPROC which is increased everytime the slot is getting
reused. Then one could simply check whether mq-mq_sender-generation ==
mq-mq_sender_generation.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  

Re: [HACKERS] shared memory message queues

2013-12-05 Thread Andres Freund
Hi,

Planned to look at this for a while... Not a detailed review, just some
thoughts. I'll let what I read sink in and possibly comment later.

On 2013-10-31 12:21:31 -0400, Robert Haas wrote:
 The attached patches attempt to rectify some of these problems.

Well, I wouldn't call it problems. Just natural, incremental development
;)

 Patch #1, on-dsm-detach-v1.patch, adds the concept of on_dsm_detach
 hooks
 [snip]
 The part of this patch which I
 suppose will elicit some controversy is that I've had to rearrange
 on_shmem_exit a bit.  It turns out that during shmem_exit, we do
 user-level cleanup, like aborting the transaction, first.  We expect
 that will probably release all of our shared-memory resources.

Hm. The API change of on_shmem_exit() is going to cause a fair bit of
pain. There are a fair number of extensions out there using it and all
would need to be littered by version dependent #ifdefs. What about
providing a version of on_shmem_exit() that allows to specify the exit
phase, and make on_shmem_exit() a backward compatible wrapper?

FWIW, I find on_dsm_detach_cancel() to be a confusing name. I thought it
might be a variant that is called in different circumstances than
on_dsm_detach(). Maybe go with cancel_on_shmem_detach(), like with
cancel_shmem_exit()?

 Patch #2, shm-toc-v1.patch, provides a facility for sizing a dynamic
 shared memory segment before creation, and for dividing it up into
 chunks after it's been created.  It therefore serves a function quite
 similar to RequestAddinShmemSpace, except of course that there is only
 one main shared memory segment created at postmaster startup time,
 whereas new dynamic shared memory segments can come into existence on
 the fly; and it serves even more conspicuously the function of
 ShmemIndex, which enables backends to locate particular data
 structures within the shared memory segment.  It is however quite a
 bit simpler than the ShmemIndex mechanism: we don't need the same
 level of extensibility here that we do for the main shared memory
 segment, because a new extension need not piggyback on an existing
 dynamic shared memory segment, but can create a whole segment of its
 own.

Hm. A couple of questions/comments:
* How do you imagine keys to be used/built?
* Won't the sequential search over toc entries become problematic?
* Some high level overview of how it's supposed to be used wouldn't
  hurt.
* the estimator stuff doesn't seem to belong in the public header?

 Patch #3, shm-mq-v1.patch, is the heart of this series.  It creates an
 infrastructure for sending and receiving messages of arbitrary length
 using ring buffers stored in shared memory (presumably dynamic shared
 memory, but hypothetically the main shared memory segment could be
 used).

The API seems to assume it's in dsm tho?

 Queues are single-reader and single-writer; they use process
 latches to implement waiting for the queue to fill (in the case of the
 reader) or drain (in the case of the writer).

Hm. Too bad, I'd hoped for single-reader, multiple-writer :P

 A non-blocking mode is also available for situations where other
 options might lead to deadlock.

That's cool. I wonder if there's a way to discover, on the writing end,
when new data can be sent. For reading there's a comment about waiting
for the latch...

Couple of questions:
* Some introductory comments about the concurrency approach would be
  good.
* shm_mq_attach() is talking about BackgroundWorkerHandles - is there a
  limitation that a bgworker has to be involved?


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-12-05 Thread Kohei KaiGai
Sorry for my late.

I checked the part-3 (shm-mq-v1.patc) portion as below.
Your comments towards part-1 and part-2 are reasonable for me,
so I have no argue on this portion.

Even though shm_mq_create() expects the address is aligned,
however, no mechanism to ensure. How about to put Assert() here?


shm_mq_send_bytes() has an Assert() to check the length to be
added to mq_bytes_written. Is the MAXALIGN64() right check in
32-bit architecture?
The sendnow value might be Min(ringsize - used, nbytes - sent),
and the ringsize came from mq-mq_ring_size being aligned
using MAXALIGN(_DOWN) that has 32bit width.

/*
 * Update count of bytes written, with alignment padding.  Note
 * that this will never actually insert any padding except at the
 * end of a run of bytes, because the buffer size is a multiple of
 * MAXIMUM_ALIGNOF, and each read is as well.
 */
Assert(sent == nbytes || sendnow == MAXALIGN64(sendnow));
shm_mq_inc_bytes_written(mq, MAXALIGN64(sendnow));


What will happen if sender tries to send a large chunk that needs to
be split into multiple sub-chunks and receiver concurrently detaches
itself from the queue during the writes by sender?
It seems to me the sender gets SHM_MQ_DETACHED and only
earlier half of the chunk still remains on the queue even though
its total length was already in the message queue.
It may eventually lead infinite loop on the receiver side when another
receiver appeared again later, then read incomplete chunk.
Does it a feasible scenario? If so, it might be a solution to prohibit
enqueuing something without receiver, and reset queue when a new
receiver is attached.


The source code comments in shm_mq_wait_internal() is a little bit
misleading for me. It says nothing shall be written without attaching
the peer process, however, we have no checks as long as nsend is
less than queue size.

 * If handle != NULL, the queue can be read or written even before the
 * other process has attached.  We'll wait for it to do so if needed.  The
 * handle must be for a background worker initialized with bgw_notify_pid
 * equal to our PID.

Right now, that's all I can comment on. I'll do follow-up code reading
in the weekend.

Thanks,

2013/11/20 Robert Haas robertmh...@gmail.com:
 On Tue, Nov 19, 2013 at 12:33 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 * on-dsm-detach-v2.patch
 It reminded me the hook registration/invocation mechanism on apache/httpd.
 It defines five levels for invocation order (REALLY_FIRST, FIRST, MIDDLE,
 LAST, REALLY_LAST), but these are alias of integer values, in other words,
 they are just kicks the hook in order to the priority value associated with a
 function pointer.
 These flexibility may make sense. We may want to control the order to
 release resources more fine grained in the future. For example, isn't it
 a problem if we have only two levels when a stuff in a queue needs to be
 released earlier than the queue itself?
 I'm not 100% certain on this suggestion because on_shmen_exit is a hook
 that does not host so much callbacks, thus extension may implement
 their own way on the SHMEM_EXIT_EARLY or SHMEM_EXIT_LATE stage
 of course.

 I don't really see much point in adding more flexibility here until we
 need it, but I can imagine that we might someday need it, for reasons
 that are not now obvious.

 * shm-toc-v1.patch

 From my experience, it makes sense to put a magic number on the tail of
 toc segment to detect shared-memory usage overrun. It helps to find bugs
 bug hard to find because of concurrent jobs.
 Probably, shm_toc_freespace() is a point to put assertion.

 How many entries is shm_toc_lookup() assumed to chain?
 It assumes a liner search from the head of shm_toc segment, and it is
 prerequisite for lock-less reference, isn't it?
 Does it make a problem if shm_toc host many entries, like 100 or 1000?
 Or, it is not an expected usage?

 It is not an expected usage.In typical usage, I expect that the
 number of TOC entries will be about N+K, where K is a small constant
 ( 10) and N is the number of cooperating parallel workers.  It's
 possible that we'll someday be in a position to leverage 100 or 1000
 parallel workers on the same task, but I don't expect it to be soon.
 And, actually, I doubt that revising the data structure would pay off
 at N=100.  At N=1000, maybe.  At N=1, probably.  But we are
 *definitely* not going to need that kind of scale any time soon, and I
 don't think it makes sense to design a complex data structure to
 handle that case when there are so many more basic problems that need
 to be solved before we can go there.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company



-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] shared memory message queues

2013-12-05 Thread Robert Haas
On Thu, Dec 5, 2013 at 8:29 AM, Andres Freund and...@2ndquadrant.com wrote:
 Patch #1, on-dsm-detach-v1.patch, adds the concept of on_dsm_detach
 hooks
 [snip]
 The part of this patch which I
 suppose will elicit some controversy is that I've had to rearrange
 on_shmem_exit a bit.  It turns out that during shmem_exit, we do
 user-level cleanup, like aborting the transaction, first.  We expect
 that will probably release all of our shared-memory resources.

 Hm. The API change of on_shmem_exit() is going to cause a fair bit of
 pain. There are a fair number of extensions out there using it and all
 would need to be littered by version dependent #ifdefs. What about
 providing a version of on_shmem_exit() that allows to specify the exit
 phase, and make on_shmem_exit() a backward compatible wrapper?

Or, we could have on_user_exit() or so and leave on_shmem_exit() alone
altogether, which would probably be transparent for nearly everyone.
I kind of like that better, except that I'm not sure whether
on_user_exit() is any good as a name.

 FWIW, I find on_dsm_detach_cancel() to be a confusing name. I thought it
 might be a variant that is called in different circumstances than
 on_dsm_detach(). Maybe go with cancel_on_shmem_detach(), like with
 cancel_shmem_exit()?

It could be cancel_on_dsm_detach() if you like that better.  Not
cancel_on_shmem_detach(), though.

 Patch #2, shm-toc-v1.patch, provides a facility for sizing a dynamic
 shared memory segment before creation, and for dividing it up into
 chunks after it's been created.  It therefore serves a function quite
 similar to RequestAddinShmemSpace, except of course that there is only
 one main shared memory segment created at postmaster startup time,
 whereas new dynamic shared memory segments can come into existence on
 the fly; and it serves even more conspicuously the function of
 ShmemIndex, which enables backends to locate particular data
 structures within the shared memory segment.  It is however quite a
 bit simpler than the ShmemIndex mechanism: we don't need the same
 level of extensibility here that we do for the main shared memory
 segment, because a new extension need not piggyback on an existing
 dynamic shared memory segment, but can create a whole segment of its
 own.

 Hm. A couple of questions/comments:
 * How do you imagine keys to be used/built?
 * Won't the sequential search over toc entries become problematic?
 * Some high level overview of how it's supposed to be used wouldn't
   hurt.
 * the estimator stuff doesn't seem to belong in the public header?

The test-shm-mq patch is intended to illustrate these points.  In that
case, for an N-worker configuration, there are N+1 keys; that is, N
message queues and one fixed-size control area.  The estimator stuff
is critically needed in the public header so that you can size your
DSM to hold the stuff you intend to store in it; again, see
test-shm-mq.

 Patch #3, shm-mq-v1.patch, is the heart of this series.  It creates an
 infrastructure for sending and receiving messages of arbitrary length
 using ring buffers stored in shared memory (presumably dynamic shared
 memory, but hypothetically the main shared memory segment could be
 used).

 The API seems to assume it's in dsm tho?

The header file makes no reference to dsm anywhere, so I'm not sure
why you draw this conclusion.

 Queues are single-reader and single-writer; they use process
 latches to implement waiting for the queue to fill (in the case of the
 reader) or drain (in the case of the writer).

 Hm. Too bad, I'd hoped for single-reader, multiple-writer :P

Sure, that might be useful, too, as might multiple-reader,
multi-writer.  But those things would come with performance and
complexity trade-offs of their own.  I think it's appropriate to leave
the task of creating those things to the people that need them.  If it
turns out that this can be enhanced to work like that without
meaningful loss of performance, that's fine by me, too, but I think
this has plenty of merit as-is.

 A non-blocking mode is also available for situations where other
 options might lead to deadlock.

 That's cool. I wonder if there's a way to discover, on the writing end,
 when new data can be sent. For reading there's a comment about waiting
 for the latch...

It's symmetric.  The idea is that you try to read or write data;
should that fail, you wait on your latch and try again when it's set.

 Couple of questions:
 * Some introductory comments about the concurrency approach would be
   good.

Not sure exactly what to write.

 * shm_mq_attach() is talking about BackgroundWorkerHandles - is there a
   limitation that a bgworker has to be involved?

No.  The only point of that is that someone might want to start
writing into a message queue before the reader has connected, or
reading before the writer has connected.  If they do that and the
counterparty never attaches, they'll hang forever.  You can handle
that by writing your own code to make sure both 

Re: [HACKERS] shared memory message queues

2013-11-19 Thread Robert Haas
On Tue, Nov 19, 2013 at 12:33 AM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 * on-dsm-detach-v2.patch
 It reminded me the hook registration/invocation mechanism on apache/httpd.
 It defines five levels for invocation order (REALLY_FIRST, FIRST, MIDDLE,
 LAST, REALLY_LAST), but these are alias of integer values, in other words,
 they are just kicks the hook in order to the priority value associated with a
 function pointer.
 These flexibility may make sense. We may want to control the order to
 release resources more fine grained in the future. For example, isn't it
 a problem if we have only two levels when a stuff in a queue needs to be
 released earlier than the queue itself?
 I'm not 100% certain on this suggestion because on_shmen_exit is a hook
 that does not host so much callbacks, thus extension may implement
 their own way on the SHMEM_EXIT_EARLY or SHMEM_EXIT_LATE stage
 of course.

I don't really see much point in adding more flexibility here until we
need it, but I can imagine that we might someday need it, for reasons
that are not now obvious.

 * shm-toc-v1.patch

 From my experience, it makes sense to put a magic number on the tail of
 toc segment to detect shared-memory usage overrun. It helps to find bugs
 bug hard to find because of concurrent jobs.
 Probably, shm_toc_freespace() is a point to put assertion.

 How many entries is shm_toc_lookup() assumed to chain?
 It assumes a liner search from the head of shm_toc segment, and it is
 prerequisite for lock-less reference, isn't it?
 Does it make a problem if shm_toc host many entries, like 100 or 1000?
 Or, it is not an expected usage?

It is not an expected usage.In typical usage, I expect that the
number of TOC entries will be about N+K, where K is a small constant
( 10) and N is the number of cooperating parallel workers.  It's
possible that we'll someday be in a position to leverage 100 or 1000
parallel workers on the same task, but I don't expect it to be soon.
And, actually, I doubt that revising the data structure would pay off
at N=100.  At N=1000, maybe.  At N=1, probably.  But we are
*definitely* not going to need that kind of scale any time soon, and I
don't think it makes sense to design a complex data structure to
handle that case when there are so many more basic problems that need
to be solved before we can go there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-11-18 Thread Kohei KaiGai
Hello,

I tried to look at the patch #1 and #2 at first, but I shall rest of
portion later.

* basic checks
All the patches (not only #1, #2) can be applied without any problem towards
the latest master branch. Its build was succeeded with Werror.
Regression test works fine on the core and contrib/test_shm_mq.

* on-dsm-detach-v2.patch
It reminded me the hook registration/invocation mechanism on apache/httpd.
It defines five levels for invocation order (REALLY_FIRST, FIRST, MIDDLE,
LAST, REALLY_LAST), but these are alias of integer values, in other words,
they are just kicks the hook in order to the priority value associated with a
function pointer.
These flexibility may make sense. We may want to control the order to
release resources more fine grained in the future. For example, isn't it
a problem if we have only two levels when a stuff in a queue needs to be
released earlier than the queue itself?
I'm not 100% certain on this suggestion because on_shmen_exit is a hook
that does not host so much callbacks, thus extension may implement
their own way on the SHMEM_EXIT_EARLY or SHMEM_EXIT_LATE stage
of course.

* shm-toc-v1.patch

From my experience, it makes sense to put a magic number on the tail of
toc segment to detect shared-memory usage overrun. It helps to find bugs
bug hard to find because of concurrent jobs.
Probably, shm_toc_freespace() is a point to put assertion.

How many entries is shm_toc_lookup() assumed to chain?
It assumes a liner search from the head of shm_toc segment, and it is
prerequisite for lock-less reference, isn't it?
Does it make a problem if shm_toc host many entries, like 100 or 1000?
Or, it is not an expected usage?

#3 and #4 should be looked later...

Thanks,

2013/11/8 Robert Haas robertmh...@gmail.com:
 On Wed, Nov 6, 2013 at 9:48 AM, Peter Eisentraut pete...@gmx.net wrote:
 This patch needs to be rebased.

 Thanks.  You didn't mention which of the four needed rebasing, but it
 seems that it's just the first one.  New version of just that patch
 attached; please use the prior versions of the remaining three.

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] shared memory message queues

2013-11-06 Thread Peter Eisentraut
This patch needs to be rebased.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers