Re: [HACKERS] Portability issues in shm_mq

2014-03-20 Thread Robert Haas
On Tue, Mar 18, 2014 at 4:14 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 18, 2014 at 12:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Meh.  I think you're putting a bit too much faith in your ability to
 predict the locus of bugs that you think aren't there.

 Well, I'm open to suggestions.

 As a suggestion: it'd be worth explicitly testing zero-byte and one-byte
 messages, those being obvious edge cases.  Then, say, randomly chosen
 lengths in the range 100-1000; this would help ferret out odd-length
 issues.  And something with message sizes larger than the queue size.

All right, done.  Let's see if that tickles any edge cases we haven't
hit before.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Robert Haas
On Mon, Mar 17, 2014 at 11:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 OK, I tried this out.  The major complication that cropped up was
 that, if we make the length word always a Size but align the buffer to
 MAXIMUM_ALIGNOF, then the length word might get split if sizeof(Size)
  MAXIMUM_ALIGNOF.

 Hmm ... do we support any platforms where that's actually the case?
 It's possible I guess but I don't know of any offhand.  The reverse
 case is real, but I'm not sure about this one.

Man, I don't have a clue what exists.  I certainly wouldn't want to
discourage someone from making a 64-bit machine that only requires
32-bit alignment for 8-bit values, but color me clueless as to whether
such things are really out there.

 That doesn't look too bad, but required changing a
 couple of if statements into while loops, and changing around the
 structure of a shm_mq_handle a bit.  See attached.

 Would it get noticeably simpler or faster if you omitted support for
 the sizeof(Size)  MAXIMUM_ALIGNOF case?  It looks like perhaps not,
 but if we were paying anything much I'd be tempted to just put in
 a static assert to the contrary and see if anyone complains.

Not really.  I installed a fast path into the receive code for the
common case where the length word isn't split, which will always be
true on platforms where sizeof(Size) = MAXIMUM_ALIGNOF and usually
true otherwise.  We could ditch the slow path completely by ignoring
that case, but it's not all that much code.  On the sending side, the
logic is pretty trivial, so I definitely don't feel bad about carrying
that.

The thing I kind of like about this approach is that it makes the code
fully independent of the relationship between MAXIMUM_ALIGNOF and
sizeof(Size).  If the former is smaller, we'll write the length word
in chunks if needed; if the latter is smaller, we'll insert useless
padding bytes.  In the perhaps-common case where they're identical, it
all works as before, except for minor space savings on 32-bit
platforms.  I was a little annoyed by having to write the extra code
and thought about objecting to this whole line of attack yet again,
but I think it's actually likely for the best.  If we start persuading
ourselves that certain cases don't need to work, and rely on that
throughout the backend, and then such machines crop up and we want to
support them, we'll have a deep hole to climb out of.  With this
approach, there might be bugs, of course, but it's a lot easier to fix
a bug that only occurs on a new platform than it is to reconsider the
whole design in light of a new platform.

 BTW, can't we drop the MAXALIGN64 stuff again?

It's still used by xlog.c.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Mar 17, 2014 at 11:09 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Would it get noticeably simpler or faster if you omitted support for
 the sizeof(Size)  MAXIMUM_ALIGNOF case?  It looks like perhaps not,
 but if we were paying anything much I'd be tempted to just put in
 a static assert to the contrary and see if anyone complains.

 Not really.  I installed a fast path into the receive code for the
 common case where the length word isn't split, which will always be
 true on platforms where sizeof(Size) = MAXIMUM_ALIGNOF and usually
 true otherwise.  We could ditch the slow path completely by ignoring
 that case, but it's not all that much code.  On the sending side, the
 logic is pretty trivial, so I definitely don't feel bad about carrying
 that.

Works for me.

 The thing I kind of like about this approach is that it makes the code
 fully independent of the relationship between MAXIMUM_ALIGNOF and
 sizeof(Size).

Yeah.  If it's not costing us much to support both cases, let's do so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Robert Haas
On Tue, Mar 18, 2014 at 10:44 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 The thing I kind of like about this approach is that it makes the code
 fully independent of the relationship between MAXIMUM_ALIGNOF and
 sizeof(Size).

 Yeah.  If it's not costing us much to support both cases, let's do so.

OK, committed as posted.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Robert Haas
All right.

On Fri, Mar 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Whilst setting up a buildfarm member on an old, now-spare Mac, I was
 somewhat astonished to discover that contrib/test_shm_mq crashes thus:
 TRAP: FailedAssertion(!(rb = sizeof(uint64)), File: shm_mq.c, Line: 429)
 but only in UTF8 locales, not in C locale.  You'd have bet your last
 dollar that that code was locale-independent, right?

First, can you retest this with the latest code?

 Recommendations:

 1. Reduce the random() multiplier from 96 to 95.  In multibyte encodings
 other than UTF8, chr() would flat out reject values of 128, so this test
 case is unportable.

 2. Why in the world is the test case testing exactly one message length
 that happens to be a multiple of 8?  Put some randomness into that,
 instead.

 3. Either you need to work a bit harder at forcing alignment, or you need
 to fix shm_mq_receive to cope with split message length words.

 4. The header comment for shm_mq_receive_bytes may once have described its
 API accurately, but that appears to have been a long long time ago in a
 galaxy far far away.  Please fix.

On these recommendations, I believe that #3 and #4 are now dealt with.
 That leaves #1 and #2.  #1 is of course easy, but I think we should
do them both together.

If we want to inject some randomness into the test, which parameters
do we want to randomize and over what ranges?  Also, if a buildfarm
critter falls over, how will we know what value triggered the failure?
 It's tempting to instead add one or more tests that we specifically
choose to have values we think are likely to exercise
platform-specific differences or otherwise find bugs - e.g. just add a
second test where the queue size and message length are both odd.  And
maybe at a test where the queue is smaller than the message size, so
that every message wraps (multiple times?).

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 First, can you retest this with the latest code?

Yeah, on it now.

 If we want to inject some randomness into the test, which parameters
 do we want to randomize and over what ranges?

I think the message length is the only particularly interesting
parameter.  It'd be nice if the length varied *within* a test, but
that would take rather considerable restructuring, so maybe it's
not worth the trouble.

 Also, if a buildfarm
 critter falls over, how will we know what value triggered the failure?

Maybe we won't, but I think knowing that it does fail on platform X is
likely to be enough to find the problem.

  It's tempting to instead add one or more tests that we specifically
 choose to have values we think are likely to exercise
 platform-specific differences or otherwise find bugs - e.g. just add a
 second test where the queue size and message length are both odd.

Meh.  I think you're putting a bit too much faith in your ability to
predict the locus of bugs that you think aren't there.

 maybe at a test where the queue is smaller than the message size, so
 that every message wraps (multiple times?).

Does the code support messages larger than the queue size?  If so, yes,
that case clearly oughta be tested.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Tom Lane
I wrote:
 Robert Haas robertmh...@gmail.com writes:
 First, can you retest this with the latest code?

 Yeah, on it now.

Early returns not good:

*** 
/Users/buildfarm/bf-data/HEAD/pgsql.93630/contrib/test_shm_mq/expected/test_shm_mq.out
  Tue Mar 18 12:00:18 2014
--- 
/Users/buildfarm/bf-data/HEAD/pgsql.93630/contrib/test_shm_mq/results/test_shm_mq.out
   Tue Mar 18 12:17:04 2014
***
*** 5,18 
  -- internal sanity tests fail.
  --
  SELECT test_shm_mq(32768, (select string_agg(chr(32+(random()*96)::int), '') 
from generate_series(1,400)), 1, 1);
!  test_shm_mq 
! -
!  
! (1 row)
! 
  SELECT test_shm_mq_pipelined(16384, (select 
string_agg(chr(32+(random()*96)::int), '') from generate_series(1,27)), 
200, 3);
!  test_shm_mq_pipelined 
! ---
!  
! (1 row)
! 
--- 5,12 
  -- internal sanity tests fail.
  --
  SELECT test_shm_mq(32768, (select string_agg(chr(32+(random()*96)::int), '') 
from generate_series(1,400)), 1, 1);
! ERROR:  message corrupted
! DETAIL:  The original message was 400 bytes but the final message is 
7492059346764176 bytes.
  SELECT test_shm_mq_pipelined(16384, (select 
string_agg(chr(32+(random()*96)::int), '') from generate_series(1,27)), 
200, 3);
! ERROR:  message corrupted
! DETAIL:  The original message was 27 bytes but the final message is 
7492059347033776 bytes.


This is C locale on a 32-bit machine, so you'll likely be seeing the same
complaint in already-online buildfarm members.

Note that size_t is definitely not int64 on this machine, so it looks to
me like your int64-ectomy was incomplete.  Those message lengths should
be impossible no matter what on this hardware.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Tom Lane
I wrote:
 Early returns not good:

Also, these compiler messages are probably relevant:

ccache gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fpic -I. -I. -I../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
-I/usr/include/et  -c -o test.o test.c
test.c: In function 'test_shm_mq':
test.c:89:3: warning: passing argument 2 of 'shm_mq_receive' from incompatible 
pointer type [enabled by default]
In file included from test_shm_mq.h:18:0,
 from test.c:19:
../../src/include/storage/shm_mq.h:61:22: note: expected 'Size *' but argument 
is of type 'uint64 *'
test.c: In function 'test_shm_mq_pipelined':
test.c:198:4: warning: passing argument 2 of 'shm_mq_receive' from incompatible 
pointer type [enabled by default]
In file included from test_shm_mq.h:18:0,
 from test.c:19:
../../src/include/storage/shm_mq.h:61:22: note: expected 'Size *' but argument 
is of type 'uint64 *'
ccache gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fpic -I. -I. -I../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
-I/usr/include/et  -c -o setup.o setup.c
ccache gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
-Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
-Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
-fpic -I. -I. -I../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
-I/usr/include/et  -c -o worker.o worker.c
worker.c: In function 'copy_messages':
worker.c:193:3: warning: passing argument 2 of 'shm_mq_receive' from 
incompatible pointer type [enabled by default]
In file included from worker.c:25:0:
../../src/include/storage/shm_mq.h:61:22: note: expected 'Size *' but argument 
is of type 'uint64 *'

I'm thinking maybe you just forgot to update the contrib module.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Robert Haas
On Tue, Mar 18, 2014 at 1:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Early returns not good:

 Also, these compiler messages are probably relevant:

 ccache gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
 -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
 -fpic -I. -I. -I../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
 -I/usr/include/et  -c -o test.o test.c
 test.c: In function 'test_shm_mq':
 test.c:89:3: warning: passing argument 2 of 'shm_mq_receive' from 
 incompatible pointer type [enabled by default]
 In file included from test_shm_mq.h:18:0,
  from test.c:19:
 ../../src/include/storage/shm_mq.h:61:22: note: expected 'Size *' but 
 argument is of type 'uint64 *'
 test.c: In function 'test_shm_mq_pipelined':
 test.c:198:4: warning: passing argument 2 of 'shm_mq_receive' from 
 incompatible pointer type [enabled by default]
 In file included from test_shm_mq.h:18:0,
  from test.c:19:
 ../../src/include/storage/shm_mq.h:61:22: note: expected 'Size *' but 
 argument is of type 'uint64 *'
 ccache gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
 -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
 -fpic -I. -I. -I../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
 -I/usr/include/et  -c -o setup.o setup.c
 ccache gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
 -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard -g 
 -fpic -I. -I. -I../../src/include -D_GNU_SOURCE -I/usr/include/libxml2  
 -I/usr/include/et  -c -o worker.o worker.c
 worker.c: In function 'copy_messages':
 worker.c:193:3: warning: passing argument 2 of 'shm_mq_receive' from 
 incompatible pointer type [enabled by default]
 In file included from worker.c:25:0:
 ../../src/include/storage/shm_mq.h:61:22: note: expected 'Size *' but 
 argument is of type 'uint64 *'

 I'm thinking maybe you just forgot to update the contrib module.

Well, I definitely forgot that.  I'll count myself lucky if that's the
only problem.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Andres Freund
On 2014-03-18 13:31:47 -0400, Robert Haas wrote:
 Well, I definitely forgot that.  I'll count myself lucky if that's the
 only problem.

One minor thing missing, patch attached.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/contrib/test_shm_mq/test.c b/contrib/test_shm_mq/test.c
index dba5e69..5ff1e9a 100644
--- a/contrib/test_shm_mq/test.c
+++ b/contrib/test_shm_mq/test.c
@@ -254,12 +254,12 @@ verify_message(Size origlen, char *origdata, Size newlen, char *newdata)
 	if (origlen != newlen)
 		ereport(ERROR,
 (errmsg(message corrupted),
- errdetail(The original message was  UINT64_FORMAT  bytes but the final message is  UINT64_FORMAT  bytes.,
+ errdetail(The original message was %zu bytes but the final message is %zu bytes.,
 	 origlen, newlen)));
 
 	for (i = 0; i  origlen; ++i)
 		if (origdata[i] != newdata[i])
 			ereport(ERROR,
 	(errmsg(message corrupted),
-	 errdetail(The new and original messages differ at byte  UINT64_FORMAT  of  UINT64_FORMAT ., i, origlen)));
+	 errdetail(The new and original messages differ at byte %zu of %zu., i, origlen)));
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Tom Lane
and...@anarazel.de (Andres Freund) writes:
 On 2014-03-18 13:31:47 -0400, Robert Haas wrote:
 Well, I definitely forgot that.  I'll count myself lucky if that's the
 only problem.

 One minor thing missing, patch attached.

setup_dynamic_shared_memory needed some more hacking too.  Committed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Tom Lane
After the last round of changes, I can confirm that my original test with
UTF8 locale works, and my HPPA box is happy too.  We could still stand
to improve the regression test though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Robert Haas
On Tue, Mar 18, 2014 at 12:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  It's tempting to instead add one or more tests that we specifically
 choose to have values we think are likely to exercise
 platform-specific differences or otherwise find bugs - e.g. just add a
 second test where the queue size and message length are both odd.

 Meh.  I think you're putting a bit too much faith in your ability to
 predict the locus of bugs that you think aren't there.

Well, I'm open to suggestions.

 maybe at a test where the queue is smaller than the message size, so
 that every message wraps (multiple times?).

 Does the code support messages larger than the queue size?  If so, yes,
 that case clearly oughta be tested.

Yep.  You should be able to send and receive any message that fits
within MaxAllocSize on even the smallest possible queue. Performance
may suck, but if that's an issue for you then don't use such a blasted
small queue.  The intended use of this is to stream (potentially long)
error messages or (potentially long and numerous) tuples between
cooperating backends without having to preallocate space for all the
data you want to send (which won't be any more feasible in a DSM than
it would be in the main segment).

Actually, you should be able to send or receive arbitrarily large
messages if you change the MemoryContextAlloc call in shm_mq.c to
MemoryContextAllocHuge, but I can't see any compelling reason to do
that.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-18 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Mar 18, 2014 at 12:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Meh.  I think you're putting a bit too much faith in your ability to
 predict the locus of bugs that you think aren't there.

 Well, I'm open to suggestions.

As a suggestion: it'd be worth explicitly testing zero-byte and one-byte
messages, those being obvious edge cases.  Then, say, randomly chosen
lengths in the range 100-1000; this would help ferret out odd-length
issues.  And something with message sizes larger than the queue size.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-17 Thread Robert Haas
On Sun, Mar 16, 2014 at 10:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 But I think there's another possible problem here.  In order for reads
 from the buffer not to suffer alignment problems, the chunk size for
 reads and writes from the buffer needs to be MAXIMUM_ALIGNOF (or some
 multiple of it).  And in order to avoid a great deal of additional and
 unwarranted complexity, the size of the message word also needs to be
 MAXIMUM_ALIGNOF (or some multiple of it).  So the message word can
 only be of size 4 if MAXIMUM_ALIGNOF is also 4.  IOW, I think your
 approach is going to run into trouble on any system where
 sizeof(Size)==4 but MAXIMUM_ALIGNOF==8.

 Well, it will result in padding space when you maxalign the length word,
 but I don't see why it wouldn't work; and it would certainly be no less
 efficient than what's there today.

Well, the problem is with this:

/* Write the message length into the buffer. */
if (!mqh-mqh_did_length_word)
{
res = shm_mq_send_bytes(mqh, sizeof(uint64), nbytes, nowait,
bytes_written);

If I change nbytes to be of type Size, and the second argument to
sizeof(Size), then it's wrong whenever sizeof(Size) % MAXIMUM_ALIGNOF
!= 0.  I could do something like:

union
{
char pad[MAXIMUM_ALIGNOF];
Size val;
} padded_size;

padded_size.val = nbytes;
res = shm_mq_send_bytes(mqh, sizeof(padded_size), padded_size,
nowait, bytes_written);

...but that probably *is* less efficient, and it's certainly a lot
uglier; a similar hack will be needed when extracting the length word,
and assertions elsewhere will need adjustment.  I wonder if it
wouldn't be better to adjust the external API to use Size just for
consistency but, internally to the module, keep using 8 byte sizes
within the buffer.  Really, I think that would boil down to going
through and making sure that we use TYPEALIGN(...,sizeof(uint64))
everywhere instead of MAXALIGN(), which doesn't seem like a big deal.
On the third hand, maybe there are or will be platforms out there
where MAXIMUM_ALIGNOF  8.  If so, it's probably best to bite the
bullet and allow for padding now, so we don't have to monkey with this
again later.

Sorry for belaboring this point, but I want to make sure I only need
to fix this once.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Mar 16, 2014 at 10:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, it will result in padding space when you maxalign the length word,
 but I don't see why it wouldn't work; and it would certainly be no less
 efficient than what's there today.

 Well, the problem is with this:

 /* Write the message length into the buffer. */
 if (!mqh-mqh_did_length_word)
 {
 res = shm_mq_send_bytes(mqh, sizeof(uint64), nbytes, nowait,
 bytes_written);

 If I change nbytes to be of type Size, and the second argument to
 sizeof(Size), then it's wrong whenever sizeof(Size) % MAXIMUM_ALIGNOF
 != 0.

Well, you need to maxalign the number of bytes physically inserted into
the queue.  Doesn't shm_mq_send_bytes do that?  Where do you do the
maxaligning of the message payload data, when the payload is odd-length?
I would have expected some logic like copy N bytes but then advance
the pointer by maxalign(N).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-17 Thread Robert Haas
On Mon, Mar 17, 2014 at 12:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Mar 16, 2014 at 10:45 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Well, it will result in padding space when you maxalign the length word,
 but I don't see why it wouldn't work; and it would certainly be no less
 efficient than what's there today.

 Well, the problem is with this:

 /* Write the message length into the buffer. */
 if (!mqh-mqh_did_length_word)
 {
 res = shm_mq_send_bytes(mqh, sizeof(uint64), nbytes, nowait,
 bytes_written);

 If I change nbytes to be of type Size, and the second argument to
 sizeof(Size), then it's wrong whenever sizeof(Size) % MAXIMUM_ALIGNOF
 != 0.

 Well, you need to maxalign the number of bytes physically inserted into
 the queue.  Doesn't shm_mq_send_bytes do that?  Where do you do the
 maxaligning of the message payload data, when the payload is odd-length?
 I would have expected some logic like copy N bytes but then advance
 the pointer by maxalign(N).

Oh, yeah.  Duh.  Clearly my brain isn't working today.  Hmm, so maybe
this will be fairly simple... will try it out.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-17 Thread Robert Haas
On Mon, Mar 17, 2014 at 1:26 PM, Robert Haas robertmh...@gmail.com wrote:
 Oh, yeah.  Duh.  Clearly my brain isn't working today.  Hmm, so maybe
 this will be fairly simple... will try it out.

OK, I tried this out.  The major complication that cropped up was
that, if we make the length word always a Size but align the buffer to
MAXIMUM_ALIGNOF, then the length word might get split if sizeof(Size)
 MAXIMUM_ALIGNOF.  That doesn't look too bad, but required changing a
couple of if statements into while loops, and changing around the
structure of a shm_mq_handle a bit.  See attached.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/src/backend/storage/ipc/shm_mq.c b/src/backend/storage/ipc/shm_mq.c
index 2d298a3..b31f4fb 100644
--- a/src/backend/storage/ipc/shm_mq.c
+++ b/src/backend/storage/ipc/shm_mq.c
@@ -72,7 +72,7 @@ struct shm_mq
 	PGPROC	   *mq_sender;
 	uint64		mq_bytes_read;
 	uint64		mq_bytes_written;
-	uint64		mq_ring_size;
+	Size		mq_ring_size;
 	bool		mq_detached;
 	uint8		mq_ring_offset;
 	char		mq_ring[FLEXIBLE_ARRAY_MEMBER];
@@ -103,15 +103,16 @@ struct shm_mq
  * locally by copying the chunks into a backend-local buffer.  mqh_buffer is
  * the buffer, and mqh_buflen is the number of bytes allocated for it.
  *
- * mqh_partial_message_bytes, mqh_expected_bytes, and mqh_did_length_word
+ * mqh_partial_message_bytes, mqh_expected_bytes, and mqh_length_word_complete
  * are used to track the state of non-blocking operations.  When the caller
  * attempts a non-blocking operation that returns SHM_MQ_WOULD_BLOCK, they
  * are expected to retry the call at a later time with the same argument;
  * we need to retain enough state to pick up where we left off.
- * mqh_did_length_word tracks whether we read or wrote the length word,
- * mqh_partial_message_bytes tracks the number of payload bytes read or
- * written, and mqh_expected_bytes - which is used only for reads - tracks
- * the expected total size of the payload.
+ * mqh_length_word_complete tracks whether we are done sending or receiving
+ * (whichever we're doing) the entire length word.  mqh_partial_bytes tracks
+ * the number of bytes read or written for either the length word or the
+ * message itself, and mqh_expected_bytes - which is used only for reads -
+ * tracks the expected total size of the payload.
  *
  * mqh_counterparty_attached tracks whether we know the counterparty to have
  * attached to the queue at some previous point.  This lets us avoid some
@@ -128,25 +129,25 @@ struct shm_mq_handle
 	dsm_segment *mqh_segment;
 	BackgroundWorkerHandle *mqh_handle;
 	char	   *mqh_buffer;
-	uint64		mqh_buflen;
-	uint64		mqh_consume_pending;
-	uint64		mqh_partial_message_bytes;
-	uint64		mqh_expected_bytes;
-	bool		mqh_did_length_word;
+	Size		mqh_buflen;
+	Size		mqh_consume_pending;
+	Size		mqh_partial_bytes;
+	Size		mqh_expected_bytes;
+	bool		mqh_length_word_complete;
 	bool		mqh_counterparty_attached;
 	MemoryContext mqh_context;
 };
 
-static shm_mq_result shm_mq_send_bytes(shm_mq_handle *mq, uint64 nbytes,
-  void *data, bool nowait, uint64 *bytes_written);
-static shm_mq_result shm_mq_receive_bytes(shm_mq *mq, uint64 bytes_needed,
-	 bool nowait, uint64 *nbytesp, void **datap);
+static shm_mq_result shm_mq_send_bytes(shm_mq_handle *mq, Size nbytes,
+  void *data, bool nowait, Size *bytes_written);
+static shm_mq_result shm_mq_receive_bytes(shm_mq *mq, Size bytes_needed,
+	 bool nowait, Size *nbytesp, void **datap);
 static bool shm_mq_wait_internal(volatile shm_mq *mq, PGPROC * volatile *ptr,
 	 BackgroundWorkerHandle *handle);
 static uint64 shm_mq_get_bytes_read(volatile shm_mq *mq, bool *detached);
-static void shm_mq_inc_bytes_read(volatile shm_mq *mq, uint64 n);
+static void shm_mq_inc_bytes_read(volatile shm_mq *mq, Size n);
 static uint64 shm_mq_get_bytes_written(volatile shm_mq *mq, bool *detached);
-static void shm_mq_inc_bytes_written(volatile shm_mq *mq, uint64 n);
+static void shm_mq_inc_bytes_written(volatile shm_mq *mq, Size n);
 static shm_mq_result shm_mq_notify_receiver(volatile shm_mq *mq);
 static void shm_mq_detach_callback(dsm_segment *seg, Datum arg);
 
@@ -163,7 +164,7 @@ shm_mq *
 shm_mq_create(void *address, Size size)
 {
 	shm_mq	   *mq = address;
-	uint64		data_offset = MAXALIGN(offsetof(shm_mq, mq_ring));
+	Size		data_offset = MAXALIGN(offsetof(shm_mq, mq_ring));
 
 	/* If the size isn't MAXALIGN'd, just discard the odd bytes. */
 	size = MAXALIGN_DOWN(size);
@@ -289,8 +290,8 @@ shm_mq_attach(shm_mq *mq, dsm_segment *seg, BackgroundWorkerHandle *handle)
 	mqh-mqh_buflen = 0;
 	mqh-mqh_consume_pending = 0;
 	mqh-mqh_context = CurrentMemoryContext;
-	mqh-mqh_partial_message_bytes = 0;
-	mqh-mqh_did_length_word = false;
+	mqh-mqh_partial_bytes = 0;
+	mqh-mqh_length_word_complete = false;
 	mqh-mqh_counterparty_attached = false;
 
 	if (seg != NULL)
@@ -314,41 +315,48 @@ shm_mq_attach(shm_mq *mq, dsm_segment *seg, 

Re: [HACKERS] Portability issues in shm_mq

2014-03-17 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 OK, I tried this out.  The major complication that cropped up was
 that, if we make the length word always a Size but align the buffer to
 MAXIMUM_ALIGNOF, then the length word might get split if sizeof(Size)
  MAXIMUM_ALIGNOF.

Hmm ... do we support any platforms where that's actually the case?
It's possible I guess but I don't know of any offhand.  The reverse
case is real, but I'm not sure about this one.

 That doesn't look too bad, but required changing a
 couple of if statements into while loops, and changing around the
 structure of a shm_mq_handle a bit.  See attached.

Would it get noticeably simpler or faster if you omitted support for
the sizeof(Size)  MAXIMUM_ALIGNOF case?  It looks like perhaps not,
but if we were paying anything much I'd be tempted to just put in
a static assert to the contrary and see if anyone complains.

BTW, can't we drop the MAXALIGN64 stuff again?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Fri, Mar 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 How is that leading to a crash?  Well, this machine is 32-bit, so MAXALIGN
 is only 4.  This means it is possible for an odd-length message cum
 message length word to not exactly divide the size of the shared memory
 ring buffer, resulting in cases where an 8-byte message length word is
 wrapped around the end of the buffer.

 Argh.  I think I forced the size of the buffer to be MAXALIGN'd, but
 what it really needs is to be a multiple of the size of uint64.

After sleeping on it, I think what you're proposing here is to double down
on a wrong design decision.  ISTM you should change the message length
words to be size_t (or possibly ssize_t, if you're depending on signed
arithmetic), which would let you keep using MAXALIGN as the alignment
macro.  There is absolutely no benefit, either for performance or code
readability, in forcing 32-bit machines to use 64-bit message length
words.  Indeed, by not using the same alignment macros as everywhere else
and not being able to use %zu for debug printouts, I think the only real
effect you're producing is to make the DSM/MQ stuff more and more randomly
unlike the rest of Postgres.  Please reconsider while it's still not too
late to change those APIs.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-16 Thread Robert Haas
On Sun, Mar 16, 2014 at 11:26 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Fri, Mar 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 How is that leading to a crash?  Well, this machine is 32-bit, so MAXALIGN
 is only 4.  This means it is possible for an odd-length message cum
 message length word to not exactly divide the size of the shared memory
 ring buffer, resulting in cases where an 8-byte message length word is
 wrapped around the end of the buffer.

 Argh.  I think I forced the size of the buffer to be MAXALIGN'd, but
 what it really needs is to be a multiple of the size of uint64.

 After sleeping on it, I think what you're proposing here is to double down
 on a wrong design decision.  ISTM you should change the message length
 words to be size_t (or possibly ssize_t, if you're depending on signed
 arithmetic), which would let you keep using MAXALIGN as the alignment
 macro.  There is absolutely no benefit, either for performance or code
 readability, in forcing 32-bit machines to use 64-bit message length
 words.  Indeed, by not using the same alignment macros as everywhere else
 and not being able to use %zu for debug printouts, I think the only real
 effect you're producing is to make the DSM/MQ stuff more and more randomly
 unlike the rest of Postgres.  Please reconsider while it's still not too
 late to change those APIs.

Hmm.  That's not a terrible idea.  I think part of the reason I did it
this way because, although the size of an individual message can be
limited to size_t, the queue maintains a counter of the total number
of bytes ever sent and received, and that has to use 64-bit arithmetic
so it doesn't overflow (much as we do for LSNs).  It seemed simpler to
make all of the lengths uint64 rather than the lengths of individual
messages Size and the total number of bytes sent and received uint64.
However, that could probably be worked out.

But I think there's another possible problem here.  In order for reads
from the buffer not to suffer alignment problems, the chunk size for
reads and writes from the buffer needs to be MAXIMUM_ALIGNOF (or some
multiple of it).  And in order to avoid a great deal of additional and
unwarranted complexity, the size of the message word also needs to be
MAXIMUM_ALIGNOF (or some multiple of it).  So the message word can
only be of size 4 if MAXIMUM_ALIGNOF is also 4.  IOW, I think your
approach is going to run into trouble on any system where
sizeof(Size)==4 but MAXIMUM_ALIGNOF==8.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-16 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 But I think there's another possible problem here.  In order for reads
 from the buffer not to suffer alignment problems, the chunk size for
 reads and writes from the buffer needs to be MAXIMUM_ALIGNOF (or some
 multiple of it).  And in order to avoid a great deal of additional and
 unwarranted complexity, the size of the message word also needs to be
 MAXIMUM_ALIGNOF (or some multiple of it).  So the message word can
 only be of size 4 if MAXIMUM_ALIGNOF is also 4.  IOW, I think your
 approach is going to run into trouble on any system where
 sizeof(Size)==4 but MAXIMUM_ALIGNOF==8.

Well, it will result in padding space when you maxalign the length word,
but I don't see why it wouldn't work; and it would certainly be no less
efficient than what's there today.

I'll be quite happy to test the results on my old HPPA box, which has
exactly those properties, if you're worried about it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Portability issues in shm_mq

2014-03-15 Thread Robert Haas
On Fri, Mar 14, 2014 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Whilst setting up a buildfarm member on an old, now-spare Mac, I was
 somewhat astonished to discover that contrib/test_shm_mq crashes thus:
 TRAP: FailedAssertion(!(rb = sizeof(uint64)), File: shm_mq.c, Line: 429)
 but only in UTF8 locales, not in C locale.  You'd have bet your last
 dollar that that code was locale-independent, right?

 The reason appears to be that in the payload string generated with
 (select string_agg(chr(32+(random()*96)::int), '') from 
 generate_series(1,400))
 the chr() argument rounds up to 128 every so often.  In UTF8 encoding,
 that causes chr() to return a multibyte character instead of a single
 byte.  So, instead of always having a fixed payload string length of
 400 bytes, the payload length moves around a bit --- in a few trials
 I see anywhere from 400 to 409 bytes.

 How is that leading to a crash?  Well, this machine is 32-bit, so MAXALIGN
 is only 4.  This means it is possible for an odd-length message cum
 message length word to not exactly divide the size of the shared memory
 ring buffer, resulting in cases where an 8-byte message length word is
 wrapped around the end of the buffer.  shm_mq_receive_bytes makes no
 attempt to hide that situation from its caller, and happily returns just
 4 bytes with SHM_MQ_SUCCESS.  shm_mq_receive, on the other hand, is so
 confident that it will always get an indivisible length word that it just
 Asserts that that's the case.

Argh.  I think I forced the size of the buffer to be MAXALIGN'd, but
what it really needs is to be a multiple of the size of uint64.

 Recommendations:

 1. Reduce the random() multiplier from 96 to 95.  In multibyte encodings
 other than UTF8, chr() would flat out reject values of 128, so this test
 case is unportable.

Agreed.

 2. Why in the world is the test case testing exactly one message length
 that happens to be a multiple of 8?  Put some randomness into that,
 instead.

Good idea.  I think that started out as a performance test rather than
an integrity test, and I didn't think hard enough when revising it
about what would make a good integrity test.

 3. Either you need to work a bit harder at forcing alignment, or you need
 to fix shm_mq_receive to cope with split message length words.

The first one is what is intended.  I will look at it.

 4. The header comment for shm_mq_receive_bytes may once have described its
 API accurately, but that appears to have been a long long time ago in a
 galaxy far far away.  Please fix.

Ugh, looks like I forgot to update that when I introduced the
shm_mq_result return type.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Portability issues in shm_mq

2014-03-14 Thread Tom Lane
Whilst setting up a buildfarm member on an old, now-spare Mac, I was
somewhat astonished to discover that contrib/test_shm_mq crashes thus:
TRAP: FailedAssertion(!(rb = sizeof(uint64)), File: shm_mq.c, Line: 429)
but only in UTF8 locales, not in C locale.  You'd have bet your last
dollar that that code was locale-independent, right?

The reason appears to be that in the payload string generated with
(select string_agg(chr(32+(random()*96)::int), '') from generate_series(1,400))
the chr() argument rounds up to 128 every so often.  In UTF8 encoding,
that causes chr() to return a multibyte character instead of a single
byte.  So, instead of always having a fixed payload string length of
400 bytes, the payload length moves around a bit --- in a few trials
I see anywhere from 400 to 409 bytes.

How is that leading to a crash?  Well, this machine is 32-bit, so MAXALIGN
is only 4.  This means it is possible for an odd-length message cum
message length word to not exactly divide the size of the shared memory
ring buffer, resulting in cases where an 8-byte message length word is
wrapped around the end of the buffer.  shm_mq_receive_bytes makes no
attempt to hide that situation from its caller, and happily returns just
4 bytes with SHM_MQ_SUCCESS.  shm_mq_receive, on the other hand, is so
confident that it will always get an indivisible length word that it just
Asserts that that's the case.

Recommendations:

1. Reduce the random() multiplier from 96 to 95.  In multibyte encodings
other than UTF8, chr() would flat out reject values of 128, so this test
case is unportable.

2. Why in the world is the test case testing exactly one message length
that happens to be a multiple of 8?  Put some randomness into that,
instead.

3. Either you need to work a bit harder at forcing alignment, or you need
to fix shm_mq_receive to cope with split message length words.

4. The header comment for shm_mq_receive_bytes may once have described its
API accurately, but that appears to have been a long long time ago in a
galaxy far far away.  Please fix.


Also, while this is not directly your problem, it's becoming clear that we
don't have enough buildfarm coverage of not-64-bit platforms; this problem
would have been spotted awhile ago if we did.  I'm going to spin up a
couple of critters on old machines lying around my office.  We should
probably also encourage owners of existing critters to expand their test
coverage a bit, eg try locales other than C.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers