Re: [HACKERS] SerializedSnapshotData alignment

2017-03-01 Thread Noah Misch
On Mon, Feb 27, 2017 at 12:11:54PM +0530, Robert Haas wrote:
> On Mon, Feb 27, 2017 at 5:13 AM, Noah Misch  wrote:
> > Dear 7b4ac19 authors,
> >
> > Field ps_snapshot_data usually receives four-byte alignment within
> > ParallelIndexScanDescData, but it contains the eight-byte whenTaken field.
> > The select_parallel test dies with SIGBUS on "Oracle Solaris 10 1/13
> > s10s_u11wos_24a SPARC", building with gcc 4.9.2.  Some credible fixes:
> >
> > 1. Move the SerializedSnapshotData declaration from snapmgr.c to snapmgr.h 
> > and
> >declare the ps_snapshot_data field to be of type SerializedSnapshotData.
> >Probably also add a field "TransactionId xids[FLEXIBLE_ARRAY_MEMBER]" to
> >SerializedSnapshotData, to assert the variable-length nature.
> >
> > 2. Change "char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER]" to "int64 ...".  I
> >have attached this in SerializedSnapshot-int64-v1.patch.
> >
> > 3. Change no declarations, and make snapmgr.c memcpy() the
> >SerializedSnapshotData through a local buffer.  I have attached this as
> >SerializedSnapshot-memcpy-v1.patch.
> >
> > I like (2) well enough, but I don't see that technique used elsewhere in the
> > tree.  (1) is more typical of PostgreSQL, though I personally like it when
> > structs can stay private to a file.  (3) is also well-attested, particularly
> > in xlog replay code.  I am leaning toward (2).  Other opinions?
> 
> In my imagination, we were already doing #3, so I'd probably favor
> that approach.

Pushed that way.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SerializedSnapshotData alignment

2017-03-01 Thread Noah Misch
On Mon, Feb 27, 2017 at 03:08:35PM +1300, Thomas Munro wrote:
> On Mon, Feb 27, 2017 at 2:18 PM, Noah Misch  wrote:
> > I wondered the same thing; if nothing else, why don't protosciurus and
> > castoroides fail the same way?  They do use older compilers, "Sun C 5.10
> > SunOS_sparc 2009/06/03" and gcc 3.4.3.  I have "Sun C 5.12 SunOS_sparc
> > 2011/11/16" and gcc 4.9.2, both of which are alignment-sensitive in several
> > configurations, according to the attached test program.  However, in a 
> > 32-bit
> > build with this Sun C, I don't get alignment-related bus errors.  (Those
> > animals build 64-bit, so this isn't the full story.)
> 
> It's been a while but I seem to recall that Sun C defaulted to a
> -xmemalign setting that tolerated misaligned reads in 32 bit builds,
> but used a different default on 64 bit builds.  "Solaris Application
> Programming" 5.8.5 seems to confirm this:  "For 32-bit applications,
> since Sun Studio 9, the default is for the compiler to assume 8-byte
> alignment and to trap and correct any misaligned data accesses.  For
> 64-bit applications, the compiler assumes 8-byte alignment, but the
> application will SIGBUS on a misaligned access."
> 
> Yet castoroides seems to be building with -m64, so that's not the
> explanation.  Could it be correctly aligned by coincidence?

Coincidental alignment was it.  ps_snapshot_data has offset 16 in a 64-bit
build and offset 12 in a 32-bit build, so it is well-aligned in 64-bit only.
I was building a 32-bit PostgreSQL; when I build a 64-bit PostgreSQL, I no
longer get the SIGBUS.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SerializedSnapshotData alignment

2017-02-26 Thread Robert Haas
On Mon, Feb 27, 2017 at 5:13 AM, Noah Misch  wrote:
> Dear 7b4ac19 authors,
>
> Field ps_snapshot_data usually receives four-byte alignment within
> ParallelIndexScanDescData, but it contains the eight-byte whenTaken field.
> The select_parallel test dies with SIGBUS on "Oracle Solaris 10 1/13
> s10s_u11wos_24a SPARC", building with gcc 4.9.2.  Some credible fixes:
>
> 1. Move the SerializedSnapshotData declaration from snapmgr.c to snapmgr.h and
>declare the ps_snapshot_data field to be of type SerializedSnapshotData.
>Probably also add a field "TransactionId xids[FLEXIBLE_ARRAY_MEMBER]" to
>SerializedSnapshotData, to assert the variable-length nature.
>
> 2. Change "char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER]" to "int64 ...".  I
>have attached this in SerializedSnapshot-int64-v1.patch.
>
> 3. Change no declarations, and make snapmgr.c memcpy() the
>SerializedSnapshotData through a local buffer.  I have attached this as
>SerializedSnapshot-memcpy-v1.patch.
>
> I like (2) well enough, but I don't see that technique used elsewhere in the
> tree.  (1) is more typical of PostgreSQL, though I personally like it when
> structs can stay private to a file.  (3) is also well-attested, particularly
> in xlog replay code.  I am leaning toward (2).  Other opinions?

In my imagination, we were already doing #3, so I'd probably favor
that approach.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SerializedSnapshotData alignment

2017-02-26 Thread Amit Kapila
On Mon, Feb 27, 2017 at 5:13 AM, Noah Misch  wrote:
> Dear 7b4ac19 authors,
>
> Field ps_snapshot_data usually receives four-byte alignment within
> ParallelIndexScanDescData, but it contains the eight-byte whenTaken field.
> The select_parallel test dies with SIGBUS on "Oracle Solaris 10 1/13
> s10s_u11wos_24a SPARC", building with gcc 4.9.2.  Some credible fixes:
>
> 1. Move the SerializedSnapshotData declaration from snapmgr.c to snapmgr.h and
>declare the ps_snapshot_data field to be of type SerializedSnapshotData.
>Probably also add a field "TransactionId xids[FLEXIBLE_ARRAY_MEMBER]" to
>SerializedSnapshotData, to assert the variable-length nature.
>
> 2. Change "char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER]" to "int64 ...".  I
>have attached this in SerializedSnapshot-int64-v1.patch.
>
> 3. Change no declarations, and make snapmgr.c memcpy() the
>SerializedSnapshotData through a local buffer.  I have attached this as
>SerializedSnapshot-memcpy-v1.patch.
>
> I like (2) well enough, but I don't see that technique used elsewhere in the
> tree.
>

You seem to have forgotten to change all the callers of
Serialize/RestoreSnapshot to int64* which is causing below warnings on
my m/c:

1>src/backend/access/transam/parallel.c(334): warning C4133:
'function' : incompatible types - from 'char *' to 'int64 *'
1>src/backend/access/transam/parallel.c(338): warning C4133:
'function' : incompatible types - from 'char *' to 'int64 *'
1>src/backend/access/transam/parallel.c(1072): warning C4133:
'function' : incompatible types - from 'char *' to 'int64 *'
1>src/backend/access/transam/parallel.c(1078): warning C4133:
'function' : incompatible types - from 'char *' to 'int64 *'


>  (1) is more typical of PostgreSQL, though I personally like it when
> structs can stay private to a file.  (3) is also well-attested, particularly
> in xlog replay code.  I am leaning toward (2).  Other opinions?
>

Your Approach-2 patch looks like a sane idea to me, if we decide to do
it some other way, I can write a patch unless you prefer to do it by
yourself.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SerializedSnapshotData alignment

2017-02-26 Thread Thomas Munro
On Mon, Feb 27, 2017 at 2:18 PM, Noah Misch  wrote:
> I wondered the same thing; if nothing else, why don't protosciurus and
> castoroides fail the same way?  They do use older compilers, "Sun C 5.10
> SunOS_sparc 2009/06/03" and gcc 3.4.3.  I have "Sun C 5.12 SunOS_sparc
> 2011/11/16" and gcc 4.9.2, both of which are alignment-sensitive in several
> configurations, according to the attached test program.  However, in a 32-bit
> build with this Sun C, I don't get alignment-related bus errors.  (Those
> animals build 64-bit, so this isn't the full story.)

It's been a while but I seem to recall that Sun C defaulted to a
-xmemalign setting that tolerated misaligned reads in 32 bit builds,
but used a different default on 64 bit builds.  "Solaris Application
Programming" 5.8.5 seems to confirm this:  "For 32-bit applications,
since Sun Studio 9, the default is for the compiler to assume 8-byte
alignment and to trap and correct any misaligned data accesses.  For
64-bit applications, the compiler assumes 8-byte alignment, but the
application will SIGBUS on a misaligned access."

Yet castoroides seems to be building with -m64, so that's not the
explanation.  Could it be correctly aligned by coincidence?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SerializedSnapshotData alignment

2017-02-26 Thread Noah Misch
On Sun, Feb 26, 2017 at 07:53:15PM -0500, Tom Lane wrote:
> Noah Misch  writes:
> > Dear 7b4ac19 authors,
> > Field ps_snapshot_data usually receives four-byte alignment within
> > ParallelIndexScanDescData, but it contains the eight-byte whenTaken field.
> > The select_parallel test dies with SIGBUS on "Oracle Solaris 10 1/13
> > s10s_u11wos_24a SPARC", building with gcc 4.9.2.
> 
> It's a little distressing that the buildfarm didn't find this already.
> Is there some reason why it's specific to that particular compiler,
> rather than generic to alignment-picky 64-bit machines?

I wondered the same thing; if nothing else, why don't protosciurus and
castoroides fail the same way?  They do use older compilers, "Sun C 5.10
SunOS_sparc 2009/06/03" and gcc 3.4.3.  I have "Sun C 5.12 SunOS_sparc
2011/11/16" and gcc 4.9.2, both of which are alignment-sensitive in several
configurations, according to the attached test program.  However, in a 32-bit
build with this Sun C, I don't get alignment-related bus errors.  (Those
animals build 64-bit, so this isn't the full story.)

> In general, though, I agree that using a char[] member to represent
> anything that has any alignment requirement at all is seriously bad
> coding style that is almost certain to fail eventually.
> 
> A solution you didn't mention is to change the ParallelIndexScanDescData
> field to be a pointer, perhaps "struct SerializedSnapshotData *", while
> leaving that struct opaque so far as relscan.h is concerned.  This could
> avoid the need to use the unsafe blind casts that I'm sure must be
> involved in accesses to that field at present.

That, too, would be reasonable.
/* run with: for n in 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15; do ./a.out $n; 
done */

#include 
#include 

int main(int argc, char **argv)
{
char foo[] = "abcdefghijklmnopqrstuvwxyz";
if (argc != 2)
{
puts("bad usage");
return 1;
}

printf("%llx\n", *(long long *)(foo + atoi(argv[1])));
return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SerializedSnapshotData alignment

2017-02-26 Thread Tom Lane
Noah Misch  writes:
> Dear 7b4ac19 authors,
> Field ps_snapshot_data usually receives four-byte alignment within
> ParallelIndexScanDescData, but it contains the eight-byte whenTaken field.
> The select_parallel test dies with SIGBUS on "Oracle Solaris 10 1/13
> s10s_u11wos_24a SPARC", building with gcc 4.9.2.

It's a little distressing that the buildfarm didn't find this already.
Is there some reason why it's specific to that particular compiler,
rather than generic to alignment-picky 64-bit machines?

In general, though, I agree that using a char[] member to represent
anything that has any alignment requirement at all is seriously bad
coding style that is almost certain to fail eventually.

A solution you didn't mention is to change the ParallelIndexScanDescData
field to be a pointer, perhaps "struct SerializedSnapshotData *", while
leaving that struct opaque so far as relscan.h is concerned.  This could
avoid the need to use the unsafe blind casts that I'm sure must be
involved in accesses to that field at present.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] SerializedSnapshotData alignment

2017-02-26 Thread Noah Misch
Dear 7b4ac19 authors,

Field ps_snapshot_data usually receives four-byte alignment within
ParallelIndexScanDescData, but it contains the eight-byte whenTaken field.
The select_parallel test dies with SIGBUS on "Oracle Solaris 10 1/13
s10s_u11wos_24a SPARC", building with gcc 4.9.2.  Some credible fixes:

1. Move the SerializedSnapshotData declaration from snapmgr.c to snapmgr.h and
   declare the ps_snapshot_data field to be of type SerializedSnapshotData.
   Probably also add a field "TransactionId xids[FLEXIBLE_ARRAY_MEMBER]" to
   SerializedSnapshotData, to assert the variable-length nature.

2. Change "char ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER]" to "int64 ...".  I
   have attached this in SerializedSnapshot-int64-v1.patch.

3. Change no declarations, and make snapmgr.c memcpy() the
   SerializedSnapshotData through a local buffer.  I have attached this as
   SerializedSnapshot-memcpy-v1.patch.

I like (2) well enough, but I don't see that technique used elsewhere in the
tree.  (1) is more typical of PostgreSQL, though I personally like it when
structs can stay private to a file.  (3) is also well-attested, particularly
in xlog replay code.  I am leaning toward (2).  Other opinions?

Field phs_snapshot_data happens to receive eight-byte alignment within
ParallelHeapScanDescData; otherwise, such scans would fail the same way.
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index f232c84..a46a73e 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -2013,7 +2013,7 @@ EstimateSnapshotSpace(Snapshot snap)
  * memory location at start_address.
  */
 void
-SerializeSnapshot(Snapshot snapshot, char *start_address)
+SerializeSnapshot(Snapshot snapshot, int64 *start_address)
 {
SerializedSnapshotData *serialized_snapshot;
 
@@ -2069,7 +2069,7 @@ SerializeSnapshot(Snapshot snapshot, char *start_address)
  * to 0.  The returned snapshot has the copied flag set.
  */
 Snapshot
-RestoreSnapshot(char *start_address)
+RestoreSnapshot(int64 *start_address)
 {
SerializedSnapshotData *serialized_snapshot;
Sizesize;
diff --git a/src/include/access/relscan.h b/src/include/access/relscan.h
index ce3ca8d..3a5c5c9 100644
--- a/src/include/access/relscan.h
+++ b/src/include/access/relscan.h
@@ -38,7 +38,7 @@ typedef struct ParallelHeapScanDescData
slock_t phs_mutex;  /* mutual exclusion for block 
number fields */
BlockNumber phs_startblock; /* starting block number */
BlockNumber phs_cblock; /* current block number */
-   charphs_snapshot_data[FLEXIBLE_ARRAY_MEMBER];
+   int64   phs_snapshot_data[FLEXIBLE_ARRAY_MEMBER];
 }  ParallelHeapScanDescData;
 
 typedef struct HeapScanDescData
@@ -138,7 +138,7 @@ typedef struct ParallelIndexScanDescData
Oid ps_relid;
Oid ps_indexid;
Sizeps_offset;  /* Offset in bytes of am 
specific structure */
-   charps_snapshot_data[FLEXIBLE_ARRAY_MEMBER];
+   int64   ps_snapshot_data[FLEXIBLE_ARRAY_MEMBER];
 } ParallelIndexScanDescData;
 
 /* Struct for heap-or-index scans of system tables */
diff --git a/src/include/utils/snapmgr.h b/src/include/utils/snapmgr.h
index ab95366..4b5feb7 100644
--- a/src/include/utils/snapmgr.h
+++ b/src/include/utils/snapmgr.h
@@ -106,8 +106,8 @@ extern void TeardownHistoricSnapshot(bool is_error);
 extern bool HistoricSnapshotActive(void);
 
 extern Size EstimateSnapshotSpace(Snapshot snapshot);
-extern void SerializeSnapshot(Snapshot snapshot, char *start_address);
-extern Snapshot RestoreSnapshot(char *start_address);
+extern void SerializeSnapshot(Snapshot snapshot, int64 *start_address);
+extern Snapshot RestoreSnapshot(int64 *start_address);
 extern void RestoreTransactionSnapshot(Snapshot snapshot, void *master_pgproc);
 
 #endif   /* SNAPMGR_H */
diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index f232c84..e6d7a19 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -2015,34 +2015,35 @@ EstimateSnapshotSpace(Snapshot snap)
 void
 SerializeSnapshot(Snapshot snapshot, char *start_address)
 {
-   SerializedSnapshotData *serialized_snapshot;
+   SerializedSnapshotData serialized_snapshot;
 
Assert(snapshot->subxcnt >= 0);
 
-   serialized_snapshot = (SerializedSnapshotData *) start_address;
-
/* Copy all required fields */
-   serialized_snapshot->xmin = snapshot->xmin;
-   serialized_snapshot->xmax = snapshot->xmax;
-   serialized_snapshot->xcnt = snapshot->xcnt;
-   serialized_snapshot->subxcnt = snapshot->subxcnt;
-   serialized_snapshot->suboverflowed = snapshot->suboverflowed;
-   serialized_snapshot->takenDuringRecovery = 
snapshot->takenDuringRecovery;
-   serialized_snapshot->curcid = snapshot->curcid;
-   serializ