Re: [HACKERS] Deadlock in XLogInsert at AIX
Noah Misch writes: > On Wed, Oct 09, 2019 at 01:15:29PM -0400, Tom Lane wrote: >> * I still think that the added configure test is a waste of build cycles. >> It'd be sufficient to test "#ifdef HAVE__BUILTIN_CONSTANT_P" where you >> are testing HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P, because our previous >> buildfarm go-round with this showed that all supported compilers >> interpret "i" this way. > xlc does not interpret "i" that way: > https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hornet=2019-09-14%2016%3A42%3A32=config Hm, how'd I miss that while looking at the buildfarm results? Anyway, you're clearly right on this one --- objection withdrawn. >> * I really dislike building the asm calls with macros as you've done >> here. > For a macro local to one C file, I think readability is the relevant metric. > In particular, it would be wrong to add arguments to make these more like > header file macros. I think the macros make the code somewhat more readable, > and you think they make the code less readable. I have removed the macros. Personally I definitely find this way more readable, though I agree beauty is in the eye of the beholder. I've checked that this patch set compiles and passes regression tests on my old Apple machines, so it's good to go as far as I can see. regards, tom lane
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Wed, Oct 09, 2019 at 01:15:29PM -0400, Tom Lane wrote: > Noah Misch writes: > > On Mon, Oct 07, 2019 at 03:06:35PM -0400, Tom Lane wrote: > >> This still fails on Apple's compilers. ... > > > Thanks for testing. That error boils down to "need to use some other > > register". The second operand of addi is one of the ppc instruction > > operands > > that can hold a constant zero or a register number[1], so the proper > > constraint is "b". I've made it so and added a comment. > > Ah-hah. This version does compile and pass check-world for me. > > > I should probably > > update s_lock.h, too, in a later patch. I don't know how it has > > mostly-avoided this failure mode, but its choice of constraint could explain > > https://postgr.es/m/flat/36E70B06-2C5C-11D8-A096-0005024EF27F%40ifrance.com > > Indeed. It's a bit astonishing that more people haven't hit that. > This should be back-patched. I may as well do that first, so there's no time when s_lock.h disagrees with arch-ppc.h about the constraint to use. I'm attaching that patch, too. > * I still think that the added configure test is a waste of build cycles. > It'd be sufficient to test "#ifdef HAVE__BUILTIN_CONSTANT_P" where you > are testing HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P, because our previous > buildfarm go-round with this showed that all supported compilers > interpret "i" this way. xlc does not interpret "i" that way: https://buildfarm.postgresql.org/cgi-bin/show_stage_log.pl?nm=hornet=2019-09-14%2016%3A42%3A32=config > * I really dislike building the asm calls with macros as you've done > here. The macros violate project style, and are not remotely general- > purpose, because they have hard-wired references to variables that are > not in their argument lists. While that could be fixed with more > arguments, I don't think that the approach is readable or maintainable > --- it's impossible for example to understand the register constraints > without looking simultaneously at the calls and the macro definition. > And, as we've seen in this "b" issue, the interactions between the chosen > instruction types and the constraints are subtle enough to make me wonder > whether you won't need even more arguments to allow some of the other > constraints to be variable. I think it'd be far better just to write out > the asm in-line and accept the not-very-large amount of code duplication > you'd get. For a macro local to one C file, I think readability is the relevant metric. In particular, it would be wrong to add arguments to make these more like header file macros. I think the macros make the code somewhat more readable, and you think they make the code less readable. I have removed the macros. > * src/tools/pginclude/headerscheck needs the same adjustment as you > made in cpluspluscheck. Done. diff --git a/configure b/configure index 74627a7..02a905b 100755 --- a/configure +++ b/configure @@ -14518,6 +14518,46 @@ $as_echo "$pgac_cv_have_ppc_mutex_hint" >&6; } $as_echo "#define HAVE_PPC_LWARX_MUTEX_HINT 1" >>confdefs.h fi +# Check if compiler accepts "i"(x) when __builtin_constant_p(x). +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether __builtin_constant_p(x) implies \"i\"(x) acceptance" >&5 +$as_echo_n "checking whether __builtin_constant_p(x) implies \"i\"(x) acceptance... " >&6; } +if ${pgac_cv_have_i_constraint__builtin_constant_p+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +static inline int + addi(int ra, int si) + { + int res = 0; + if (__builtin_constant_p(si)) + __asm__ __volatile__( + " addi %0,%1,%2\n" : "=r"(res) : "b"(ra), "i"(si)); + return res; + } + int test_adds(int x) { return addi(3, x) + addi(x, 5); } +int +main () +{ + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + pgac_cv_have_i_constraint__builtin_constant_p=yes +else + pgac_cv_have_i_constraint__builtin_constant_p=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_have_i_constraint__builtin_constant_p" >&5 +$as_echo "$pgac_cv_have_i_constraint__builtin_constant_p" >&6; } +if test x"$pgac_cv_have_i_constraint__builtin_constant_p" = xyes ; then + +$as_echo "#define HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P 1" >>confdefs.h + +fi ;; esac diff --git a/configure.in b/configure.in index 0d16c1a..88d3a59 100644 --- a/configure.in +++ b/configure.in @@ -1539,6 +1539,26 @@ case $host_cpu in if test x"$pgac_cv_have_ppc_mutex_hint" = xyes ; then AC_DEFINE(HAVE_PPC_LWARX_MUTEX_HINT, 1, [Define to 1 if the assembler supports PPC's LWARX mutex hint bit.]) fi +# Check if compiler accepts "i"(x) when __builtin_constant_p(x). +AC_CACHE_CHECK([whether __builtin_constant_p(x) implies "i"(x) acceptance], +
Re: [HACKERS] Deadlock in XLogInsert at AIX
Noah Misch writes: > On Mon, Oct 07, 2019 at 03:06:35PM -0400, Tom Lane wrote: >> This still fails on Apple's compilers. ... > Thanks for testing. That error boils down to "need to use some other > register". The second operand of addi is one of the ppc instruction operands > that can hold a constant zero or a register number[1], so the proper > constraint is "b". I've made it so and added a comment. Ah-hah. This version does compile and pass check-world for me. > I should probably > update s_lock.h, too, in a later patch. I don't know how it has > mostly-avoided this failure mode, but its choice of constraint could explain > https://postgr.es/m/flat/36E70B06-2C5C-11D8-A096-0005024EF27F%40ifrance.com Indeed. It's a bit astonishing that more people haven't hit that. This should be back-patched. Now that the patch passes mechanical checks, I took a closer look, and there are some things I don't like: * I still think that the added configure test is a waste of build cycles. It'd be sufficient to test "#ifdef HAVE__BUILTIN_CONSTANT_P" where you are testing HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P, because our previous buildfarm go-round with this showed that all supported compilers interpret "i" this way. * I really dislike building the asm calls with macros as you've done here. The macros violate project style, and are not remotely general- purpose, because they have hard-wired references to variables that are not in their argument lists. While that could be fixed with more arguments, I don't think that the approach is readable or maintainable --- it's impossible for example to understand the register constraints without looking simultaneously at the calls and the macro definition. And, as we've seen in this "b" issue, the interactions between the chosen instruction types and the constraints are subtle enough to make me wonder whether you won't need even more arguments to allow some of the other constraints to be variable. I think it'd be far better just to write out the asm in-line and accept the not-very-large amount of code duplication you'd get. * src/tools/pginclude/headerscheck needs the same adjustment as you made in cpluspluscheck. regards, tom lane
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Mon, Oct 07, 2019 at 03:06:35PM -0400, Tom Lane wrote: > Noah Misch writes: > > [ fetch-add-gcc-xlc-unify-v2.patch ] > > This still fails on Apple's compilers. The first failure I get is > > ccache gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith > -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute > -Wformat-security -fno-strict-aliasing -fwrapv -g -O2 -I../../../src/include > -I/usr/local/include -isysroot /Developer/SDKs/MacOSX10.5.sdk-c -o > nodeHashjoin.o nodeHashjoin.c > /var/tmp//ccXUM8ep.s:449:Parameter error: r0 not allowed for parameter 2 > (code as 0 not r0) > > Line 449 of the assembly file is the addi in > > LM87: > sync > lwarx r0,0,r2 > addir11,r0,1 > stwcx. r11,0,r2 > bne $-12 > isync > > which I suppose comes out of PG_PPC_FETCH_ADD. I find this idea of > constructing assembly code by string-pasting pretty unreadable and am not > tempted to try to debug it, but I don't immediately see why this doesn't > work when the existing s_lock.h code does. I think that the assembler > error message is probably misleading: while it seems to be saying to > s/r0/0/ in the addi, gcc consistently uses "rN" syntax for the second > parameter elsewhere. I do note that gcc never generates r0 as addi's > second parameter in several files I checked through, so maybe what it > means is "you need to use some other register"? (Which would imply that > the constraint for this asm argument is too loose.) Thanks for testing. That error boils down to "need to use some other register". The second operand of addi is one of the ppc instruction operands that can hold a constant zero or a register number[1], so the proper constraint is "b". I've made it so and added a comment. I should probably update s_lock.h, too, in a later patch. I don't know how it has mostly-avoided this failure mode, but its choice of constraint could explain https://postgr.es/m/flat/36E70B06-2C5C-11D8-A096-0005024EF27F%40ifrance.com > I'm also wondering why this isn't following s_lock.h's lead as to > USE_PPC_LWARX_MUTEX_HINT and USE_PPC_LWSYNC. Most or all of today's pg_atomic_compare_exchange_u32() usage does not have the property that the mutex hint would signal. pg_atomic_compare_exchange_u32() specifies "Full barrier semantics", which lwsync does not provide. We might want to relax the specification to make lwsync acceptable, but that would be a separate, architecture-independent project. (generic-gcc.h:pg_atomic_compare_exchange_u32_impl() speculates along those lines, writing "FIXME: we can probably use a lower consistency model".) [1] "Notice that addi and addis use the value 0, not the contents of GPR 0, if RA=0." -- https://openpowerfoundation.org/?resource_lib=power-isa-version-3-0 diff --git a/configure b/configure index 7a6bfc2..d6ecb33 100755 --- a/configure +++ b/configure @@ -14650,6 +14650,46 @@ $as_echo "$pgac_cv_have_ppc_mutex_hint" >&6; } $as_echo "#define HAVE_PPC_LWARX_MUTEX_HINT 1" >>confdefs.h fi +# Check if compiler accepts "i"(x) when __builtin_constant_p(x). +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether __builtin_constant_p(x) implies \"i\"(x) acceptance" >&5 +$as_echo_n "checking whether __builtin_constant_p(x) implies \"i\"(x) acceptance... " >&6; } +if ${pgac_cv_have_i_constraint__builtin_constant_p+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +static inline int + addi(int ra, int si) + { + int res = 0; + if (__builtin_constant_p(si)) + __asm__ __volatile__( + " addi %0,%1,%2\n" : "=r"(res) : "r"(ra), "i"(si)); + return res; + } + int test_adds(int x) { return addi(3, x) + addi(x, 5); } +int +main () +{ + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + pgac_cv_have_i_constraint__builtin_constant_p=yes +else + pgac_cv_have_i_constraint__builtin_constant_p=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_have_i_constraint__builtin_constant_p" >&5 +$as_echo "$pgac_cv_have_i_constraint__builtin_constant_p" >&6; } +if test x"$pgac_cv_have_i_constraint__builtin_constant_p" = xyes ; then + +$as_echo "#define HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P 1" >>confdefs.h + +fi ;; esac diff --git a/configure.in b/configure.in index dde3eec..510bd76 100644 --- a/configure.in +++ b/configure.in @@ -1541,6 +1541,26 @@ case $host_cpu in if test x"$pgac_cv_have_ppc_mutex_hint" = xyes ; then AC_DEFINE(HAVE_PPC_LWARX_MUTEX_HINT, 1, [Define to 1 if the assembler supports PPC's LWARX mutex hint bit.]) fi +# Check if compiler accepts "i"(x) when __builtin_constant_p(x). +AC_CACHE_CHECK([whether __builtin_constant_p(x) implies "i"(x) acceptance], +
Re: [HACKERS] Deadlock in XLogInsert at AIX
Noah Misch writes: > [ fetch-add-gcc-xlc-unify-v2.patch ] This still fails on Apple's compilers. The first failure I get is ccache gcc -std=gnu99 -Wall -Wmissing-prototypes -Wpointer-arith -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute -Wformat-security -fno-strict-aliasing -fwrapv -g -O2 -I../../../src/include -I/usr/local/include -isysroot /Developer/SDKs/MacOSX10.5.sdk-c -o nodeHashjoin.o nodeHashjoin.c /var/tmp//ccXUM8ep.s:449:Parameter error: r0 not allowed for parameter 2 (code as 0 not r0) Line 449 of the assembly file is the addi in LM87: sync lwarx r0,0,r2 addir11,r0,1 stwcx. r11,0,r2 bne $-12 isync which I suppose comes out of PG_PPC_FETCH_ADD. I find this idea of constructing assembly code by string-pasting pretty unreadable and am not tempted to try to debug it, but I don't immediately see why this doesn't work when the existing s_lock.h code does. I think that the assembler error message is probably misleading: while it seems to be saying to s/r0/0/ in the addi, gcc consistently uses "rN" syntax for the second parameter elsewhere. I do note that gcc never generates r0 as addi's second parameter in several files I checked through, so maybe what it means is "you need to use some other register"? (Which would imply that the constraint for this asm argument is too loose.) I'm also wondering why this isn't following s_lock.h's lead as to USE_PPC_LWARX_MUTEX_HINT and USE_PPC_LWSYNC. regards, tom lane
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Sat, Aug 31, 2019 at 03:30:26PM -0700, Noah Misch wrote: > On Sat, Aug 31, 2019 at 02:27:55PM -0400, Tom Lane wrote: > > Noah Misch writes: > > > Done. fetch-add-variable-test-v1.patch just adds tests for non-constant > > > addends and 16-bit edge cases. Today's implementation handles those, > > > PostgreSQL doesn't use them, and I might easily have broken them. > > > fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add() > > > intrinsic to inline asm. fetch-add-gcc-xlc-unify-v1.patch moves > > > fetch_add to > > > inline asm for all other ppc compilers. gcc-7.2.0 generates equivalent > > > code > > > before and after. I plan to keep the third patch HEAD-only, > > > back-patching the > > > other two. I tested with xlc v12 and v13. > > > > Hm, no objection to the first two patches, but I don't understand > > why the third patch goes to so much effort just to use "addi" rather > > than (one assumes) "li" then "add"? It doesn't seem likely that > > that's buying much. > > Changing an addi to li+add may not show up on benchmarks, but I can't claim > it's immaterial. I shouldn't unify the code if that makes the compiled code > materially worse than what the gcc intrinsics produce today, hence the > nontrivial (~50 line) bits to match the intrinsics' capabilities. The first two patches have worked so far, but fetch-add-gcc-xlc-unify-v1.patch broke older gcc: https://postgr.es/m/flat/7517.1568470...@sss.pgh.pa.us. That side thread settled on putting pg_atomic_compare_exchange_u{32,64}_impl implementations in arch-ppc.h. Attached fetch-add-gcc-xlc-unify-v2.patch does so. For compare_exchange, the generated code doesn't match gcc intrinsics exactly; code comments discuss this. Compared to v1, its other change is to extract common __asm__ templates into macros instead of repeating them four times (32/64bit, variable/const). diff --git a/configure b/configure index 7a6bfc2..d6ecb33 100755 --- a/configure +++ b/configure @@ -14650,6 +14650,46 @@ $as_echo "$pgac_cv_have_ppc_mutex_hint" >&6; } $as_echo "#define HAVE_PPC_LWARX_MUTEX_HINT 1" >>confdefs.h fi +# Check if compiler accepts "i"(x) when __builtin_constant_p(x). +{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether __builtin_constant_p(x) implies \"i\"(x) acceptance" >&5 +$as_echo_n "checking whether __builtin_constant_p(x) implies \"i\"(x) acceptance... " >&6; } +if ${pgac_cv_have_i_constraint__builtin_constant_p+:} false; then : + $as_echo_n "(cached) " >&6 +else + cat confdefs.h - <<_ACEOF >conftest.$ac_ext +/* end confdefs.h. */ +static inline int + addi(int ra, int si) + { + int res = 0; + if (__builtin_constant_p(si)) + __asm__ __volatile__( + " addi %0,%1,%2\n" : "=r"(res) : "r"(ra), "i"(si)); + return res; + } + int test_adds(int x) { return addi(3, x) + addi(x, 5); } +int +main () +{ + + ; + return 0; +} +_ACEOF +if ac_fn_c_try_compile "$LINENO"; then : + pgac_cv_have_i_constraint__builtin_constant_p=yes +else + pgac_cv_have_i_constraint__builtin_constant_p=no +fi +rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext +fi +{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_have_i_constraint__builtin_constant_p" >&5 +$as_echo "$pgac_cv_have_i_constraint__builtin_constant_p" >&6; } +if test x"$pgac_cv_have_i_constraint__builtin_constant_p" = xyes ; then + +$as_echo "#define HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P 1" >>confdefs.h + +fi ;; esac diff --git a/configure.in b/configure.in index dde3eec..510bd76 100644 --- a/configure.in +++ b/configure.in @@ -1541,6 +1541,26 @@ case $host_cpu in if test x"$pgac_cv_have_ppc_mutex_hint" = xyes ; then AC_DEFINE(HAVE_PPC_LWARX_MUTEX_HINT, 1, [Define to 1 if the assembler supports PPC's LWARX mutex hint bit.]) fi +# Check if compiler accepts "i"(x) when __builtin_constant_p(x). +AC_CACHE_CHECK([whether __builtin_constant_p(x) implies "i"(x) acceptance], + [pgac_cv_have_i_constraint__builtin_constant_p], +[AC_COMPILE_IFELSE([AC_LANG_PROGRAM( +[static inline int + addi(int ra, int si) + { + int res = 0; + if (__builtin_constant_p(si)) + __asm__ __volatile__( + " addi %0,%1,%2\n" : "=r"(res) : "r"(ra), "i"(si)); + return res; + } + int test_adds(int x) { return addi(3, x) + addi(x, 5); }], [])], +[pgac_cv_have_i_constraint__builtin_constant_p=yes], +[pgac_cv_have_i_constraint__builtin_constant_p=no])]) +if test x"$pgac_cv_have_i_constraint__builtin_constant_p" = xyes ; then + AC_DEFINE(HAVE_I_CONSTRAINT__BUILTIN_CONSTANT_P, 1, +[Define to 1 if __builtin_constant_p(x) implies "i"(x) acceptance.]) +fi ;; esac diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in index 512213a..9be84fc 100644 --- a/src/include/pg_config.h.in +++ b/src/include/pg_config.h.in @@ -332,6 +332,9 @@ /*
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Sat, Aug 31, 2019 at 02:27:55PM -0400, Tom Lane wrote: > Noah Misch writes: > > Done. fetch-add-variable-test-v1.patch just adds tests for non-constant > > addends and 16-bit edge cases. Today's implementation handles those, > > PostgreSQL doesn't use them, and I might easily have broken them. > > fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add() > > intrinsic to inline asm. fetch-add-gcc-xlc-unify-v1.patch moves fetch_add > > to > > inline asm for all other ppc compilers. gcc-7.2.0 generates equivalent code > > before and after. I plan to keep the third patch HEAD-only, back-patching > > the > > other two. I tested with xlc v12 and v13. > > Hm, no objection to the first two patches, but I don't understand > why the third patch goes to so much effort just to use "addi" rather > than (one assumes) "li" then "add"? It doesn't seem likely that > that's buying much. Changing an addi to li+add may not show up on benchmarks, but I can't claim it's immaterial. I shouldn't unify the code if that makes the compiled code materially worse than what the gcc intrinsics produce today, hence the nontrivial (~50 line) bits to match the intrinsics' capabilities.
Re: [HACKERS] Deadlock in XLogInsert at AIX
Noah Misch writes: > Done. fetch-add-variable-test-v1.patch just adds tests for non-constant > addends and 16-bit edge cases. Today's implementation handles those, > PostgreSQL doesn't use them, and I might easily have broken them. > fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add() > intrinsic to inline asm. fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to > inline asm for all other ppc compilers. gcc-7.2.0 generates equivalent code > before and after. I plan to keep the third patch HEAD-only, back-patching the > other two. I tested with xlc v12 and v13. Hm, no objection to the first two patches, but I don't understand why the third patch goes to so much effort just to use "addi" rather than (one assumes) "li" then "add"? It doesn't seem likely that that's buying much. regards, tom lane
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Wed, Jan 17, 2018 at 12:36:31AM -0800, Noah Misch wrote: > On Tue, Jan 16, 2018 at 08:50:24AM -0800, Andres Freund wrote: > > On 2018-01-16 16:12:11 +0900, Michael Paquier wrote: > > > On Fri, Feb 03, 2017 at 12:26:50AM +, Noah Misch wrote: > > > > Since this emits double syncs with older xlc, I recommend instead > > > > replacing > > > > the whole thing with inline asm. As I opined in the last message of the > > > > thread you linked above, the intrinsics provide little value as > > > > abstractions > > > > if one checks the generated code to deduce how to use them. Now that > > > > the > > > > generated code is xlc-version-dependent, the port is better off with > > > > compiler-independent asm like we have for ppc in s_lock.h. > > > > > > Could it be cleaner to just use __xlc_ver__ to avoid double syncs on > > > past versions? I think that it would make the code more understandable > > > than just listing directly the instructions. > > > > Given the quality of the intrinsics on AIX, see past commits and the > > comment in the code quoted above, I think we're much better of doing > > this via inline asm. > > For me, verifiability is the crucial benefit of inline asm. Anyone with an > architecture manual can thoroughly review an inline asm implementation. Given > intrinsics and __xlc_ver__ conditionals, the same level of review requires > access to every xlc version. > > > As Heikki is not around these days, Noah, could you provide a new > > > version of the patch? This bug has been around for some time now, it > > > would be nice to move on.. > > Not soon. Done. fetch-add-variable-test-v1.patch just adds tests for non-constant addends and 16-bit edge cases. Today's implementation handles those, PostgreSQL doesn't use them, and I might easily have broken them. fetch-add-xlc-asm-v1.patch moves xlc builds from the __fetch_and_add() intrinsic to inline asm. fetch-add-gcc-xlc-unify-v1.patch moves fetch_add to inline asm for all other ppc compilers. gcc-7.2.0 generates equivalent code before and after. I plan to keep the third patch HEAD-only, back-patching the other two. I tested with xlc v12 and v13. diff --git a/src/test/regress/regress.c b/src/test/regress/regress.c index 7f03b7e..b00b76f 100644 --- a/src/test/regress/regress.c +++ b/src/test/regress/regress.c @@ -687,7 +687,7 @@ test_atomic_uint32(void) if (pg_atomic_read_u32() != 3) elog(ERROR, "atomic_read_u32() #2 wrong"); - if (pg_atomic_fetch_add_u32(, 1) != 3) + if (pg_atomic_fetch_add_u32(, pg_atomic_read_u32() - 2) != 3) elog(ERROR, "atomic_fetch_add_u32() #1 wrong"); if (pg_atomic_fetch_sub_u32(, 1) != 4) @@ -712,6 +712,20 @@ test_atomic_uint32(void) if (pg_atomic_fetch_add_u32(, INT_MAX) != INT_MAX) elog(ERROR, "pg_atomic_add_fetch_u32() #3 wrong"); + pg_atomic_fetch_add_u32(, 2); /* wrap to 0 */ + + if (pg_atomic_fetch_add_u32(, PG_INT16_MAX) != 0) + elog(ERROR, "pg_atomic_fetch_add_u32() #3 wrong"); + + if (pg_atomic_fetch_add_u32(, PG_INT16_MAX + 1) != PG_INT16_MAX) + elog(ERROR, "pg_atomic_fetch_add_u32() #4 wrong"); + + if (pg_atomic_fetch_add_u32(, PG_INT16_MIN) != 2 * PG_INT16_MAX + 1) + elog(ERROR, "pg_atomic_fetch_add_u32() #5 wrong"); + + if (pg_atomic_fetch_add_u32(, PG_INT16_MIN - 1) != PG_INT16_MAX) + elog(ERROR, "pg_atomic_fetch_add_u32() #6 wrong"); + pg_atomic_fetch_add_u32(, 1); /* top up to UINT_MAX */ if (pg_atomic_read_u32() != UINT_MAX) @@ -787,7 +801,7 @@ test_atomic_uint64(void) if (pg_atomic_read_u64() != 3) elog(ERROR, "atomic_read_u64() #2 wrong"); - if (pg_atomic_fetch_add_u64(, 1) != 3) + if (pg_atomic_fetch_add_u64(, pg_atomic_read_u64() - 2) != 3) elog(ERROR, "atomic_fetch_add_u64() #1 wrong"); if (pg_atomic_fetch_sub_u64(, 1) != 4) diff --git a/src/include/port/atomics/generic-xlc.h b/src/include/port/atomics/generic-xlc.h index 5ddccff..8b5c732 100644 --- a/src/include/port/atomics/generic-xlc.h +++ b/src/include/port/atomics/generic-xlc.h @@ -73,11 +73,27 @@ pg_atomic_compare_exchange_u32_impl(volatile pg_atomic_uint32 *ptr, static inline uint32 pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) { + uint32 _t; + uint32 res; + /* -* __fetch_and_add() emits a leading "sync" and trailing "isync", thereby -* providing sequential consistency. This is undocumented. +* xlc has a no-longer-documented __fetch_and_add() intrinsic. In xlc +* 12.01.., it emits a leading "sync" and trailing "isync". In +* xlc 13.01.0003.0004, it emits neither. Hence, using the intrinsic +* would add redundant syncs on xlc 12. */ - return __fetch_and_add((volatile int *)>value, add_); + __asm__ __volatile__( + "
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Mon, Jan 22, 2018 at 4:23 PM, Tom Lanewrote: >> If nobody is willing to put in the effort to keep AIX supported under >> XLC, then we should just update the documentation to say that it isn't >> supported. Our support for that platform is pretty marginal anyway if >> we're only supporting it up through 6.1; that was released in 2007 and >> went out of support in Q2 of last year. > > Huh? We have somebody just upthread saying that they're working out > issues on newer AIX. We should at least give them time to finish > that research before making any decisions. Well, I certainly have no problem with that in theory. That's why my sentence started with the word "if". On a practical level, this thread has been dead for almost a year, and has revived slightly because Michael bumped it, but nobody has really made a firm commitment to do any specific thing to move this issue forward. I'm happy if they do, but it might not happen. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Mon, Jan 22, 2018 at 04:23:55PM -0500, Tom Lane wrote: > Robert Haaswrites: >> I agree with Noah. It's true that having unfixed bugs isn't >> particularly good, but it doesn't justify activating a CommitFest >> entry under someone else's name. > > Agreed. The CF app is not a bug tracker. OK, thanks for the feedback. Based on that I am deleting this entry from the CF. And my apologies for the noise. >> If nobody is willing to put in the effort to keep AIX supported under >> XLC, then we should just update the documentation to say that it isn't >> supported. Our support for that platform is pretty marginal anyway if >> we're only supporting it up through 6.1; that was released in 2007 and >> went out of support in Q2 of last year. > > Huh? We have somebody just upthread saying that they're working out > issues on newer AIX. We should at least give them time to finish > that research before making any decisions. Yes, Tony stated so, still he has not published any patches yet. Regarding the potential documentation impact, please let me suggest just to mention that support of XLC up to 12.1 is supported, per the buildfarm this is rather stable and does not face the same issues as what we are seeing with XLC 13.1. Is there any point to wait for the docs to not say that now? As far as I know, if support for XLC 13.1 moves on in the future we can update the docs so as the support range is increased later on. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Deadlock in XLogInsert at AIX
Robert Haaswrites: > I agree with Noah. It's true that having unfixed bugs isn't > particularly good, but it doesn't justify activating a CommitFest > entry under someone else's name. Agreed. The CF app is not a bug tracker. > If nobody is willing to put in the effort to keep AIX supported under > XLC, then we should just update the documentation to say that it isn't > supported. Our support for that platform is pretty marginal anyway if > we're only supporting it up through 6.1; that was released in 2007 and > went out of support in Q2 of last year. Huh? We have somebody just upthread saying that they're working out issues on newer AIX. We should at least give them time to finish that research before making any decisions. regards, tom lane
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Wed, Jan 17, 2018 at 12:36:31AM -0800, Noah Misch wrote: > For me, verifiability is the crucial benefit of inline asm. Anyone with an > architecture manual can thoroughly review an inline asm implementation. Given > intrinsics and __xlc_ver__ conditionals, the same level of review requires > access to every xlc version. Okay. > The most recent patch version is Returned with Feedback. As a matter of > procedure, I discourage creating commitfest entries as a tool to solicit new > patch versions. If I were the author of a RwF patch, I would dislike finding > a commitfest entry that I did not create with myself listed as author. Per my understanding, this is a bug, and we don't want to lose track of them. > If you do choose to proceed, the entry should be Waiting on Author. Right. > Note that fixing this bug is just the start of accepting XLC 13.1 as a > compiler of PostgreSQL. If we get a buildfarm member with a few dozen clean > runs (blocked by, at a minimum, fixing this and the inlining bug), we'll have > something. Until then, support for XLC 13.1 is an anti-feature. Per my understanding of this thread, this is a bug. My point is that the documentation states that AIX is supported from 4.3.3 to 6.1, however there are no restrictions related to the compiler, hence I would have thought that the docs imply XLC 13.1 as a supported compiler. And IBM states that XLC 13.1 is supported from AIX 6.1: https://www-01.ibm.com/support/docview.wss?uid=swg21326972 True that the docs tell as well to look at the buildfarm animals, which don't use XLC 13.1 if you don't look at them closely. Perhaps an explicit mention of the compiler compatibilities in the docs would help making the support range clear to anybody then. I would expect more people to look at the docs than the buildfarm internal contents. -- Michael signature.asc Description: PGP signature
Re: [HACKERS] Deadlock in XLogInsert at AIX
Am Dienstag, den 16.01.2018, 08:25 + schrieb REIX, Tony: > I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 > and, using times outputs provided by PostgreSQL tests, XLC seems to > provide at least 8% more speed. We also plan to run professional > performance tests in order to compare PostgreSQL 10.1 on AIX vs > Linux/Power. I saw some 2017 performance slides, made with older > versions of PostgreSQL and XLC, that show bad PostgreSQL performance > on AIX vs Linux/Power, and I cannot believe it. We plan to > investigate this. I assume you are referring to the attached graph i've showed on PGConf.US 2017 ? The numbers we've got on that E850 machine (pgbench SELECT-only, scale 1000) weren't really good in comparison to Linux on the same machine. We tried many options to make the performance better, overall the graph shows the best performance from Linux *and* AIX with gcc. XL C We used some knobs to get the best out on AIX: export OBJECT_MODE=64; gcc -m64 ldedit -b forkpolicy:cor -b textpsize:64K -b datapsize:64K -b stackpsize:64K postgres export MALLOCOPTIONS=multiheap:16,considersize,pool,no_mallinfo schedo -p -o vpm_fold_policy=4 There are many other things you can tune on AIX, but they didn't seem to give the improvement we'd like to see. run_bench.sh Description: application/shellscript
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Tue, Jan 16, 2018 at 08:50:24AM -0800, Andres Freund wrote: > On 2018-01-16 16:12:11 +0900, Michael Paquier wrote: > > On Fri, Feb 03, 2017 at 12:26:50AM +, Noah Misch wrote: > > > Since this emits double syncs with older xlc, I recommend instead > > > replacing > > > the whole thing with inline asm. As I opined in the last message of the > > > thread you linked above, the intrinsics provide little value as > > > abstractions > > > if one checks the generated code to deduce how to use them. Now that the > > > generated code is xlc-version-dependent, the port is better off with > > > compiler-independent asm like we have for ppc in s_lock.h. > > > > Could it be cleaner to just use __xlc_ver__ to avoid double syncs on > > past versions? I think that it would make the code more understandable > > than just listing directly the instructions. > > Given the quality of the intrinsics on AIX, see past commits and the > comment in the code quoted above, I think we're much better of doing > this via inline asm. For me, verifiability is the crucial benefit of inline asm. Anyone with an architecture manual can thoroughly review an inline asm implementation. Given intrinsics and __xlc_ver__ conditionals, the same level of review requires access to every xlc version. > > As there have been other > > bug reports from Tony Reix who has been working on AIX with XLC 13.1 and > > that this thread got lost in the wild, I have added an entry in the next > > CF: > > https://commitfest.postgresql.org/17/1484/ The most recent patch version is Returned with Feedback. As a matter of procedure, I discourage creating commitfest entries as a tool to solicit new patch versions. If I were the author of a RwF patch, I would dislike finding a commitfest entry that I did not create with myself listed as author. If you do choose to proceed, the entry should be Waiting on Author. > > As Heikki is not around these days, Noah, could you provide a new > > version of the patch? This bug has been around for some time now, it > > would be nice to move on.. Not soon. Note that fixing this bug is just the start of accepting XLC 13.1 as a compiler of PostgreSQL. If we get a buildfarm member with a few dozen clean runs (blocked by, at a minimum, fixing this and the inlining bug), we'll have something. Until then, support for XLC 13.1 is an anti-feature. nm
Re: [HACKERS] Deadlock in XLogInsert at AIX
On 2018-01-16 16:12:11 +0900, Michael Paquier wrote: > On Fri, Feb 03, 2017 at 12:26:50AM +, Noah Misch wrote: > > On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote: > >> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile > >> pg_atomic_uint32 *ptr, > >> static inline uint32 > >> pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) > >> { > >> + uint32 ret; > >> + > >>/* > >> - * __fetch_and_add() emits a leading "sync" and trailing "isync", > >> thereby > >> - * providing sequential consistency. This is undocumented. > >> + * Use __sync() before and __isync() after, like in compare-exchange > >> + * above. > >> */ > >> - return __fetch_and_add((volatile int *)>value, add_); > >> + __sync(); > >> + > >> + ret = __fetch_and_add((volatile int *)>value, add_); > >> + > >> + __isync(); > >> + > >> + return ret; > >> } > > > > Since this emits double syncs with older xlc, I recommend instead replacing > > the whole thing with inline asm. As I opined in the last message of the > > thread you linked above, the intrinsics provide little value as abstractions > > if one checks the generated code to deduce how to use them. Now that the > > generated code is xlc-version-dependent, the port is better off with > > compiler-independent asm like we have for ppc in s_lock.h. > > Could it be cleaner to just use __xlc_ver__ to avoid double syncs on > past versions? I think that it would make the code more understandable > than just listing directly the instructions. Given the quality of the intrinsics on AIX, see past commits and the comment in the code quoted above, I think we're much better of doing this via inline asm. Greetings, Andres Freund
RE:[HACKERS] Deadlock in XLogInsert at AIX
Thanks Noah ! Hummm You have a big machine, more powerful than mine. However, it seems that you do not see the random failure I see. Cordialement, Tony Reix ATOS / Bull SAS ATOS Expert IBM Coop Architect & Technical Leader Office : +33 (0) 4 76 29 72 67 1 rue de Provence - 38432 Échirolles - France www.atos.net De : Noah Misch [n...@leadboat.com] Envoyé : mardi 16 janvier 2018 17:19 À : REIX, Tony Cc : Michael Paquier; Heikki Linnakangas; Konstantin Knizhnik; PostgreSQL Hackers; Bernd Helmle; OLIVA, PASCAL; EMPEREUR-MOT, SYLVIE Objet : Re: [HACKERS] Deadlock in XLogInsert at AIX On Tue, Jan 16, 2018 at 01:50:29PM +, REIX, Tony wrote: > And, on BuildFarm, I do not see any details about the logical/physical > configuration of the AIX VMs, like hornet. > Being able to run real concurrent parallel stress programs, thus required > multi-physical-CPU VM, would help. It has 48 virtual CPUs. Here's the prtconf output: System Model: IBM,8231-E2C Machine Serial Number: 104C0CT Processor Type: PowerPC_POWER7 Processor Implementation Mode: POWER 7 Processor Version: PV_7_Compat Number Of Processors: 12 Processor Clock Speed: 3720 MHz CPU Type: 64-bit Kernel Type: 64-bit LPAR Info: 1 10-4C0CT Memory Size: 127488 MB Good Memory Size: 127488 MB Platform Firmware level: AL740_100 Firmware Version: IBM,AL740_100 Console Login: enable Auto Restart: true Full Core: false NX Crypto Acceleration: Not Capable Network Information Host Name: power-aix IP Address: 140.211.15.154 Sub Netmask: 255.255.255.0 Gateway: 140.211.15.1 Name Server: 140.211.166.130 Domain Name: osuosl.org Paging Space Information Total Paging Space: 12288MB Percent Used: 1% Volume Groups Information == Active VGs == homevg: PV_NAME PV STATE TOTAL PPs FREE PPsFREE DISTRIBUTION hdisk1active558 183 00..00..00..71..112 hdisk2active558 0 00..00..00..00..00 hdisk3active558 0 00..00..00..00..00 hdisk4active558 0 00..00..00..00..00 == rootvg: PV_NAME PV STATE TOTAL PPs FREE PPsFREE DISTRIBUTION hdisk0active558 510 111..93..83..111..112 hdisk5active558 514 111..97..83..111..112 == INSTALLED RESOURCE LIST The following resources are installed on the machine. +/- = Added or deleted from Resource List. * = Diagnostic support not available. Model Architecture: chrp Model Implementation: Multiple Processor, PCI bus + sys0 System Object + sysplanar0 System Planar * vio0 Virtual I/O Bus * vsa2 U78AB.001.WZSHZY0-P1-T2 LPAR Virtual Serial Adapter * vty2 U78AB.001.WZSHZY0-P1-T2-L0 Asynchronous Terminal * vsa1 U78AB.001.WZSHZY0-P1-T1 LPAR Virtual Serial Adapter * vty1 U78AB.001.WZSHZY0-P1-T1-L0 Asynchronous Terminal * vsa0 U8231.E2C.104C0CT-V1-C0 LPAR Virtual Serial Adapter * vty0 U8231.E2C.104C0CT-V1-C0-L0 Asynchronous Terminal * pci8 U78AB.001.WZSHZY0-P1PCI Express Bus * pci7 U78AB.001.WZSHZY0-P1PCI Express Bus * pci6 U78AB.001.WZSHZY0-P1PCI Express Bus * pci5 U78AB.001.WZSHZY0-P1PCI Express Bus * pci4 U78AB.001.WZSHZY0-P1PCI Express Bus * pci10U78AB.001.WZSHZY0-P1-C2 PCI Bus + cor0 U78AB.001.WZSHZY0-P1-C2-T1 GXT145 Graphics Adapter * pci3 U78AB.001.WZSHZY0-P1PCI Express Bus + ent0 U78AB.001.WZSHZY0-P1-C7-T1 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004) + ent1 U78AB.001.WZSHZY0-P1-C7-T2 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004) + ent2 U78AB.001.WZSHZY0-P1-C7-T3 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004) + ent3 U78AB.001.WZSHZY0-P1-C7-T4 4-Port Gigabit Ethernet PCI-Express Adapter (e414571614102004) * pci2 U78AB.001.WZSHZY0-P1PCI Express Bus * pci1 U78AB.001.WZSHZY0-P1PCI Express Bus * pci9 U78AB.001.WZSHZY0-P1PCI Bus + usbhc0 U78AB.001.WZSHZY0-P1USB Host Controller (33103500) + usbhc1 U78AB.001.WZSHZY0-P1USB Host Controller (33103500) + usbhc2 U78AB.001.WZSHZY0-P1U
Re: [HACKERS] Deadlock in XLogInsert at AIX
On 01/16/2018 08:50 AM, REIX, Tony wrote: > Hi Michael > > You said: > >> Setting up a buildfarm member with the combination of compiler and >> environment where you are seeing the failures would be the best answer >> in my opinion: >> https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto >> >> This does not require special knowledge of PostgreSQL internals, and the >> in-core testing framework has improved the last couple of years to allow >> for more advanced tests. I do use it as well for some tests on my own >> modules (company stuff). The buildfarm code has also followed the pace, >> which really helps a lot, thanks to Andrew Dunstan. >> >> Developers and committers are more pro-active if they can see automated >> tests failing in the central community place. And buildfarm animals >> usually don't stay red for more than a couple of days. > Hu I quickly read this HowTo and I did not find any explanation about the > "protocole" > used for exchanging data between my VM and the PostgreSQL BuildFarm. > My machine is behind firewalls and have restricted access to the outside. > Either I'll see when that does not work... or I can get some information > about which port > (or anything else) I have to ask to be opened, if needed. > Anyway, I'll read it in depth now and I'll try to implement it. > > > Communication is only done via outbound port 443 (https). There are no passwords required and no inbound connections, ever. Uploads are signed using a shared secret. Communication can can be via a proxy. If you need the client to use a proxy with git that's a bit more complex, but possible. Ping me if you need help setting this up. cheers andrew -- Andrew Dunstanhttps://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
RE:[HACKERS] Deadlock in XLogInsert at AIX
Hi Michael You said: > Setting up a buildfarm member with the combination of compiler and > environment where you are seeing the failures would be the best answer > in my opinion: > https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto > > This does not require special knowledge of PostgreSQL internals, and the > in-core testing framework has improved the last couple of years to allow > for more advanced tests. I do use it as well for some tests on my own > modules (company stuff). The buildfarm code has also followed the pace, > which really helps a lot, thanks to Andrew Dunstan. > > Developers and committers are more pro-active if they can see automated > tests failing in the central community place. And buildfarm animals > usually don't stay red for more than a couple of days. Hu I quickly read this HowTo and I did not find any explanation about the "protocole" used for exchanging data between my VM and the PostgreSQL BuildFarm. My machine is behind firewalls and have restricted access to the outside. Either I'll see when that does not work... or I can get some information about which port (or anything else) I have to ask to be opened, if needed. Anyway, I'll read it in depth now and I'll try to implement it. About the random error we see, I guess that I may see it, though PostgreSQL BuildFarm AIX VMs do not see it, because I'm using a not-too-small VM, using variable Physical Processing units (CPUs) since my VM is uncapped (I may use all Physical CPU if available): up to 4 physical processors and up to 8 virtual processors. And, on BuildFarm, I do not see any details about the logical/physical configuration of the AIX VMs, like hornet. Being able to run real concurrent parallel stress programs, thus required multi-physical-CPU VM, would help. Regards, Tony
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Tue, Jan 16, 2018 at 08:25:51AM +, REIX, Tony wrote: > My team and my company (ATOS/Bull) are involved in improving the > quality of PostgreSQL on AIX. Cool to hear that! > We have AIX 6.1, 7.1, and 7.2 Power8 systems, with several > logical/physical processors. And I plan to have a more powerful (more > processors) machine for running PostgreSQL stress tests. > A DB-expert colleague has started to write some new not-too-complex > stress tests that we'd like to submit to PostgreSQL project later. > For now, using latest versions of XLC 12 (12.1.0.19) and 13 (13.1.3.4 > with a patch), we have only (on AIX 6.1 and 7.2) one remaining random > failure (dealing with src/bin/pgbench/t/001_pgbench.pl test), for > PostgreSQL 9.6.6 and 10.1 . And, on AIX 7.1, we have one more > remaining failure that may be due to some other dependent > software. Investigating. > XLC 13.1.3.4 shows an issue with -O2 and I have a work-around that > fixes it in ./src/backend/parser/gram.c . We have opened a PMR > (defect) against XLC. Note that our tests are now executed without the > PG_FORCE_DISABLE_INLINE "inline" trick in src/include/port/aix.h that > suppresses the inlining of routines on AIX. I think that older > versions of XLC have shown issues that have now disappeared (or, at > least, many of them). > I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 and, > using times outputs provided by PostgreSQL tests, XLC seems to provide > at least 8% more speed. We also plan to run professional performance > tests in order to compare PostgreSQL 10.1 on AIX vs Linux/Power. I saw > some 2017 performance slides, made with older versions of PostgreSQL > and XLC, that show bad PostgreSQL performance on AIX vs Linux/Power, > and I cannot believe it. We plan to investigate this. That's interesting investigation. The community is always interested in such stories. You could have material for a conference talk. > Though I have very very little skills about PostgreSQL (I'm porting > too now GCC Go on AIX), we can help, at least by > compiling/testing/investigating/stressing in a different AIX > environment than the AIX ones (32/64bit, XLC/GCC) you have in your > BuildFarm. Setting up a buildfarm member with the combination of compiler and environment where you are seeing the failures would be the best answer in my opinion: https://wiki.postgresql.org/wiki/PostgreSQL_Buildfarm_Howto This does not require special knowledge of PostgreSQL internals, and the in-core testing framework has improved the last couple of years to allow for more advanced tests. I do use it as well for some tests on my own modules (company stuff). The buildfarm code has also followed the pace, which really helps a lot, thanks to Andrew Dunstan. Developers and committers are more pro-active if they can see automated tests failing in the central community place. And buildfarm animals usually don't stay red for more than a couple of days. -- Michael signature.asc Description: PGP signature
RE:[HACKERS] Deadlock in XLogInsert at AIX
Hi Michael, My team and my company (ATOS/Bull) are involved in improving the quality of PostgreSQL on AIX. We have AIX 6.1, 7.1, and 7.2 Power8 systems, with several logical/physical processors. And I plan to have a more powerful (more processors) machine for running PostgreSQL stress tests. A DB-expert colleague has started to write some new not-too-complex stress tests that we'd like to submit to PostgreSQL project later. For now, using latest versions of XLC 12 (12.1.0.19) and 13 (13.1.3.4 with a patch), we have only (on AIX 6.1 and 7.2) one remaining random failure (dealing with src/bin/pgbench/t/001_pgbench.pl test), for PostgreSQL 9.6.6 and 10.1 . And, on AIX 7.1, we have one more remaining failure that may be due to some other dependent software. Investigating. XLC 13.1.3.4 shows an issue with -O2 and I have a work-around that fixes it in ./src/backend/parser/gram.c . We have opened a PMR (defect) against XLC. Note that our tests are now executed without the PG_FORCE_DISABLE_INLINE "inline" trick in src/include/port/aix.h that suppresses the inlining of routines on AIX. I think that older versions of XLC have shown issues that have now disappeared (or, at least, many of them). I've been able to compare PostgreSQL compiled with XLC vs GCC 7.1 and, using times outputs provided by PostgreSQL tests, XLC seems to provide at least 8% more speed. We also plan to run professional performance tests in order to compare PostgreSQL 10.1 on AIX vs Linux/Power. I saw some 2017 performance slides, made with older versions of PostgreSQL and XLC, that show bad PostgreSQL performance on AIX vs Linux/Power, and I cannot believe it. We plan to investigate this. Though I have very very little skills about PostgreSQL (I'm porting too now GCC Go on AIX), we can help, at least by compiling/testing/investigating/stressing in a different AIX environment than the AIX ones (32/64bit, XLC/GCC) you have in your BuildFarm. Let me know how we can help. Regards, Cordialement, Tony Reix ATOS / Bull SAS ATOS Expert IBM Coop Architect & Technical Leader Office : +33 (0) 4 76 29 72 67 1 rue de Provence - 38432 Échirolles - France www.atos.net De : Michael Paquier [michael.paqu...@gmail.com] Envoyé : mardi 16 janvier 2018 08:12 À : Noah Misch Cc : Heikki Linnakangas; Konstantin Knizhnik; PostgreSQL Hackers; Bernd Helmle Objet : Re: [HACKERS] Deadlock in XLogInsert at AIX On Fri, Feb 03, 2017 at 12:26:50AM +, Noah Misch wrote: > On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote: >> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile >> pg_atomic_uint32 *ptr, >> static inline uint32 >> pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) >> { >> +uint32 ret; >> + >> /* >> - * __fetch_and_add() emits a leading "sync" and trailing "isync", >> thereby >> - * providing sequential consistency. This is undocumented. >> + * Use __sync() before and __isync() after, like in compare-exchange >> + * above. >> */ >> -return __fetch_and_add((volatile int *)>value, add_); >> +__sync(); >> + >> +ret = __fetch_and_add((volatile int *)>value, add_); >> + >> +__isync(); >> + >> +return ret; >> } > > Since this emits double syncs with older xlc, I recommend instead replacing > the whole thing with inline asm. As I opined in the last message of the > thread you linked above, the intrinsics provide little value as abstractions > if one checks the generated code to deduce how to use them. Now that the > generated code is xlc-version-dependent, the port is better off with > compiler-independent asm like we have for ppc in s_lock.h. Could it be cleaner to just use __xlc_ver__ to avoid double syncs on past versions? I think that it would make the code more understandable than just listing directly the instructions. As there have been other bug reports from Tony Reix who has been working on AIX with XLC 13.1 and that this thread got lost in the wild, I have added an entry in the next CF: https://commitfest.postgresql.org/17/1484/ As Heikki is not around these days, Noah, could you provide a new version of the patch? This bug has been around for some time now, it would be nice to move on.. I think I could have written patches myself, but I don't have an AIX machine at hand. Of course not with XLC 13.1. -- Michael
Re: [HACKERS] Deadlock in XLogInsert at AIX
On Fri, Feb 03, 2017 at 12:26:50AM +, Noah Misch wrote: > On Wed, Feb 01, 2017 at 02:39:25PM +0200, Heikki Linnakangas wrote: >> @@ -73,11 +73,19 @@ pg_atomic_compare_exchange_u32_impl(volatile >> pg_atomic_uint32 *ptr, >> static inline uint32 >> pg_atomic_fetch_add_u32_impl(volatile pg_atomic_uint32 *ptr, int32 add_) >> { >> +uint32 ret; >> + >> /* >> - * __fetch_and_add() emits a leading "sync" and trailing "isync", >> thereby >> - * providing sequential consistency. This is undocumented. >> + * Use __sync() before and __isync() after, like in compare-exchange >> + * above. >> */ >> -return __fetch_and_add((volatile int *)>value, add_); >> +__sync(); >> + >> +ret = __fetch_and_add((volatile int *)>value, add_); >> + >> +__isync(); >> + >> +return ret; >> } > > Since this emits double syncs with older xlc, I recommend instead replacing > the whole thing with inline asm. As I opined in the last message of the > thread you linked above, the intrinsics provide little value as abstractions > if one checks the generated code to deduce how to use them. Now that the > generated code is xlc-version-dependent, the port is better off with > compiler-independent asm like we have for ppc in s_lock.h. Could it be cleaner to just use __xlc_ver__ to avoid double syncs on past versions? I think that it would make the code more understandable than just listing directly the instructions. As there have been other bug reports from Tony Reix who has been working on AIX with XLC 13.1 and that this thread got lost in the wild, I have added an entry in the next CF: https://commitfest.postgresql.org/17/1484/ As Heikki is not around these days, Noah, could you provide a new version of the patch? This bug has been around for some time now, it would be nice to move on.. I think I could have written patches myself, but I don't have an AIX machine at hand. Of course not with XLC 13.1. -- Michael signature.asc Description: PGP signature