Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Andres Freund
On 2017-03-24 13:50:54 -0400, Robert Haas wrote:
> On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas  wrote:
> > On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
> >> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  
> >> wrote:
> >>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
> >>>
> >>> For normal commits and aborts we already reset PgXact->xmin
> >>> Avoiding touching highly contented shmem improves concurrent
> >>> performance.
> >>>
> >>> Simon Riggs
> >>
> >> I'm getting occasional crashes with backtraces that look like this:
> >>
> >> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit= >> temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
> >> snapmgr.c:1154
> >> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
> >>
> >> I suspect that is the fault of this patch.  Please fix or revert.
> >
> > Also, the entire buildfarm is turning red.
> >
> > longfin, spurfowl, and magpie all show this assertion failure in the
> > log.  I haven't checked the others.
> >
> > TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 
> > 1154)
> 
> Another thing that is interesting is that when I run make -j8
> check-world, the overall tests appear to succeed even though there are
> failures mid-way through:
> 
> test tablefunc... FAILED (test process exited with exit code 
> 2)
> 
> ...but then later we end with:
> 
> ok
> All tests successful.
> Files=11, Tests=80, 251 wallclock secs ( 0.07 usr  0.02 sys + 19.77
> cusr 14.45 csys = 34.31 CPU)
> Result: PASS

> real4m27.421s
> user3m50.047s
> sys1m31.937s

> That's unrelated to the current problem of course, but it seems to
> suggest that make's -j option doesn't entirely do what you'd expect
> when used with make check-world.
> 

That's likely the output of a different test from the one that failed.
It's a lot easier to see the result if you're doing
&& echo success || echo failure

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Simon Riggs
On 24 March 2017 at 16:14, Robert Haas  wrote:

> I suspect that is the fault of this patch.  Please fix or revert.

Will revert then fix.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 12:27 PM, Robert Haas  wrote:
> On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
>> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
>>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>>>
>>> For normal commits and aborts we already reset PgXact->xmin
>>> Avoiding touching highly contented shmem improves concurrent
>>> performance.
>>>
>>> Simon Riggs
>>
>> I'm getting occasional crashes with backtraces that look like this:
>>
>> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit=> temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
>> snapmgr.c:1154
>> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
>>
>> I suspect that is the fault of this patch.  Please fix or revert.
>
> Also, the entire buildfarm is turning red.
>
> longfin, spurfowl, and magpie all show this assertion failure in the
> log.  I haven't checked the others.
>
> TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

Another thing that is interesting is that when I run make -j8
check-world, the overall tests appear to succeed even though there are
failures mid-way through:

test tablefunc... FAILED (test process exited with exit code 2)

...but then later we end with:

ok
All tests successful.
Files=11, Tests=80, 251 wallclock secs ( 0.07 usr  0.02 sys + 19.77
cusr 14.45 csys = 34.31 CPU)
Result: PASS

real4m27.421s
user3m50.047s
sys1m31.937s

That's unrelated to the current problem of course, but it seems to
suggest that make's -j option doesn't entirely do what you'd expect
when used with make check-world.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 12:14 PM, Robert Haas  wrote:
> On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
>> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>>
>> For normal commits and aborts we already reset PgXact->xmin
>> Avoiding touching highly contented shmem improves concurrent
>> performance.
>>
>> Simon Riggs
>
> I'm getting occasional crashes with backtraces that look like this:
>
> #4  0x000107e4be2b in AtEOXact_Snapshot (isCommit= temporarily unavailable, due to optimizations>, isPrepare=0 '\0') at
> snapmgr.c:1154
> #5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
>
> I suspect that is the fault of this patch.  Please fix or revert.

Also, the entire buildfarm is turning red.

longfin, spurfowl, and magpie all show this assertion failure in the
log.  I haven't checked the others.

TRAP: FailedAssertion("!(MyPgXact->xmin == 0)", File: "snapmgr.c", Line: 1154)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Avoid SnapshotResetXmin() during AtEOXact_Snapshot()

2017-03-24 Thread Robert Haas
On Fri, Mar 24, 2017 at 10:23 AM, Simon Riggs  wrote:
> Avoid SnapshotResetXmin() during AtEOXact_Snapshot()
>
> For normal commits and aborts we already reset PgXact->xmin
> Avoiding touching highly contented shmem improves concurrent
> performance.
>
> Simon Riggs

I'm getting occasional crashes with backtraces that look like this:

#0  0x7fff9679c286 in __pthread_kill ()
#1  0x7fff94e1a9f9 in pthread_kill ()
#2  0x7fff9253a9a3 in abort ()
#3  0x000107e0659e in ExceptionalCondition (conditionName=, errorType=0x6 , fileName=, lineNumber=) at assert.c:54
#4  0x000107e4be2b in AtEOXact_Snapshot (isCommit=, isPrepare=0 '\0') at
snapmgr.c:1154
#5  0x000107a76c06 in CleanupTransaction () at xact.c:2643
#6  0x000107a76267 in CommitTransactionCommand () at xact.c:2818
#7  0x000107cecfc2 in exec_simple_query
(query_string=0x7f975481e640 "ABORT TRANSACTION") at postgres.c:2461
#8  0x000107ceabb7 in PostgresMain (argc=, argv=, dbname=, username=) at postgres.c:4071
#9  0x000107c6bb58 in PostmasterMain (argc=, argv=) at postmaster.c:4317
#10 0x000107be5cdd in main (argc=, argv=) at main.c:228

I suspect that is the fault of this patch.  Please fix or revert.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers