Re: Scaling XLog insertion (was Re: [HACKERS] Moving more work outside WALInsertLock)

Fujii Masao Thu, 16 Feb 2012 01:16:22 -0800

On Thu, Feb 16, 2012 at 5:02 AM, Heikki Linnakangas
<[email protected]> wrote:
> On 15.02.2012 18:52, Fujii Masao wrote:
>>
>> On Thu, Feb 16, 2012 at 1:01 AM, Heikki Linnakangas
>> <[email protected]>  wrote:
>>>
>>> Are you still seeing this failure with the latest patch I posted
>>>
>>> (http://archives.postgresql.org/message-id/[email protected])?
>>
>>
>> Yes. Just to be safe, I again applied the latest patch to HEAD,
>> compiled that and tried
>> the same test. Then unfortunately I got the same failure again.
>
>
> Ok.
>
>> I ran the configure with '--enable-debug' '--enable-cassert'
>> 'CPPFLAGS=-DWAL_DEBUG',
>> and make with -j 2 option.
>>
>> When I ran the test with wal_debug = on, I got the following assertion
>> failure.
>>
>> LOG:  INSERT @ 0/17B3F90: prev 0/17B3F10; xid 998; len 31: Heap -
>> insert: rel 1663/12277/16384; tid 0/197
>> STATEMENT:  create table t (i int); insert into t
>> values(generate_series(1,10000)); delete from t
>> LOG:  INSERT @ 0/17B3FD0: prev 0/17B3F50; xid 998; len 31: Heap -
>> insert: rel 1663/12277/16384; tid 0/198
>> STATEMENT:  create table t (i int); insert into t
>> values(generate_series(1,10000)); delete from t
>> TRAP: FailedAssertion("!(((bool) (((void*)(&(target->tid)) != ((void
>> *)0))&&  ((&(target->tid))->ip_posid != 0))))", File: "heapam.c",
>>
>> Line: 5578)
>> LOG:  xlog bg flush request 0/17B4000; write 0/17A6000; flush 0/179D5C0
>> LOG:  xlog bg flush request 0/17B4000; write 0/17B0000; flush 0/17B0000
>> LOG:  server process (PID 16806) was terminated by signal 6: Abort trap
>>
>> This might be related to the original problem which Jeff and I saw.
>
>
> That's strange. I made a fresh checkout, too, and applied the patch, but
> still can't reproduce. I used the attached script to test it.
>
> It's surprising that the crash happens when the records are inserted, not at
> recovery. I don't see anything obviously wrong there, so could you please
> take a look around in gdb and see if you can get a clue what's going on?
> What's the stack trace?


According to the above log messages, one strange thing is that the location
of the WAL record (i.e., 0/17B3F90) is not the same as the previous location
of the following WAL record (i.e., 0/17B3F50). Is this intentional?

BTW, when I ran the test on my Ubuntu, I could not reproduce the problem.
I could reproduce the problem only in MacOS.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Scaling XLog insertion (was Re: [HACKERS] Moving more work outside WALInsertLock)

Reply via email to