On Tue, Mar 31, 2015 at 1:29 AM, Michael Paquier <michael.paqu...@gmail.com>
wrote:

>
>
> On Tue, Mar 31, 2015 at 3:42 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
>
>> On Mon, Mar 30, 2015 at 8:54 PM, Jeff Janes <jeff.ja...@gmail.com> wrote:
>>
>>> After freeing up the rows at the end of the table so it is eligible for
>>> truncation, then running a manual VACUUM to actually release the space, I
>>> kept running into the problem that the truncation scan was consistently
>>> suspended and then aborted due to a conflicting lock requested/held.
>>>
>>> But the perversity is that that conflicting lock request can only be
>>> coming, as far as I can tell, from the autovac process.  I'm not sure how
>>> this happens, as I thought autovac never waited for locks but only obtained
>>> one if it were instantaneously available, but that it is the only
>>> explanation I can think of.
>>>
>>> I'm not seeing this in 9.4, but I'm not sure how deterministic it is so
>>> maybe that is just luck.
>>>
>>
>>
>> It looks like the culprit is this:
>>
>> commit 0d831389749a3baaced7b984205b9894a82444b9
>> Author: Alvaro Herrera <alvhe...@alvh.no-ip.org>
>> Date:   Wed Mar 18 11:52:33 2015 -0300
>>
>>     Rationalize vacuuming options and parameters
>>
>> I'd guess the autovac nature of the autovac process is getting lost in
>> there where, but I don't see where.
>>
>
> Hm. I ran a couple of tests and am noticing that a manual VACUUM is not
> able to truncate all the pages (it should, no?)... For example with your
> test case on REL9_4_STABLE VACUUM VERBOSE reports that all the pages are
> truncated:
> INFO:  00000: "pgbench_accounts": truncated 16394 to 0 pages
>
> OK, on HEAD this does not seem to work:
> INFO:  00000: "pgbench_accounts": truncated 16394 to 13554 pages
>
> But if I try as well with 4559167c (0d831389~1) I am getting a similar
> result:
> INFO:  00000: "pgbench_accounts": truncated 16394 to 3309 pages
>

I was not seeing that on commits before 0d831389--although often once the
truncation scan had run for a while without problem, I would abort the
session and call it good, as my failures were always quick ones.

Did it tell you why?  If it surrendered the lock to a competing process, it
should report that as previous INFO messages. If it doesn't give one of
those, then it probably just thinks there are some tuples it can't remove
yet somewhere.  What did it give earlier up in the verbose output, for the
number of removed and nonremovable tuples?

Cheers,

Jeff

Reply via email to