date:20130403

Re: [HACKERS] Patch for checking file parameters to psql before password prompt

2013-04-03 Thread Greg Smith


On 12/29/12 3:36 PM, Stephen Frost wrote:


Perhaps there's a better way to handle that?


Responding to feedback would be a nice start.  This submissions has been 
dead at "Waiting on Author" for at least 3 months now.  Time to give it 
the "Returned with Feedback" boot and see if it comes around again 
later.  I'll do the kicking myself now.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Brendan Jurd

On 4 April 2013 15:11, Tom Lane  wrote:
> Brendan Jurd  writes:
>> My thought was that on-disk zero-D arrays should be converted into
>> empty 1-D arrays (with default lower bounds of course) when they are
>> read by array_recv.
>
> Huh?  array_recv would not get applied to datums coming off of disk.

My mistake, sorry for the noise.

> In any case, the whole exercise is pointless if we don't change the
> visible behavior of array_dims et al.  So I think the idea that this
> would be without visible consequence is silly.  What's up for argument
> is just how much incompatibility is acceptable.

I don't know that anyone was suggesting there would be no visible
consequences of any kind.  I was hoping that we could at least
represent on-disk zero-D arrays as though they were 1-D.

If that's not going to fly, and we are stuck with continuing to allow
zero-D as a valid representation, then perhaps your '[]=' syntax would
be the way to proceed.  It would not be terribly difficult to rework
the patch along those lines, although I have to admit "allow empty
arrays with dimensions" is not nearly so satisfying a title as
"exorcise zero-dimensional arrays".

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Tom Lane

Brendan Jurd  writes:
> My thought was that on-disk zero-D arrays should be converted into
> empty 1-D arrays (with default lower bounds of course) when they are
> read by array_recv.

Huh?  array_recv would not get applied to datums coming off of disk.
The only way to make this 100% transparent would be to go through every
C-coded function that deals with arrays and make sure it delivers
identical results for both cases.  It's possible we could do that for
array functions in the core code, but what about user-written
extensions?

In any case, the whole exercise is pointless if we don't change the
visible behavior of array_dims et al.  So I think the idea that this
would be without visible consequence is silly.  What's up for argument
is just how much incompatibility is acceptable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Greg Smith


On 4/2/13 11:54 AM, Robert Haas wrote:

But, having said that, I still think the best idea is what Andres
proposed, which pretty much matches my own thoughts: the bgwriter
needs to populate the free list, so that buffer allocations don't have
to wait for linear scans of the buffer array.


I was hoping this one would make it to a full six years of being on the 
TODO list before it came up again, missed it by a few weeks.  The 
funniest part is that Amit even submitted a patch on this theme a few 
months ago without much feedback: 
http://www.postgresql.org/message-id/6C0B27F7206C9E4CA54AE035729E9C382852FF97@szxeml509-mbs 
 That stalled where a few things have, on a) needing more regression 
test workloads, and b) wondering just what the deal with large 
shared_buffers setting degrading performance was.


I saw refactoring in this area as waiting behind it being easier to 
experiment with adding new processes, but that barrier has fallen now. 
Maybe it needs a new freelist process, maybe it doesn't, today the code 
needed to try both is relatively cheap.


The other thing that always seemed to stop me was never having a typical 
Linux system big enough to hit some of these problems available all the 
time.  What I did this week on that front was just go buy a 24 core 
server with 64GB of RAM that lives in my house.  I just need to keep it 
two floors away if I want to sleep at night.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Multi-pass planner

2013-04-03 Thread Greg Stark

On Fri, Aug 21, 2009 at 6:54 PM, decibel  wrote:

> Would it? Risk seems like it would just be something along the lines of
> the high-end of our estimate. I don't think confidence should be that hard
> either. IE: hard-coded guesses have a low confidence. Something pulled
> right out of most_common_vals has a high confidence. Something estimated
> via a bucket is in-between, and perhaps adjusted by the number of tuples.
>

I used to advocate a similar idea. But when questioned on list I tried to
work out the details and ran into some problem coming up with a concrete
plan.

How do you compare a plan that you think has a 99% chance of running in 1ms
but a 1% chance of taking 1s against a plan that has a 90% chance of 1ms
and a 10% chance of taking 100ms? Which one is actually riskier? They might
even both have the same 95% percentile run-time.

And additionally there are different types of unknowns. Do you want to
treat plans where we have a statistical sample that gives us a
probabilistic answer the same as plans where we think our model just has a
10% chance of being wrong? The model is going to either be consistently
right or consistently wrong for a given query but the sample will vary from
run to run. (Or vice versa depending on the situation).

-- 
greg

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund

On 2013-04-04 02:58:43 +0200, Andres Freund wrote:
> On 2013-04-03 20:45:51 -0400, Tom Lane wrote:
> > and...@anarazel.de (Andres Freund) writes:
> > > Looking at the page lsn's with dd I noticed something peculiar:
> >
> > > page 0:
> > > 01 00 00 00 18 c2 00 31 => 1/3100C218
> > > page 1:
> > > 01 00 00 00 80 44 01 31 => 1/31014480
> > > page 10:
> > > 01 00 00 00 60 ce 05 31 => 1/3105ce60
> > > page 43:
> > > 01 00 00 00 58 7a 16 31 => 1/31167a58
> > > page 44:
> > > 01 00 00 00 f0 99 16 31 => 1/311699f0
> > > page 45:
> > > 00 00 00 00 00 00 00 00 => 0/0
> > > page 90:
> > > 01 00 00 00 90 17 1d 32 => 1/321d1790
> > > page 91:
> > > 01 00 00 00 38 ef 1b 32 => 1/321bef38
> >
> > > So we have written out pages that are after pages without a LSN that
> > > have an LSN thats *beyond* the point XLOG has successfully been written
> > > to disk (1/31169A38)?
> >
> > If you're looking into the FPIs, those would contain the page's older
> > LSN, not the one assigned by the current WAL record.
>
> Nope, thats from the heap, and the LSNs are *newer* than what startup
> recovered to. I am pretty sure by now we are missing out on valid WAL, I
> am just not sure why.

Ok, I think I see the bug. And I think its been introduced in the
checkpoints patch.

Remember, as explained in other mails, I am pretty sure the end of wal
was errorneously detected, resulting in the following error:
27692  2013-04-03 10:09:15.647 PDT:LOG:  incorrect resource manager data 
checksum in record at 1/31169A68

Not that without a *hardware* crash end of wal is normally found first
by the checks for wrong LSNs or wrong record lengths. Not here.

My theory is that a page that was full page written changed while it was
inserted into the WAL which caused the error. A possibly scenario for
that would e.g. two concurrent calls to MarkBufferDirtyHint().
Note:
 * 2. The caller might have only share-lock instead of exclusive-lock on the
 *buffer's content lock.

if ((bufHdr->flags & (BM_DIRTY | BM_JUST_DIRTIED)) !=
(BM_DIRTY | BM_JUST_DIRTIED))
{
if (DataChecksumsEnabled() && (bufHdr->flags & BM_PERMANENT))
{
MyPgXact->delayChkpt = delayChkpt = true;
lsn = XLogSaveBufferForHint(buffer); /* PART 1 */
}
}

LockBufHdr(bufHdr);
Assert(bufHdr->refcount > 0);

if (!(bufHdr->flags & BM_DIRTY))
{
dirtied = true; /* Means "will be dirtied by 
this action" */

/*
 * Set the page LSN if we wrote a backup block. We 
aren't
 * supposed to set this when only holding a share lock 
but
 * as long as we serialise it somehow we're OK. We 
choose to
 * set LSN while holding the buffer header lock, which 
causes
 * any reader of an LSN who holds only a share lock to 
also
 * obtain a buffer header lock before using 
PageGetLSN().
 * Fortunately, thats not too many places.
 *
 * If checksums are enabled, you might think we should 
reset the
 * checksum here. That will happen when the page is 
written
 * sometime later in this checkpoint cycle.
 */
if (!XLogRecPtrIsInvalid(lsn))
PageSetLSN(page, lsn); /* PART 2 */
}

So XLogSaveBufferForHint(buffer) is called for all buffers which are not
yet dirty. Without any locks. It just does:

XLogSaveBufferForHint(Buffer buffer)
{
rdata[0].data = (char *) (&watermark);
rdata[0].len = sizeof(int);
rdata[0].next = &(rdata[1]);

rdata[1].data = NULL;
rdata[1].len = 0;
rdata[1].buffer = buffer;
rdata[1].buffer_std = true;
rdata[1].next = NULL;

return XLogInsert(RM_XLOG_ID, XLOG_HINT, rdata);
}

I think what happens is that that one backend grabs WALInsertLock first,
it computes a full page write for the buffer, including a CRC. And
returns a valid LSN. Then it continues towards the PageSetLSN() after
LockBufHdr().
Allthewhile the second backend notices that WALInsertLock is free again
and continues towards the WALInsertLock protected parts of XLogInsert().

Only that it already has done that part:
/*
 * Calculate CRC of the data, including all the backup blocks
 *
 * Note that the record header isn't added into the CRC initially since 
we
 * don't know the prev-link yet.  Thus, the CRC will represent the CRC 
of
 * the whole record in the order: rdata, then backup blocks, then record
 * header.
 */
INIT_CRC32(rdata_crc);
for (rdt = rdata; rdt != NULL; rdt = rdt->next)

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Ants Aasma

On Thu, Apr 4, 2013 at 3:41 AM, Tom Lane  wrote:
> I think the original vision of the clock sweep algorithm included the
> idea that different backends could be running the sweep over different
> parts of the buffer ring concurrently.  If we could get rid of the need
> to serialize that activity, it'd pretty much eliminate the bottleneck
> I should think.  The problem is how to manage it to ensure that (a)
> backends aren't actually contending on the same buffers as they do this,
> and (b) there's a reasonably fair rate of usage_count decrements for
> each buffer, rather than possibly everybody ganging up on the same area
> sometimes.  Thoughts?

Fairness and avoiding each others implies that some coordination is
required. One wild idea I had is to partition buffers to N partitions
and have one clock sweep each, protected by a lwlock.

To reduce contention, the clocksweep runners use something like
sched_getcpu() to determine which partition to use to find their
buffer. Using the CPU to determine the partition makes it necessary
for the process to be scheduled out in the critical section for some
other backend to contend on the lock. And if some backend does contend
on it, it is likely to reside on the same CPU and by sleeping will
make room for the lockholder to run.

To ensure fairness for buffers, every time one of the clocksweeps
wraps around a global offset counter is incremented. This re-assigns
all cpus/backends to the next partition, sort of like the mad hatters
tea party.

The scenario that I'm most worried about here is what happens when a
process holding the clocksweep lock is migrated to another CPU and
then scheduled out. The processes on the original CPU will sooner or
later block behind the lock while the processes on the CPU where the
lock holder now waits keep hogging the CPU. This will create an
imbalance that the scheduler might try to correct, possibly creating a
nasty feedback loop. It could be that in practice the scenario works
out to be too far fetched to matter, but who knows.

I don't have a slightest idea yet how the background writer would
function in this environment. But if redesigning the bgwriter
mechanism was on the table I thought I would throw the idea out here.

Regards,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interesting post-mortem on a near disaster with git

2013-04-03 Thread Daniel Farina

On Wed, Apr 3, 2013 at 6:18 PM, Jim Nasby  wrote:
>>  > What about rdiff-backup? I've set it up for personal use years ago
>>
>>  > (via the handy open source bash script backupninja) years ago and it
>>
>>  > has a pretty nice no-frills point-in-time, self-expiring, file-based
>>
>>  > automatic backup program that works well with file synchronization
>>
>>  > like rsync (I rdiff-backup to one disk and rsync the entire
>>
>>  > rsync-backup output to another disk). I've enjoyed using it quite a
>>
>>  > bit during my own personal-computer emergencies and thought the
>>
>>  > maintenance required from me has been zero, and I have used it from
>>
>>  > time to time to restore, proving it even works. Hardlinks can be used
>>
>>  > to tag versions of a file-directory tree recursively relatively
>>
>>  > compactly.
>>
>>  >
>>
>>  > It won't be as compact as a git-aware solution (since git tends to to
>>
>>  > rewrite entire files, which will confuse file-based incremental
>>
>>  > differential backup), but the amount of data we are talking about is
>>
>>  > pretty small, and as far as a lowest-common-denominator tradeoff for
>>
>>  > use in emergencies, I have to give it a lot of praise. The main
>>
>>  > advantage it has here is it implements point-in-time recovery
>>
>>  > operations that easy to use and actually seem to work. That said,
>>
>>  > I've mostly done targeted recoveries rather than trying to recover
>>
>>  > entire trees.
>>
>> I have the same set up, and same feedback.
>
>
> I had the same setup, but got tired of how rdiff-backup behaved when a
> backup was interrupted (very lengthy cleanup process). Since then I've
> switched to an rsync setup that does essentially the same thing as
> rdiff-backup (uses hardlinks between multiple copies).
>
> The only downside I'm aware of is that my rsync backups aren't guaranteed to
> be "consistent" (for however consistent a backup of an active FS would
> be...).

I forgot to add one more thing to my first mail, although it's very
important to my feeble recommendation: the problem is that blind
synchronization is a great way to propagate destruction.

rdiff-backup (but perhaps others, too) has a file/directory structure
that is, as far as I know, additive, and the pruning can be done
independently at different replicas that can have different
retention...and if done just right (I'm not sure about the case of
concurrent backups being taken) one can write a re-check that no files
are to be modified or deleted by the synchronization as a safeguard.

-- 
fdr


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Brendan Jurd

On 4 April 2013 01:10, Tom Lane  wrote:
> I think though that the upthread argument that we'd have multiple
> interpretations of the same thing is bogus.  To me, the core idea that's
> being suggested here is that '{}' should mean a zero-length 1-D array,
> not a zero-D array as formerly.  We would still need a way to represent
> zero-D arrays, if only because they'd still exist on-disk in existing
> databases (assuming we're not willing to break pg_upgrade for this).

Tom,

My thought was that on-disk zero-D arrays should be converted into
empty 1-D arrays (with default lower bounds of course) when they are
read by array_recv.  Any SQL operation on your zero-D arrays would
therefore resolve as though they were 1-D.  A pg_dump/restore would
result in the arrays being 1-D on the restore side.  If pg_upgrade
conserves the zero-D array in binary form, that's okay since the
receiving end will just treat it as 1-D out of array_recv anyway.

My intention was that the zero-D array could continue to live
indefinitely in binary form, but would never be observable as such by
any application code.

Cheers,
BJ

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor erratum for 9.2.4 release notes

2013-04-03 Thread Ian Lawrence Barwick

2013/4/4 Tom Lane :
> Ian Lawrence Barwick  writes:
>> I guess the 9.2.4 release notes haven't been finalized yet; apologies
>> if this is already addressed, but following sentence:
>
>>
>> Also, if you are upgrading from a version earlier than 9.2.2,
>> see the release notes for 9.2.2.
>>
>
>> should read:
>
>>
>> Also, if you are upgrading from a version earlier than 9.2.3,
>> see the release notes for 9.2.3.
>>
>
> Um, no, there's no special instructions attached to 9.2.3.  The point of
> that para is to thread back to the last minor release that had something
> extra for you to do.

Got it, sorry for the noise

Regards

Ian Barwick


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Minor erratum for 9.2.4 release notes

2013-04-03 Thread Tom Lane

Ian Lawrence Barwick  writes:
> I guess the 9.2.4 release notes haven't been finalized yet; apologies
> if this is already addressed, but following sentence:

>
> Also, if you are upgrading from a version earlier than 9.2.2,
> see the release notes for 9.2.2.
>

> should read:

>
> Also, if you are upgrading from a version earlier than 9.2.3,
> see the release notes for 9.2.3.
>

Um, no, there's no special instructions attached to 9.2.3.  The point of
that para is to thread back to the last minor release that had something
extra for you to do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Interesting post-mortem on a near disaster with git

2013-04-03 Thread Jim Nasby


On 3/26/13 6:42 AM, Cédric Villemain wrote:

Le lundi 25 mars 2013 19:35:12, Daniel Farina a écrit :

 > On Mon, Mar 25, 2013 at 11:07 AM, Stefan Kaltenbrunner

 >

 >  wrote:

 > >> Back when we used CVS for quite a few years I kept 7 day rolling

 > >> snapshots of the CVS repo, against just such a difficulty as this. But

 > >> we seem to be much better organized with infrastructure these days so I

 > >> haven't done that for a long time.

 > >

 > > well there is always room for improvement(and for learning from others)

 > > - but I agree that this proposal seems way overkill. If people think we

 > > should keep online "delayed" mirrors we certainly have the resources to

 > > do that on our own if we want though...

 >

 > What about rdiff-backup? I've set it up for personal use years ago

 > (via the handy open source bash script backupninja) years ago and it

 > has a pretty nice no-frills point-in-time, self-expiring, file-based

 > automatic backup program that works well with file synchronization

 > like rsync (I rdiff-backup to one disk and rsync the entire

 > rsync-backup output to another disk). I've enjoyed using it quite a

 > bit during my own personal-computer emergencies and thought the

 > maintenance required from me has been zero, and I have used it from

 > time to time to restore, proving it even works. Hardlinks can be used

 > to tag versions of a file-directory tree recursively relatively

 > compactly.

 >

 > It won't be as compact as a git-aware solution (since git tends to to

 > rewrite entire files, which will confuse file-based incremental

 > differential backup), but the amount of data we are talking about is

 > pretty small, and as far as a lowest-common-denominator tradeoff for

 > use in emergencies, I have to give it a lot of praise. The main

 > advantage it has here is it implements point-in-time recovery

 > operations that easy to use and actually seem to work. That said,

 > I've mostly done targeted recoveries rather than trying to recover

 > entire trees.

I have the same set up, and same feedback.


I had the same setup, but got tired of how rdiff-backup behaved when a backup 
was interrupted (very lengthy cleanup process). Since then I've switched to an 
rsync setup that does essentially the same thing as rdiff-backup (uses 
hardlinks between multiple copies).

The only downside I'm aware of is that my rsync backups aren't guaranteed to be 
"consistent" (for however consistent a backup of an active FS would be...).
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Minor erratum for 9.2.4 release notes

2013-04-03 Thread Ian Lawrence Barwick

I guess the 9.2.4 release notes haven't been finalized yet; apologies
if this is already addressed, but following sentence:

   
Also, if you are upgrading from a version earlier than 9.2.2,
see the release notes for 9.2.2.
   

should read:

   
Also, if you are upgrading from a version earlier than 9.2.3,
see the release notes for 9.2.3.
   


Regards


Ian Barwick


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund

On 2013-04-03 20:45:51 -0400, Tom Lane wrote:
> and...@anarazel.de (Andres Freund) writes:
> > Looking at the page lsn's with dd I noticed something peculiar:
> 
> > page 0:
> > 01 00 00 00 18 c2 00 31 => 1/3100C218
> > page 1:
> > 01 00 00 00 80 44 01 31 => 1/31014480
> > page 10:
> > 01 00 00 00 60 ce 05 31 => 1/3105ce60
> > page 43:
> > 01 00 00 00 58 7a 16 31 => 1/31167a58
> > page 44:
> > 01 00 00 00 f0 99 16 31 => 1/311699f0
> > page 45:
> > 00 00 00 00 00 00 00 00 => 0/0
> > page 90:
> > 01 00 00 00 90 17 1d 32 => 1/321d1790
> > page 91:
> > 01 00 00 00 38 ef 1b 32 => 1/321bef38
> 
> > So we have written out pages that are after pages without a LSN that
> > have an LSN thats *beyond* the point XLOG has successfully been written
> > to disk (1/31169A38)?
> 
> If you're looking into the FPIs, those would contain the page's older
> LSN, not the one assigned by the current WAL record.

Nope, thats from the heap, and the LSNs are *newer* than what startup
recovered to. I am pretty sure by now we are missing out on valid WAL, I
am just not sure why.

Unfortunately we can't easily diagnose what happened at:
27692  2013-04-03 10:09:15.647 PDT:LOG:  incorrect resource manager data 
checksum in record at 1/31169A68
since the startup process wrote its end of recovery checkpoint there:
rmgr: XLOGlen (rec/tot): 72/   104, tx:  0, lsn: 
1/31169A68, prev 1/31169A38, bkp: , desc: checkpoint: redo 1/31169A68; tli 
1; prev tli 1; fpw true; xid 0/26254999; oid 843781; multi 1; offset 0; oldest 
xid 1799 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; shutdown

Starting from a some blocks in that wal segments later:
pg_xlogdump /tmp/tmp/data2/pg_xlog/000100010031 -s 1/3116c000 -n 10
first record is after 1/3116C000, at 1/3116D9D8, skipping over 6616 bytes
rmgr: Heaplen (rec/tot): 51/83, tx:   26254999, lsn: 
1/3116D9D8, prev 1/3116BA20, bkp: , desc: update: rel 1663/16384/835589; 
tid 38/148 xmax 26254999 ; new tid 44/57 xmax 0
rmgr: Btree   len (rec/tot): 34/66, tx:   26254999, lsn: 
1/3116DA30, prev 1/3116D9D8, bkp: , desc: insert: rel 1663/16384/835590; 
tid 25/319
rmgr: Heaplen (rec/tot): 51/83, tx:   26255000, lsn: 
1/3116DA78, prev 1/3116DA30, bkp: , desc: update: rel 1663/16384/835589; 
tid 19/214 xmax 26255000 ; new tid 44/58 xmax 0

the records continue again.

Greetings,


Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Tom Lane

and...@anarazel.de (Andres Freund) writes:
> Looking at the page lsn's with dd I noticed something peculiar:

> page 0:
> 01 00 00 00 18 c2 00 31 => 1/3100C218
> page 1:
> 01 00 00 00 80 44 01 31 => 1/31014480
> page 10:
> 01 00 00 00 60 ce 05 31 => 1/3105ce60
> page 43:
> 01 00 00 00 58 7a 16 31 => 1/31167a58
> page 44:
> 01 00 00 00 f0 99 16 31 => 1/311699f0
> page 45:
> 00 00 00 00 00 00 00 00 => 0/0
> page 90:
> 01 00 00 00 90 17 1d 32 => 1/321d1790
> page 91:
> 01 00 00 00 38 ef 1b 32 => 1/321bef38

> So we have written out pages that are after pages without a LSN that
> have an LSN thats *beyond* the point XLOG has successfully been written
> to disk (1/31169A38)?

If you're looking into the FPIs, those would contain the page's older
LSN, not the one assigned by the current WAL record.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Tom Lane

Greg Stark  writes:
> I'm still skeptical about the idea of a "freelist". That just seems like a
> terrible point of contention. However perhaps that's because I'm picturing
> an LRU linked list. Perhaps the right thing is to maintain a pool of
> buffers in some less contention-prone data structure which lets each
> backend pick buffers out more or less independently of the others.

I think the original vision of the clock sweep algorithm included the
idea that different backends could be running the sweep over different
parts of the buffer ring concurrently.  If we could get rid of the need
to serialize that activity, it'd pretty much eliminate the bottleneck
I should think.  The problem is how to manage it to ensure that (a)
backends aren't actually contending on the same buffers as they do this,
and (b) there's a reasonably fair rate of usage_count decrements for
each buffer, rather than possibly everybody ganging up on the same area
sometimes.  Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund

On 2013-04-04 02:28:32 +0200, Andres Freund wrote:
> On 2013-04-04 01:52:41 +0200, Andres Freund wrote:
> > On 2013-04-03 15:57:49 -0700, Jeff Janes wrote:
> > > I've changed the subject from "regression test failed when enabling
> > > checksum" because I now know they are totally unrelated.
> > > 
> > > My test case didn't need to depend on archiving being on, and so with a
> > > simple tweak I rendered the two issues orthogonal.
> > > 
> > > 
> > > On Wed, Apr 3, 2013 at 12:15 PM, Jeff Davis  wrote:
> > > 
> > > > On Mon, 2013-04-01 at 19:51 -0700, Jeff Janes wrote:
> > > >
> > > > > What I would probably really want is the data as it existed after the
> > > > > crash but before recovery started, but since the postmaster
> > > > > immediately starts recovery after the crash, I don't know of a good
> > > > > way to capture this.
> > > >
> > > > Can you just turn off the restart_after_crash GUC? I had a chance to
> > > > look at this, and seeing the block before and after recovery would be
> > > > nice. I didn't see a log file in the data directory, but it didn't go
> > > > through recovery, so I assume it already did that.
> > > >
> > > 
> > > You don't know that the cluster is in the bad state until after it goes
> > > through recovery because most crashes recover perfectly fine.  So it would
> > > have to make a side-copy of the cluster after the crash, then recover the
> > > original and see how things go, then either retain or delete the 
> > > side-copy.
> > >  Unfortunately my testing harness can't do this at the moment, because the
> > > perl script storing the consistency info needs to survive over the crash
> > > and recovery.   It might take me awhile to figure out how to make it do
> > > this.
> > > 
> > > I have the server log just go to stderr, where it gets mingled together
> > > with messages from my testing harness.  I had not uploaded that file.
> > > 
> > > Here is a new upload. It contains both a data_dir tarball including xlog,
> > > and the mingled stderr ("do_new.out")
> > > 
> > > https://drive.google.com/folderview?id=0Bzqrh1SO9FcEQmVzSjlmdWZvUHc&usp=sharing
> > > 
> > > The other 3 files in it constitute the harness.  It is a quite a mess, 
> > > with
> > > some hard-coded paths.  The just-posted fix for wal_keep_segments will 
> > > also
> > > have to be applied.
> > > 
> > > 
> > > 
> > > >
> > > > The block is corrupt as far as I can tell. The first third is written,
> > > > and the remainder is all zeros. The header looks like this:
> > > >
> > > 
> > > Yes, that part is by my design.  Why it didn't get fixed from a FPI is not
> > > by my design, or course.
> > 
> > There are no FPIs (if you mean a full page image with that) afaics:
> > 
> > Your logs tell us about recovery:
> > 27692  2013-04-03 10:09:15.621 PDT:LOG:  redo starts at 1/3190
> > 27692  2013-04-03 10:09:15.647 PDT:LOG:  incorrect resource manager data 
> > checksum in record at 1/31169A68
> > 27692  2013-04-03 10:09:15.647 PDT:LOG:  redo done at 1/31169A38

> Looking at the page lsn's with dd I noticed something peculiar:
> 
> page 0:
> 01 00 00 00 18 c2 00 31 => 1/3100C218
> page 1:
> 01 00 00 00 80 44 01 31 => 1/31014480
> page 10:
> 01 00 00 00 60 ce 05 31 => 1/3105ce60
> page 43:
> 01 00 00 00 58 7a 16 31 => 1/31167a58
> page 44:
> 01 00 00 00 f0 99 16 31 => 1/311699f0
> page 45:
> 00 00 00 00 00 00 00 00 => 0/0
> page 90:
> 01 00 00 00 90 17 1d 32 => 1/321d1790
> page 91:
> 01 00 00 00 38 ef 1b 32 => 1/321bef38
> 
> So we have written out pages that are after pages without a LSN that
> have an LSN thats *beyond* the point XLOG has successfully been written
> to disk (1/31169A38)?

By now I am pretty sure the issue is that the end-of-wal is detected too
early. Above the end is detected at 1/31169A38, but the next WAL file
contains valid data:
pg_xlogdump /tmp/tmp/data2/pg_xlog/000100010032 -n 10
rmgr: Heap2   len (rec/tot): 26/58, tx:  0, lsn: 
1/3230, prev 1/31D8, bkp: , desc: clean: rel 1663/16384/835589; blk 
122 remxid 26328926
rmgr: Heaplen (rec/tot): 51/83, tx:   26328964, lsn: 
1/3270, prev 1/3230, bkp: , desc: update: rel 1663/16384/835589; 
tid 122/191 xmax 26328964 ; new tid 129/140 xmax 0
rmgr: Heap2   len (rec/tot): 30/62, tx:  0, lsn: 
1/32C8, prev 1/3270, bkp: , desc: clean: rel 1663/16384/835589; blk 
63 remxid 26328803
rmgr: Btree   len (rec/tot): 34/66, tx:   26328964, lsn: 
1/32000108, prev 1/32C8, bkp: , desc: insert: rel 1663/16384/835590; 
tid 25/14

That would explains exactly those symptopms, wouldn't it? Whats causing
that - I am not sure yet.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Noah Misch

On Wed, Apr 03, 2013 at 08:09:15PM +0300, Heikki Linnakangas wrote:
> --- a/src/include/regex/regguts.h
> +++ b/src/include/regex/regguts.h
> @@ -148,6 +148,7 @@
>  typedef short color; /* colors of characters */
>  typedef int pcolor;  /* what color promotes to */
>  
> +#define MAX_COLOR32767   /* max value that fits in 'color' 
> datatype */

This should use SHRT_MAX, no?  (Not that any supported platform differs here.)

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Jeff Davis

 On Wed, 2013-04-03 at 15:57 -0700, Jeff Janes wrote:

> You don't know that the cluster is in the bad state until after it
> goes through recovery because most crashes recover perfectly fine.  So
> it would have to make a side-copy of the cluster after the crash, then
> recover the original and see how things go, then either retain or
> delete the side-copy.  Unfortunately my testing harness can't do this
> at the moment, because the perl script storing the consistency info
> needs to survive over the crash and recovery.   It might take
> me awhile to figure out how to make it do this.

Hmm... you're right, that is difficult to integrate into a test.

Another approach is to expand PageHeaderIsValid in a pre-checksums
version to do some extra checks (perhaps the same checks as pg_filedump
does). It might be able to at least detect simple cases, like all zeros
between pd_upper and pd_special (when pd_upper < pd_special). Still not
easy, but maybe more practical than saving the directory like that. We
may even want a GUC for that to aid future debugging (obviously it would
have a performance impact).

> The other 3 files in it constitute the harness.  It is a quite a mess,
> with some hard-coded paths.  The just-posted fix for wal_keep_segments
> will also have to be applied.

I'm still trying to reproduce the problem myself. I keep trying simpler
versions of your test and I don't see the problem. It's a little awkward
to try to run the full version of your test (and I'm jumping between
code inspection and various other ideas).

> I'll work on it, but it will take awhile to figure out exactly how to
> use pg_filedump to do that, and how to automate that process and
> incorporate it into the harness.

It might be more productive to try to expand PageHeaderIsValid as I
suggested above.

> In the meantime, I tested the checksum commit itself (96ef3b8ff1c) and
> the problem occurs there, so if the problem is not the checksums
> themselves (and I agree it probably isn't) it must have been
> introduced before that rather than after.

Thank you for all of your work on this. I have also attached another
test patch that disables most of the complex areas of the checksums
patch (VM bits, hint bits, etc.). If you apply different portions of the
patch (or the whole thing), it will help eliminate possibilities. In
particular, eliminating the calls to PageSetAllVisible+visibilitymap_set
could tell us a lot.

Also, the first data directory you posted had a problem with a page that
appeared to be a new page (otherwise I would have expected either old or
new data at the end). Do you think this might be a problem only for new
pages? Or do you think your test just makes it likely that many writes
will happen on new pages?

I did find one potential problem in the checksums code (aside from my
previous patch involving wal_level=archive): visibilitymap_set is called
sometimes before the heap page is marked dirty. I will examine a little
more closely, but I expect that to require another patch. Not sure if
that could explain this problem or not.

Regards,
Jeff Davis

*** a/src/backend/access/heap/pruneheap.c
--- b/src/backend/access/heap/pruneheap.c
***
*** 257,264  heap_page_prune(Relation relation, Buffer buffer, TransactionId OldestXmin,
  		 * point in repeating the prune/defrag process until something else
  		 * happens to the page.
  		 */
! 		if (((PageHeader) page)->pd_prune_xid != prstate.new_prune_xid ||
! 			PageIsFull(page))
  		{
  			((PageHeader) page)->pd_prune_xid = prstate.new_prune_xid;
  			PageClearFull(page);
--- 257,264 
  		 * point in repeating the prune/defrag process until something else
  		 * happens to the page.
  		 */
! 		if (false && (((PageHeader) page)->pd_prune_xid != prstate.new_prune_xid ||
! 	  PageIsFull(page)))
  		{
  			((PageHeader) page)->pd_prune_xid = prstate.new_prune_xid;
  			PageClearFull(page);
*** a/src/backend/commands/vacuumlazy.c
--- b/src/backend/commands/vacuumlazy.c
***
*** 668,674  lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
  			freespace = PageGetHeapFreeSpace(page);

  			/* empty pages are always all-visible */
! 			if (!PageIsAllVisible(page))
  			{
  PageSetAllVisible(page);
  MarkBufferDirty(buf);
--- 668,674 
  			freespace = PageGetHeapFreeSpace(page);

  			/* empty pages are always all-visible */
! 			if (false && !PageIsAllVisible(page))
  			{
  PageSetAllVisible(page);
  MarkBufferDirty(buf);
***
*** 901,907  lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
  		freespace = PageGetHeapFreeSpace(page);

  		/* mark page all-visible, if appropriate */
! 		if (all_visible)
  		{
  			if (!PageIsAllVisible(page))
  			{
--- 901,907 
  		freespace = PageGetHeapFreeSpace(page);

  		/* mark page all-visible, if appropriate */
! 		if (false && all_visible)
  		{
  			if (!PageIsAllVisible(page))
  			{
***
*** 1149,1155  lazy_vacuu

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund

On 2013-04-04 01:52:41 +0200, Andres Freund wrote:
> On 2013-04-03 15:57:49 -0700, Jeff Janes wrote:
> > I've changed the subject from "regression test failed when enabling
> > checksum" because I now know they are totally unrelated.
> > 
> > My test case didn't need to depend on archiving being on, and so with a
> > simple tweak I rendered the two issues orthogonal.
> > 
> > 
> > On Wed, Apr 3, 2013 at 12:15 PM, Jeff Davis  wrote:
> > 
> > > On Mon, 2013-04-01 at 19:51 -0700, Jeff Janes wrote:
> > >
> > > > What I would probably really want is the data as it existed after the
> > > > crash but before recovery started, but since the postmaster
> > > > immediately starts recovery after the crash, I don't know of a good
> > > > way to capture this.
> > >
> > > Can you just turn off the restart_after_crash GUC? I had a chance to
> > > look at this, and seeing the block before and after recovery would be
> > > nice. I didn't see a log file in the data directory, but it didn't go
> > > through recovery, so I assume it already did that.
> > >
> > 
> > You don't know that the cluster is in the bad state until after it goes
> > through recovery because most crashes recover perfectly fine.  So it would
> > have to make a side-copy of the cluster after the crash, then recover the
> > original and see how things go, then either retain or delete the side-copy.
> >  Unfortunately my testing harness can't do this at the moment, because the
> > perl script storing the consistency info needs to survive over the crash
> > and recovery.   It might take me awhile to figure out how to make it do
> > this.
> > 
> > I have the server log just go to stderr, where it gets mingled together
> > with messages from my testing harness.  I had not uploaded that file.
> > 
> > Here is a new upload. It contains both a data_dir tarball including xlog,
> > and the mingled stderr ("do_new.out")
> > 
> > https://drive.google.com/folderview?id=0Bzqrh1SO9FcEQmVzSjlmdWZvUHc&usp=sharing
> > 
> > The other 3 files in it constitute the harness.  It is a quite a mess, with
> > some hard-coded paths.  The just-posted fix for wal_keep_segments will also
> > have to be applied.
> > 
> > 
> > 
> > >
> > > The block is corrupt as far as I can tell. The first third is written,
> > > and the remainder is all zeros. The header looks like this:
> > >
> > 
> > Yes, that part is by my design.  Why it didn't get fixed from a FPI is not
> > by my design, or course.
> 
> There are no FPIs (if you mean a full page image with that) afaics:
> 
> Your logs tell us about recovery:
> 27692  2013-04-03 10:09:15.621 PDT:LOG:  redo starts at 1/3190
> 27692  2013-04-03 10:09:15.647 PDT:LOG:  incorrect resource manager data 
> checksum in record at 1/31169A68
> 27692  2013-04-03 10:09:15.647 PDT:LOG:  redo done at 1/31169A38
> 
> And then the error:
> 
> 27698 SELECT 2013-04-03 10:09:16.688 PDT:WARNING:  page verification failed, 
> calculated checksum 16296 but expected 49517
> 27698 SELECT 2013-04-03 10:09:16.688 PDT:ERROR:  invalid page in block 90 of 
> relation base/16384/835589
> 
> 
> looking at xlog from that time, the first records are:
> rmgr: XLOGlen (rec/tot): 72/   104, tx:  0, lsn: 
> 1/3128, prev 1/3090, bkp: , desc: checkpoint: redo 1/3128; 
> tli 1; prev tli 1; fpw true; xid 0/26254997; oid 835589; multi 1; offset 0; 
> oldest xid 1799 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; online
> rmgr: XLOGlen (rec/tot):  4/36, tx:  0, lsn: 
> 1/3190, prev 1/3128, bkp: , desc: nextOid: 843781
> rmgr: Storage len (rec/tot): 16/48, tx:  0, lsn: 
> 1/31B8, prev 1/3190, bkp: , desc: file create: base/16384/835589
> rmgr: Heaplen (rec/tot): 37/  7905, tx:   26254997, lsn: 
> 1/31E8, prev 1/31B8, bkp: 1000, desc: hot_update: rel 
> 1663/16384/12660; tid 0/47 xmax 26254997 ; new tid 0/33 xmax 0
> 
> So the file was created after the last checkpoint, so no full pages need
> to be logged. And we theoretically should be able to recovery all things
> like torn pages from the WAL since that should cover everything that
> happened to that file.
> 
> The bkp block 1 indicated in the 4th record above is the only one in
> that period of WAL btw.
> 
> Looking a bit more, the last page that's covered by WAL is:
> rmgr: Heaplen (rec/tot): 35/67, tx:   26254998, lsn: 
> 1/311699A8, prev 1/31169960, bkp: , desc: insert: rel 1663/16384/835589; 
> tid 44/56
> 
> which is far earlier than the "block 90" the error is found in unless i
> misunderstand something. Which is strange, since non-zero content to
> those pages shouldn't go to disk until the respective WAL is
> flushed. Huh, are we missing something here?

Looking at the page lsn's with dd I noticed something peculiar:

page 0:
01 00 00 00 18 c2 00 31 => 1/3100C218
page 1:
01 00 00 00 80 44 01 31 => 1/31014480
page 10:
01 00 00 00 60 ce 05 31 => 1/3105ce60
page 43:
01 00 00 00

Re: [HACKERS] patch to add \watch to psql

2013-04-03 Thread Dickson S. Guedes

2013/4/3 Will Leinweber :
> Here is an updated patch that addresses several of the points brought up so
> far, such as the sleep, internationalization banner, and zero wait check,
> and it removes the premature input check.
>
> Unfortunately rl_clear_screen() is not included at all in libedit, causing
> compilation to fail, and I was completely unable to find a way to
> distinguish libedit from readline on OS X. It tries extraordinarily hard to
> pretend that it's readline. Instead falling back to simple control
> characters to clear the screen worked very well on both linux and OS X.

I don't have access to an OSX box ATM but term_clear_screen(), in
libedit, didn't help?

-- 
Dickson S. Guedes
mail/xmpp: gue...@guedesoft.net - skype: guediz
http://github.com/guedes - http://guedesoft.net
http://www.postgresql.org.br


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] spoonbill vs. -HEAD

2013-04-03 Thread Tom Lane

Stefan Kaltenbrunner  writes:
> On 04/03/2013 12:59 AM, Tom Lane wrote:
>> BTW, on further thought it seems like maybe this is an OpenBSD bug,
>> at least in part: what is evidently happening is that the temporary
>> blockage of SIGINT during the handler persists even after we've
>> longjmp'd back to the main loop.  But we're using sigsetjmp(..., 1)
>> to establish that longjmp handler --- so why isn't the original signal
>> mask reinstalled when we return to the main loop?
>> 
>> If (your version of) OpenBSD is getting this wrong, it'd explain why
>> we've not seen similar behavior elsewhere.

> hmm trolling the openbsd cvs history brings up this:
> http://www.openbsd.org/cgi-bin/cvsweb/src/sys/arch/sparc64/sparc64/machdep.c?r1=1.143;sortby=date#rev1.143

That's about alternate signal stacks, which we're not using.

I put together a simple test program (attached) and tried it on
spoonbill, and it says that the signal *does* get unblocked when control
returns to the sigsetjmp(...,1).  So now I'm really confused.  Somehow
the results we're getting in a full-fledged backend do not match up with
the results gotten by this test program ... but why?

regards, tom lane

#include 
#include 
#include 
#include 
#include 

typedef void (*pqsigfunc) (int signo);

sigjmp_buf *PG_exception_stack;

static void
printmask(const char *labl)
{
sigset_t sigs;
int i;

printf("masked signals at %s: ", labl);
if (sigprocmask(SIG_SETMASK, NULL, &sigs) == 0)
{
for (i = 1; i <= 32; i++)
{
//  if (sigs.sigset[0] & sigmask(i))
if (sigs & sigmask(i))
printf("%d, ", i);
}
}
else
printf("failed!");
printf("\n");
fflush(stdout);
}

static void
sigint_handler(int signo)
{
printmask("sigint_handler");
siglongjmp(*PG_exception_stack, 1);
abort();
}

static pqsigfunc
pqsignal(int signo, pqsigfunc func)
{
struct sigaction act,
oact;

act.sa_handler = func;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
if (signo != SIGALRM)
act.sa_flags |= SA_RESTART;
#ifdef SA_NOCLDSTOP
if (signo == SIGCHLD)
act.sa_flags |= SA_NOCLDSTOP;
#endif
if (sigaction(signo, &act, &oact) < 0)
return SIG_ERR;
return oact.sa_handler;
}

void
intermediate_func(int level)
{
do {
sigjmp_buf *save_exception_stack = PG_exception_stack;
sigjmp_buf local_sigjmp_buf;

if (sigsetjmp(local_sigjmp_buf, 0) == 0)
{
PG_exception_stack = &local_sigjmp_buf;

if (level > 0)
intermediate_func(level - 1);
else
{
kill(getpid(), SIGINT);
sleep(1);
abort();
}
}
else
{
PG_exception_stack = save_exception_stack;

printmask("inner longjmp catcher");
siglongjmp(*PG_exception_stack, 1);
abort();
}
PG_exception_stack = save_exception_stack;
} while (0);
}

int
main(int argc, char **argv)
{
sigjmp_buf  outer_sigjmp_buf;

printmask("startup");
pqsignal(SIGINT, sigint_handler);

if (sigsetjmp(outer_sigjmp_buf, 1) != 0)
{
printmask("outer longjmp catcher");
return 0;
}
PG_exception_stack = &outer_sigjmp_buf;

intermediate_func(2);

printmask("unexpected exit");
return 0;
}

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch to add \watch to psql

2013-04-03 Thread Will Leinweber

Here is an updated patch that addresses several of the points brought up so
far, such as the sleep, internationalization banner, and zero wait check,
and it removes the premature input check.

Unfortunately rl_clear_screen() is not included at all in libedit, causing
compilation to fail, and I was completely unable to find a way to
distinguish libedit from readline on OS X. It tries extraordinarily hard to
pretend that it's readline. Instead falling back to simple control
characters to clear the screen worked very well on both linux and OS X.

On Tue, Mar 26, 2013 at 2:14 PM, Peter Eisentraut  wrote:

> On 3/24/13 3:10 PM, Tom Lane wrote:
> > I also concur with the complaint here
> >
> http://www.postgresql.org/message-id/caazkufzxyj-rt1aec6s0g7zm68tdlfbbm1r6hgrbbxnz80k...@mail.gmail.com
> > that allowing a minimum sleep of 0 is rather dangerous
>
> The original "watch" command apparently silently corrects a delay of 0
> to 0.1 seconds.
>
> > Another minor question is whether we really need the time-of-day in the
> > banner,
>
> That's also part of the original "watch" and occasionally useful, I think.
>
>

psql-watch-v4.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Gavin Flower


On 04/04/13 11:55, Jim Nasby wrote:

On 4/3/13 10:34 AM, Gavin Flower wrote:

Maybe we should adopt the famous compromise of '0.5'?


+0.5. ;P

Far too positive for our grim world!  How about '-0,5' instead?  :-)

I notice you call yourself a 'Data Architect' - never too sure If I 
should call myself

a 'software architect who pretends he can program' or
a 'programmer with pretensions of being a software architect'?  :-)

Re: [HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Andres Freund

On 2013-04-03 15:57:49 -0700, Jeff Janes wrote:
> I've changed the subject from "regression test failed when enabling
> checksum" because I now know they are totally unrelated.
> 
> My test case didn't need to depend on archiving being on, and so with a
> simple tweak I rendered the two issues orthogonal.
> 
> 
> On Wed, Apr 3, 2013 at 12:15 PM, Jeff Davis  wrote:
> 
> > On Mon, 2013-04-01 at 19:51 -0700, Jeff Janes wrote:
> >
> > > What I would probably really want is the data as it existed after the
> > > crash but before recovery started, but since the postmaster
> > > immediately starts recovery after the crash, I don't know of a good
> > > way to capture this.
> >
> > Can you just turn off the restart_after_crash GUC? I had a chance to
> > look at this, and seeing the block before and after recovery would be
> > nice. I didn't see a log file in the data directory, but it didn't go
> > through recovery, so I assume it already did that.
> >
> 
> You don't know that the cluster is in the bad state until after it goes
> through recovery because most crashes recover perfectly fine.  So it would
> have to make a side-copy of the cluster after the crash, then recover the
> original and see how things go, then either retain or delete the side-copy.
>  Unfortunately my testing harness can't do this at the moment, because the
> perl script storing the consistency info needs to survive over the crash
> and recovery.   It might take me awhile to figure out how to make it do
> this.
> 
> I have the server log just go to stderr, where it gets mingled together
> with messages from my testing harness.  I had not uploaded that file.
> 
> Here is a new upload. It contains both a data_dir tarball including xlog,
> and the mingled stderr ("do_new.out")
> 
> https://drive.google.com/folderview?id=0Bzqrh1SO9FcEQmVzSjlmdWZvUHc&usp=sharing
> 
> The other 3 files in it constitute the harness.  It is a quite a mess, with
> some hard-coded paths.  The just-posted fix for wal_keep_segments will also
> have to be applied.
> 
> 
> 
> >
> > The block is corrupt as far as I can tell. The first third is written,
> > and the remainder is all zeros. The header looks like this:
> >
> 
> Yes, that part is by my design.  Why it didn't get fixed from a FPI is not
> by my design, or course.

There are no FPIs (if you mean a full page image with that) afaics:

Your logs tell us about recovery:
27692  2013-04-03 10:09:15.621 PDT:LOG:  redo starts at 1/3190
27692  2013-04-03 10:09:15.647 PDT:LOG:  incorrect resource manager data 
checksum in record at 1/31169A68
27692  2013-04-03 10:09:15.647 PDT:LOG:  redo done at 1/31169A38

And then the error:

27698 SELECT 2013-04-03 10:09:16.688 PDT:WARNING:  page verification failed, 
calculated checksum 16296 but expected 49517
27698 SELECT 2013-04-03 10:09:16.688 PDT:ERROR:  invalid page in block 90 of 
relation base/16384/835589


looking at xlog from that time, the first records are:
rmgr: XLOGlen (rec/tot): 72/   104, tx:  0, lsn: 
1/3128, prev 1/3090, bkp: , desc: checkpoint: redo 1/3128; tli 
1; prev tli 1; fpw true; xid 0/26254997; oid 835589; multi 1; offset 0; oldest 
xid 1799 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; online
rmgr: XLOGlen (rec/tot):  4/36, tx:  0, lsn: 
1/3190, prev 1/3128, bkp: , desc: nextOid: 843781
rmgr: Storage len (rec/tot): 16/48, tx:  0, lsn: 
1/31B8, prev 1/3190, bkp: , desc: file create: base/16384/835589
rmgr: Heaplen (rec/tot): 37/  7905, tx:   26254997, lsn: 
1/31E8, prev 1/31B8, bkp: 1000, desc: hot_update: rel 1663/16384/12660; 
tid 0/47 xmax 26254997 ; new tid 0/33 xmax 0

So the file was created after the last checkpoint, so no full pages need
to be logged. And we theoretically should be able to recovery all things
like torn pages from the WAL since that should cover everything that
happened to that file.

The bkp block 1 indicated in the 4th record above is the only one in
that period of WAL btw.

Looking a bit more, the last page that's covered by WAL is:
rmgr: Heaplen (rec/tot): 35/67, tx:   26254998, lsn: 
1/311699A8, prev 1/31169960, bkp: , desc: insert: rel 1663/16384/835589; 
tid 44/56

which is far earlier than the "block 90" the error is found in unless i
misunderstand something. Which is strange, since non-zero content to
those pages shouldn't go to disk until the respective WAL is
flushed. Huh, are we missing something here?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Greg Stark

On Wed, Apr 3, 2013 at 3:00 PM, Robert Haas  wrote:

> The main hesitation I've had about actually implementing such a scheme
> is that I find it a bit unappealing to have a background process
> dedicated to just this.  But maybe it could be combined with some of
> the other ideas presented here.  Perhaps we should have one process
> that scans the buffer arena and populates the freelists; as a side
> effect, if it runs across a dirty buffer, it kicks it over to the
> process described in the previous paragraph (which could still, also,
> absorb requests from other backends using buffer access strategies).
> Then we'd end up with nothing that looks exactly like the background
> writer we have now, but maybe no one would miss it.
>

I think the general pattern of development has led in the opposite
direction. Every time there's been one daemon responsible for two things
it's ended up causing contorted code and difficult to tune behaviours and
we've ended up splitting the two.

In particular in this case it seems like an especially poor choice. In the
time one buffer write might take the entire freelist might empty out. I
could easily imagine this happening *every* time a write I/O happens. It
seems more likely that you'll need multiple processes running the buffer
cleaning to keep up with a decent I/O subsystem.

I'm still skeptical about the idea of a "freelist". That just seems like a
terrible point of contention. However perhaps that's because I'm picturing
an LRU linked list. Perhaps the right thing is to maintain a pool of
buffers in some less contention-prone data structure which lets each
backend pick buffers out more or less independently of the others.

-- 
greg

Re: [HACKERS] CREATE EXTENSION BLOCKS

2013-04-03 Thread David E. Wheeler

On Apr 3, 2013, at 11:41 AM, Alvaro Herrera  wrote:

>> Oh, they are not unique per-schema? I guess they are global to the database 
>> but then their objects are in the specified schema, then.
> 
> Right -- an extension is not considered to live within a schema, they
> are database-global.  The objects might live in a particular schema (if
> it is "relocatable"), and there's support to move those to a different
> schema, but this doesn't affect the extension itself.

Thanks. I humbly submit this patch to help prevent silly questions like this in 
the future.

diff --git a/doc/src/sgml/ref/create_extension.sgml 
b/doc/src/sgml/ref/create_extension.sgml
index 4f3b9a5..4ab3dff 100644
--- a/doc/src/sgml/ref/create_extension.sgml
+++ b/doc/src/sgml/ref/create_extension.sgml
@@ -93,6 +93,8 @@ CREATE EXTENSION [ IF NOT EXISTS ] extension_name
 relocated.  The named schema must already exist.
 If not specified, and the extension's control file does not specify a
 schema either, the current default object creation schema is used.
+Note that only the extension objects will be placed into the named
+schema; the extension itself is a database-global object.

   
  

Best,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Clang compiler warning on 9.3 HEAD

2013-04-03 Thread Peter Geoghegan

On Wed, Apr 3, 2013 at 11:52 PM, Will Leinweber  wrote:
> On ref 8507907 when compiling with clang on os x, I got this warning which
> seems like a possible bug.
>
> I thought to report this because I imagine clang isn't frequently used
> day-to-day by most.

I actually reported a bug that was thrown up by Clang for similar
reasons over a year ago (i.e. an implicit conversion of one enum type
to another). That bug was fixed (by this commit:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=05e83968929f4ec1eba058fcae755fd2df98864e
), and the diagnostics were considered generally sound and useful at
the time. This looks like a real problem to me.

-- 
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] corrupt pages detected by enabling checksums

2013-04-03 Thread Jeff Janes

I've changed the subject from "regression test failed when enabling
checksum" because I now know they are totally unrelated.

My test case didn't need to depend on archiving being on, and so with a
simple tweak I rendered the two issues orthogonal.

On Wed, Apr 3, 2013 at 12:15 PM, Jeff Davis  wrote:

> On Mon, 2013-04-01 at 19:51 -0700, Jeff Janes wrote:
>
> > What I would probably really want is the data as it existed after the
> > crash but before recovery started, but since the postmaster
> > immediately starts recovery after the crash, I don't know of a good
> > way to capture this.
>
> Can you just turn off the restart_after_crash GUC? I had a chance to
> look at this, and seeing the block before and after recovery would be
> nice. I didn't see a log file in the data directory, but it didn't go
> through recovery, so I assume it already did that.
>

You don't know that the cluster is in the bad state until after it goes
through recovery because most crashes recover perfectly fine.  So it would
have to make a side-copy of the cluster after the crash, then recover the
original and see how things go, then either retain or delete the side-copy.
 Unfortunately my testing harness can't do this at the moment, because the
perl script storing the consistency info needs to survive over the crash
and recovery.   It might take me awhile to figure out how to make it do
this.

I have the server log just go to stderr, where it gets mingled together
with messages from my testing harness.  I had not uploaded that file.

Here is a new upload. It contains both a data_dir tarball including xlog,
and the mingled stderr ("do_new.out")

https://drive.google.com/folderview?id=0Bzqrh1SO9FcEQmVzSjlmdWZvUHc&usp=sharing

The other 3 files in it constitute the harness.  It is a quite a mess, with
some hard-coded paths.  The just-posted fix for wal_keep_segments will also
have to be applied.

>
> The block is corrupt as far as I can tell. The first third is written,
> and the remainder is all zeros. The header looks like this:
>

Yes, that part is by my design.  Why it didn't get fixed from a FPI is not
by my design, or course.

>
> So, the page may be corrupt without checksums as well, but it just
> happens to be hidden for the same reason. Can you try to reproduce it
> without -k?

No, things run (seemingly) fine without -k.

> And on the checkin right before checksums were added?
> Without checksums, you'll need to use pg_filedump (or similar) to find
> whether an error has happened.
>

I'll work on it, but it will take awhile to figure out exactly how to use
pg_filedump to do that, and how to automate that process and incorporate it
into the harness.

In the meantime, I tested the checksum commit itself (96ef3b8ff1c) and the
problem occurs there, so if the problem is not the checksums themselves
(and I agree it probably isn't) it must have been introduced before that
rather than after.

Cheers,

Jeff

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Jim Nasby


On 4/3/13 10:34 AM, Gavin Flower wrote:

Maybe we should adopt the famous compromise of '0.5'?


+0.5. ;P
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Clang compiler warning on 9.3 HEAD

2013-04-03 Thread Will Leinweber

On ref 8507907 when compiling with clang on os x, I got this warning which
seems like a possible bug.

I thought to report this because I imagine clang isn't frequently used
day-to-day by most.

dependency.c:213:36: warning: implicit conversion from enumeration type
'ObjectClass' (aka 'enum ObjectClass') to different
  enumeration type 'ObjectType' (aka 'enum ObjectType') [-Wconversion]

EventTriggerSupportsObjectType(getObjectClass(thisobj)))
~~
^~~
1 warning generated.

Re: [HACKERS] Drastic performance loss in assert-enabled build in HEAD

2013-04-03 Thread Tom Lane

Kevin Grittner  writes:
> Tom Lane  wrote:
>>> In fact, I'm going to go further and say that I do not like the
>>> entire concept of scannability, either as to design or
>>> implementation, and I think we should just plain rip it out.

> To be honest, I don't think I've personally seen a single use case
> for matviews where they could be used if you couldn't count on an
> error if attempting to use them without the contents reflecting a
> materialization of the associated query at *some* point in time.

Well, if we remove the WITH NO DATA clause from CREATE MATERIALIZED
VIEW, that minimum requirement is satisfied no?

I wouldn't actually want to remove that option, because pg_dump will
need it to avoid circular-reference problems.  But if you simply don't
use it then you have the minimum guarantee.  And I do not see where
the current implementation is giving you any more guarantees.

What it *is* doing is setting a rather dubious behavioral precedent that
we will no doubt hear Robert complaining about when (not if) we decide
we don't want to be backward-compatible with it anymore.

Granting that throwing an error is actually of some use to some people,
I would not think that people would want to turn it on via a command
that throws away the existing view contents altogether, nor turn it off
with a full-throated REFRESH.  There are going to need to be ways to
incrementally update matviews, and ways to disable/enable access that
are not tied to a complete rebuild, not to mention being based on
user-determined rather than hard-wired criteria for what's too stale.
So I don't think this is a useful base to build on.

If you feel that scannability disable is an absolute must for version 0,
let's invent a matview reloption or some such to implement it and let
users turn it on and off as they wish.  That seems a lot more likely
to still be useful two years from now.  And if you're absolutely
convinced that unlogged matviews mustn't work as I suggest, we can
lose those from 9.3, too.

What I'd actually rather see us spending time on right now is making
some provision for incremental updates, which I will boldly propose
could be supported by user-written triggers on the underlying tables
if we only diked out the prohibitions against INSERT/UPDATE/DELETE on
matviews, and allowed them to operate on a matview's contents just like
it was a table.  Now admittedly that would foreclose allowing matviews
to be updatable in the updatable-view sense, but that's a feature I
would readily give up if it meant users could build incremental update
mechanisms this year and not two years down the road.

> (1) On the public web site for circuit court data, visibility is
> based on supreme court rules and the advice of a committee
> consisting of judges, representatives of the press, defense
> attorneys, prosecuting attorneys, etc.  There are cases in the
> database which, for one reason or another, should not show up on
> the public web site.  On a weekly basis, where monitoring shows the
> lowest usage, the table of cases which are "too old" to be shown
> according to the rules thus determined is regenerated.  If there
> was the possibility that a dump and load could fail to fill this,
> and the queries would run without error, they could not move from
> ad hoc matview techniques to the new feature without excessive
> risk.

Why exactly do you think that what I'm suggesting would cause a dump and
reload to not regenerate the data?  (Personally, I think pg_dump ought
to have an option selecting whether or not to repopulate matviews, but
again, if that's not what you want *don't use it*.)

> (2) Individual judges have a "dashboard" showing such things as the
> number of court cases which have gone beyond certain thresholds
> without action.  They can "drill down" to detail so that cases
> which have "slipped through the cracks" can be scheduled for some
> appropriate action.  "Justice delayed..." and all of that.  It
> would be much better to get an error which would result in
> "information currently unavailable" than to give the impression
> that there are no such cases.

If you need 100% accuracy, which is what this scenario appears to be
demanding, matviews are not for you (yet).  The existing implementation
certainly is nowhere near satisfying this scenario.

> ... Making sure that
> the heap has at least one page if data has been generated seems
> like a not-entirely-unreasonable way to do that, although there
> remains at least one vacuum bug to fix if we keep it, in addition
> to Tom's concerns.

No.  This is an absolute disaster.  It's taking something we have always
considered to be an irrelevant implementation detail and making it into
user-visible DDL state, despite the fact that it doesn't begin to satisfy
basic transactional behaviors.  We *need* to get rid of that aspect of
things.  If you must have scannability state in version 0, okay, but
it has to be a catalog property not this.

> It has the advantage of playing ni

Re: [HACKERS] c language functions

2013-04-03 Thread Rodrigo Barboza

I see, that's true.
I'm returning unknown type, there is a little more overhead. But it's
working now. =]
Thanks for the help guys!


On Wed, Apr 3, 2013 at 6:17 PM, Robert Haas  wrote:

> On Wed, Apr 3, 2013 at 3:25 PM, Rodrigo Barboza 
> wrote:
> > Well, I was checking inside my function the type of the second argument
> and
> > switching between the allowed types.
> > This way kind of does the same thing of many functions, doesn't it?
>
> Sure, except that it will also call your function when someone does
> my_int + text or my_int + bytea or my_int + box, which are surely a
> lot less sensible.  It's probably a good idea to assume that if you
> make your type work like the existing types, it's more likely to end
> well than if you make it behave completely differently.
>
> ...Robert
>

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-04-03 Thread Robert Haas

On Wed, Apr 3, 2013 at 2:54 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> On Tue, Apr 2, 2013 at 12:19 PM, Peter Eisentraut  wrote:
>>> It's weird that SET LOCAL and SET SESSION actually *set* the value, and
>>> the second key word determines how long the setting will last.  SET
>>> PERSISTENT doesn't actually set the value.  I predict that this will be
>>> a new favorite help-it-doesn't-work FAQ.
>
>> I think this is another argument against this particular syntax.  I
>> have always thought that something along the lines of ALTER SYSTEM
>> would be more appropriate.  ALTER DATABASE .. SET and ALTER ROLE ..
>> SET don't change the value immediately either, and nobody gets
>> confused about that to my knowledge.  But I can see where SET
>> PERSISTENT could cause that sort of confusion.
>
> Yeah, I think I argued for using the SET syntax to start with, but
> I'm coming around to the position that SET PERSISTENT is too much
> unlike the behavior of other varieties of SET.  ALTER is sounding
> more attractive to me now.  Not sure about "ALTER SYSTEM" in particular
> though --- it's not clear that that has any real merit other than
> already existing as a keyword.  (Not that that's negligible.)
> ALTER CONFIGURATION is another alternative using an existing keyword
> that might be worth considering.

Yeah, I thought about something like that.  Aside from saving on
keywords, the reason I like ALTER SYSTEM or similar is that I suspect
there will be other system-wide things that we may want to let people
ALTER in the future, so I think that route might avoid an unnecessary
proliferation of top-level commands.  I am not, however, deadly
attached to the idea, if someone's got a good reason for preferring
something else.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] c language functions

2013-04-03 Thread Robert Haas

On Wed, Apr 3, 2013 at 3:25 PM, Rodrigo Barboza  wrote:
> Well, I was checking inside my function the type of the second argument and
> switching between the allowed types.
> This way kind of does the same thing of many functions, doesn't it?

Sure, except that it will also call your function when someone does
my_int + text or my_int + bytea or my_int + box, which are surely a
lot less sensible.  It's probably a good idea to assume that if you
make your type work like the existing types, it's more likely to end
well than if you make it behave completely differently.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Drastic performance loss in assert-enabled build in HEAD

2013-04-03 Thread Kevin Grittner

Tom Lane  wrote:
> Robert Haas  writes:
>> Tom Lane  wrote:

>>> In fact, I'm going to go further and say that I do not like the
>>> entire concept of scannability, either as to design or
>>> implementation, and I think we should just plain rip it out.
>
>> This has been my feeling from the beginning, so I'm happy to
>> support this position.  I think the current version - where
>> scan-ability is tracked in just one way - is an improvement over
>> the previous version of the patch - where it was tracked in two
>> different ways with a confusing shuffle of information from one
>> place to the other. But my favorite number of places to track it
>> would be zero.
>
> To be clear, I think we'll end up tracking some notion of
> scannability eventually.  I just don't think the current notion
> is sufficiently baked to want to promise to be upward-compatible
> with it in future.

To be honest, I don't think I've personally seen a single use case
for matviews where they could be used if you couldn't count on an
error if attempting to use them without the contents reflecting a
materialization of the associated query at *some* point in time.
There probably are such, but I think removing this entirely takes
the percentage of use cases covered by the implementation in this
release down from 10% to 2%.

Consider where the Wisconsin Courts use "home-grown" materialized
views currently:

(1) On the public web site for circuit court data, visibility is
based on supreme court rules and the advice of a committee
consisting of judges, representatives of the press, defense
attorneys, prosecuting attorneys, etc.  There are cases in the
database which, for one reason or another, should not show up on
the public web site.  On a weekly basis, where monitoring shows the
lowest usage, the table of cases which are "too old" to be shown
according to the rules thus determined is regenerated.  If there
was the possibility that a dump and load could fail to fill this,
and the queries would run without error, they could not move from
ad hoc matview techniques to the new feature without excessive
risk.

(2) Individual judges have a "dashboard" showing such things as the
number of court cases which have gone beyond certain thresholds
without action.  They can "drill down" to detail so that cases
which have "slipped through the cracks" can be scheduled for some
appropriate action.  "Justice delayed..." and all of that.  It
would be much better to get an error which would result in
"information currently unavailable" than to give the impression
that there are no such cases.

Since the main point of this patch is to get the basis for a more
complete matview feature into place while still supporting *some*
use cases, and a high priority needs to be place on not painting
ourselves into a corner, I agree we should rip this out if we think
it does so.  I have spent some time looking at what we will want to
add in future releases, and a more sophisticated concept of what is
"fresh" enough to allow use of the materialized data is certainly
on the list, and falling back to running the underlying query like
a "normal" view would be a reasonable option to be able to choose
in some cases; but I think that will *always* need to start with
information about whether data *has* been generated, and an empty
set has to be considered valid data if it has been.  If we come up
with a way to track that which isn't too fragile, I'm confident
that it will remain useful as the feature evolves. Making sure that
the heap has at least one page if data has been generated seems
like a not-entirely-unreasonable way to do that, although there
remains at least one vacuum bug to fix if we keep it, in addition
to Tom's concerns.  It has the advantage of playing nicely with
unlogged tables.  If this is not going to be what we use long term,
do we have a clue what is?

Personally, I think it would be sad to reduce the use cases for
which matviews in 9.3 can be used to those which can tolerate an
error-free reference to a matview for which data has not been
generated, but I'll rip out support for that distinction if that is
the consensus.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] commit dfda6ebaec67 versus wal_keep_segments

2013-04-03 Thread Jeff Janes

On Wed, Apr 3, 2013 at 11:14 AM, Heikki Linnakangas  wrote:

> On 03.04.2013 18:58, Jeff Janes wrote:
>
>> On Tue, Apr 2, 2013 at 10:08 PM, Jeff Janes  wrote:
>>
>>  This commit introduced a problem with wal_keep_segments:
>>>
>>> commit dfda6ebaec6763090fb78b458a979b**558c50b39b
>>>
>>
>> The problem seems to be that the underflow warned about is happening,
>> because the check to guard it was checking the wrong thing.  However, I
>> don't really understand KeepLogSeg.  It seems like segno, and hence
>> recptr,
>> don't actually serve any purpose.
>>
>
> Hmm, the check is actually correct, but the assignment in the else-branch
> isn't. The idea of KeepLogSeg is to calculate recptr - wal_keep_segments,
> and assign that to *logSegNo. But only if *logSegNo is not already < than
> the calculated value. Does the attached look correct to you?

Let me describe what I think is going on.  My description is "On start,
recptr is the redo location of the just-completed checkpoint, and logSegNo
is the redo location segment of the checkpoint before that one.  We want to
keep the previous-checkpoint redo location, and we also want to keep
wal_keep_segments before the current-checkpoint redo location, so we take
whichever is earlier."

If my understanding is now correct, then I think your patch looks correct.
 (Also, applying it fixed the problem I was having.)

Why do we keep wal_keep_segments before the just-finished checkpoint,
rather than keeping that many before the previous checkpoint?  I seems like
it would be more intuitive (to the DBA) for that parameter to mean "keep
this many more segments than you otherwise would".  I'm not proposing we
change it, I'm just curious about why it is done that way.

Thanks,

Jeff

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Andres Freund

On 2013-04-04 08:03:03 +1300, Gavin Flower wrote:
> On 04/04/13 07:58, Kevin Grittner wrote:
> >Gavin Flower  wrote:
> >
> >>Anyhow, I think we should standardise on zero as the initial
> >>index to be as consistent as practicable.
> >If you want to suggest a default of zero for the first subscript of
> >an array in SQL, please don't confuse the issue by using any form
> >of the word "standard" in that proposal.  There are ANSI and ISO
> >standards for SQL, and they require that the first element of an
> >array is one.  I'm OK with *extending* the standard by *allowing*
> >other values, but let's not flaunt the standard and break existing
> >code by changing the *default*.

That's already possible:

postgres=# SELECT ('[0:3]={e1,e2,e3,e4}'::text[])[0];
 text 
--
 e1
(1 row)

Not too convenient, but ...

> You omitted my rider 'However, not with a religious zeal at the expense of
> practical considerations!' Which was meant to cover concerns like yours.

Given its already possible I don't understand what you propose then. A
guc that allows changing the default? That would need to be included in
dumps and such and would make old dumps - which won't include an
explicit SET of the current value - ambigious to load. So that doesn't
seem to be a good idea either.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] c language functions

2013-04-03 Thread Rodrigo Barboza

Well, I was checking inside my function the type of the second argument and
switching between the allowed types.
This way kind of does the same thing of many functions, doesn't it?


On Wed, Apr 3, 2013 at 3:39 PM, Tom Lane  wrote:

> Rodrigo Barboza  writes:
> > Why not useful?
> > If I don't make it receive anyelement, I will have to create an add
> > function for each type.
>
> If you make it anyelement, then you're contracting to be able to add
> any datatype whatsoever to a my_uint.  This is nonsensical.
>
> You'd be better off declaring several specific addition functions,
> one for each other type.  This will be an order of magnitude easier
> to write, and probably run an order of magnitude faster too, because
> just checking to see what type you got would already be significantly
> more expensive than adding a couple of integers ought to be.
>
> Look at the built-in types and functions for precedent.  There are
> indeed separate functions for int2 + int2, int2 + int4, int4 + int2,
> int4 + int4, etc etc.  If we were starting from scratch, we might reduce
> that to just int4 + int4 and rely on the implicit coercion from int2 to
> int4 to handle the other cases; but there's no way we'd put in run-time
> type determination.
>
> regards, tom lane
>

Re: [HACKERS] regression test failed when enabling checksum

2013-04-03 Thread Jeff Davis

On Mon, 2013-04-01 at 19:51 -0700, Jeff Janes wrote:

> I've reproduced the problem, this time in block 74 of relation
> base/16384/4931589, and a tarball of the data directory is here:
> 
> 
> https://docs.google.com/file/d/0Bzqrh1SO9FcELS1majlFcTZsR0k/edit?usp=sharing
> 
> 
> 
> (the table is in database jjanes under role jjanes, the binary is
> commit 9ad27c215362df436f8c)
> 
> 
> What I would probably really want is the data as it existed after the
> crash but before recovery started, but since the postmaster
> immediately starts recovery after the crash, I don't know of a good
> way to capture this.

Can you just turn off the restart_after_crash GUC? I had a chance to
look at this, and seeing the block before and after recovery would be
nice. I didn't see a log file in the data directory, but it didn't go
through recovery, so I assume it already did that.

The block is corrupt as far as I can tell. The first third is written,
and the remainder is all zeros. The header looks like this:

(These numbers are mostly from pageinspect. The checksum value that
comes from pageinspect needed to be cast back to an unsigned short to
match the error message above -- should that be changed in
pageinspect?).

lsn: 7/252E4080
checksum: 34212
flags: 1
lower: 1188
upper: 5952
special: 8192
pagesize: 8192
version: 4
prune_xid: 156833911

So the header looks good, but most of the page data is missing. I tried
with pg_filedump (the 9.2.0 version, which should be fine), and it
agrees with me that the page is corrupt.

Interestingly, that doesn't result in a user-visible error when
ignore_checksum_failure=on. That's because the item pointers appear to
all be either not normal or they point to the zeroed region. Testing the
visibility of a zeroed tuple header is always false, so no problem.

You'd still think this would cause incorrect results, but I think what's
happening is that this is a new page (otherwise it would have been
written with something other than zeroes before). So, the tuples that
are supposed to be there may be uncommitted anyway.

So, the page may be corrupt without checksums as well, but it just
happens to be hidden for the same reason. Can you try to reproduce it
without -k? And on the checkin right before checksums were added?
Without checksums, you'll need to use pg_filedump (or similar) to find
whether an error has happened.

To start speculating about the root cause: something is violating the
WAL before data rule, or not writing a FPI when it's supposed to, or not
properly restoring the FPI during recovery, or something sets the wrong
LSN. This could still be caused by the checksums patch, but it seems a
little less likely now. The reason I say that is because it's a new page
with tuples on it, so that means something in the insert/update path
ended up not writing the FPI before writing the page.

Regards,
Jeff Davis

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Gavin Flower


On 04/04/13 07:58, Kevin Grittner wrote:

Gavin Flower  wrote:


Anyhow, I think we should standardise on zero as the initial
index to be as consistent as practicable.

If you want to suggest a default of zero for the first subscript of
an array in SQL, please don't confuse the issue by using any form
of the word "standard" in that proposal.  There are ANSI and ISO
standards for SQL, and they require that the first element of an
array is one.  I'm OK with *extending* the standard by *allowing*
other values, but let's not flaunt the standard and break existing
code by changing the *default*.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
You omitted my rider 'However, not with a religious zeal at the expense 
of practical considerations!' Which was meant to cover concerns like yours.



Cheers,
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Gavin Flower


On 04/04/13 05:36, David E. Wheeler wrote:

On Apr 3, 2013, at 9:30 AM, Tom Lane  wrote:


Fortran ... Basic ... actually I'd have thought that zero was a
minority position.  Fashions change I guess.

I say we turn the default lower bound up to 11.

David


In keeping with the level of irrationality in this thread, maybe we 
should set it to an irrational number like the square root of 2, or 
transcend our selves and make in a transcendental number like pi!  :-)


I suppose using the square root of minus one would be consider too 
imaginative???  :-)



Cheers,
Gavin

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Kevin Grittner

Gavin Flower  wrote:

> Anyhow, I think we should standardise on zero as the initial
> index to be as consistent as practicable.

If you want to suggest a default of zero for the first subscript of
an array in SQL, please don't confuse the issue by using any form
of the word "standard" in that proposal.  There are ANSI and ISO
standards for SQL, and they require that the first element of an
array is one.  I'm OK with *extending* the standard by *allowing*
other values, but let's not flaunt the standard and break existing
code by changing the *default*.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-04-03 Thread Tom Lane

Robert Haas  writes:
> On Tue, Apr 2, 2013 at 12:19 PM, Peter Eisentraut  wrote:
>> It's weird that SET LOCAL and SET SESSION actually *set* the value, and
>> the second key word determines how long the setting will last.  SET
>> PERSISTENT doesn't actually set the value.  I predict that this will be
>> a new favorite help-it-doesn't-work FAQ.

> I think this is another argument against this particular syntax.  I
> have always thought that something along the lines of ALTER SYSTEM
> would be more appropriate.  ALTER DATABASE .. SET and ALTER ROLE ..
> SET don't change the value immediately either, and nobody gets
> confused about that to my knowledge.  But I can see where SET
> PERSISTENT could cause that sort of confusion.

Yeah, I think I argued for using the SET syntax to start with, but
I'm coming around to the position that SET PERSISTENT is too much
unlike the behavior of other varieties of SET.  ALTER is sounding
more attractive to me now.  Not sure about "ALTER SYSTEM" in particular
though --- it's not clear that that has any real merit other than
already existing as a keyword.  (Not that that's negligible.)
ALTER CONFIGURATION is another alternative using an existing keyword
that might be worth considering.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Gavin Flower


On 04/04/13 05:30, Tom Lane wrote:

Zero as the default lower bound is consistent with most languages
(especially the common ones like C, C++, Java, & Python), in fact
I don't remember any language where that is not the case (ignoring
SQL) - and I've written programs in about 20 languages.

Fortran ... Basic ... actually I'd have thought that zero was a
minority position.  Fashions change I guess.

regards, tom lane
I had forgotten the indexing in BASIC & FORTRAN, I now recall COBOL's 
TABLE's were equivalent to arrays and they started at one.



Cheers,
Gavin

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Andres Freund

On 2013-04-04 07:42:36 +1300, Gavin Flower wrote:
> On 04/04/13 05:21, Pavel Stehule wrote:
> >
> >>
> >>Pavel
> >>
> >ALOGOL 60 was zero based by default, as I remember
> >deliberately setting the lower bound to 1, I managed to avoid
> >PASCAL and I only glanced at ADA.
> >
> >
> >
> > http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28array%29
> >
> >
> >In Pascal and similar languages (Wirth family) is possible to define lower
> >bound - any value (like PL/pgSQL). Depends on a school - but lot of people
> >used 1.
> >
> >I remember - it was one argument for in Pascal/C flame war.
> >
> >Regards
> >
> >Pavel
> >
> [...]
> 
> At the time (just over 40 years ago!) I was adamant that arrays should start
> with an index of one, now I much prefer zero.  Probably I prefer zero in
> part, because now I understand what is happening at the machine code level,
> and partly because zero is the default for the main languages I use.
> 
> Wasting time on Google (I have 'more' important things I 'should' be
> doing!), I find ALGOL 60 did not appear to have a default value for the
> lower index - not only that, but one could make it negative! see:
> http://www.csci.csusb.edu/dick/samples/algol60.syntax.html
> 
> Anyhow, I think we should standardise on zero as the initial index to be as
> consistent as practicable.  However, not with a religious zeal at the
> expense of practical considerations!

Changing this now, rather than on a green field, strikes me as a pretty
absurd exercise in frustrating users by breaking queries subtly. Since 1
is a value index that won't always break visible.

-1

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Gavin Flower


On 04/04/13 05:21, Pavel Stehule wrote:




Pavel


ALOGOL 60 was zero based by default, as I remember
deliberately setting the lower bound to 1, I managed to avoid
PASCAL and I only glanced at ADA.


http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28array%29


In Pascal and similar languages (Wirth family) is possible to define 
lower bound - any value (like PL/pgSQL). Depends on a school - but lot 
of people used 1.


I remember - it was one argument for in Pascal/C flame war.

Regards

Pavel


[...]

At the time (just over 40 years ago!) I was adamant that arrays should 
start with an index of one, now I much prefer zero.  Probably I prefer 
zero in part, because now I understand what is happening at the machine 
code level, and partly because zero is the default for the main 
languages I use.


Wasting time on Google (I have 'more' important things I 'should' be 
doing!), I find ALGOL 60 did not appear to have a default value for the 
lower index - not only that, but one could make it negative! see: 
http://www.csci.csusb.edu/dick/samples/algol60.syntax.html


Anyhow, I think we should standardise on zero as the initial index to be 
as consistent as practicable.  However, not with a religious zeal at the 
expense of practical considerations!



Cheers,
Gavin

Re: [HACKERS] CREATE EXTENSION BLOCKS

2013-04-03 Thread Alvaro Herrera

David E. Wheeler wrote:
> On Apr 3, 2013, at 2:37 AM, Albe Laurenz  wrote:
> 
> > I'd expect one of the CREATE EXTENSION commands to succeed
> > and the others to block until the transaction is committed,
> > then to fail with 'extension "oracle_fdw" already exists'.
> > 
> > If that is what happens, it's what I'd expect since
> > extension names are unique (see the unique constraint on
> > pg_extension).
> 
> Oh, they are not unique per-schema? I guess they are global to the database 
> but then their objects are in the specified schema, then.

Right -- an extension is not considered to live within a schema, they
are database-global.  The objects might live in a particular schema (if
it is "relocatable"), and there's support to move those to a different
schema, but this doesn't affect the extension itself.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] c language functions

2013-04-03 Thread Tom Lane

Rodrigo Barboza  writes:
> Why not useful?
> If I don't make it receive anyelement, I will have to create an add
> function for each type.

If you make it anyelement, then you're contracting to be able to add
any datatype whatsoever to a my_uint.  This is nonsensical.

You'd be better off declaring several specific addition functions,
one for each other type.  This will be an order of magnitude easier
to write, and probably run an order of magnitude faster too, because
just checking to see what type you got would already be significantly
more expensive than adding a couple of integers ought to be.

Look at the built-in types and functions for precedent.  There are
indeed separate functions for int2 + int2, int2 + int4, int4 + int2,
int4 + int4, etc etc.  If we were starting from scratch, we might reduce
that to just int4 + int4 and rely on the implicit coercion from int2 to
int4 to handle the other cases; but there's no way we'd put in run-time
type determination.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Kevin Grittner

Tom Lane  wrote:

> Fortran ... Basic ... actually I'd have thought that zero was a
> minority position.  Fashions change I guess.

When I started programming the top four languages for business
programming were COBOL, BASIC, RPG II, and assembly language.
(Pascal and C came later, and I never saw much use of Fortran by
anyone other than mathematicians.) Except for assembly language,
the subscripts for arrays either started with 1 always, or that was
the default.  Given when it was first developed, it's not too
surprising that the SQL standard adopted 1 as the first element of
an array.

Which is more natural depends on whether you think of the subscript
in terms of ordinal positions or offsets from the base address.

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] regression test failed when enabling checksum

2013-04-03 Thread Jeff Davis

On Wed, 2013-04-03 at 09:48 -0700, Jeff Janes wrote:

>  And why did those uninitialized pages trigger warnings when they were
> autovacced, but not when they were seq scanned in a query?
> 
A scan won't trigger that warning. Empty pages are sometimes the
expected result of a crash when the file is extended but the page is not
written yet. So empty pages aren't necessarily an indication of
corruption (though I'd like to fix that eventually, because sometimes
zeroing is corruption).

Regards,
Jeff Davis




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Dean Rasheed

On 3 April 2013 15:10, Tom Lane  wrote:
> I think though that the upthread argument that we'd have multiple
> interpretations of the same thing is bogus.  To me, the core idea that's
> being suggested here is that '{}' should mean a zero-length 1-D array,
> not a zero-D array as formerly.  We would still need a way to represent
> zero-D arrays, if only because they'd still exist on-disk in existing
> databases (assuming we're not willing to break pg_upgrade for this).
> I suggest that that ought *not* involve any braces.  Perhaps '[]=' would
> be a suitable representation.

Then restoring an old SQL dump would silently convert all old zero-D
arrays to zero-length 1-D arrays. Also, if you failed to completely
rewrite all your application code, and accidentally missed a few
functions, you could very easily end up in a situation where your
tables contained a mixture of  zero-D arrays and zero-length 1-D
arrays, that wouldn't compare equally, leading to all sorts of nasty
bugs.

> In the other direction, ISTM that
> '{{},{},{}}' is a zero-by-three array,

Isn't that a three-by-zero array rather than a zero-by-three array?
Either way, what is the use-case for it? I certainly can't remember
ever using a 0x3 or 3x0 matrix.

> entirely distinct from '{}' or
> '{{}}' in dimensionality if not content.

And '{{}}' is a one-by-zero array. There is currently no proposed
syntax for representing a zero-by-zero array, or a zero-by-one array.

All I see here is endless opportunity for confusion and code breakage,
with very little real benefit.

I actually don't think the current behaviour is broken. I find it
perfectly logical and consistent to have a single dimensionless
concept of an empty array. It has no elements, so questions like are
they arranged in 1-D or 2-D, and what index do they start at are
meaningless, and so should return NULL. The only counter-intuitive
thing is that array_length() returns NULL instead of 0 for the empty
array, and it doesn't take long to get used to that behaviour.

Much of this seems to have grown out of a desire to fix the confusing
behaviour of array_length(), but seems to have ended up in a much more
confusing place.

Given that array_length() takes 2 arguments, it is somewhat forgivable
for it to return NULL when you ask for the array's length in a
dimension that it doesn't have. Perhaps what might make the API easier
to work with is a new single-argument function --- e.g., array_size()
that would return the number of elements in the array --- 0 for '{}',
3 for '{1,2,3}', 6 for '{{1,2},{3,4},{5,6}}' and so on.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] commit dfda6ebaec67 versus wal_keep_segments

2013-04-03 Thread Heikki Linnakangas


On 03.04.2013 18:58, Jeff Janes wrote:

On Tue, Apr 2, 2013 at 10:08 PM, Jeff Janes  wrote:


This commit introduced a problem with wal_keep_segments:

commit dfda6ebaec6763090fb78b458a979b558c50b39b


The problem seems to be that the underflow warned about is happening,
because the check to guard it was checking the wrong thing.  However, I
don't really understand KeepLogSeg.  It seems like segno, and hence recptr,
don't actually serve any purpose.


Hmm, the check is actually correct, but the assignment in the 
else-branch isn't. The idea of KeepLogSeg is to calculate recptr - 
wal_keep_segments, and assign that to *logSegNo. But only if *logSegNo 
is not already < than the calculated value. Does the attached look 
correct to you?



At some point when it is over-pruning and recycling, it recyles the log
files that are still needed for recovery, and if the database crashes at
that point it will not recover because it can't find either the primary
secondary checkpoint records.


So, KeepLogSeg incorrectly sets *logSegNo to 0, and CreateCheckPoint 
decrements it, causing it to underflow to 2^64-1. Now RemoveOldXlogFiles 
feels free to remove every WAL segment.


- Heikki
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 3227d4c..2f79af6 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7518,9 +7518,9 @@ CreateRestartPoint(int flags)
 }
 
 /*
- * Calculate the last segment that we need to retain because of
- * wal_keep_segments, by subtracting wal_keep_segments from
- * the given xlog location, recptr.
+ * Retreat *logSegNo to the last segment that we need to retain because of
+ * wal_keep_segments. This is calculated by by subtracting wal_keep_segments
+ * from the given xlog location, recptr.
  */
 static void
 KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
@@ -7536,7 +7536,7 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
 	if (segno <= wal_keep_segments)
 		segno = 1;
 	else
-		segno = *logSegNo - wal_keep_segments;
+		segno = segno - wal_keep_segments;
 
 	/* don't delete WAL segments newer than the calculated segment */
 	if (segno < *logSegNo)

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Tom Lane

Heikki Linnakangas  writes:
> Attached is a patch to add the overflow check. I used the error message 
> "too many distinct characters in regex". That's not totally accurate, 
> because there isn't a limit on distinct characters per se, but on the 
> number of colors. Conceivably, you could have a regexp with more than 
> 32k different characters, but where most of them are mapped to the same 
> color. In practice, it's not helpful to the user to say "too many 
> colors"; he will have no clue what a color is.

Patch looks good except perhaps for wordsmithing the message text.

One thought is that we don't need to identify this as a regex error
because the PG code will report it with "invalid regular expression: %s".

I think there's a good argument for saying "too many character colors"
and relying on the "invalid regular expression" context to clue in the
clueless.  After all, most of them don't know what an NFA is either, but
no one has complained about the REG_ETOOBIG message.  I think if you get
to the point where you're triggering this error, you probably know
something about regexes, or even if you don't the phrase "too many" will
give you a fair idea what's wrong.  There is much to be said for
specifically identifying the implementation limit that's been hit, even
if most users don't know what it is.  So I'd just as soon not fall back
on something imprecise.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] c language functions

2013-04-03 Thread Rodrigo Barboza

Why not useful?
If I don't make it receive anyelement, I will have to create an add
function for each type.
Correct me if I'm wrong.


On Wed, Apr 3, 2013 at 2:27 PM, Robert Haas  wrote:

> On Wed, Apr 3, 2013 at 11:26 AM, Rodrigo Barboza
>  wrote:
> > Hello.
> > I'm trying to create a generic add function.
> > I have defined a type my_uint and it needs a '+' operator.
> > This operator should work like normal int + int operation.
> > The function is defined expecting arguments (my_uint, anyelement).
> >
> > I'm confused in retrieving the anyelement type, value and than do the add
> > operation and return the correct value and type.
> > I tried to use PG_GETARG_DATUM, but I don't know how to extract the value
> > from it (should it be a uint32, uint64, float or double).
> >
> > Any tips?
>
> Well, the information the function ends up receiving is going to
> depend on how the function is declared at an SQL level.  So if you are
> defining the function to take an argument of anyelement (which seems
> unlikely to be a useful thing to do, but let's suppose you do it
> anyway) then look at the C code for some other function that takes an
> anyelement argument and look at how that function unpacks it.
>
> Similarly, if you declare the function to take int4 argument, then go
> look at the C code function that takes an int4 argument and see what
> it does to unpack it.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: [HACKERS] Drastic performance loss in assert-enabled build in HEAD

2013-04-03 Thread Tom Lane

Robert Haas  writes:
> On Wed, Apr 3, 2013 at 10:59 AM, Tom Lane  wrote:
>> In fact, I'm going to go further and say that I do not like the entire
>> concept of scannability, either as to design or implementation, and
>> I think we should just plain rip it out.

> This has been my feeling from the beginning, so I'm happy to support
> this position.  I think the current version - where scan-ability is
> tracked in just one way - is an improvement over the previous version
> of the patch - where it was tracked in two different ways with a
> confusing shuffle of information from one place to the other.  But my
> favorite number of places to track it would be zero.

To be clear, I think we'll end up tracking some notion of scannability
eventually.  I just don't think the current notion is sufficiently baked
to want to promise to be upward-compatible with it in future.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] c language functions

2013-04-03 Thread Robert Haas

On Wed, Apr 3, 2013 at 11:26 AM, Rodrigo Barboza
 wrote:
> Hello.
> I'm trying to create a generic add function.
> I have defined a type my_uint and it needs a '+' operator.
> This operator should work like normal int + int operation.
> The function is defined expecting arguments (my_uint, anyelement).
>
> I'm confused in retrieving the anyelement type, value and than do the add
> operation and return the correct value and type.
> I tried to use PG_GETARG_DATUM, but I don't know how to extract the value
> from it (should it be a uint32, uint64, float or double).
>
> Any tips?

Well, the information the function ends up receiving is going to
depend on how the function is declared at an SQL level.  So if you are
defining the function to take an argument of anyelement (which seems
unlikely to be a useful thing to do, but let's suppose you do it
anyway) then look at the C code for some other function that takes an
anyelement argument and look at how that function unpacks it.

Similarly, if you declare the function to take int4 argument, then go
look at the C code function that takes an int4 argument and see what
it does to unpack it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review]

2013-04-03 Thread Robert Haas

On Tue, Apr 2, 2013 at 12:19 PM, Peter Eisentraut  wrote:
> It's weird that SET LOCAL and SET SESSION actually *set* the value, and
> the second key word determines how long the setting will last.  SET
> PERSISTENT doesn't actually set the value.  I predict that this will be
> a new favorite help-it-doesn't-work FAQ.

I think this is another argument against this particular syntax.  I
have always thought that something along the lines of ALTER SYSTEM
would be more appropriate.  ALTER DATABASE .. SET and ALTER ROLE ..
SET don't change the value immediately either, and nobody gets
confused about that to my knowledge.  But I can see where SET
PERSISTENT could cause that sort of confusion.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Heikki Linnakangas


On 03.04.2013 18:41, Tom Lane wrote:

Heikki Linnakangas  writes:

On 03.04.2013 18:21, Tom Lane wrote:

Obviously Henry didn't think that far ahead.  I agree that throwing
an error is the best solution, and that widening "color" is probably
not what we want to do.  You want to fix that, or shall I?



I can do it. I assume that Tcl has the same bug, so I'll submit a report
there, too.


Yes, definitely.

It occurs to me that at some point it might be useful to convert "color"
to unsigned short, so that you could have 64K of them ...--- but we'd still
need the error check anyway, and there's no reason to tackle such a
change today.


I was just thinking the same. In practice, expanding it to 64k doesn't 
get you much farther. There is this in newdfa():


  d->incarea = (struct arcp *) MALLOC(nss * cnfa->ncolors * 
sizeof(struct arcp));


That's (number of states) * (number of colors) * (constant). The test 
case I posted earlier would require about 40 GB of RAM for that 
allocation alone, and fails with an "out of memory" error. Maybe it 
would be possible to construct a regexp that has a lot of colors but few 
states, but that's an even more marginal use case.



Attached is a patch to add the overflow check. I used the error message 
"too many distinct characters in regex". That's not totally accurate, 
because there isn't a limit on distinct characters per se, but on the 
number of colors. Conceivably, you could have a regexp with more than 
32k different characters, but where most of them are mapped to the same 
color. In practice, it's not helpful to the user to say "too many 
colors"; he will have no clue what a color is.


PS. I was mistaken when I said that this causes an assertion failure; it 
segfaults even with assertions enabled.


- Heikki
diff --git a/src/backend/regex/regc_color.c b/src/backend/regex/regc_color.c
index 1c60566..e6aa899 100644
--- a/src/backend/regex/regc_color.c
+++ b/src/backend/regex/regc_color.c
@@ -247,7 +247,15 @@ newcolor(struct colormap * cm)
 		/* oops, must allocate more */
 		struct colordesc *newCd;
 
+		if (cm->max == MAX_COLOR)
+		{
+			CERR(REG_ECOLORS);
+			return COLORLESS;	/* too many colors */
+		}
+
 		n = cm->ncds * 2;
+		if (n > MAX_COLOR + 1)
+			n = MAX_COLOR + 1;
 		if (cm->cd == cm->cdspace)
 		{
 			newCd = (struct colordesc *) MALLOC(n * sizeof(struct colordesc));
diff --git a/src/include/regex/regerrs.h b/src/include/regex/regerrs.h
index a761371..6516d7e 100644
--- a/src/include/regex/regerrs.h
+++ b/src/include/regex/regerrs.h
@@ -77,3 +77,7 @@
 {
 	REG_ETOOBIG, "REG_ETOOBIG", "nfa has too many states"
 },
+
+{
+	REG_ECOLORS, "REG_ECOLORS", "too many distinct characters in regex"
+},
diff --git a/src/include/regex/regex.h b/src/include/regex/regex.h
index 616c2c6..a7fb14f 100644
--- a/src/include/regex/regex.h
+++ b/src/include/regex/regex.h
@@ -153,6 +153,7 @@ typedef struct
 #define REG_MIXED	17			/* character widths of regex and string differ */
 #define REG_BADOPT	18			/* invalid embedded option */
 #define REG_ETOOBIG 19			/* nfa has too many states */
+#define REG_ECOLORS 20			/* too many distinct characters in regex */
 /* two specials for debugging and testing */
 #define REG_ATOI	101			/* convert error-code name to number */
 #define REG_ITOA	102			/* convert error-code number to name */
diff --git a/src/include/regex/regguts.h b/src/include/regex/regguts.h
index e1e406f..fc7f5ae 100644
--- a/src/include/regex/regguts.h
+++ b/src/include/regex/regguts.h
@@ -148,6 +148,7 @@
 typedef short color;			/* colors of characters */
 typedef int pcolor;/* what color promotes to */
 
+#define MAX_COLOR	32767		/* max value that fits in 'color' datatype */
 #define COLORLESS	(-1)		/* impossible color */
 #define WHITE		0			/* default color, parent of all others */
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] regression test failed when enabling checksum

2013-04-03 Thread Andres Freund

On 2013-04-03 09:48:54 -0700, Jeff Janes wrote:
> On Wed, Apr 3, 2013 at 2:31 AM, Andres Freund wrote:
> 
> >
> >
>  I just checked and unfortunately your dump doesn't contain all that much
> > valid WAL:
> > ...
> >
> 
> 
> > So just two checkpoint records.
> >
> > Unfortunately I  fear that won't be enough to diagnose the problem,
> > could you reproduce it with a higher wal_keep_segments?
> >
> 
> I've been trying, but see message "commit dfda6ebaec67 versus
> wal_keep_segments".

Setting up an archive_command could help if its affordable from the
space perspective :/

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Robert Haas

On Wed, Apr 3, 2013 at 12:36 PM, David E. Wheeler  wrote:
> On Apr 3, 2013, at 9:30 AM, Tom Lane  wrote:
>> Fortran ... Basic ... actually I'd have thought that zero was a
>> minority position.  Fashions change I guess.
>
> I say we turn the default lower bound up to 11.

+1.  I think that's clearly a compromise position that will make
everyone equally happy.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] regression test failed when enabling checksum

2013-04-03 Thread Jeff Janes

On Wed, Apr 3, 2013 at 2:31 AM, Andres Freund wrote:

>
>
 I just checked and unfortunately your dump doesn't contain all that much
> valid WAL:
> ...
>

> So just two checkpoint records.
>
> Unfortunately I  fear that won't be enough to diagnose the problem,
> could you reproduce it with a higher wal_keep_segments?
>

I've been trying, but see message "commit dfda6ebaec67 versus
wal_keep_segments".

Looking at some of the log files more, I see that vacuum is involved, but
in some way I don't understand.  The crash always happens on a test
cycle immediately after the sleep that allows the autovac to kick in and
finish.  So the events goes something like this:

...
run the frantic updating of "foo" until crash
recovery
query "foo" and verify the results are consistent with expectations
sleep to allow autovac to do its job.
truncate "foo" and repopulate it.
run the frantic updating of "foo" until crash
recovery
attempt to query "foo" but get the checksum failure.

What the vacuum is doing that corrupts the system in a way that survives
the truncate is a mystery to me.

Also, at one point I had the harness itself exit as soon as it detected the
problem, but I failed to have it shut down the server.  So the server keep
running idle and having autovac do its thing, which produced some
interesting log output:

WARNING:  relation "foo" page 45 is uninitialized --- fixing
WARNING:  relation "foo" page 46 is uninitialized --- fixing
...
WARNING:  relation "foo" page 72 is uninitialized --- fixing
WARNING:  relation "foo" page 73 is uninitialized --- fixing
WARNING:  page verification failed, calculated checksum 54570 but expected
34212
ERROR:  invalid page in block 74 of relation base/16384/4931589

This happened 3 times.  Every time, the warnings started on page 45, and
they continued up until the invalid page was found (which varied, being 74,
86, and 74 again)

I wonder if the bug is in checksums, or if the checksums are doing their
job by finding some other bug.  And why did those uninitialized pages
trigger warnings when they were autovacced, but not when they were seq
scanned in a query?

Cheers,

Jeff

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Andrew Dunstan



On 04/03/2013 11:34 AM, Gavin Flower wrote:


Zero as the default lower bound is consistent with most languages 
(especially the common ones like C, C++, Java, & Python), in fact I 
don't remember any language where that is not the case (ignoring SQL) 
- and I've written programs in about 20 languages.


Pascal and Ada are obvious examples.

I'm quite sure there are others, but don't have time to do the research.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [sepgsql 2/3] Add db_schema:search permission checks

2013-04-03 Thread Robert Haas

On Tue, Apr 2, 2013 at 2:22 PM, Kohei KaiGai  wrote:
> OK, please check the attached ones.

Thanks.  I reviewed the schema-search patch and I think it looks
reasonable, but shouldn't we be calling the event OAT_NAMESPACE_SEARCH
rather than OAT_SCHEMA_SEARCH?  And, similarly,
ObjectAccessNamespaceSearch and InvokeNamespaceSearchHook?  I think
this terminology could be confusing to backend hackers who are used to
seeing the term "namespace" internally when referring to schemas.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread David E. Wheeler

On Apr 3, 2013, at 9:30 AM, Tom Lane  wrote:

> Fortran ... Basic ... actually I'd have thought that zero was a
> minority position.  Fashions change I guess.

I say we turn the default lower bound up to 11.

David




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Drastic performance loss in assert-enabled build in HEAD

2013-04-03 Thread Robert Haas

On Wed, Apr 3, 2013 at 10:59 AM, Tom Lane  wrote:
> In fact, I'm going to go further and say that I do not like the entire
> concept of scannability, either as to design or implementation, and
> I think we should just plain rip it out.

This has been my feeling from the beginning, so I'm happy to support
this position.  I think the current version - where scan-ability is
tracked in just one way - is an improvement over the previous version
of the patch - where it was tracked in two different ways with a
confusing shuffle of information from one place to the other.  But my
favorite number of places to track it would be zero.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Tom Lane

>> Zero as the default lower bound is consistent with most languages
>> (especially the common ones like C, C++, Java, & Python), in fact
>> I don't remember any language where that is not the case (ignoring
>> SQL) - and I've written programs in about 20 languages.

Fortran ... Basic ... actually I'd have thought that zero was a
minority position.  Fashions change I guess.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Pavel Stehule

>
>>  Pavel
>>
>>ALOGOL 60 was zero based by default, as I remember deliberately
>> setting the lower bound to 1, I managed to avoid PASCAL and I only glanced
>> at ADA.
>>
>
>
> http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28array%29
>
>
In Pascal and similar languages (Wirth family) is possible to define lower
bound - any value (like PL/pgSQL). Depends on a school - but lot of people
used 1.

I remember - it was one argument for in Pascal/C flame war.

Regards

Pavel



> Regards
>
>
>
Pavel
>
>
>>
>>
>> Cheers,
>> Gavin
>>
>
>

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Pavel Stehule

2013/4/3 Gavin Flower 

>  On 04/04/13 04:58, Pavel Stehule wrote:
>
>
>
>
> 2013/4/3 Gavin Flower 
>
>> On 04/04/13 03:02, Florian Pflug wrote:
>>
>>> On Apr3, 2013, at 15:30 , Andrew Dunstan  wrote:
>>>
 On 04/02/2013 02:46 PM, Florian Pflug wrote:

> If we're going to break compatibility, we should IMHO get rid of
> non-zero lower bounds all together. My guess is that the number of
> affected users wouldn't be much higher than for the proposed patch,
> and it'd allow lossless mapping to most language's native array types…
>
 That would actually break a HUGE number of users, since the default
 lower
 bound is 1. I have seen any number of pieces if code that rely on that.

>>> Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
>>> is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
>>> be lossless.
>>>
>>> best regards,
>>> Florian Pflug
>>>
>>>
>>>  Zero as the default lower bound is consistent with most languages
>> (especially the common ones like C, C++, Java, & Python), in fact I don't
>> remember any language where that is not the case (ignoring SQL) - and I've
>> written programs in about 20 languages.
>>
>
>  pascal, ADA, and ALGOL like languages
>
>  Regards
>
>  Pavel
>
>ALOGOL 60 was zero based by default, as I remember deliberately
> setting the lower bound to 1, I managed to avoid PASCAL and I only glanced
> at ADA.
>

http://en.wikipedia.org/wiki/Comparison_of_programming_languages_%28array%29

Regards

Pavel


>
>
> Cheers,
> Gavin
>

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Gavin Flower


On 04/04/13 04:58, Pavel Stehule wrote:




2013/4/3 Gavin Flower >


On 04/04/13 03:02, Florian Pflug wrote:

On Apr3, 2013, at 15:30 , Andrew Dunstan mailto:and...@dunslane.net>> wrote:

On 04/02/2013 02:46 PM, Florian Pflug wrote:

If we're going to break compatibility, we should IMHO
get rid of
non-zero lower bounds all together. My guess is that
the number of
affected users wouldn't be much higher than for the
proposed patch,
and it'd allow lossless mapping to most language's
native array types...

That would actually break a HUGE number of users, since
the default lower
bound is 1. I have seen any number of pieces if code that
rely on that.

Uh, yeah, we should make it 1 then, not 0, then. As long as
the bound
is fixed, conversion to native C/Java/Ruby/Python/... arrays
would still
be lossless.

best regards,
Florian Pflug


Zero as the default lower bound is consistent with most languages
(especially the common ones like C, C++, Java, & Python), in fact
I don't remember any language where that is not the case (ignoring
SQL) - and I've written programs in about 20 languages.


pascal, ADA, and ALGOL like languages

Regards

Pavel

ALOGOL 60 was zero based by default, as I remember deliberately setting 
the lower bound to 1, I managed to avoid PASCAL and I only glanced at ADA.



Cheers,
Gavin

Re: [HACKERS] CREATE EXTENSION BLOCKS

2013-04-03 Thread David E. Wheeler

On Apr 3, 2013, at 2:37 AM, Albe Laurenz  wrote:

> I'd expect one of the CREATE EXTENSION commands to succeed
> and the others to block until the transaction is committed,
> then to fail with 'extension "oracle_fdw" already exists'.
> 
> If that is what happens, it's what I'd expect since
> extension names are unique (see the unique constraint on
> pg_extension).

Oh, they are not unique per-schema? I guess they are global to the database but 
then their objects are in the specified schema, then.

Thanks,

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] track_activities is mostly broken

2013-04-03 Thread Tom Lane

The point of turning off pgstat_track_activities is, IMO, to eliminate
the overhead of updating one's PgBackendStatus entry in main memory.
Commit 4f42b546 broke this, however, because it confused getting into
the "reporting disabled" state with what should happen once already
in the state.  If you try turning off track_activities in a backend,
you'll find that you can't see its current query any more, but that's
not because it's not reporting it --- it's only because
pg_stat_get_activity() is hiding it from you.  The overhead is still
being paid, as you can confirm by noting that query_start and
state_change continue to update whenever the backend does something.

Will fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Pavel Stehule

2013/4/3 Gavin Flower 

> On 04/04/13 03:02, Florian Pflug wrote:
>
>> On Apr3, 2013, at 15:30 , Andrew Dunstan  wrote:
>>
>>> On 04/02/2013 02:46 PM, Florian Pflug wrote:
>>>
 If we're going to break compatibility, we should IMHO get rid of
 non-zero lower bounds all together. My guess is that the number of
 affected users wouldn't be much higher than for the proposed patch,
 and it'd allow lossless mapping to most language's native array types…

>>> That would actually break a HUGE number of users, since the default lower
>>> bound is 1. I have seen any number of pieces if code that rely on that.
>>>
>> Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
>> is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
>> be lossless.
>>
>> best regards,
>> Florian Pflug
>>
>>
>>  Zero as the default lower bound is consistent with most languages
> (especially the common ones like C, C++, Java, & Python), in fact I don't
> remember any language where that is not the case (ignoring SQL) - and I've
> written programs in about 20 languages.
>

pascal, ADA, and ALGOL like languages

Regards

Pavel




>
> Maybe we should adopt the famous compromise of '0.5'?  :-)
>
>
> Cheers,
> Gavin
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/**mailpref/pgsql-hackers
>

Re: [HACKERS] commit dfda6ebaec67 versus wal_keep_segments

2013-04-03 Thread Jeff Janes

On Tue, Apr 2, 2013 at 10:08 PM, Jeff Janes  wrote:

> This commit introduced a problem with wal_keep_segments:
>
> commit dfda6ebaec6763090fb78b458a979b558c50b39b
>

The problem seems to be that the underflow warned about is happening,
because the check to guard it was checking the wrong thing.  However, I
don't really understand KeepLogSeg.  It seems like segno, and hence recptr,
don't actually serve any purpose.

Cheers,

Jeff

diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index 2f9209f..3643be8 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -8264,7 +8264,7 @@ KeepLogSeg(XLogRecPtr recptr, XLogSegNo *logSegNo)
XLByteToSeg(recptr, segno);

/* avoid underflow, don't go below 1 */
-   if (segno <= wal_keep_segments)
+   if (*logSegNo <= wal_keep_segments)
segno = 1;
else
segno = *logSegNo - wal_keep_segments;

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Tom Lane

Heikki Linnakangas  writes:
> On 03.04.2013 18:21, Tom Lane wrote:
>> Obviously Henry didn't think that far ahead.  I agree that throwing
>> an error is the best solution, and that widening "color" is probably
>> not what we want to do.  You want to fix that, or shall I?

> I can do it. I assume that Tcl has the same bug, so I'll submit a report 
> there, too.

Yes, definitely.

It occurs to me that at some point it might be useful to convert "color"
to unsigned short, so that you could have 64K of them --- but we'd still
need the error check anyway, and there's no reason to tackle such a
change today.  (This possible change is, however, another reason to not
want pg_trgm looking directly at the internals of the data structure...)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Gavin Flower


On 04/04/13 03:02, Florian Pflug wrote:

On Apr3, 2013, at 15:30 , Andrew Dunstan  wrote:

On 04/02/2013 02:46 PM, Florian Pflug wrote:

If we're going to break compatibility, we should IMHO get rid of
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…

That would actually break a HUGE number of users, since the default lower
bound is 1. I have seen any number of pieces if code that rely on that.

Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.

best regards,
Florian Pflug


Zero as the default lower bound is consistent with most languages 
(especially the common ones like C, C++, Java, & Python), in fact I 
don't remember any language where that is not the case (ignoring SQL) - 
and I've written programs in about 20 languages.


Maybe we should adopt the famous compromise of '0.5'?  :-)


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] c language functions

2013-04-03 Thread Rodrigo Barboza

Hello.
I'm trying to create a generic add function.
I have defined a type my_uint and it needs a '+' operator.
This operator should work like normal int + int operation.
The function is defined expecting arguments (my_uint, anyelement).

I'm confused in retrieving the anyelement type, value and than do the add
operation and return the correct value and type.
I tried to use PG_GETARG_DATUM, but I don't know how to extract the value
from it (should it be a uint32, uint64, float or double).

Any tips?

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Heikki Linnakangas


On 03.04.2013 18:21, Tom Lane wrote:

Heikki Linnakangas  writes:

A regex with that many different colors is an extreme case, so I think
it's enough to turn the assertion in newcolor() into a run-time check,
and throw a "too many colors in regexp" error. Alternatively, we could
expand 'color' from short to int, but that would double the memory usage
of sane regexps with less different characters.


Obviously Henry didn't think that far ahead.  I agree that throwing
an error is the best solution, and that widening "color" is probably
not what we want to do.  You want to fix that, or shall I?


I can do it. I assume that Tcl has the same bug, so I'll submit a report 
there, too.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Tom Lane

Heikki Linnakangas  writes:
> A regex with that many different colors is an extreme case, so I think 
> it's enough to turn the assertion in newcolor() into a run-time check, 
> and throw a "too many colors in regexp" error. Alternatively, we could 
> expand 'color' from short to int, but that would double the memory usage 
> of sane regexps with less different characters.

Obviously Henry didn't think that far ahead.  I agree that throwing
an error is the best solution, and that widening "color" is probably
not what we want to do.  You want to fix that, or shall I?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Regex with > 32k different chars causes a backend crash

2013-04-03 Thread Heikki Linnakangas

While playing with Alexander's pg_trgm regexp patch, I noticed that the 
regexp library trips an assertion (if enabled) or crashes, when passed 
an input string that contains more than 32k different characters:


select 'foo' ~ (select string_agg(chr(x),'') from generate_series(100, 
35000) x) as nastyregex;


This is because it uses 'short' as the datatype to identify colors. When 
it overflows, -32768 is used as index to the colordesc array, and you 
get a crash. AFAICS this can't reliably be used for anything more 
sinister than crashing the backend.


A regex with that many different colors is an extreme case, so I think 
it's enough to turn the assertion in newcolor() into a run-time check, 
and throw a "too many colors in regexp" error. Alternatively, we could 
expand 'color' from short to int, but that would double the memory usage 
of sane regexps with less different characters.


Thoughts?

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Drastic performance loss in assert-enabled build in HEAD

2013-04-03 Thread Tom Lane

[ sorry for being slow to respond, things are crazy this week ]

Kevin Grittner  writes:
> Tom Lane  wrote:
>> Anyway, the immediate takeaway is that this represents a horribly
>> expensive way for pg_dump to find out which matviews aren't
>> scannable.  The cheap way for it to find out would be if we had a
>> boolean flag for that in pg_class.  Would you review the bidding
>> as to why things were done the way they are?  Because in general,
>> having to ask the kernel something is going to suck in any case,
>> so basing it on the size of the disk file doesn't seem to me to
>> be basically a good thing.

> The early versions of the patch had a boolean in pg_class to track
> this, but I got complaints from Robert and Noah (and possibly
> others?) that this got too ugly in combination with the support for
> unlogged matviews, and they suggested the current approach.

Meh.  I don't think that we should allow that corner case to drive the
design like this.  I would *far* rather not have unlogged matviews at
all than this boondoggle.  I'm not even convinced that the existing
semantics are desirable: who's to say that having an unlogged matview
revert to empty on crash renders it invalid?  If it's a summary of
underlying unlogged tables, then that's actually the right thing.
(If someone is desperately unhappy with such a behavior, why are they
insisting on it being unlogged, anyway?)

If we go with this implementation, I think we are going to be painting
ourselves into a corner that will be difficult or impossible to get out
of, because the scannability state of a matview is not just a matter of
catalog contents but of on-disk files.  That will make it difficult to
impossible for pg_upgrade to cope reasonably with any future rethinking
of scannability status.

In fact, I'm going to go further and say that I do not like the entire
concept of scannability, either as to design or implementation, and
I think we should just plain rip it out.  We can rethink that for 9.4
or later, but what we've got right now is a kluge, and I don't want
to find ourselves required to be bug-compatible with it forevermore.
To take just the most salient point: assuming that we've somehow
determined that some matview is too out-of-date to be usable, why is the
appropriate response to that to cause queries on it to fail altogether?
That seems like about the least useful feature that could be devised.
Why not, say, have queries fall back to expanding the view definition as
though it were a regular view?

I think matviews without any scannability control are already a pretty
useful feature, and one I'd not be ashamed to ship just like that for
9.3.  But the scannability stuff is clearly going to need to be
revisited.  IMO, driving a stake in the ground at the exact spot that
this stake is placed is not a good plan.  If we simply remove that
aspect altogether for now, I think we'll have more room to maneuver in
future releases.

I apologize for not having griped about this earlier, but I've really
paid no attention to the matviews work up to now; there are only so
many cycles in a day.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in FDW documentation

2013-04-03 Thread Tom Lane

Albe Laurenz  writes:
> I found a small typo, patch attached.

Drat, thought I'd fixed that before, but obviously not.  Done now, thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in documentation for function to_json

2013-04-03 Thread Andrew Dunstan



On 04/03/2013 10:11 AM, Tom Lane wrote:

Andrew Dunstan  writes:

I have a patch for this and other docs errors waiting in the wings. I
will commit it when the current hiatus is done, some time tomorrow.

There's no reason not to commit now.  The blockage is on propagation to
the anongit mirror, not on what committers can do.





True, but there's no great advantage either. I am not planning to commit 
anything else before then, and I like to check the committers email when 
I commit to make sure everything is kosher.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in documentation for function to_json

2013-04-03 Thread Tom Lane

Andrew Dunstan  writes:
> I have a patch for this and other docs errors waiting in the wings. I 
> will commit it when the current hiatus is done, some time tomorrow.

There's no reason not to commit now.  The blockage is on propagation to
the anongit mirror, not on what committers can do.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Tom Lane

Andrew Dunstan  writes:
> On 04/02/2013 02:46 PM, Florian Pflug wrote:
>> If we're going to break compatibility, we should IMHO get rid of
>> non-zero lower bounds all together. My guess is that the number of
>> affected users wouldn't be much higher than for the proposed patch,
>> and it'd allow lossless mapping to most language's native array types

> That would actually break a HUGE number of users, since the default 
> lower bound is 1. I have seen any number of pieces if code that rely on 
> that.

I assume he meant non-one lower bounds.  But in any case, removing the
option for other lower bounds would be very clearly a removal of useful
functionality, so I don't see it happening ... certainly not if we're
so tied to bug-compatibility that we can't redefine the number of
dimensions an empty array has.

I think though that the upthread argument that we'd have multiple
interpretations of the same thing is bogus.  To me, the core idea that's
being suggested here is that '{}' should mean a zero-length 1-D array,
not a zero-D array as formerly.  We would still need a way to represent
zero-D arrays, if only because they'd still exist on-disk in existing
databases (assuming we're not willing to break pg_upgrade for this).
I suggest that that ought *not* involve any braces.  Perhaps '[]=' would
be a suitable representation.  In the other direction, ISTM that
'{{},{},{}}' is a zero-by-three array, entirely distinct from '{}' or
'{{}}' in dimensionality if not content.  I haven't worked this out in
complete detail, but if different strings mean the same thing then we
don't have the representation defined right.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Florian Pflug

On Apr3, 2013, at 15:30 , Andrew Dunstan  wrote:
> On 04/02/2013 02:46 PM, Florian Pflug wrote:
>> If we're going to break compatibility, we should IMHO get rid of
>> non-zero lower bounds all together. My guess is that the number of
>> affected users wouldn't be much higher than for the proposed patch,
>> and it'd allow lossless mapping to most language's native array types…
> 
> That would actually break a HUGE number of users, since the default lower
> bound is 1. I have seen any number of pieces if code that rely on that.

Uh, yeah, we should make it 1 then, not 0, then. As long as the bound
is fixed, conversion to native C/Java/Ruby/Python/... arrays would still
be lossless.

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Page replacement algorithm in buffer cache

2013-04-03 Thread Robert Haas

On Tue, Apr 2, 2013 at 1:20 PM, Andres Freund  wrote:
> On 2013-04-02 12:56:56 -0400, Tom Lane wrote:
>> Andres Freund  writes:
>> > On 2013-04-02 12:22:03 -0400, Tom Lane wrote:
>> >> I agree in general, though I'm not sure the bgwriter process can
>> >> reasonably handle this need along with what it's already supposed to be
>> >> doing.  We may need another background process that is just responsible
>> >> for keeping the freelist populated.
>>
>> > What else is the bgwriter actually doing otherwise? Sure, it doesn't put 
>> > the
>> > pages on the freelist, but otherwise its trying to do the above isn't it?
>>
>> I think it will be problematic to tie buffer-undirtying to putting both
>> clean and dirty buffers into the freelist.  It might chance to work all
>> right to use one scan process for both, but I'm afraid it's more likely
>> that we'd end up either overserving one goal or underserving the other.
>
> Hm. I had imagined that we would only ever put clean buffers into the
> freelist and that we would never write out a buffer that we don't need
> for a new page. I don't see much point in randomly writing out buffers
> that won't be needed for something else soon. Currently we can't do much
> better than basically undirtying random buffers since we don't really know
> which page will reach a usagecount of zero since bgwriter doesn't
> manipulate usagecounts.
>
> One other scenario I can see is the problem of strategy buffer reusage
> of dirtied pages (hint bits, pruning) during seqscans where we would
> benefit from pages being written out fast, but I can't imagine that that
> could be handled very well by something like the bgwriter?
>
> Am I missing something?

I've had the same thought.  I think we should consider having a
background process that listens on a queue, sort of like the fsync
absorption queue.  When a process using a buffer access strategy
dirties a buffer, it adds it to that queue and sets the latch for the
background process, which then wakes up and starts cleaning the
buffers that have been added to its queue.  The hope is that, by the
time the ring buffer wraps around, the background process will have
cleaned the buffer, preventing the foreground process from having to
wait for the buffer write (and, perhaps, xlog flush).

The main hesitation I've had about actually implementing such a scheme
is that I find it a bit unappealing to have a background process
dedicated to just this.  But maybe it could be combined with some of
the other ideas presented here.  Perhaps we should have one process
that scans the buffer arena and populates the freelists; as a side
effect, if it runs across a dirty buffer, it kicks it over to the
process described in the previous paragraph (which could still, also,
absorb requests from other backends using buffer access strategies).
Then we'd end up with nothing that looks exactly like the background
writer we have now, but maybe no one would miss it.

I think that as we go through the process of trying to improve this,
we should also look hard at trying to make the algorithms more
self-tuning.  For example, instead of having a fixed delay between
rounds for the buffer-arena-scanning process, I think we should try to
make it adaptive.  If it sticks a bunch of buffers on the freelist and
the freelist then runs dry before it wakes up again, the backend that
notices that the list is empty (or below some low watermark), it
should set a latch to wake up the buffer-arena-scanning process; and
the next time that process goes back to sleep, it should sleep for a
shorter period of time.  As things are today, what the background
writer actually does is unhelpful enough that there might not be much
point in fiddling with this, but as we get to having a more sensible
scheme overall, I think it will pay dividends.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Typo in FDW documentation

2013-04-03 Thread Albe Laurenz

I found a small typo, patch attached.

Yours,
Laurenz Albe


fdw-doc.patch
Description: fdw-doc.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Typo in documentation for function to_json

2013-04-03 Thread Michael Paquier

On Wed, Apr 3, 2013 at 10:32 PM, Andrew Dunstan  wrote:

> I have a patch for this and other docs errors waiting in the wings. I will
> commit it when the current hiatus is done, some time tomorrow.
>
OK thanks.
-- 
Michael

Re: [HACKERS] Typo in documentation for function to_json

2013-04-03 Thread Andrew Dunstan



On 04/03/2013 09:22 AM, Michael Paquier wrote:

Hi all,

While reading some documentation about json functions, I found a typo 
with the example used with function to_json. A bracket was missing. 
Please find attached the patch correcting that.





I have a patch for this and other docs errors waiting in the wings. I 
will commit it when the current hiatus is done, some time tomorrow.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Andrew Dunstan



On 04/02/2013 02:46 PM, Florian Pflug wrote:

On Apr1, 2013, at 13:43 , Robert Haas  wrote:

I don't think the current behavior is broken.  I found it
counterintuitive at first, but then I got used to it.  It's reasonably
self-consistent: arrays can't have empty dimensions, therefore the
empty array is unique and dimensionless.  Is that the behavior I would
have picked if I had designed the type?  No, it isn't.  I wouldn't
have tried to support one-dimensional arrays and multi-dimensional
arrays in the same data type either, nor would I have supported
non-default lower bounds.  But all of those ships have sailed, and the
time to change them is not after people have spent 10 years building
applications that work with the current behavior.  If we want to
introduce a new type with different, perhaps better, behavior, well, I
think that might be a fine idea.  But I *don't* think imposing a hard
compatibility break on users of arrays is a good idea.

+1.

Especially since having infinitely many variants of "empty" is IMHO
hardly an improvement over the existing situation.

If we're going to break compatibility, we should IMHO get rid of
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…



That would actually break a HUGE number of users, since the default 
lower bound is 1. I have seen any number of pieces if code that rely on 
that.


Doesn't the standard have something to say about all this?

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] in-catalog Extension Scripts and Control parameters (templates?)

2013-04-03 Thread Dimitri Fontaine

Hi,

Dimitri Fontaine  writes:
>> Documentation doesn't build, multiple errors. In addition to the reference
>> pages, there should be a section in the main docs about these templates.
>
> I'm now working on that, setting up the documentation tool set.

Fixed in the attached version 6 of the patch.

I couldn't get to fix the documentation build tool chain on my main
workstation, so I had to revive a VM where I was lucky enough to find my
old setup was still working, all with shared directories, VPATH build
setup etc. Anyways.

I also expanded a little on the docs we did have, adding a section in
the main extension chapter and some example in the CREATE TEMPLATE FOR
EXTENSION reference page.

That version has no flaw that I'm aware of and implements the design we
reached after almost 2 years on the topic which is more complex than it
would appear at first sight. I hope that you will like the patch if not
the feature itself, as I have big plans for it in the near future.

Many thanks to all involved for helping with the design and the
reviewing, regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

templates.v6.patch.gz
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Typo in documentation for function to_json

2013-04-03 Thread Michael Paquier

Hi all,

While reading some documentation about json functions, I found a typo with
the example used with function to_json. A bracket was missing. Please find
attached the patch correcting that.

Regards,
-- 
Michael


20120403_to_json_typo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: [HACKERS] Should array_length() Return NULL)

2013-04-03 Thread Florian Pflug

On Apr1, 2013, at 13:43 , Robert Haas  wrote:
> I don't think the current behavior is broken.  I found it
> counterintuitive at first, but then I got used to it.  It's reasonably
> self-consistent: arrays can't have empty dimensions, therefore the
> empty array is unique and dimensionless.  Is that the behavior I would
> have picked if I had designed the type?  No, it isn't.  I wouldn't
> have tried to support one-dimensional arrays and multi-dimensional
> arrays in the same data type either, nor would I have supported
> non-default lower bounds.  But all of those ships have sailed, and the
> time to change them is not after people have spent 10 years building
> applications that work with the current behavior.  If we want to
> introduce a new type with different, perhaps better, behavior, well, I
> think that might be a fine idea.  But I *don't* think imposing a hard
> compatibility break on users of arrays is a good idea.

+1.

Especially since having infinitely many variants of "empty" is IMHO
hardly an improvement over the existing situation.

If we're going to break compatibility, we should IMHO get rid of
non-zero lower bounds all together. My guess is that the number of
affected users wouldn't be much higher than for the proposed patch,
and it'd allow lossless mapping to most language's native array types…

best regards,
Florian Pflug



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extra security measures for next week's releases

2013-04-03 Thread Magnus Hagander

On Wed, Apr 3, 2013 at 12:09 PM,   wrote:
> Am 28.03.2013 18:03, schrieb Tom Lane:
>
>> The core committee has decided that one of the security issues due to be
>> fixed next week is sufficiently bad that we need to take extra measures
>> to prevent it from becoming public before packages containing the fix
>> are available.  (This is a scenario we've discussed before, but never
>> had to actually implement.)
>>
>
> 8.3 has reached EOL in February 2013, I guess there will be no fix for 8.3,
> right?

That is correct.

Some distributions may backpatch fixes manually, but there will be no
official patch for 8.3.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extra security measures for next week's releases

2013-04-03 Thread roadrunner6


Am 28.03.2013 18:03, schrieb Tom Lane:

The core committee has decided that one of the security issues due to be
fixed next week is sufficiently bad that we need to take extra measures
to prevent it from becoming public before packages containing the fix
are available.  (This is a scenario we've discussed before, but never
had to actually implement.)



8.3 has reached EOL in February 2013, I guess there will be no fix for 
8.3, right?





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE EXTENSION BLOCKS

2013-04-03 Thread Albe Laurenz

David E. Wheeler wrote:
> I am working on scripts to copy data from Oracle via oracle_fdw. They each do 
> something like this:
> 
> CREATE SCHEMA migrate_stuff;
> SET search_path TO migrate_stuff,public;
> CREATE EXTENSION oracle_fdw SCHEMA migrate_rules;
> 
[...]
> 
> Then I run them in parallel:
> 
> for file in migrate*.sql; do
> psql -d foo -f $file &
> done
> wait
> 
> This works fine except for one thing: the first CREATE EXTENSION statement 
> blocks all the others. Even
> when I create the extension in separate schemas in each script! I have to 
> remove the CREATE EXTENSION
> statement, create it in public before any of the scripts run, then drop it 
> when they're done. I'm okay
> with this workaround, but wasn't sure if the blocking of CREATE EXTENSION was 
> intentional or a known
> issue (id did not see it documented in 
> http://www.postgresql.org/docs/current/static/sql-
> createextension.html).

I'd expect one of the CREATE EXTENSION commands to succeed
and the others to block until the transaction is committed,
then to fail with 'extension "oracle_fdw" already exists'.

If that is what happens, it's what I'd expect since
extension names are unique (see the unique constraint on
pg_extension).

Yours,
Laurenz Albe


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] regression test failed when enabling checksum

2013-04-03 Thread Andres Freund

On 2013-04-01 19:51:19 -0700, Jeff Janes wrote:
> On Mon, Apr 1, 2013 at 10:37 AM, Jeff Janes  wrote:
> 
> > On Tue, Mar 26, 2013 at 4:23 PM, Jeff Davis  wrote:
> >
> >>
> >> Patch attached. Only brief testing done, so I might have missed
> >> something. I will look more closely later.
> >>
> >
> > After applying your patch, I could run the stress test described here:
> >
> > http://archives.postgresql.org/pgsql-hackers/2012-02/msg01227.php
> >
> > But altered to make use of initdb -k, of course.
> >
> > Over 10,000 cycles of crash and recovery, I encountered two cases of
> > checksum failures after recovery, example:
> > ...
> >
> 
> 
> > Unfortunately I already cleaned up the data directory before noticing the
> > problem, so I have nothing to post for forensic analysis.  I'll try to
> > reproduce the problem.
> >
> >
> I've reproduced the problem, this time in block 74 of relation
> base/16384/4931589, and a tarball of the data directory is here:
> 
> https://docs.google.com/file/d/0Bzqrh1SO9FcELS1majlFcTZsR0k/edit?usp=sharing
> 
> (the table is in database jjanes under role jjanes, the binary is commit
> 9ad27c215362df436f8c)
> 
> What I would probably really want is the data as it existed after the crash
> but before recovery started, but since the postmaster immediately starts
> recovery after the crash, I don't know of a good way to capture this.
> 
> I guess one thing to do would be to extract from the WAL the most recent
> FPW for block 74 of relation base/16384/4931589  (assuming it hasn't been
> recycled already) and see if it matches what is actually in that block of
> that data file, but I don't currently know how to do that.
> 
> 11500 SELECT 2013-04-01 12:01:56.926 PDT:WARNING:  page verification
> failed, calculated checksum 54570 but expected 34212
> 11500 SELECT 2013-04-01 12:01:56.926 PDT:ERROR:  invalid page in block 74
> of relation base/16384/4931589
> 11500 SELECT 2013-04-01 12:01:56.926 PDT:STATEMENT:  select sum(count) from
> foo

I just checked and unfortunately your dump doesn't contain all that much
valid WAL:
rmgr: XLOGlen (rec/tot): 72/   104, tx:  0, lsn: 
7/AB28, prev 7/AA90, bkp: , desc: checkpoint: redo 7/AB28; tli 
1; prev tli 1; fpw true; xid 0/156747297; oid 4939781; multi 1; offset 0; 
oldest xid 1799 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; online
rmgr: XLOGlen (rec/tot): 72/   104, tx:  0, lsn: 
7/AB90, prev 7/AB28, bkp: , desc: checkpoint: redo 7/AB90; tli 
1; prev tli 1; fpw true; xid 0/156747297; oid 4939781; multi 1; offset 0; 
oldest xid 1799 in DB 1; oldest multi 1 in DB 1; oldest running xid 0; shutdown
pg_xlogdump: FATAL:  error in WAL record at 7/AB90: record with zero length 
at 7/ABF8

So just two checkpoint records.

Unfortunately I  fear that won't be enough to diagnose the problem,
could you reproduce it with a higher wal_keep_segments?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: index support for regexp search

2013-04-03 Thread Alexander Korotkov

On Wed, Apr 3, 2013 at 11:10 AM, Erikjan Rijkers  wrote:

> On Tue, April 2, 2013 23:54, Alexander Korotkov wrote:
>
> > [trgm-regexp-0.15.patch.gz]
>
> Yes, it does look good now; Attached a list of measurements. Most of the
> searches that I put in
> that test-program are now speeded up very much.
>
> There still are a few regressions, for example:
>
> HEAD  azjunk6  x[aeiou]{4,5}q  83  Seq Scan
>  1393.465 ms
> trgm_regex15  azjunk6  x[aeiou]{4,5}q  83  Bitmap Heap Scan
>  1728.319 ms
>
> HEAD  azjunk7  x[aeiou]{1,3}q  190031  Seq Scan
> 16819.555 ms
> trgm_regex15  azjunk7  x[aeiou]{1,3}q  190031  Bitmap Heap Scan
> 21286.804 ms
>
> Not exactly negligible, and ideally those regressions would be removed but
> with the huge
> advantages for other cases I'd say it's worth it.
>

Thank you for testing!
Exploring results more detail I found version 13 to be buggy. This version
is a dead end, we have quite different API now. Could you use v12 instead
of v13 in comparison, please?
Sometimes we have regression in comparison with head in two reasons:
1) We select index scan in both cases but with patch we spent more time for
analysis. It's inevitable disadvantage of any index. We can only take care
of analysis doesn't take too long. Current testing results don't show this
reason to be significant.
2) Sometimes we select index scan while sequential scan would be faster.
It's also inevitable disadvantage until we have a relevant statistics. We
now have similar situation, for example, with in-core geometrical search
and LIKE/ILIKE search in pg_trgm. However,  probably, situation could be
improved somehow even without such statistics. But I think we can do such
conclusion based on synthetical testing, because improvements for
synthetical cases could appear to be an worsening for real-life cases.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2013-04-03 Thread Erikjan Rijkers

On Tue, April 2, 2013 23:54, Alexander Korotkov wrote:

> [trgm-regexp-0.15.patch.gz]

Yes, it does look good now; Attached a list of measurements. Most of the 
searches that I put in
that test-program are now speeded up very much.

There still are a few regressions, for example:

HEAD  azjunk6  x[aeiou]{4,5}q  83  Seq Scan  1393.465 ms
trgm_regex15  azjunk6  x[aeiou]{4,5}q  83  Bitmap Heap Scan  1728.319 ms

HEAD  azjunk7  x[aeiou]{1,3}q  190031  Seq Scan 16819.555 ms
trgm_regex15  azjunk7  x[aeiou]{1,3}q  190031  Bitmap Heap Scan 21286.804 ms

Not exactly negligible, and ideally those regressions would be removed but with 
the huge
advantages for other cases I'd say it's worth it.

hth,

Erik Rijkers





re-head-13-15-20130403-0708.txt.bz2
Description: BZip2 compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

100 matches

Mail list logo