date:20141021

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

2014-10-21 Thread Heikki Linnakangas

On 10/20/2014 11:02 PM, jes...@krogh.cc wrote:

>>I do suspect the majority is from 30 concurrent processes updating an
>>506GB GIN index, but it would be nice to confirm that. There is also a
>>message-queue in the DB with a fairly high turnaround.

>
>A 506GB GIN index? Uh, interesting :). What's it used for? Trigrams?

It is for full-text-search, but it is being updated entirely regulary,
~100M records. A dump/restore cycle typically reduces the size to 30-40%
of current size.

Try 9.4 beta. The on-disk format of GIN indexes was rewritten in 9.4, 
making them a lot smaller. That might help with WAL volume too. Or not, 
but I'd love to hear what the impact is, in a real-life database :-).

- Heikki

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] narwhal and PGDLLIMPORT

2014-10-21 Thread Noah Misch

On Mon, Oct 20, 2014 at 01:03:31AM -0400, Noah Misch wrote:
> I reproduced narwhal's problem using its toolchain on another 32-bit Windows
> Server 2003 system.  The crash happens at the SHGetFolderPath() call in
> pqGetHomeDirectory().  A program can acquire that function via shfolder.dll or
> via shell32.dll; we've used the former method since commit 889f038, for better
> compatibility[1] with Windows NT 4.0.  On this system, shfolder.dll's version
> loads and unloads shell32.dll.  In PostgreSQL built using this older compiler,
> shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading shell32!
> That started with commit 846e91e.  I don't expect to understand the mechanism
> behind it, but I recommend we switch back to linking libpq with shell32.dll.
> The MSVC build already does that in all supported branches, and it feels right
> for the MinGW build to follow suit in 9.4+.  Windows versions that lack the
> symbol in shell32.dll are now ancient history.

That allowed narwhal to proceed a bit further than before, but it crashes
later in the dblink test suite.  Will test again...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

2014-10-21 Thread Craig Ringer

On 10/21/2014 03:03 AM, jes...@krogh.cc wrote:

> That being said, along comes the backup, scheduled ones a day and tries to
> read off these wal-files, which to the backup looks like "an awfull lot of
> small files", our backup utillized a single thread to read of those files
> and levels of at reading through 30-40MB/s from a 21 drive Raid50 of
> rotating drives, which is quite bad. That causes a daily incremental run
> to take in the order of 24h. Differential picking up larger deltas and
> full are even worse.

What's the backup system?

151952 files should be a trivial matter for any backup system. I'm very
surprised you're seeing those kind of run times for 2TB of WAL, and
think it's worth investigating just why the backup system is behaving
this way.

What does 'filefrag' say about the WAL segments? Are they generally a
single extent each? If not, how many extents?

It'd be useful to know the kernel version, file system, RAID controller,
whether you use LVM, and other relevant details? What's your RAID
array's stripe size?

> A short test like:
> find . -type f -ctime -1 | tail -n 50 | xargs cat | pipebench > /dev/null
> confirms the backup speed to be roughly the same as seen by the backup
> software.
> Another test from the same volume doing:
> find . -type f -ctime -1 | tail -n 50 | xargs cat  > largefile
> And then wait for the fs to not cache the file any more and
> cat largefile | pipebench > /dev/null
> confirms that the disk-subsystem can do way (150-200MB/s) better on larger
> files.

OK, so a larger contiguously allocated file looks like it's probably
read faster. That doesn't mean there's any guarantee that big WAL
segment would be allocated contiguously if there are lots of other
writes interspersed, but the FS will try.

(What does 'filefrag' say about your 'largefile'?)

I'm wondering if you're having issues related to a RAID stripe size that
is close to, or bigger than, your WAL segment size. So each segment is
only being read from one disk or a couple of disks. If that's the case
you're probably not getting ideal write performance either.

That said, I don't see any particular reason why readahead wouldn't
result in you getting similar results from multiple smaller WAL segments
that're allocated contiguously, and they usually would be if they're
created one after the other. What are your readahead settings? (There
are often several at different levels; what exists depends on how your
storage is configured, use of LVM, use of SW RAID, etc).

In my opinion RAID 50, or RAID 5, are generally pretty poor options for
a database file system in performance terms anyway. Especially for
transaction logs. RAID 50 is also not wonderfully durable for arrays of
larger numbers of bigger disks given modern disks' sizes, even with the
low block error rates and relatively low disk failure rates. I
personally tend to consider two parity disks the minimum acceptable for
arrays of more than four or five disks. I'd certainly want continuous
archiving or streaming replication in place if I was running RAID 50 on
a big array.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deferring some AtStart* allocations?

2014-10-21 Thread Andres Freund

Hi,

On 2014-10-09 15:01:19 -0400, Robert Haas wrote:
>  /*
> @@ -960,18 +966,38 @@ AtEOXact_Inval(bool isCommit)
...
> + /*
> +  * We create invalidation stack entries lazily, so the parent 
> might
> +  * not have one.  Instead of creating one, moving all the data 
> over,
> +  * and then freeing our own, we can just adjust the level of 
> our own
> +  * entry.
> +  */
> + if (myInfo->parent == NULL || myInfo->parent->my_level < 
> my_level - 1)
> + {
> + myInfo->my_level--;
> + return;
> + }
> +

I think this bit might not be correct. What if the subxact one level up
aborts? Then it'll miss dealing with these invalidation entries. Or am I
misunderstanding something?

I like the patch, except the above potential issue.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Deferring some AtStart* allocations?

2014-10-21 Thread Andres Freund

On 2014-10-08 13:52:14 -0400, Robert Haas wrote:
> On Sun, Jun 29, 2014 at 9:12 PM, Tom Lane  wrote:
> > Meh.  Even "SELECT 1" is going to be doing *far* more pallocs than that to
> > get through raw parsing, parse analysis, planning, and execution startup.
> > If you can find a few hundred pallocs we can avoid in trivial queries,
> > it would get interesting; but I'll be astonished if saving 4 is measurable.
> 
> I got nerd-sniped by this problem today, probably after staring at the
> profiling data very similar to what led Andres to ask the question in
> the first place.

I've loolked through this patch and I'm happy with it. One could argue
that the current afterTriggers == NULL checks should be replaced with a
Assert ensuring we're inside the xact. But I think you're right in
removing them, and I don't think we actually need the asserts.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Possible micro-optimization in CacheInvalidateHeapTuple

2014-10-21 Thread Jim Nasby


On 10/13/14, 8:28 PM, Tom Lane wrote:

Jim Nasby  writes:

CacheInvalidateHeapTuple currently does the following tests first; would there 
be a performance improvement to testing the system relation case first? We're 
almost never in bootstrap mode, so that test is almost always a waste. Is there 
any reason not to switch the two?
/* Do nothing during bootstrap */
if (IsBootstrapProcessingMode())
return;



/*
 * We only need to worry about invalidation for tuples that are in 
system
 * relations; user-relation tuples are never in catcaches and can't 
affect
 * the relcache either.
 */
if (!IsSystemRelation(relation))
return;


You're assuming that IsSystemRelation() is safe to apply during bootstrap
mode.  Even if it is, I don't see the point of messing with this.
IsBootstrapProcessingMode() is a macro expanding to one comparison
instruction.


Comment patch to that effect attached.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/src/backend/utils/cache/inval.c b/src/backend/utils/cache/inval.c
index a7a768e..545ccc5 100644
--- a/src/backend/utils/cache/inval.c
+++ b/src/backend/utils/cache/inval.c
@@ -1055,7 +1055,11 @@ CacheInvalidateHeapTuple(Relation relation,
Oid databaseId;
Oid relationId;
 
-   /* Do nothing during bootstrap */
+   /*
+ * Do nothing during bootstrap. It may seem silly to check this first since
+ * it will almost always be false, but it's not safe to assume that later
+ * checks can be done safely while in bootstrap.
+ */
if (IsBootstrapProcessingMode())
return;
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-10-21 Thread Tom Lane

Jim Nasby  writes:
> - What happens if we run out of space to remember skipped blocks?

You forget some, and are no worse off than today.  (This might be an
event worthy of logging, if the array is large enough that we don't
expect it to happen often ...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-10-21 Thread Jim Nasby


On 10/21/14, 5:39 PM, Alvaro Herrera wrote:

Jim Nasby wrote:


Currently, a non-freeze vacuum will punt on any page it can't get a
cleanup lock on, with no retry. Presumably this should be a rare
occurrence, but I think it's bad that we just assume that and won't
warn the user if something bad is going on.


I think if you really want to attack this problem, rather than just
being noisy about it, what you could do is to keep a record of which
page numbers you had to skip, and then once you're done with your first
scan you go back and retry the lock on the pages you skipped.


I'm OK with that if the community is; I was just trying for minimum 
invasiveness.

If I go this route, I'd like some input though...

- How to handle storing the blockIDs. Fixed size array or something fancier? 
What should we limit it to, especially since we're already allocating 
maintenance_work_mem for the tid array.

- What happens if we run out of space to remember skipped blocks? I could do 
something like what we do for running out of space in the dead_tuples array, 
but I'm worried that will add a serious amount of complexity, especially since 
re-processing these blocks could be what actually pushes us over the limit.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Spurious set in heap_prune_chain()

2014-10-21 Thread Jim Nasby


In heap_prune_chain():

tup.t_tableOid = RelationGetRelid(relation);

rootlp = PageGetItemId(dp, rootoffnum);

/*
 * If it's a heap-only tuple, then it is not the start of a HOT chain.
 */
if (ItemIdIsNormal(rootlp))
{
htup = (HeapTupleHeader) PageGetItem(dp, rootlp);

tup.t_data = htup;
tup.t_len = ItemIdGetLength(rootlp);
tup.t_tableOid = RelationGetRelid(relation);

AFAICT the second case of setting tup.t_tableOid is pointless. Attached patch 
removes it. Passes make check.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com
diff --git a/src/backend/access/heap/pruneheap.c 
b/src/backend/access/heap/pruneheap.c
index 06b5488..4c40f7e 100644
--- a/src/backend/access/heap/pruneheap.c
+++ b/src/backend/access/heap/pruneheap.c
@@ -371,7 +371,6 @@ heap_prune_chain(Relation relation, Buffer buffer, 
OffsetNumber rootoffnum,
 
tup.t_data = htup;
tup.t_len = ItemIdGetLength(rootlp);
-   tup.t_tableOid = RelationGetRelid(relation);
ItemPointerSet(&(tup.t_self), BufferGetBlockNumber(buffer), 
rootoffnum);
 
if (HeapTupleHeaderIsHeapOnly(htup))

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Log inability to lock pages during vacuum

2014-10-21 Thread Alvaro Herrera

Jim Nasby wrote:

> Currently, a non-freeze vacuum will punt on any page it can't get a
> cleanup lock on, with no retry. Presumably this should be a rare
> occurrence, but I think it's bad that we just assume that and won't
> warn the user if something bad is going on.

I think if you really want to attack this problem, rather than just
being noisy about it, what you could do is to keep a record of which
page numbers you had to skip, and then once you're done with your first
scan you go back and retry the lock on the pages you skipped.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify calls of pg_class_aclcheck when multiple modes are used

2014-10-21 Thread Fabrízio de Royes Mello

Em terça-feira, 21 de outubro de 2014, Michael Paquier <
michael.paqu...@gmail.com> escreveu:

> On Wed, Oct 22, 2014 at 5:03 AM, Peter Eisentraut  > wrote:
>
>> While looking at this, I wrote a few tests cases for sequence
>> privileges, because that was not covered at all.  That patch is attached.
>>
> +1 for those tests.
>
>
+1

Fabrízio Mello


-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello

Re: [HACKERS] expected/sequence_1.out obsolete?

2014-10-21 Thread Tom Lane

Michael Paquier  writes:
> I don't think that this is a good idea, have a look at bb3f839 explaining
> that this alternate output may happen when a checkpoint kicks in. Simple
> patch is attached.

Pushed, thanks.  (The 9.4 branch was broken too, but differently :-()

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables

2014-10-21 Thread Jim Nasby

On 10/21/14, 4:36 PM, Jeff Janes wrote:

On Mon, Oct 20, 2014 at 5:46 PM, Andres Freund mailto:and...@2ndquadrant.com>> wrote:

On 2014-10-20 17:43:26 -0700, Josh Berkus wrote:
> On 10/20/2014 05:39 PM, Jim Nasby wrote:
> > Or maybe vacuum isn't the right way to handle some of these scenarios.
> > It's become the catch-all for all of this stuff, but maybe that doesn't
> > make sense anymore. Certainly when it comes to dealing with inserts
> > there's no reason we *have* to do anything other than set hint bits and
> > possibly freeze xmin.
>
> +1

A page read is a page read. What's the point of heaving another process
do it?

It is only a page read if you have to read the page.  It would seem optimal to 
have bgwriter adventitiously set hint bits and vm bits, because that is the 
last point at which the page can be changed without risking that it be written 
out twice. At that point, it has been given the maximum amount of time it can 
be given for the interested transactions to have committed and to have aged 
past the xmin horizon.  I seem to recall that the main problem with that, 
though, is that you must be attached to a database in order to determine 
visibility, and bgwriter is not attached to a database.

It's also a bit more complex than a simple question of "is the page still in shared 
buffers". Our *real* last chance is when the page is about to be evicted from the 
filesystem cache; after that reading it back it will be extremely expensive (relatively 
speaking).

I think it's worth considering this, because if you have any moderate length 
transactions on a busy database bgwriter won't be able to help much; you'll be 
burning through shared buffers too quickly.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify calls of pg_class_aclcheck when multiple modes are used

2014-10-21 Thread Michael Paquier

On Wed, Oct 22, 2014 at 5:03 AM, Peter Eisentraut  wrote:

> While looking at this, I wrote a few tests cases for sequence
> privileges, because that was not covered at all.  That patch is attached.
>
+1 for those tests.
-- 
Michael

Re: [HACKERS] expected/sequence_1.out obsolete?

2014-10-21 Thread Michael Paquier

On Wed, Oct 22, 2014 at 4:46 AM, Peter Eisentraut  wrote:

> expected/sequence_1.out hasn't been updated at least since
>
> commit d90ced8bb22194cbb45f58beb0961251103aeff5
> Date:   Thu Oct 3 16:17:18 2013 -0400
>
> and nobody appears to have complained.
>
> Can it be removed?
>
I don't think that this is a good idea, have a look at bb3f839 explaining
that this alternate output may happen when a checkpoint kicks in. Simple
patch is attached.
-- 
Michael
diff --git a/src/test/regress/expected/sequence_1.out b/src/test/regress/expected/sequence_1.out
index 124967e..e426f64 100644
--- a/src/test/regress/expected/sequence_1.out
+++ b/src/test/regress/expected/sequence_1.out
@@ -91,6 +91,8 @@ SELECT nextval('serialTest2_f6_seq');
 
 -- basic sequence operations using both text and oid references
 CREATE SEQUENCE sequence_test;
+CREATE SEQUENCE IF NOT EXISTS sequence_test;
+NOTICE:  relation "sequence_test" already exists, skipping
 SELECT nextval('sequence_test'::text);
  nextval 
 -
@@ -163,6 +165,9 @@ SELECT nextval('sequence_test'::text);
   99
 (1 row)
 
+DISCARD SEQUENCES;
+SELECT currval('sequence_test'::regclass);
+ERROR:  currval of sequence "sequence_test" is not yet defined in this session
 DROP SEQUENCE sequence_test;
 -- renaming sequences
 CREATE SEQUENCE foo_seq;
@@ -341,6 +346,9 @@ SELECT lastval();
   99
 (1 row)
 
+DISCARD SEQUENCES;
+SELECT lastval();
+ERROR:  lastval is not yet defined in this session
 CREATE SEQUENCE seq2;
 SELECT nextval('seq2');
  nextval 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] expected/sequence_1.out obsolete?

2014-10-21 Thread Tom Lane

Peter Eisentraut  writes:
> expected/sequence_1.out hasn't been updated at least since
> commit d90ced8bb22194cbb45f58beb0961251103aeff5
> Date:   Thu Oct 3 16:17:18 2013 -0400
> and nobody appears to have complained.

> Can it be removed?

No, it needs to be fixed.  The alternate output from "select * from
foo_seq" will appear occasionally (from memory, if a checkpoint starts
or perhaps completes during the test).

We might have noticed this already if the buildfarm weren't so noisy
lately :-(

An alternative answer would be to remove log_cnt from the set of
columns displayed, but I think that would be taking away part of the
point of that particular test.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Autovacuum fails to keep visibility map up-to-date in mostly-insert-only-tables

2014-10-21 Thread Jeff Janes

On Mon, Oct 20, 2014 at 5:46 PM, Andres Freund 
wrote:

> On 2014-10-20 17:43:26 -0700, Josh Berkus wrote:
> > On 10/20/2014 05:39 PM, Jim Nasby wrote:
> > > Or maybe vacuum isn't the right way to handle some of these scenarios.
> > > It's become the catch-all for all of this stuff, but maybe that doesn't
> > > make sense anymore. Certainly when it comes to dealing with inserts
> > > there's no reason we *have* to do anything other than set hint bits and
> > > possibly freeze xmin.
> >
> > +1
>
> A page read is a page read. What's the point of heaving another process
> do it?

It is only a page read if you have to read the page.  It would seem optimal
to have bgwriter adventitiously set hint bits and vm bits, because that is
the last point at which the page can be changed without risking that it be
written out twice. At that point, it has been given the maximum amount of
time it can be given for the interested transactions to have committed and
to have aged past the xmin horizon.  I seem to recall that the main problem
with that, though, is that you must be attached to a database in order to
determine visibility, and bgwriter is not attached to a database.

Cheers,

Jeff

Re: [HACKERS] wal-size limited to 16MB - Performance issue for subsequent backup

2014-10-21 Thread Jeff Janes

On Mon, Oct 20, 2014 at 12:03 PM,  wrote:

> Hi.
>
> One of our "production issues" is that the system generates lots of
> wal-files, lots is like 151952 files over the last 24h, which is about
> 2.4TB worth of WAL files. I wouldn't say that isn't an issue by itself,
> but the system does indeed work fine. We do subsequently gzip the files to
> limit actual disk-usage, this makes the files roughly 30-50% in size.
>
> That being said, along comes the backup, scheduled ones a day and tries to
> read off these wal-files, which to the backup looks like "an awfull lot of
> small files", our backup utillized a single thread to read of those files
> and levels of at reading through 30-40MB/s from a 21 drive Raid50 of
> rotating drives, which is quite bad. That causes a daily incremental run
> to take in the order of 24h. Differential picking up larger deltas and
> full are even worse.
>

Why not have archive_command (which gets the files while they are still
cached) put the files directly into their final destination on the backup
server?

> Suggestions are welcome. An archive-command/restore command that could
> combine/split wal-segments might be the easiest workaround, but how about
> crash-safeness?
>

I think you would just have to combine them by looking at the file name and
seeking to a specific spot in the large file (rather than just appending to
it) so that if the archive_command fails and gets rerun, it will still end
up in the correct place.  I don't see what other crash-safeness issues you
would have, other than the ones you already have.  You would want to do the
compression afterward combining, not before, so that all segments are of
predictable size.

It should be pretty easy as long as want your combined files to consist of
either 16 or 256 (or 255 in older versions) WAL files.

You would have to pass through directly any files not matching the filename
pattern of ordinary WAL files.

Cheers,

Jeff

Re: [HACKERS] Allow format 0000-0000-0000 in postgresql MAC parser

2014-10-21 Thread Peter Eisentraut

On 10/17/14 6:37 PM, Ali Akbar wrote:
> On a side note, i'm noticing from
> http://en.wikipedia.org/wiki/MAC_address, that there is three numbering
> namespace for MAC: MAC-48, EUI-48 and EUI-64. The last one is 64 bits
> long (8 bytes). Currently PostgreSQL's macaddr is only 6 bytes long.
> Should we change it to 8 bytes (not in this patch, of course)?

It looks like nothing current is actually using EUI-64, so probably not,
unless someone comes with an actual use case.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow format 0000-0000-0000 in postgresql MAC parser

2014-10-21 Thread Peter Eisentraut

On 10/1/14 8:34 AM, Herwin Weststrate wrote:
> It has been registered now
> (https://commitfest.postgresql.org/action/patch_view?id=1585). I've got
> an updated version of the patch with the documentation fix.

committed


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Simplify calls of pg_class_aclcheck when multiple modes are used

2014-10-21 Thread Peter Eisentraut

On 8/27/14 8:02 AM, Michael Paquier wrote:
> In a couple of code paths we do the following to check permissions on an
> object:
> if (pg_class_aclcheck(relid, userid, ACL_USAGE) != ACLCHECK_OK &&
> pg_class_aclcheck(relid, userid, ACL_UPDATE) != ACLCHECK_OK)
> ereport(ERROR, blah);
> 
> Wouldn't it be better to simplify that with a single call of
> pg_class_aclcheck, gathering together the modes that need to be checked?

Yes, it's probably just an oversight.

While looking at this, I wrote a few tests cases for sequence
privileges, because that was not covered at all.  That patch is attached.

That led me to discover this issue:
http://www.postgresql.org/message-id/5446b819.1020...@gmx.net

I'll wait for the resolution of that and then commit this.

diff --git a/src/test/regress/expected/sequence.out b/src/test/regress/expected/sequence.out
index a27b5fd..8783ca6 100644
--- a/src/test/regress/expected/sequence.out
+++ b/src/test/regress/expected/sequence.out
@@ -367,6 +367,41 @@ DROP SEQUENCE seq2;
 SELECT lastval();
 ERROR:  lastval is not yet defined in this session
 CREATE USER seq_user;
+-- privileges tests
+-- nextval
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT SELECT ON seq3 TO seq_user;
+SELECT nextval('seq3');
+ERROR:  permission denied for sequence seq3
+ROLLBACK;
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT UPDATE ON seq3 TO seq_user;
+SELECT nextval('seq3');
+ nextval 
+-
+   1
+(1 row)
+
+ROLLBACK;
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT USAGE ON seq3 TO seq_user;
+SELECT nextval('seq3');
+ nextval 
+-
+   1
+(1 row)
+
+ROLLBACK;
+-- currval
 BEGIN;
 SET LOCAL SESSION AUTHORIZATION seq_user;
 CREATE SEQUENCE seq3;
@@ -377,9 +412,97 @@ SELECT nextval('seq3');
 (1 row)
 
 REVOKE ALL ON seq3 FROM seq_user;
+GRANT SELECT ON seq3 TO seq_user;
+SELECT currval('seq3');
+ currval 
+-
+   1
+(1 row)
+
+ROLLBACK;
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+SELECT nextval('seq3');
+ nextval 
+-
+   1
+(1 row)
+
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT UPDATE ON seq3 TO seq_user;
+SELECT currval('seq3');
+ERROR:  permission denied for sequence seq3
+ROLLBACK;
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+SELECT nextval('seq3');
+ nextval 
+-
+   1
+(1 row)
+
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT USAGE ON seq3 TO seq_user;
+SELECT currval('seq3');
+ currval 
+-
+   1
+(1 row)
+
+ROLLBACK;
+-- lastval
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+SELECT nextval('seq3');
+ nextval 
+-
+   1
+(1 row)
+
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT SELECT ON seq3 TO seq_user;
+SELECT lastval();
+ lastval 
+-
+   1
+(1 row)
+
+ROLLBACK;
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+SELECT nextval('seq3');
+ nextval 
+-
+   1
+(1 row)
+
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT UPDATE ON seq3 TO seq_user;
 SELECT lastval();
 ERROR:  permission denied for sequence seq3
 ROLLBACK;
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+SELECT nextval('seq3');
+ nextval 
+-
+   1
+(1 row)
+
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT USAGE ON seq3 TO seq_user;
+SELECT lastval();
+ lastval 
+-
+   1
+(1 row)
+
+ROLLBACK;
 -- Sequences should get wiped out as well:
 DROP TABLE serialTest, serialTest2;
 -- Make sure sequences are gone:
diff --git a/src/test/regress/sql/sequence.sql b/src/test/regress/sql/sequence.sql
index 8d3b700..0dd653d 100644
--- a/src/test/regress/sql/sequence.sql
+++ b/src/test/regress/sql/sequence.sql
@@ -168,11 +168,86 @@ CREATE SEQUENCE seq2;
 
 CREATE USER seq_user;
 
+-- privileges tests
+
+-- nextval
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT SELECT ON seq3 TO seq_user;
+SELECT nextval('seq3');
+ROLLBACK;
+
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT UPDATE ON seq3 TO seq_user;
+SELECT nextval('seq3');
+ROLLBACK;
+
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT USAGE ON seq3 TO seq_user;
+SELECT nextval('seq3');
+ROLLBACK;
+
+-- currval
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+SELECT nextval('seq3');
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT SELECT ON seq3 TO seq_user;
+SELECT currval('seq3');
+ROLLBACK;
+
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+CREATE SEQUENCE seq3;
+SELECT nextval('seq3');
+REVOKE ALL ON seq3 FROM seq_user;
+GRANT UPDATE ON seq3 TO seq_user;
+SELECT currval('seq3');
+ROLLBACK;
+
+BEGIN;
+SET LOCAL SESSION AUTHORIZATION seq_user;
+C

Re: [HACKERS] Directory/File Access Permissions for COPY and Generic File Access Functions

2014-10-21 Thread Brightwell, Adam

Peter,

You patch is missing the files src/include/catalog/pg_diralias.h,
> src/include/commands/diralias.h, and src/backend/commands/diralias.c.
>
> (Hint: git add -N)
>

Yikes, sorry about that, not sure how that happened.  Attached is an
updated patch.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com
diff --git a/src/backend/catalog/Makefile b/src/backend/catalog/Makefile
index b257b02..8cdc5cb 100644
--- a/src/backend/catalog/Makefile
+++ b/src/backend/catalog/Makefile
@@ -41,7 +41,7 @@ POSTGRES_BKI_SRCS = $(addprefix $(top_srcdir)/src/include/catalog/,\
 	pg_foreign_data_wrapper.h pg_foreign_server.h pg_user_mapping.h \
 	pg_foreign_table.h pg_rowsecurity.h \
 	pg_default_acl.h pg_seclabel.h pg_shseclabel.h pg_collation.h pg_range.h \
-	toasting.h indexing.h \
+	pg_diralias.h toasting.h indexing.h \
 )
 
 # location of Catalog.pm
diff --git a/src/backend/catalog/aclchk.c b/src/backend/catalog/aclchk.c
index d30612c..3717bf5 100644
--- a/src/backend/catalog/aclchk.c
+++ b/src/backend/catalog/aclchk.c
@@ -30,6 +30,7 @@
 #include "catalog/pg_collation.h"
 #include "catalog/pg_conversion.h"
 #include "catalog/pg_database.h"
+#include "catalog/pg_diralias.h"
 #include "catalog/pg_default_acl.h"
 #include "catalog/pg_event_trigger.h"
 #include "catalog/pg_extension.h"
@@ -48,6 +49,7 @@
 #include "catalog/pg_ts_config.h"
 #include "catalog/pg_ts_dict.h"
 #include "commands/dbcommands.h"
+#include "commands/diralias.h"
 #include "commands/proclang.h"
 #include "commands/tablespace.h"
 #include "foreign/foreign.h"
@@ -3183,6 +3185,190 @@ ExecGrant_Type(InternalGrant *istmt)
 	heap_close(relation, RowExclusiveLock);
 }
 
+/*
+ * ExecuteGrantDirAliasStmt
+ *   handles the execution of the GRANT/REVOKE ON DIRALIAS command.
+ *
+ * stmt - the GrantDirAliasStmt that describes the directory aliases and
+ *permissions to be granted/revoked.
+ */
+void
+ExecuteGrantDirAliasStmt(GrantDirAliasStmt *stmt)
+{
+	Relation		pg_diralias_rel;
+	Oidgrantor;
+	List		   *grantee_ids = NIL;
+	AclMode			permissions;
+	ListCell	   *item;
+
+	/* Must be superuser to grant directory alias permissions */
+	if (!superuser())
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("must be superuser to grant directory alias permissions")));
+
+	/*
+	 * Grantor is optional.  If it is not provided then set it to the current
+	 * user.
+	 */
+	if (stmt->grantor)
+		grantor = get_role_oid(stmt->grantor, false);
+	else
+		grantor = GetUserId();
+
+	/* Convert grantee names to oids */
+	foreach(item, stmt->grantees)
+	{
+		PrivGrantee *grantee = (PrivGrantee *) lfirst(item);
+
+		if (grantee->rolname == NULL)
+			grantee_ids = lappend_oid(grantee_ids, ACL_ID_PUBLIC);
+		else
+		{
+			Oid roleid = get_role_oid(grantee->rolname, false);
+			grantee_ids = lappend_oid(grantee_ids, roleid);
+		}
+	}
+
+	permissions = ACL_NO_RIGHTS;
+
+	/* If ALL was provided then set permissions to ACL_ALL_RIGHTS_DIRALIAS */
+	if (stmt->permissions == NIL)
+		permissions = ACL_ALL_RIGHTS_DIRALIAS;
+	else
+	{
+		/* Condense all permissions */
+		foreach(item, stmt->permissions)
+		{
+			AccessPriv *priv = (AccessPriv *) lfirst(item);
+			permissions |= string_to_privilege(priv->priv_name);
+		}
+	}
+
+
+	/*
+	 * Though it shouldn't be possible to provide permissions other than READ
+	 * and WRITE, check to make sure no others have been set.  If they have,
+	 * then warn the user and correct the permissions.
+	 */
+	if (permissions & !((AclMode) ACL_ALL_RIGHTS_DIRALIAS))
+	{
+		ereport(WARNING,
+(errcode(ERRCODE_INVALID_GRANT_OPERATION),
+ errmsg("directory aliases only support READ and WRITE permissions")));
+
+		permissions &= ACL_ALL_RIGHTS_DIRALIAS;
+	}
+
+	pg_diralias_rel = heap_open(DirAliasRelationId, RowExclusiveLock);
+
+	/* Grant/Revoke permissions on directory aliases */
+	foreach(item, stmt->directories)
+	{
+		Datum			values[Natts_pg_diralias];
+		bool			replaces[Natts_pg_diralias];
+		bool			nulls[Natts_pg_diralias];
+		ScanKeyData		skey[1];
+		HeapScanDesc	scandesc;
+		HeapTuple		tuple;
+		HeapTuple		new_tuple;
+		Datum			datum;
+		Oidowner_id;
+		Acl			   *dir_acl;
+		Acl			   *new_acl;
+		bool			is_null;
+		intnum_old_members;
+		intnum_new_members;
+		Oid			   *old_members;
+		Oid			   *new_members;
+		Oiddiralias_id;
+		char		   *name;
+
+		name = strVal(lfirst(item));
+
+		ScanKeyInit(&skey[0],
+	Anum_pg_diralias_dirname,
+	BTEqualStrategyNumber, F_NAMEEQ,
+	CStringGetDatum(name));
+
+		scandesc = heap_beginscan_catalog(pg_diralias_rel, 1, skey);
+
+		tuple = heap_getnext(scandesc, ForwardScanDirection);
+
+		if (!HeapTupleIsValid(tuple))
+			elog(ERROR, "could not find tuple for directory alias \"%s\"", name);
+
+		/*
+		 * Get directory alias owner id.  Since all superusers are considered
+		 * to be owners of a directory alias, it is safe to assume that the
+		 * current user is an owner, given the sup

[HACKERS] expected/sequence_1.out obsolete?

2014-10-21 Thread Peter Eisentraut

expected/sequence_1.out hasn't been updated at least since

commit d90ced8bb22194cbb45f58beb0961251103aeff5
Date:   Thu Oct 3 16:17:18 2013 -0400

and nobody appears to have complained.

Can it be removed?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PostgreSQL Service Name Enhancement - Wildcard support for LDAP/DNS lookup

2014-10-21 Thread Tom Lane

"Doyle, Bryan"  writes:
> Would specifying a special value for the service name, perhaps [%], be an 
> acceptable implementation of this enhancement/fix to my above concerns?

> Example:
> # comment
> [%]
> host=%.domain.com
> port=5433
> user=admin

This doesn't seem like a terribly good idea, because such an entry would
capture *any* service name whatsoever.  And, since we check service names
before other possibilities such as host/database names, the entry would
then proceed to capture every possible connection request.

I follow what you're trying to do, but it needs to be a more constrained
syntax.  One possibility is to insist that the wildcard be only a part
of the name string, eg

[myservers-%]
host=%.domain.com
port=5433
user=admin

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Directory/File Access Permissions for COPY and Generic File Access Functions

2014-10-21 Thread Peter Eisentraut

You patch is missing the files src/include/catalog/pg_diralias.h,
src/include/commands/diralias.h, and src/backend/commands/diralias.c.

(Hint: git add -N)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Directory/File Access Permissions for COPY and Generic File Access Functions

2014-10-21 Thread Peter Eisentraut

On 10/16/14 12:01 PM, Stephen Frost wrote:
> This started out as a request for a non-superuser to be able to review
> the log files without needing access to the server.

I think that can be done with a security-definer function.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] PostgreSQL Service Name Enhancement - Wildcard support for LDAP/DNS lookup

2014-10-21 Thread Doyle, Bryan

Hi,



I am looking to patch the LDAP Service Name feature and would like some 
feedback prior to doing so. In general, while I personally view this as a bug 
fix, it could also easily be considered to simply be an enhancement to current 
functionality and therefore the below is written as a proposed enhancement.



Note: This will also potentially serve as my first patch submission, so this 
will be a "smoke test" of my direct interaction with the mailing list as a 
Goldman Sachs Employee for proposal/feedback as well as following up with a 
potential patch/participation in commit processes, etc. As such my replies and 
related code submissions may be delayed for processes to happen on my end, with 
my associated apologies in advance. I hope to have this process completed for 
contribution to the 9.5 release.



## Currently:


* Service names are provided as an available abstraction mechanism for a 
host:port (and user, dbname, ssl requirements, etc.) information for connection 
parameters via libpq.

* If service names are used, they must each individually have an entry in 
pg_service.conf.

* In a service name entry, host:port entries, hosts can either be directly 
provided or their values may be referenced via an LDAP location that provides 
equivalent name/value pairs for connection information.



The service name feature facilitates the use of dynamic/managed host entries or 
allows for active connection configuration management with the goal in either 
case of separating such overhead from application or other process 
configuration.



Therefore:

* Dynamic DNS entries require a fixed port or active management of the file 
(DNS resolution can happen elsewhere)

* Non-Dynamic DNS entries require active management of pg_service.conf, but may 
still be desirable for configuration management reasons - version control?

* LDAP entries are largely static, but do require active management of 
pg_service.conf if new data servers are to be added (or old ones decommissioned)



In all cases described above, a newly provisioned data server requires a new 
service name entry, which requires the file to be actively managed. This 
mitigates the value of using LDAP or Dynamic DNS entries (with fixed ports in 
the DNS case) since an active distribution mechanism must still be in place, 
especially for deployments of many servers and perhaps this useful mechanism is 
used less often as a result.



## Proposal:



* Have a wildcard entry that allows string substitution of the service name 
into either the host field or an LDAP record.



I do not want to over complicate the existing configuration file or parsing 
logic as it is pretty lightweight currently. As such, I am looking for feedback 
related to if it would be acceptable to have a "wildcard" entry for service 
names as well as what that wildcard should be. I do *not* propose that complex 
string substitution, including regular expressions, etc. be utilized for 
service name matching at this time.



Per 31.16 in the 9.4 Documentation, a valid service entry is of the following 
format:



# comment

[mydb]

host=somehost.domain.com

port=5433

user=admin



Would specifying a special value for the service name, perhaps [%], be an 
acceptable implementation of this enhancement/fix to my above concerns?



Example:



# comment

[%]

host=%.domain.com

port=5433

user=admin



...or more interestingly, per 31.17:



# comment

[%]

ldap=ldap://ldap.mycompany.com/dc=mycompany,dc=com?uniqueMember?one?(cn=%)



The location of [%] and therefore order of the service names would still be 
honored. I believe this allows for useful functionality:

* LDAP wildcard listed first - libpq can always try an LDAP lookup and proceed 
with further service names if one is not found in LDAP location

* LDAP wildcard listed last - only look up LDAP entries if a service name 
doesn't match prior service names

* LDAP wildcard in the middle - combination of first 2 behaviors

* DNS wildcard listed last - try connecting with service name in well-formed 
dns-based host entry (this likely can't be first as it would always match)

* Combination of the LDAP/keyword = value entries per 31.17:



>From the existing documentation:  "Processing of pg_service.conf is terminated 
>after a successful LDAP lookup, but is continued if the LDAP server cannot be 
>contacted. This is to provide a fallback with further LDAP URL lines that 
>point to different LDAP servers, classical keyword = value pairs, or default 
>connection options. If you would rather get an error message in this case, add 
>a syntactically incorrect line after the LDAP URL."



The "%" in the ldap, host entry would replace with the service name in use - 
also possible to make it less likely to match a future valid "%" by providing 
it as [%] in the string, though I don't know why a % would reasonably show up 
in an ldap or host record.



I

Re: [HACKERS] run xmllint during build (was Re: need xmllint on borka)

2014-10-21 Thread Peter Eisentraut

On 9/14/14 3:34 AM, Fabien COELHO wrote:
>> and rebased this patch on top of that.
> 
> Applied and tested, everything looks fine.
> 
> The only remaining question is whether the xmllint check should always
> be called. You stated that it was stricter than sgml processing, so I
> would think it worth to always call it, but this is really a marginal
> preference. I think it is okay if some slaves in the build farm do build
> the various targets.

Committed.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Question about RI checks

2014-10-21 Thread Nick Barnes

On Wed, Oct 22, 2014 at 3:19 AM, Kevin Grittner  wrote:

>
> It doesn't seem like this analysis considers all of the available ON
> DELETE and ON UPDATE behaviors available.  Besides RESTRICT there is
> CASCADE, SET NULL, SET DEFAULT, and NO ACTION.  Some of those
> require updating the referencing rows.
>

I think the logic in question is specific to RESTRICT and NO ACTION. The
other cases don't look like they need to explicitly lock anything; the
UPDATE / DELETE itself should take care of that.

On Wed, Oct 22, 2014 at 3:19 AM, Kevin Grittner  wrote:

> Florian Pflug  wrote:
>
> > So in conclusion, the lock avoids raising constraint violation errors in
>
> > a few cases in READ COMMITTED mode. In REPEATABLE READ mode, it converts
> some
> > constraint violation errors into serialization failures. Or at least
> that's
> > how it looks to me.
>
> It doesn't seem like this analysis considers all of the available ON
> DELETE and ON UPDATE behaviors available.  Besides RESTRICT there is
> CASCADE, SET NULL, SET DEFAULT, and NO ACTION.  Some of those
> require updating the referencing rows.
>
> >> And even if the lock serves a purpose, KEY SHARE is an odd choice, since
> >> the referencing field is, in general, not a "key" in this sense.
> >
> > Hm, yeah, that's certainly weird.
>
> I don't think I understand that either.
>
> --
> Kevin Grittner
> EDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

[HACKERS] Getting rid of "accept incoming network connections" prompts on OS X

2014-10-21 Thread Tom Lane

If you do any Postgres development on OS X, you've probably gotten
seriously annoyed by the way that, every single time you reinstall the
postmaster executable, you get a dialog box asking whether you'd like
to allow it to accept incoming network connections.  (At least, you
do unless you disable the OS firewall, which is not a great idea.)
It's particularly awful to run "make check-world" in this environment,
because you get a pop-up for each test install.

My Salesforce colleagues researched how to fix this, and found out
that it can be suppressed if you sign the postgres executable, which
you can easily do with a self-signed certificate.  Once you've allowed
or denied network connections for a signed executable, you don't get
prompted again when the executable is replaced, so long as it's at
the same file path and signed with the same certificate.  So you only
have to dismiss the dialogs once more during a check-world run, and
you're done seeing them.  (Tested on Mavericks and Yosemite, have not
tried anything older.)

Accordingly, we'd like to propose something like the attached patch
to add an optional signing step to the build process.  It lacks any
documentation ATM, but if there are not objections to the basic idea
I'll write some.

regards, tom lane

diff --git a/configure.in b/configure.in
index 527b0762053e38af39c72ad137f52195f81a722b..bf31ecbecd1fbee614152c7fc4ffd709618765da 100644
*** a/configure.in
--- b/configure.in
*** AC_CHECK_PROGS(OSX, [osx sgml2xml sx])
*** 1877,1882 
--- 1877,1912 
  #
  AC_CHECK_PROGS(PROVE, prove)
  
+ #
+ # Do code-signing? (currently only for OS X)
+ #
+ PGAC_ARG_REQ(with, codesigning, [STRING],
+ [use certificate STRING to code-sign binaries])
+ AC_SUBST(with_codesigning)
+ 
+ if test ! -z "$with_codesigning"; then
+   if test "$PORTNAME" = "darwin"; then
+ 
+ AC_CHECK_PROGS(SECURITY, security)
+ AC_CHECK_PROGS(CODESIGN, codesign)
+ 
+ AC_MSG_CHECKING([valid identity for codesigning])
+ cs_valid_identities=`$SECURITY find-identity -p codesigning | sed -n -E -e '/Valid identities only/,$ p' | sed '1 d' | grep "\"$with_codesigning\"" | wc -l`
+ if test $cs_valid_identities -lt 1; then
+   AC_MSG_ERROR([No valid identity '$with_codesigning' found.])
+ elif test $cs_valid_identities -gt 1; then
+   AC_MSG_ERROR([Ambiguous identity '$with_codesigning'.])
+ else
+   AC_MSG_RESULT([$with_codesigning])
+ fi;
+ 
+   else
+ 
+ AC_MSG_ERROR([--with-codesigning is not supported for $PORTNAME port])
+ 
+   fi;
+ fi;
+ 
  # Thread testing
  
  # We have to run the thread test near the end so we have all our symbols
diff --git a/configure b/configure
index f0580ceb5e5dcb3fdae2789f29eaf3bc757d08ae..f222fd30a7c68457f7d614597f81e9d9425e3a3e 100755
*** a/configure
--- b/configure
*** ac_includes_default="\
*** 627,632 
--- 627,635 
  
  ac_subst_vars='LTLIBOBJS
  vpath_build
+ CODESIGN
+ SECURITY
+ with_codesigning
  PROVE
  OSX
  XSLTPROC
*** with_gnu_ld
*** 838,843 
--- 841,847 
  enable_largefile
  enable_float4_byval
  enable_float8_byval
+ with_codesigning
  '
ac_precious_vars='build_alias
  host_alias
*** Optional Packages:
*** 1524,1529 
--- 1528,1535 
use system time zone data in DIR
--without-zlib  do not use Zlib
--with-gnu-ld   assume the C compiler uses GNU ld [default=no]
+   --with-codesigning=STRING
+   use certificate STRING to code-sign binaries
  
  Some influential environment variables:
CC  C compiler command
*** fi
*** 14785,14790 
--- 14791,14929 
  done
  
  
+ #
+ # Do code-signing? (currently only for OS X)
+ #
+ 
+ 
+ 
+ # Check whether --with-codesigning was given.
+ if test "${with_codesigning+set}" = set; then :
+   withval=$with_codesigning;
+   case $withval in
+ yes)
+   as_fn_error $? "argument required for --with-codesigning option" "$LINENO" 5
+   ;;
+ no)
+   as_fn_error $? "argument required for --with-codesigning option" "$LINENO" 5
+   ;;
+ *)
+ 
+   ;;
+   esac
+ 
+ fi
+ 
+ 
+ 
+ 
+ if test ! -z "$with_codesigning"; then
+   if test "$PORTNAME" = "darwin"; then
+ 
+ for ac_prog in security
+ do
+   # Extract the first word of "$ac_prog", so it can be a program name with args.
+ set dummy $ac_prog; ac_word=$2
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for $ac_word" >&5
+ $as_echo_n "checking for $ac_word... " >&6; }
+ if ${ac_cv_prog_SECURITY+:} false; then :
+   $as_echo_n "(cached) " >&6
+ else
+   if test -n "$SECURITY"; then
+   ac_cv_prog_SECURITY="$SECURITY" # Let the user override the test.
+ else
+ as_save_IFS=$IFS; IFS=$PATH_SEPARATOR
+ for as_dir in $PATH
+ do
+   IFS=$as_save_IFS
+   test -z "$as_dir" && as_dir=.
+ for ac_exec_ext in '' $ac_executable_extensions; do
+   if as_fn_executable_p "$as_dir/$ac_word$a

Re: [HACKERS] Question about RI checks

2014-10-21 Thread Nick Barnes

Thanks! I've been mulling this over for weeks; nice to know it wasn't just
staring me in the face...

So in conclusion, the lock avoids raising constraint violation errors in
> a few cases in READ COMMITTED mode. In REPEATABLE READ mode, it converts
> some
> constraint violation errors into serialization failures. Or at least that's
> how it looks to me.
>

Yeah, it had occurred to me that this is one place you might see some
benefit. But waiting around on a potentially irrelevant update, just in
case the RI violation resolves itself, didn't really sound like a net win.
Not to mention the possibility of a deadlock, if the other transaction
updates our PK or adds another reference to it.

Thanks again,
Nick Barnes

Re: [HACKERS] Question about RI checks

2014-10-21 Thread Kevin Grittner

Florian Pflug  wrote:

> So in conclusion, the lock avoids raising constraint violation errors in

> a few cases in READ COMMITTED mode. In REPEATABLE READ mode, it converts some
> constraint violation errors into serialization failures. Or at least that's
> how it looks to me.

It doesn't seem like this analysis considers all of the available ON
DELETE and ON UPDATE behaviors available.  Besides RESTRICT there is
CASCADE, SET NULL, SET DEFAULT, and NO ACTION.  Some of those
require updating the referencing rows.

>> And even if the lock serves a purpose, KEY SHARE is an odd choice, since
>> the referencing field is, in general, not a "key" in this sense.
>
> Hm, yeah, that's certainly weird.

I don't think I understand that either.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [TODO] Track number of files ready to be archived in pg_stat_archiver

2014-10-21 Thread Brightwell, Adam

Julien,


> Actually, I used the same loop as the archiver one (see
> backend/postmaster/pgarch.c, function pgarch_readyXlog) to get the exact
> same number of files.
>

Ah, I see.


> If we change it in this patch, it would be better to change it everywhere.
> What do you think ?
>

Hmm... I'd have to defer to the better judgement of a committer on that
one.  Though, I would think that the general desire would be to keep the
patch relevant ONLY to the necessary changes.  I would not qualify making
those types of changes as relevant, IMHO.  I do think this is potential for
cleanup, however, I would suspect that would be best done in a separate
patch.  But again, I'd defer to a committer whether such changes are even
necessary/acceptable.

-Adam

-- 
Adam Brightwell - adam.brightw...@crunchydatasolutions.com
Database Engineer - www.crunchydatasolutions.com

Re: [HACKERS] Question about RI checks

2014-10-21 Thread Florian Pflug

(CCing Alvaro, since he implemented KEY SHARE locks)

On Oct16, 2014, at 15:51 , Nick Barnes  wrote:
> One of the queries in ri_triggers.c has be a little baffled.
> 
> For (relatively) obvious reasons, a FK insert triggers a SELECT 1 FROM pk_rel 
> ... FOR KEY SHARE.
> For not-so-obvious reasons, a PK delete triggers a SELECT 1 FROM fk_rel ... 
> FOR KEY SHARE.
> 
> I can't see what the lock on fk_rel achieves. Both operations are already
> contending for the lock on the PK row, which seems like enough to cover
> every eventuality.

I remember wondering about this too, quite a while ago. That was before we
had KEY locks, i.e. it simply read "FOR SHARE" back then.

I don't think I ever figured out if and why that lock is necessary - so here's
an attempt at unraveling this.

The lock certainly isn't required to ensure that we see all the child rows in
the fk_rel -- since we're doing this in an AFTER trigger, we're already holding
the equivalent of an UPDATE lock on the parent row, so no new fk_rel rows can
appear concurrently. Note that "seeing" here refers to an up-to-date snapshot
-- in REPEATABLE READ mode and above, our transaction's snapshot doesn't
necessarily see those rows, but ri_PerformCheck ensure that we error out if our
transaction's snapshot prevents us from seeing any newly added child rows
(c.f. detectNewRows).

However, DELETing from fk_rel isn't restricted by any of the constraint 
triggers,
so child rows may still be deleted concurrently. So without the lock, it might
happen that we report a constraint violation, yet the offending row is already
gone by the time we report the error. Note that ri_PerformCheck's detectNewRows
flag doesn't enter here -- AFAIK, it only raises an error if the transaction's
snapshot sees *fewer* rows than a current snapshot. 

So AFAICS, the current behaviour (sans the KEY SHARE / SHARE confusion, see
below) is this:

In READ COMMITED mode, we'll ignore rows that are deleted or reparented
concurrently (after waiting for the concurrent transaction to commit, of course)

In REPEATABLE READ mode and above, we'll report a serialization error if a child
row is deleted or reparented concurrently (if the concurrent transaction 
committed,
of course)

Without the lock, we'd report a constraint violation in both cases.

So in conclusion, the lock avoids raising constraint violation errors in
a few cases in READ COMMITTED mode. In REPEATABLE READ mode, it converts some
constraint violation errors into serialization failures. Or at least that's
how it looks to me.

> And even if the lock serves a purpose, KEY SHARE is an odd choice, since
> the referencing field is, in general, not a "key" in this sense.

Hm, yeah, that's certainly weird. AFAICS, what this will do is prevent any
concurrent update of any columns mentions in any unique index on the *fk_rel*
- but as you say, that set doesn't necessarily the set of columns in the
FK constraint at all.

So currently, it seems that the lock only prevent concurrent DELETES, but 
not necessarily concurrent UPDATEs, even if such an UPDATE changes the parent
that a child row refers to.

Independent from whether the lock is actually desirable or not, that
inconsistency certainly looks like a bug to me...

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-10-21 Thread Amit Kapila

On Wed, Oct 8, 2014 at 6:17 PM, Andres Freund 
wrote:
>
> On 2014-06-25 19:06:32 +0530, Amit Kapila wrote:
> > 2.
> > LWLockWakeup()
> > {
> > ..
> > #ifdef LWLOCK_STATS
> > lwstats->spin_delay_count += SpinLockAcquire(&lock->mutex);
> > #else
> > SpinLockAcquire(&lock->mutex);
> > #endif
> > ..
> > }
> >
> > Earlier while releasing lock, we don't count it towards LWLock stats
> > spin_delay_count.  I think if we see other places in lwlock.c, it only
gets
> > counted when we try to acquire it in a loop.
>
> I think the previous situation was clearly suboptimal. I've now modified
> things so all spinlock acquirations are counted.

Code has mainly 4 stats (sh_acquire_count, ex_acquire_count,
block_count, spin_delay_count) to track, if I try to see
all stats together to understand the contention situation,
the unpatched code makes sense.   spin_delay_count gives
how much delay has happened to acquire spinlock which when
combined with other stats gives the clear situation about
the contention around aquisation of corresponding LWLock.
Now if we want to count the spin lock delay for Release call
as well, then the meaning of the stat is getting changed.
It might be that new meaning of spin_delay_count stat is more
useful in some situations, however the older one has its own
benefits, so I am not sure if changing this as part of this
patch is the best thing to do.

> > 3.
> > LWLockRelease()
> > {
> > ..
> > /* grant permission to run, even if a spurious share lock increases
> > lockcount */
> > else if (mode == LW_EXCLUSIVE && have_waiters)
> > check_waiters = true;
> > /* nobody has this locked anymore, potential exclusive lockers get a
chance
> > */
> > else if (lockcount == 0 && have_waiters)
> > check_waiters = true;
> > ..
> > }
> >
> > It seems comments have been reversed in above code.
>
> No, they look right. But I've expanded them in the version I'm going to
> post in a couple minutes.

okay.

> > 5.
> > LWLockWakeup()
> > {
> > ..
> > dlist_foreach_modify(iter, (dlist_head *) &wakeup)
> > {
> > PGPROC *waiter = dlist_container(PGPROC, lwWaitLink, iter.cur);
> > LOG_LWDEBUG("LWLockRelease", l, mode, "release waiter");
> > dlist_delete(&waiter->lwWaitLink);
> > pg_write_barrier();
> > waiter->lwWaiting = false;
> > PGSemaphoreUnlock(&waiter->sem);
> > }
> > ..
> > }
> >
> > Why can't we decrement the nwaiters after waking up? I don't think
> > there is any major problem even if callers do that themselves, but
> > in some rare cases LWLockRelease() might spuriously assume that
> > there are some waiters and tries to call LWLockWakeup().  Although
> > this doesn't create any problem, keeping the value sane is good unless
> > there is some problem in doing so.
>
> That won't work because then LWLockWakeup() wouldn't be called when
> necessary - precisely because nwaiters is 0.

> The reason I've done so is that it's otherwise much harder to debug
> issues where there are backend that have been woken up already, but
> haven't rerun yet. Without this there's simply no evidence of that
> state. I can't see this being relevant for performance, so I'd rather
> have it stay that way.

I am not sure what useful information we can get during debugging by not
doing this in LWLockWakeup() and w.r.t performance it can lead extra
function call, few checks and I think in some cases even can
acquire/release spinlock.

> > 6.
> > LWLockWakeup()
> > {
> > ..
> > dlist_foreach_modify(iter, (dlist_head *) &l->waiters)
> > {
> > ..
> > if (wokeup_somebody && waiter->lwWaitMode == LW_EXCLUSIVE)
> > continue;
> > ..
> > if (waiter->lwWaitMode != LW_WAIT_UNTIL_FREE)
> > {
> > ..
> > wokeup_somebody = true;
> > }
> > ..
> > }
> > ..
> > }
> >
> > a.
> > IIUC above logic, if the waiter queue is as follows:
> > (S-Shared; X-Exclusive) S X S S S X S S
> >
> > it can skip the exclusive waiters and release shared waiter.
> >
> > If my understanding is right, then I think instead of continue, there
> > should be *break* in above logic.
>
> No, it looks correct to me. What happened is that the first S was woken
> up. So there's no point in waking up an exclusive locker, but further
> non-exclusive lockers can be woken up.

Okay, even then it makes the current logic of wakingup
different which I am not sure is what this patch is intended
for.

> > b.
> > Consider below sequence of waiters:
> > (S-Shared; X-Exclusive) S S X S S
> >
> > I think as per un-patched code, it will wakeup waiters uptill
(including)
> > first Exclusive, but patch will wake up uptill (*excluding*) first
> > Exclusive.
>
> I don't think the current code does that.

LWLockRelease()
{
..
/*
 * If the front waiter wants exclusive lock, awaken him only.
 *
Otherwise awaken as many waiters as want shared access.
 */
if (proc-
>lwWaitMode != LW_EXCLUSIVE)
{
while (proc->lwWaitLink !=
NULL &&
   proc->lwWaitLink->lwWaitMode != LW_EXCLUSIVE)
 {
if (proc->lwWaitMode != LW_WAIT_UNTIL_FREE)
 releaseOK = false;
proc = proc->lwWaitLink;
 }
}
/* proc is now the last PGPROC to be

Re: [HACKERS] Trailing comma support in SELECT statements

2014-10-21 Thread Tom Lane

Jim Nasby  writes:
> On 10/20/14, 11:16 AM, Andrew Dunstan wrote:
>> The JSON spec is quite clear on this. Leading and trailing commas are not 
>> allowed. I would fight tooth and nail not to allow it for json (and by 
>> implication jsonb, since they use literally the same parser - in fact we do 
>> that precisely so their input grammars can't diverge).

> +1. Data types that implement specs should follow the spec.

> I was more concerned about things like polygon, but the real point (ha!) is 
> that we need to think about the data types too. (I will say I don't think 
> things that mandate an exact number of elements (like point, box, etc) should 
> support extra delimiters).

I'm pretty strongly against this, as it would create cross-version hazards
for data.  Having queries that depend on newer-version SQL features is
something that people are used to coping with ... but data that loads into
some versions and not others seems like a hassle we do not need to invent.

(Of course, I'm not for the feature w.r.t. SQL either.  But breaking data
compatibility is just adding an entire new dimension of trouble.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inconsistencies in documentation of row-level locking

2014-10-21 Thread Michael Paquier

On Tue, Oct 21, 2014 at 10:26 AM, Jim Nasby 
wrote:

> Did this get committed? Should probably add it to the commitfest if not...
>
Already done in CF3, I should have mentioned it:
https://commitfest.postgresql.org/action/patch_view?id=1594
-- 
Michael

Re: [HACKERS] Would you help to review our modifications

2014-10-21 Thread Merlin Moncure

On Mon, Oct 20, 2014 at 11:02 AM, David G Johnston
 wrote:
> rohtodeveloper wrote
>> So how to deal with this kind of situation if I want a implicit
>> conversion?
>
> As of the out-of-support 8.3 release many of the implicit casts previously
> defined have been changed to explicit casts.  It is a catalog change -
> obviously, since you can still define implicit casts - so if you absolutely
> must have the pre-existing cast be implicit you can modify the catalog
> directly.
>
> You may wish to describe why you think this is the solution you need - with
> implicit casting there are generally more downsides that upsides.

I feel your pain. My company just last year completed a nine month
effort to validate a sprawling code base for post 8.3 casts.  We were
orphaned on 8.1 and were very nearly forced to switch to another
database.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support

2014-10-21 Thread Florian Pflug

On Oct21, 2014, at 02:53 , Wim Lewis  wrote:
> 
>> 2) AFAICS, this .plist file doesn't do anything about launchd's habit of
>> not waiting for the network to come up. 

>> If true, the job will be kept alive as long as the network is up, where
>> up is defined as at least one non-loopback  interface being up and having
>> IPv4 or IPv6 addresses assigned to them.  If false, the job will be kept
>> alive in the inverse condition.
> 
> On the other hand, it’s not unreasonable to have postgres running on a
> machine with only a loopback interface, depending on the use.

This, I think, shows the gist of the problem -- "Networking is up" is not
a clearly defined stated on modern desktop machines. Interfaces can come and
go all the time, and even things like ethernet may take a while to come up
during boot, or even depend on a user logging in if something like 802.11X
is used.

>> We might be able to put something in LaunchEvents that gets it to fire
>> when the network launches, but documentation is hella thin (and may only
>> be supported on Yosemite, where there are a bunch of poorly-documented
>> launchd changes).
> 
> If one were desperate enough... it’s possible to dig through the launchd
> sources to make up for the gaps in the documentation (also on
> opensource.apple.com; there used to be a community-ish site for it at
> macosforge.org as well). It’s rough going, though, IIRC.

The correct way to deal with this would be to teach postgres to react to
connectivity changes dynamically. During postmaster launch, we'd ignore all
"listen" entries which mention non-bindable addresses. If the network
configuration changes, we'd rescan the "listen" entries, and attempt to
create any missing sockets.

OS X provides an API (via the SystemConfiguration framework IIRC) which allows
to subscribe to networking-changed notifications. I believe systemd provides
something similar via dbus or some such.

Whether or not doing this is worth the effort though, I don't know. I guess
it depends on whether postgres on OS X is actually used for productions --
my guess is that most installations are for development and testing purposes,
but maybe my view's biased there.

>>> (3) I don't think you want Disabled = true.
>> 
>> It’s the default. When you run `launchctl load -w` it overrides it to false 
>> in
>> its database. I’m fine to have it be less opaque, though.
> 
> Yes, AFAICT it’s conventional to specify Disabled=true in a launchd plist and
> use launchctl to enable the item.

Yup, macports also has Disabled=true in the launchd items they install.

best regards,
Florian Pflug

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [TODO] Track number of files ready to be archived in pg_stat_archiver

2014-10-21 Thread Julien Rouhaud

On Tue, Oct 21, 2014 at 7:35 AM, Brightwell, Adam <
adam.brightw...@crunchydatasolutions.com> wrote:

> Julien,
>
> The following is an initial review:
>
>
Thanks for the review.


> * Applies cleanly to master (f330a6d).
> * Regression tests updated and pass, including 'check-world'.
> * Documentation updated and builds successfully.
> * Might want to consider replacing the following magic number with a
> constant or perhaps calculated value.
>
> +   int basenamelen = (int) strlen(rlde->d_name) - 6;
>
> * Wouldn't it be easier, or perhaps more reliable to use "strrchr()" with
> the following instead?
>
> +   strcmp(rlde->d_name + basenamelen, ".ready") == 0)
>
> char *extension = strrchr(ride->d_name, '.');
> ...
> strcmp(extension, ".ready") == 0)
>
> I think this approach might also help to resolve the magic number above.
> For example:
>
> char *extension = strrchr(ride->d_name, '.');
> int basenamelen = (int) strlen(ride->d_name) - strlen(extension);
>
>
Actually, I used the same loop as the archiver one (see
backend/postmaster/pgarch.c, function pgarch_readyXlog) to get the exact
same number of files.

If we change it in this patch, it would be better to change it everywhere.
What do you think ?

-Adam
>
> --
> Adam Brightwell - adam.brightw...@crunchydatasolutions.com
> Database Engineer - www.crunchydatasolutions.com
>

Re: [HACKERS] [PATCH] Simplify EXISTS subqueries containing LIMIT

2014-10-21 Thread Marti Raudsepp

Hi

Thanks for taking a look.

On Sun, Oct 19, 2014 at 1:22 PM, David Rowley  wrote:
> the argument for this would
> have been much stronger if anti join support had just been added last week.
> It's been quite a few years now and the argument for this must be getting
> weaker with every release.

I see your point, but I would put it another way: we've had this for a
few years, but people haven't learned and are *still* using LIMIT 1.

Actually I thought of a cleverer approach to solve this, that doesn't
require calling eval_const_expressions and works with any expression.

Given query:
  EXISTS (SELECT ... WHERE foo LIMIT any_expr())
we could turn it into the almost-equivalent form:
  EXISTS (SELECT ... WHERE foo AND any_expr() > 0)

The only problem is that we'd no longer be able to throw an error for
negative values and that seems like a deal-breaker.

Regards,
Marti

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add launchd Support

2014-10-21 Thread Marti Raudsepp

On Tue, Oct 21, 2014 at 3:53 AM, Wim Lewis  wrote:
> I think the idea of OnDemand is for launchd items to act a bit like inetd
> does: launchd creates the listening socket (or mach port or file-change
> notification) on the port specified in the plist, and only starts the
> process when someone tries to connect to it. This might not be a terrible
> thing to support, but it would require more changes throughout postgres
> (postmaster would have to check in with launchd at start time to get the
> listen socket; ports and socket paths would no longer be specifiable in
> postgres’ config, etc) and I’m skeptical that it’d be worth the work
> without a more concrete motivation.

systemd socket activation on Linux works pretty much the same way. If
this will ever be implemented, it's most reasonable to tackle both
launchd and systemd together.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] inherit support for foreign tables

2014-10-21 Thread Etsuro Fujita


(2014/10/14 20:00), Etsuro Fujita wrote:

Here are separated patches.

fdw-chk.patch  - CHECK constraints on foreign tables
fdw-inh.patch  - table inheritance with foreign tables

The latter has been created on top of [1].



[1] http://www.postgresql.org/message-id/540da168.3040...@lab.ntt.co.jp


To be exact, it has been created on top of [1] and fdw-chk.patch.

I noticed that the latter disallows TRUNCATE on inheritance trees that 
contain at least one child foreign table.  But I think it would be 
better to allow it, with the semantics that we quietly ignore the child 
foreign tables and apply the operation to the child plain tables, which 
is the same semantics as ALTER COLUMN SET STORAGE on such inheritance 
trees.  Comments welcome.


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Wait free LW_SHARED acquisition - v0.9

2014-10-21 Thread Amit Kapila

On Fri, Oct 17, 2014 at 11:41 PM, Andres Freund 
wrote:
> On 2014-10-17 17:14:16 +0530, Amit Kapila wrote:
> > On Tue, Oct 14, 2014 at 11:34 AM, Amit Kapila 
> > wrote:
> >  HEAD – commit 494affb + wait free lw_shared_v2
> >
> >  Shared_buffers=8GB; Scale Factor = 3000
> >
> >  Client Count/No. Of Runs (tps) 64 128  Run-1 286209 274922  Run-2
289101
> > 274495  Run-3 289639 273633
>
> So here the results with LW_SHARED were consistently better, right?

Yes.

> You
> saw performance degradations here earlier?

Yes.

> > So I am planning to proceed further with the review/test of your
> > latest patch.
>
> > According to me, below things are left from myside:
> > a. do some basic tpc-b tests with patch

I have done few tests, the results of which are below, the data indicates
that neither there is any noticeable gain nor any noticeable loss on tpc-b
tests which I think is what could have been expected of this patch.
There is slight variation at few client counts (for sync_commit =off,
at 32 and 128), however I feel that is just noise as I don't see any
general trend.

Performance Data

IBM POWER-8 24 cores, 192 hardware threads
RAM = 492GB
Database Locale =C
max_connections =300
checkpoint_segments=300
checkpoint_timeout=15min
maintenance_work_mem = 1GB
checkpoint_completion_target = 0.9
Client Count = number of concurrent sessions and threads (ex. -c 8 -j 8)
Duration of each individual run = 30mins
Test mode - tpc-b

Below data is median of 3 runs, detailed data is attached with this
mail.


Scale_factor =3000; shared_buffers=8GB;

  Patch/Client_count 8 16 32 64 128  HEAD 3849 4889 3569 3845 4547
LW_SHARED 3844 4787 3532 3814 4408

Scale_factor =3000; shared_buffers=8GB; synchronous_commit=off;

  Patch/Client_count 8 16 32 64 128  HEAD 5966 8297 10084 9348 8836
LW_SHARED 6070 8612 8839 9503 8584

While doing performance tests, I noticed a hang at higher client
counts with patch. I have tried to check call stack for few of
processes and it is as below:

#0  0x008010933e54 in .semop () from /lib64/libc.so.6
#1  0x10286e48 in .PGSemaphoreLock ()
#2  0x102f68bc in .LWLockAcquire ()
#3  0x102d1ca0 in .ReadBuffer_common ()
#4  0x102d2ae0 in .ReadBufferExtended ()
#5  0x100a57d8 in ._bt_getbuf ()
#6  0x100a6210 in ._bt_getroot ()
#7  0x100aa910 in ._bt_search ()
#8  0x100ab494 in ._bt_first ()
#9  0x100a8e84 in .btgettuple ()
..

#0  0x008010933e54 in .semop () from /lib64/libc.so.6
#1  0x10286e48 in .PGSemaphoreLock ()
#2  0x102f68bc in .LWLockAcquire ()
#3  0x102d1ca0 in .ReadBuffer_common ()
#4  0x102d2ae0 in .ReadBufferExtended ()
#5  0x100a57d8 in ._bt_getbuf ()
#6  0x100a6210 in ._bt_getroot ()
#7  0x100aa910 in ._bt_search ()
#8  0x100ab494 in ._bt_first ()
...

The test configuration is as below:
Test env - Power - 7 (hydra)
scale_factor - 3000
shared_buffers - 8GB
test mode - pgbench read only

test execution -
./pgbench -c 128 -j 128 -T 1800 -S -M prepared postgres

I have ran it for half an hour, but it doesn't came out even after
~2 hours.  It doesn't get reproduced every time, currently I am
able to reproduce it and the m/c is in same state, if you want any
info, let me know (unfortunately binaries are in release mode, so
might not get enough information).


> > b. re-review latest version posted by you
>
> Cool!

I will post my feedback for code separately, once I am able to
completely review the new versions.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


perf_lwlock_contention_tpcb_data_v1.ods
Description: application/vnd.oasis.opendocument.spreadsheet

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user/role CURRENT_USER

2014-10-21 Thread Kyotaro HORIGUCHI

Hello,

> Kyotaro,
> 
> Food for thought.  Couldn't you reduce the following block:
> 
> + if (strcmp(stmt->role, "current_user") == 0)
> + {
> + roleid = GetUserId();
> + tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
> + if (!HeapTupleIsValid(tuple))
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("roleid %d does not exist", roleid)));
> + }
> + else
> + {
> + tuple = SearchSysCache1(AUTHNAME, PointerGetDatum(stmt->role));
> + if (!HeapTupleIsValid(tuple))
> + ereport(ERROR,
> + (errcode(ERRCODE_UNDEFINED_OBJECT),
> + errmsg("role \"%s\" does not exist", stmt->role)));
> 
> To:
> 
> if (strcmp(stmt->role, "current_user") == 0)
> roleid = GetUserId();
> else
> roleid = get_role_oid(stmt->role, false);
> 
> tuple = SearchSysCache1(AUTHOID, ObjectIdGetDatum(roleid));
> 
> if (!HeapTupleIsValid(tuple))
> ereport(ERROR,
> (errcode(ERRCODE_UNDEFINED_OBJECT),
>  errmsg("roleid %d does not exist", roleid)));
> 
> I think this makes it a bit cleaner.  It also reuses existing code as
> 'get_role_oid()' already does a valid role name check and will raise the
> proper error.

Year, far cleaner. I missed the function. Thank you.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user/role CURRENT_USER

2014-10-21 Thread Kyotaro HORIGUCHI

Thank you for reviewing,

 2014 13:10:57 +0530, Rushabh Lathia  wrote in 

> I gone through patch and here is the review for this patch:
> 
> 
> .) patch go applied on master branch with patch -p1 command
>(git apply failed)
> .) regression make check run fine
> .) testcase coverage is missing in the patch
> .) Over all coding/patch looks good.
> 
> Few comments:
> 
> 1) Any particular reason for not adding SESSION_USER/USER also made usable
> for this command ?

No particular reson. This patch was the first step and if this is
the adequate way and adding them is desirable, I will add them.

> 2) I think RoleId_or_curruser can be used for following role:
> 
> /* ALTER TABLE  OWNER TO RoleId */
> | OWNER TO RoleId

Good catch. I'll try it.

> 3) In the documentation patch, added for CURRENT_USER but CURRENT_ROLE is
> missing.

Mmm.. I didn't added CURRENT_ROLE in the previous patch.

I suppose CURRENT_ROLE is an implicit alias of CURRENT_USER
because it is not explained in the page below in spite of
existing in syntax.

http://www.postgresql.org/docs/9.4/static/functions-info.html

and it is currently usable only in expressions, so the following
SQL failed and all syntax using auth_ident will fail.

postgres=# CREATE USER MAPPING FOR CURRENT_ROLE SERVER s1;
ERROR:  syntax error at or near "current_role"

share/doc/html/sql-createusermapping.html

| Synopsis
| 
| CREATE USER MAPPING FOR { user_name | USER | CURRENT_USER | PUBLIC }

I don't know why the 'USER' is added here, but anyway I can add
'CURRENT_ROLE' there in gram.y but it seems not necessary to add
it to document.


Ok, I'll modify this patch so that,

  - Make CURRENT_USER/ROLE usable in TABLE OWNER TO.

and since adding CURRENT_ROLE is the another thing, I'll do the
following things as additional patch.

 - Add USER, CURRENT_ROLE and SESSION_* for the place where
   CURRENT_USER is usable now.  auth_ident and
   RoleId_or_curruser.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

46 matches

Mail list logo