date:20140702

On Tue, Jul 1, 2014 at 7:18 AM, Fujii Masao masao.fu...@gmail.com wrote:

 On Tue, Jul 1, 2014 at 5:51 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
  Thanks for the review!
 
 
  On 06/27/2014 09:12 AM, Michael Paquier wrote:
 
  Looking at this patch, it does not compile when assertions are enabled
  because of this assertion in heapam.c:
  Assert(recdata == data + datalen);
  It seems like a thinko as removing it makes the code compile.
 
 
  Fixed.
 
 
  Some comments about the code:
  1) In src/backend/access/transam/README:
  's/no further action is no required/no further action is required/g'
 
 
  Fixed.
 
 
  2) Description of XLogBeginInsert, XLogHasBlockRef, XLogGetBlockRef and
  XLogGetBlockRefIds are missing in src/backend/access/transam/README.
 
 
  Added descriptions of XLogBeginInsert and XLogHasBlockRef. Ḯ'm not sure
  what to do with the latter two. They're not really intended for use by
 redo
  routines, but for things like pg_xlogdump that want to extract
 information
  about any record type.
 
 
  3) There are a couple of code blocks marked as FIXME and BROKEN. There
 are
  as well some NOT_USED blocks.
 
 
  Yeah. The FIXMEs and BROKEN blocks are all in rm_desc routines. They
 mostly
  stand for code that used to print block numbers or relfilenodes, which
  doesn't make much sense to do in an ad hoc way in rmgr-specific desc
  routines anymore. Will need to go through them all an decide what's the
  important information to print for each record type.
 
  The few NOT_USED blocks are around code in redo routines that extract
 some
  information from the WAL record, that isn't needed anymore. We could
 remove
  the fields from the WAL records altogether, but might be useful to keep
 for
  debugging purposes.
 
 
4) Current patch crashes when running make check in
  contrib/test_decoding.
  Looking closer, wal_level = logical is broken:
  =# show wal_level;
wal_level
  ---
logical
  (1 row)
  =# create database foo;
  The connection to the server was lost. Attempting reset: Failed.
 
 
  Fixed.
 
 
  Looking even closer, log_heap_new_cid is broken:
 
 
  Fixed.
 
 
  6) XLogRegisterData creates a XLogRecData entry and appends it in one of
  the static XLogRecData entries if buffer_id = 0 or to
  rdata_head/rdata_tail if buffer_id = -1 (for permanent data related to
 the
  WAL record). XLogRegisterXLogRecData does something similar, but with an
  entry of XLogRecData already built. What do you think about removing
  XLogRegisterXLogRecData and keeping only XLogRegisterData. This makes
 the
  interface simpler, and XLogRecData could be kept private in xlog.c. This
  would need more work especially on gin side where I am seeing for
 example
  constructLeafRecompressWALData directly building a XLogRecData entry.
 
 
  Hmm. I considered that, but punted because I didn't want to rewrite all
 the
  places that currently use XLogRecDatas in less straightforward ways. I
 kept
  XLogRegisterXLogRecData as an escape hatch for those.
 
  But I bit the bullet now, and got rid of all the
 XLogRegisterXLogRecData()
  calls. XLogRecData struct is now completely private to xlog.c.
 
  Another big change in the attached patch version is that
 XLogRegisterData()
  now *copies* the data to a working area, instead of merely adding a
  XLogRecData entry to the chain. This simplifies things in xlog.c, now
 that
  the XLogRecData concept is not visible to the rest of the system. This
 is a
  subtle semantic change for the callers: you can no longer register a
 struct
  first, and fill the fields afterwards.
 
  That adds a lot of memcpys to the critical path, which could be bad for
  performance. In practice, I think it's OK, or even a win, because a
 typical
  WAL record payload is very small. But I haven't measured that. If it
 becomes
  an issue, we could try to optimize, and link larger data blocks to the
 rdata
  chain, but memcpy small small chunks of data. There are a few WAL record
  types that don't fit in a small statically allocated buffer, so I
 provided a
  new function, XLogEnsureSpace(), that can be called before XLogBegin() to
  allocate a large-enough working area.
 
  These changes required changing a couple of places where WAL records are
  created:
 
  * ginPlaceToPage. This function has a strange split of responsibility
  between ginPlaceToPage and dataPlaceToPage/entryPlaceToPage.
 ginPlaceToPage
  calls dataPlaceToPage or entryPlaceToPage, depending on what kind of a
 tree
  it is, and dataPlaceToPage or entryPlaceToPage contributes some data to
 the
  WAL record. Then ginPlaceToPage adds some more, and finally calls
  XLogInsert(). I simplified the SPLIT case so that we now just create full
  page images of both page halves. We were already logging all the data on
  both page halves, but we were doing it in a smarter way, knowing what
 kind
  of pages they were. For example, for an entry tree page, we included an
  IndexTuple struct for every tuple on the

Re: [HACKERS] 9.5 CF1

Performance
---

scalable LWLocks
Generic atomics implementation
Both under active discussion.

KNN-GiST with recheck
Partial sort
Some earlier discussion, but no conclusions, and as far as I know
nobody is working on these patches at the moment.

lowering array_agg memory requirements
I'm waiting to hear from Tomas if he expects to post an updated
version of the patch soon. Otherwise I'll move it to 2014-08.

Don't require a NBuffer sized PrivateRefCount array of local buffer pins
Pending performance tests before commit. Is anyone working on this?

Allow more join types to be removed
Updated patch pending review from Simon. Updates welcome.

Sort support for text with strxfrm() poor man's keys
Lots of earlier discussion, but no conclusions as far as I can tell.
Does anyone want to take a look at this?

Scaling shared buffer eviction
Pending review by Andres. Any updates?

Spread shared memory across NUMA memory nodes
Marked waiting on author, but status unclear. Any updates?

Use unique index for longer pathkeys
No reviews, no reviewers. I took a quick look, and it's not clear if
this is useful in itself, or if it just enables more interesting
optimisations later. Does anyone want to look at this?

XLogLockBlockRangeForCleanup
Amit Khandekar plans to post a review this week.

SKIP LOCKED
Two patches available: original by Simon, updated/merged version by
Thomas Munro. Simon said he'll look into both with a view towards
committing the functionality. Timeline not yet clear.

Allow NOT IN to use anti joins
Revised patch marked pending review. Updates welcome.

CSN snapshots
Work in progress, plenty of discussion, no reviews. Probably nothing
to do here. (Heikki, are you waiting for feedback or anything?)

There's not even one ready for committer patch in this category.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RETURNING PRIMARY KEY syntax extension

2014-07-02 Thread Ian Barwick

On 14/07/01 23:13, Robert Haas wrote:
 On Tue, Jul 1, 2014 at 8:00 AM, Rushabh Lathia rushabh.lat...@gmail.com 
 wrote:
 .) In map_primary_key_to_list() patch using INDEX_ATTR_BITMAP_IDENTITY_KEY
 bitmap to get the keycols. In IndexAttrBitmapKind there is also
 INDEX_ATTR_BITMAP_KEY, so was confuse between INDEX_ATTR_BITMAP_KEY and
 INDEX_ATTR_BITMAP_IDENTITY_KEY and also haven't found related comments in
 the code. Later with use of testcase and debugging found confirmed that
 INDEX_ATTR_BITMAP_IDENTITY_KEY only returns the Primary key.
 
 Actually, that depends on how REPLICA IDENTITY is set.  IOW, you can't
 assume that will give you the primary key.

Damn, fooled by the name. Thanks for the info; I'll rework the patch
accordingly.

Regards


Ian Barwick


-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.5

2014-07-02 Thread Ants Aasma

On Fri, Jun 27, 2014 at 9:00 PM, Andres Freund and...@2ndquadrant.com wrote:
 Imagine the situation for the buffer header spinlock which is one of the
 bigger performance issues atm. We could aim to replace all usages of
 that with clever and complicated logic, but it's hard.

 The IMO reasonable (and prototyped) way to do it is to make the common
 paths lockless, but fall back to the spinlock for the more complicated
 situations. For the buffer header that means that pin/unpin and buffer
 lookup are lockless, but IO and changing the identity of a buffer still
 require the spinlock. My attempts to avoid the latter basically required
 a buffer header specific reimplementation of spinlocks.

There is a 2010 paper [1] that demonstrates a fully non-blocking
approach to buffer management using the same generalized clock
algorithm that PostgreSQL has. The site also has an implementation for
Apache Derby. You may find some interesting ideas in there.

[1] 
http://code.google.com/p/derby-nb/source/browse/trunk/derby-nb/ICDE10_conf_full_409.pdf

Regards,
Ants Aasma
-- 
Cybertec Schönig  Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] alter user set local_preload_libraries.

2014-07-02 Thread Kyotaro HORIGUCHI

Hello, This is about a document fix of local_preload_libraries
for the versions 9.3 and earlier to at least 8.4.


There's inconsistency between documentation about
local_preload_libraries for 9.3 and earlier and their behavior.

The 9.3 document for local_preload_libraries says that:

http://www.postgresql.org/docs/9.3/static/runtime-config-client.html

 | For example, debugging could be enabled for all sessions under
 | a given user name by setting this parameter with ALTER ROLE
 | SET.

Ok, let's do that.

postgres=# alter user horiguti2 set local_preload_libraries='libname';
ERROR:  parameter local_preload_libraries cannot be set after connection start

Back to 8.4 shows the same behavior.

session_preload_libraries works as expected since 9.4. It is
added at 9.4 vested the context PGC_SUSET so its setting in
pg_db_role_setting can complete its mission. On the other hand,
local_preload_libraries seems to continue to have the context
PCG_BACKEND and the behavior is same to the older versions, as
shown above. And the description about 'ALTER ROLE SET' has been
removed from config.sgml. These are done by the commit
070518ddab2.

I think we should regard the real behavior as right, right?

Eventually, local_preload_libraries seems to have almost the same
function to shared_preload_libraries, except a restriction on
load module directory.

Putting aside the meaning of its existence, anyway the third
paragraph of the description for local_preload_libraries should
be removed. The change made by the attached patch for 9.3 will
show in runtime-config-client.html 18.11. Client Connection
Defaults


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index cc6dcaf..a88c58c 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5672,17 +5672,6 @@ dynamic_library_path = 'C:\tools\postgresql;H:\my_project\lib;$libdir'
/para
 
para
-Unlike xref linkend=guc-shared-preload-libraries, there is no
-performance advantage to loading a library at session
-start rather than when it is first used.  Rather, the intent of
-this feature is to allow debugging or performance-measurement
-libraries to be loaded into specific sessions without an explicit
-commandLOAD/ command being given.  For example, debugging could
-be enabled for all sessions under a given user name by setting
-this parameter with commandALTER ROLE SET/.
-   /para
-
-   para
 If a specified library is not found,
 the connection attempt will fail.
/para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

On Mon, Jun 23, 2014 at 5:42 PM, Fujii Masao masao.fu...@gmail.com wrote:
 On Sat, Jun 21, 2014 at 12:59 PM, Joe Conway m...@joeconway.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 06/13/2014 07:29 AM, Tom Lane wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 On Thu, Jun 12, 2014 at 8:51 PM, Fujii Masao
 masao.fu...@gmail.com wrote:
 Some users enable log_disconnections in postgresql.conf to
 audit all logouts. But since log_disconnections is defined with
 PGC_BACKEND, it can be changed at connection start. This means
 that any client (even nonsuperuser) can freely disable
 log_disconnections not to log his or her logout even when the
 system admin enables it in postgresql.conf. Isn't this
 problematic for audit?

 That's harmful for audit purpose. I think that we should make
 log_disconnections PGC_SUSET rather than PGC_BACKEND in order to
 forbid non-superusers from changing its setting. Attached patch
 does this.

 This whole argument seems wrong unless I'm missing something:

 test=# set log_connections = on;
 ERROR:  parameter log_connections cannot be set after connection start
 test=# set log_disconnections = off;
 ERROR:  parameter log_disconnections cannot be set after connection
 start

Hmm... I found that you had marked this proposal as Returned with Feedback.
But I don't think that we reached the consensus to do that. I think that it's
still worth discussing this topic in this CF. So I marked this as Needs Review
again.

If you strongly think that this proposal should be marked as
Returned with Feedback, could you let me know why you think so?

Regards

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL format and API changes (9.5)

On Tue, Jul 1, 2014 at 5:51 AM, Heikki Linnakangas hlinnakan...@vmware.com
wrote:

 On 06/27/2014 09:12 AM, Michael Paquier wrote:

 2) Description of XLogBeginInsert, XLogHasBlockRef, XLogGetBlockRef and
 XLogGetBlockRefIds are missing in src/backend/access/transam/README.


 Added descriptions of XLogBeginInsert and XLogHasBlockRef. Ḯ'm not sure
 what to do with the latter two. They're not really intended for use by redo
 routines, but for things like pg_xlogdump that want to extract information
 about any record type.


That's definitely not part of the redo section or WAL construction section.
It could be a new section in the README describing how to get the list of
blocks touched by a WAL record. I'd rather have those APIs documented
somewhere than nowhere, and the README would be well-suited for that (as
well as in-code comments) as those APIs are aimed at developers.


  3) There are a couple of code blocks marked as FIXME and BROKEN. There are
 as well some NOT_USED blocks.


 Yeah. The FIXMEs and BROKEN blocks are all in rm_desc routines. They
 mostly stand for code that used to print block numbers or relfilenodes,
 which doesn't make much sense to do in an ad hoc way in rmgr-specific desc
 routines anymore. Will need to go through them all an decide what's the
 important information to print for each record type.

 The few NOT_USED blocks are around code in redo routines that extract some
 information from the WAL record, that isn't needed anymore. We could remove
 the fields from the WAL records altogether, but might be useful to keep for
 debugging purposes.

They could be removed and be kept as a part of the git history... That's
not really a vital point btw.



  6) XLogRegisterData creates a XLogRecData entry and appends it in one of
 the static XLogRecData entries if buffer_id = 0 or to
 rdata_head/rdata_tail if buffer_id = -1 (for permanent data related to the
 WAL record). XLogRegisterXLogRecData does something similar, but with an
 entry of XLogRecData already built. What do you think about removing
 XLogRegisterXLogRecData and keeping only XLogRegisterData. This makes the
 interface simpler, and XLogRecData could be kept private in xlog.c. This
 would need more work especially on gin side where I am seeing for example
 constructLeafRecompressWALData directly building a XLogRecData entry.


 Another big change in the attached patch version is that
 XLogRegisterData() now *copies* the data to a working area, instead of
 merely adding a XLogRecData entry to the chain. This simplifies things in
 xlog.c, now that the XLogRecData concept is not visible to the rest of the
 system. This is a subtle semantic change for the callers: you can no longer
 register a struct first, and fill the fields afterwards.

 That adds a lot of memcpys to the critical path, which could be bad for
 performance. In practice, I think it's OK, or even a win, because a typical
 WAL record payload is very small. But I haven't measured that. If it
 becomes an issue, we could try to optimize, and link larger data blocks to
 the rdata chain, but memcpy small small chunks of data. There are a few WAL
 record types that don't fit in a small statically allocated buffer, so I
 provided a new function, XLogEnsureSpace(), that can be called before
 XLogBegin() to allocate a large-enough working area.


I just ran the following test on my laptop with your latest patch:
- Patched version:
=# CREATE TABLE foo AS SELECT generate_series(1,1000) as a;
SELECT 1000
Time: 32913.419 ms
=# CREATE TABLE foo2 AS SELECT generate_series(1,1000) as a;
SELECT 1000
Time: 32261.045 ms
- master (d97e98e):
=#  CREATE TABLE foo AS SELECT generate_series(1,1000) as a;
SELECT 1000
Time: 29651.394 ms
=#  CREATE TABLE foo2 AS SELECT generate_series(1,1000) as a;
SELECT 1000
Time: 29734.887 ms
10% difference... That was just a quick test though.

These changes required changing a couple of places where WAL records are
 created:

 * ginPlaceToPage. This function has a strange split of responsibility
 between ginPlaceToPage and dataPlaceToPage/entryPlaceToPage.
 ginPlaceToPage calls dataPlaceToPage or entryPlaceToPage, depending on what
 kind of a tree it is, and dataPlaceToPage or entryPlaceToPage contributes
 some data to the WAL record. Then ginPlaceToPage adds some more, and
 finally calls XLogInsert(). I simplified the SPLIT case so that we now just
 create full page images of both page halves. We were already logging all
 the data on both page halves, but we were doing it in a smarter way,
 knowing what kind of pages they were. For example, for an entry tree page,
 we included an IndexTuple struct for every tuple on the page, but didn't
 include the item pointers. A straight full page image takes more space, but
 not much, so I think in any real application it's going to be lost in
 noise. (again, haven't measured it though)

Thanks, the code looks cleaner this way. Having looked a bit at the way
ginPlaceToPage

Re: [HACKERS] WAL format and API changes (9.5)

On Wed, Jul 2, 2014 at 4:09 PM, Michael Paquier michael.paqu...@gmail.com
wrote:

 3) I noticed a bug in gin redo code path when trying to use the WAL replay
 facility.


Looking at the backtrace, it seems that a page for gin is corrupted. The
buffer capture patch does some sanity checks on the page format when
performing masking and it is failing in one of them on a standby when
kicking ginRedoUpdateMetapage:
if (pd_lower  pd_upper || pd_special  pd_upper ||
pd_lower  SizeOfPageHeaderData || pd_special  BLCKSZ)
{
elog(ERROR, invalid page at %X/%08X\n,
((PageHeader) page)-pd_lsn.xlogid,
((PageHeader) page)-pd_lsn.xrecoff);
}

frame #4: 0x00010437dec5 postgres`CheckForBufferLeaks + 165 at
bufmgr.c:1778
frame #5: 0x00010437df1e postgres`AtProcExit_Buffers(code=1, arg=0)
+ 30 at bufmgr.c:1750
frame #6: 0x00010438fded postgres`shmem_exit(code=1) + 301 at
ipc.c:263
frame #7: 0x00010438fc1c postgres`proc_exit_prepare(code=1) + 124
at ipc.c:187
frame #8: 0x00010438fb63 postgres`proc_exit(code=1) + 19 at
ipc.c:102
frame #9: 0x000104555b3c postgres`errfinish(dummy=0) + 1180 at
elog.c:555
frame #10: 0x0001045590de postgres`elog_finish(elevel=20,
fmt=0x000104633d4f) + 830 at elog.c:1362
frame #11: 0x00010437c1af
postgres`mask_unused_space(page=0x7fff5bc20a70) + 159 at bufcapt.c:78
frame #12: 0x00010437b53d
postgres`mask_heap_page(page=0x7fff5bc20a70) + 29 at bufcapt.c:95
frame #13: 0x00010437b1cd
postgres`buffer_capture_write(newcontent=0x000104ab3980, blkno=0) + 205
at bufcapt.c:329
frame #14: 0x00010437bc7d postgres`buffer_capture_forget(buffer=82)
+ 349 at bufcapt.c:433
frame #15: 0x0001043801c9 postgres`LockBuffer(buffer=82, mode=0) +
233 at bufmgr.c:2773
frame #16: 0x0001043800c8 postgres`UnlockReleaseBuffer(buffer=82) +
24 at bufmgr.c:2554
frame #17: 0x000103fefb03
postgres`ginRedoUpdateMetapage(lsn=335350144, record=0x7fb1740382f0) +
1843 at ginxlog.c:580
frame #18: 0x000103fede96 postgres`gin_redo(lsn=335350144,
record=0x7fb1740382f0) + 534 at ginxlog.c:724
frame #19: 0x0001040ad692 postgres`StartupXLOG + 8482 at xlog.c:6810
frame #20: 0x000104330e0e postgres`StartupProcessMain + 430 at
startup.c:224
frame #21: 0x0001040c64d9 postgres`AuxiliaryProcessMain(argc=2,
argv=0x7fff5bc231b0) + 1897 at bootstrap.c:416
frame #22: 0x00010432b1a8
postgres`StartChildProcess(type=StartupProcess) + 328 at postmaster.c:5090
frame #23: 0x0001043292b9 postgres`PostmasterMain(argc=3,
argv=0x7fb173c044e0) + 5401 at postmaster.c:1212
frame #24: 0x00010426f995 postgres`main(argc=3,
argv=0x7fb173c044e0) + 773 at main.c:219
Note that I have never seen that with vanilla, only with this patch.
Regards,
-- 
Michael

Re: [HACKERS] 9.5 CF1

On Wed, Jul 2, 2014 at 2:44 PM, Abhijit Menon-Sen a...@2ndquadrant.com
wrote:

 WAL format  API changes
 I'm not sure what's happening here. Will look more closely, but
 updates are welcome from the people who've been participating in
 the discussion/review.

Patch has been reviewed once. Heikki has submitted an updated that still
has some issues, nothing unbearable though IMO. I imagine that there are
still some discussions needed for the potention performance drop that this
patch could create because of the individual memcpy calls that are
performed for each WAL record registration.
Regards,
-- 
Michael

Re: [HACKERS] Spinlocks and compiler/memory barriers

On 2014-07-01 20:21:37 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote:
  Also if you're struggling for Sun buildfarm animals, recent versions of 
  QEMU
  will quite happily install and run later versions of 32-bit Solaris over
  serial, and 2.0 even manages to give you a cgthree framebuffer for the full
  experience.
 
  Well. I have to admit I'm really not interested in investing that much
  time in something I've no stake in. If postgres developers have to put
  emulated machines to develop features something imo went seriously
  wrong. That's more effort than at least I'm willing to spend.
 
 Perhaps more to the point, I have no faith at all that an emulator will
 mimic multiprocessor timing behavior to the level of detail needed to
 tell whether memory-barrier-related logic works.  See the VAX discussion
 just a couple days ago.

Well, it would allow us to see wether fixed stuff actually compiles and
runs - that's not nothing. The biggest problem with fixing stuff like
armv5, sparc8, whatever is that it requires adding stuff to our inline
assembly. It's easy to accidentally make it not compile, but
comparatively harder to make the behaviour even worse than before.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.5

On 2014-07-02 09:27:52 +0300, Ants Aasma wrote:
 On Fri, Jun 27, 2014 at 9:00 PM, Andres Freund and...@2ndquadrant.com wrote:
  Imagine the situation for the buffer header spinlock which is one of the
  bigger performance issues atm. We could aim to replace all usages of
  that with clever and complicated logic, but it's hard.
 
  The IMO reasonable (and prototyped) way to do it is to make the common
  paths lockless, but fall back to the spinlock for the more complicated
  situations. For the buffer header that means that pin/unpin and buffer
  lookup are lockless, but IO and changing the identity of a buffer still
  require the spinlock. My attempts to avoid the latter basically required
  a buffer header specific reimplementation of spinlocks.
 
 There is a 2010 paper [1] that demonstrates a fully non-blocking
 approach to buffer management using the same generalized clock
 algorithm that PostgreSQL has. The site also has an implementation for
 Apache Derby. You may find some interesting ideas in there.
 
 [1] 
 http://code.google.com/p/derby-nb/source/browse/trunk/derby-nb/ICDE10_conf_full_409.pdf

Interesting. Thanks for the link. I think I had pretty much all the
parts they described lockless as well (excluding the buffer mapping
hashtable itself, which they didn't focus on either), it was just
operations like replacing a dirty victim buffer where I fell back to the
spinlock.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Spinlocks and compiler/memory barriers

2014-07-02 Thread Mark Cave-Ayland


On 02/07/14 08:36, Andres Freund wrote:


On 2014-07-01 20:21:37 -0400, Tom Lane wrote:

Andres Freund and...@2ndquadrant.com writes:

On 2014-07-01 23:21:07 +0100, Mark Cave-Ayland wrote:

Also if you're struggling for Sun buildfarm animals, recent versions of QEMU
will quite happily install and run later versions of 32-bit Solaris over
serial, and 2.0 even manages to give you a cgthree framebuffer for the full
experience.



Well. I have to admit I'm really not interested in investing that much
time in something I've no stake in. If postgres developers have to put
emulated machines to develop features something imo went seriously
wrong. That's more effort than at least I'm willing to spend.


Perhaps more to the point, I have no faith at all that an emulator will
mimic multiprocessor timing behavior to the level of detail needed to
tell whether memory-barrier-related logic works.  See the VAX discussion
just a couple days ago.


Well, it would allow us to see wether fixed stuff actually compiles and
runs - that's not nothing. The biggest problem with fixing stuff like
armv5, sparc8, whatever is that it requires adding stuff to our inline
assembly. It's easy to accidentally make it not compile, but
comparatively harder to make the behaviour even worse than before.


The point I wanted to make was that there are certain applications for 
which SPARCv8 is still certified for particular military/aerospace use. 
While I don't use it myself, some people are still using it enough to 
want to contribute QEMU patches.


In terms of QEMU, the main reason for mentioning it was that if the 
consensus were to keep SPARCv8 support then this could be one possible 
way to provide basic buildfarm support (although as Tom rightly points 
out, there would need to be testing to build up confidence that the 
emulator does the right thing in a multiprocessor environment).



ATB,

Mark.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

At 2014-06-30 10:59:22 -0400, robertmh...@gmail.com wrote:

 That means we need some review of the patch for what it is, which
 there hasn't been much of, yet.

Regardless of the eventual fate of pgaudit, we'd certainly welcome some
review of the code.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

At 2014-07-01 21:39:27 +0900, maumau...@gmail.com wrote:

 Won't it be burden and a headache to maintain pgaudit code when it
 becomes obsolete in the near future?

Maybe it's a bit unfair to single out this statement to respond to,
because it seems at best tangential to your larger point, but:

If it were to really become obsolete (not sure about the near future),
it wouldn't need much maintenance. It already works about as well as it
ever will on older releases (e.g., we have no hopes of ever backporting
enough of event triggers to provide DDL deparsing in 9.3).

 I'm afraid they would be disappointed if PostgreSQL provides auditing
 functionality which does not conform to any real regulations like PCI
 DSS, NIST

I foresee lots of disappointment, then. I don't think even Stephen is
advocating NIST-compliance as the *baseline* for serious auditing in
core, just that we need a design that lets us get there sometime.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WAL replay bugs

2014-07-02 Thread Kyotaro HORIGUCHI

Hello,

 Thanks for your comments. Looking forward to seeing some more input.

Thank you. bufcapt.c was a poser.

bufcapt.c: 326 memcpy(tail, page[BLCKSZ - 2], 2);

  This seems duzzling..  Isn't *(uint16*)(page[BLCKSZ - 2]) applicable?

bufcapt.c: 331 else if (PageGetSpecial

  Generally saying, the code to identify the type of page is too
  heuristic and seems fragile.

  Pehaps this is showing that no tidy or generalized way to tell
  what a page is used for. Many of the modules which have their
  own page format has a magic value and the values seem to be
  selected carefully. But no one-for-all method to retrieve that.

  Each type of page can be confirmed by the following way *if*
  its type is previously *hinted* except for gin.

  btree: 32bit magic at pd-opaque
  gin  : No magic
  gist : 16-bit magic at ((GISTPageOpaque*)pd-opaque)-gist_page_id
  spgist   : 16-bit magic at ((SpGistPageOpaque*)pd-opaque)-spgist_page_id
  hash : 16-bit magic at ((HashPageOpaque*)pd-paque)-hasho_page_id
  sequence : 16-bit magic at pd-opaque, the last 2 bytes of the page.

  # Is it comprehensive? and correct?

  The majority is 16-bit magic at the TAIL of opaque struct. If
  we can unify them into , say, 16-bit magic at
  *(uint16*)(pd-opaque) the sizeof() labyrinth become stable and
  simple and other tools which should identify the type of pages
  will be happy. Do you think we can do that?


# Sorry, time's up for today.


  - contrib/buffer_capture_cmp/README
..
 Yeah right... This was a rest of some previous hacking on this feature.
 Paragraph was rather unclear so I rewrote it, mentioning that the custom
 script can use PGPORT to connect to the node where tests can be run.
 
 - contrib/buffer_capture_cmp/Makefile
 
   make check does nothing when BUFFER_CAPTURE is not defined, as
...
 Sure, I added such a message in the makefile.
 
 - buffer_capture_cmp.c
 
This source generates void executable when BUFFER_CAPTURE is
..
 Done. The compilation of this utility is now independent on BUFFER_CAPTURE.
 At the same time I made test.sh a bit smarter to have it grab the value of
 BUFFER_CAPTURE_FILE directly from bufcapt.h.
 
 - buffer_capture_cmp.c/main()
 
The parameters for this command are the parent directories for
...
 Fixed. I changed back the utility to directly file names instead of data
 folders as arguments.
 
 Updated patches addressing those comments are attached.
 Regards,

Thank you I'll look into them later.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS Design

2014-07-02 Thread Yeb Havinga


On 01/07/14 21:51, Robert Haas wrote:

On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed dean.a.rash...@gmail.com wrote:


That seems like a pretty strong argument.

If RLS quals are instead regarded as constraints on access, and
multiple policies apply, then it seems that the quals should now be
combined with AND rather than OR, right?

Yeah, maybe.  I intuitively feel that OR would be more useful, so it
would be nice to find a design where that makes sense.
Looking at the use cases we described earlier in 
http://www.postgresql.org/message-id/attachment/32196/mini-rim.sql I see 
more OR than AND, for instance 'if the row is sensitive then the user 
must be related to the row' which translates to (NOT sensitive) OR the 
user is related.


An addition to that rule could be a breakglass method or other reasons 
to access, e.g.
(NOT sensitive) OR user is related OR break glass OR legally required 
access.



   But it depends
a lot, in my view, on what syntax we end up with.  For example,
suppose we add just one command:

ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;

If the given role inherits from multiple roles that have different
filters, I think the user will naturally expect all of the filters to
be applied.


Suppose a building administrator gives a single person that has multiple 
roles multiple key cards to access appropriate rooms in a building. You 
could draw a venn diagram of the rooms those key cards open, and the 
intuition here probably is that the person can enter any room if one of 
the key cards matches, not all cards.



But you could do it other ways.  For example:

ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;

If a table is set to NO ROW LEVEL SECURITY then it behaves just like
it does now: anyone who accesses it sees all the rows, restricted to
those columns for which they have permission.  If the table is set to
ROW LEVEL SECURITY then the default is to show no rows.  The second
command then allows access to a subset of the rows for a give role
name.  In this case, it is probably logical for access to be combined
via OR.


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output

2014-07-02 Thread Rajeev rastogi

On 23 June 2014 15:22, Andres Freund Wrote:
 
  This may be harmless but pinning_backends should be defined as int4
  rather than int8 because BufferDesc-refcount is just defined as
  unsigned and it's converted to Datum by Int32GetDatum().
 
 Well, in theory a uint32 can't generally be converted to a int32.
 That's why I chose a int64 because it's guaranteed to have sufficient
 range. It's fairly unlikely to have that many pins, but using a int64
 seems cheap enough here.

But since we have declared pinning_backends as int8, we should use 
Int64GetDatum to form the tuple datum.

Using Int32GetDatum will lead to issue specially incase of WIN32 platform or 
any other platform where int8
is not considered as USE_FLOAT8_BYVAL (or FLOAT8PASSBYVAL as 0). On such 
platform while initializing the tuple, 
type by value will be marked as false but still value (not address) will be 
passed to datum, which will lead to crash.

I have done testing to verify this on win32 and able to observe the crash where 
as it works fine on Linux.

Also can we change the description of function pg_buffercache_pages to 
include pinning_backend also in the description.

Thanks and Regards,
Kumar Rajeev Rastogi






-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gaussian distribution pgbench

2014-07-02 Thread Fabien COELHO



Hello Mitsumasa-san,


And I'm also interested in your decile percents output like under
followings,
decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4%


Sure, I'm really fine with that.


I think that it is easier than before. Sum of decile percents is just 100%.


That's a good property:-)

However, I don't prefer highest/lowest percentage because it will be 
confused with decile percentage for users, and anyone cannot understand 
this digits. I cannot understand 4.9%, 0.0% when I see the first time. 
Then, I checked the source code, I understood it:( It's not good 
design... #Why this parameter use 100?


What else? People have ten fingers and like powers of 10, and are used to 
percents?



So I'd like to remove it if you like. It will be more simple.


I think that for the exponential distribution it helps, especially for 
high threshold, to have the lowest/highest percent density. For low 
thresholds, the decile is also definitely useful. So I'm fine with both 
outputs as you have put them.


I have just updated the wording so that it may be clearer:

 decile percents: 69.9% 21.0% 6.3% 1.9% 0.6% 0.2% 0.1% 0.0% 0.0% 0.0%
 probability of fist/last percent of the range: 11.3% 0.0%


Attached patch is fixed version, please confirm it.


Attached a v15 which just fixes a typo and the above wording update. I'm 
validating it for committers.



#Of course, World Cup is being held now. I'm not hurry at all.


I'm not a soccer kind of person, so it does not influence my 
availibility.:-)



Suggested commit message:

Add drawing random integers with a Gaussian or truncated exponentional 
distributions to pgbench.


Test variants with these distributions are also provided and triggered
with options --gaussian=... and --exponential=


Have a nice day/night,

--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 4aa8a50..3541b7e 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -41,6 +41,7 @@
 #include math.h
 #include signal.h
 #include sys/time.h
+#include assert.h
 #ifdef HAVE_SYS_SELECT_H
 #include sys/select.h
 #endif
@@ -98,6 +99,8 @@ static int	pthread_join(pthread_t th, void **thread_return);
 #define LOG_STEP_SECONDS	5	/* seconds between log messages */
 #define DEFAULT_NXACTS	10		/* default nxacts */
 
+#define MIN_GAUSSIAN_THRESHOLD		2.0	/* minimum threshold for gauss */
+
 int			nxacts = 0;			/* number of transactions per client */
 int			duration = 0;		/* duration in seconds */
 
@@ -171,6 +174,14 @@ bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
 
+/* gaussian distribution tests: */
+double		stdev_threshold;   /* standard deviation threshold */
+booluse_gaussian = false;
+
+/* exponential distribution tests: */
+double		exp_threshold;   /* threshold for exponential */
+bool		use_exponential = false;
+
 char	   *pghost = ;
 char	   *pgport = ;
 char	   *login = NULL;
@@ -332,6 +343,88 @@ static char *select_only = {
 	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
 };
 
+/* --exponential case */
+static char *exponential_tpc_b = {
+	\\set nbranches  CppAsString2(nbranches)  * :scale\n
+	\\set ntellers  CppAsString2(ntellers)  * :scale\n
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts exponential :exp_threshold\n
+	\\setrandom bid 1 :nbranches\n
+	\\setrandom tid 1 :ntellers\n
+	\\setrandom delta -5000 5000\n
+	BEGIN;\n
+	UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+	UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n
+	UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n
+	INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n
+	END;\n
+};
+
+/* --exponential with -N case */
+static char *exponential_simple_update = {
+	\\set nbranches  CppAsString2(nbranches)  * :scale\n
+	\\set ntellers  CppAsString2(ntellers)  * :scale\n
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts exponential :exp_threshold\n
+	\\setrandom bid 1 :nbranches\n
+	\\setrandom tid 1 :ntellers\n
+	\\setrandom delta -5000 5000\n
+	BEGIN;\n
+	UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+	INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n
+	END;\n
+};
+
+/* --exponential with -S case */
+static char *exponential_select_only = {
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts exponential :exp_threshold\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+};
+
+/* --gaussian case */
+static char *gaussian_tpc_b = {
+	\\set nbranches

[HACKERS] Missing IPv6 for pgbuildfarm.org

2014-07-02 Thread Torsten Zuehlsdorff


Hello,

i've tried to setup a FreeBSD 10 machine as buildfarm-member. But it's 
an IPv6 only machine and there is no IPv6 for the homepage.


Can anyone add support for IPv6 to it?

Greetings,
Torsten


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

2014-07-02 Thread Etsuro Fujita


(2014/07/02 14:06), Abhijit Menon-Sen wrote:


Foreign table inheritance
 Moved from ready for committer back to waiting on author after
 Noah's review comments. Should we expect an updated patch to be
 posted in this CF?

 
http://archives.postgresql.org/message-id/20140702022327.gc1586...@tornado.leadboat.com


Noah, thank you for the review!

In addition to the issues pointed out by Noah, the following should be 
resolved, I think.  And ISTM we should work on the latter first, for 
which I plan to add a separate patch to the next CF.  So, I think it 
would be better to move this to the next CF.


http://www.postgresql.org/message-id/3492.1404136...@sss.pgh.pa.us

Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-07-02 Thread Christoph Berg

Re: Tom Lane 2014-07-01 20654.1404247...@sss.pgh.pa.us
 Yeah, I'm unexcited about this proposal.  In any case, given the two
 existing APIs we have to deal with, allowing PG_OOM_ADJUST_VALUE to
 default to 0 is sane in both APIs but a default for the file name
 can work for only one.

Nod.

 Fair enough.  I went for a minimum-change approach when hacking that
 script, but we could change it some more in the name of readability.
 Will do something about that.

Thanks, it's much nicer now. There's one uglyness left though: The
name PG_CHILD_OOM_SCORE_ADJ should match what actually gets passed to
the backends,

DAEMON_ENV=PG_OOM_ADJUST_FILE=$PG_OOM_ADJUST_FILE 
PG_OOM_ADJUST_VALUE=$PG_CHILD_OOM_SCORE_ADJ

would better be PG_OOM_ADJUST_VALUE=$PG_OOM_ADJUST_VALUE.

(Possibly the smart way to fix this would be to change
src/backend/postmaster/fork_process.c to use PG_CHILD_OOM_SCORE_ADJ
instead.)

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

Thanks for the update. I have marked the patch returned with feedback
and moved it to the 2014-08 CF.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-07-02 Thread Simon Riggs

On 1 July 2014 18:32, Stephen Frost sfr...@snowman.net wrote:

 Having functions to control the auditing would work, but it's not
 exactly the ideal approach, imv, and

What aspect is less than ideal?

 the only reason it's being
 discussed here is because it might be a way to allow an extension to
 provide the auditing- not because it's actually a benefit to anyone.

That is a false statement, as well as being a personal one. It's sad
to hear personal comments in this.

It seems strange to be advocating new grammar at a time when we are
actively reducing the size of it (see recent commits and long running
hackers discussions). Functions don't carry the same overhead, in fact
they cost nothing if you're not using them, which is the most
important point.

The right to execute functions can be delegated easily to any group
that wants access. There is no special benefit to SQL grammar on that
point.


 However, if we have such functions in a contrib extension, I worry what
 the pg_upgrade path is from that extension to an in-core solution.

Functions are already used heavily for many aspects of PostgreSQL.
http://www.postgresql.org/docs/devel/static/functions-admin.html

Presumably you don't advocate an in core solution to replace
pg_cancel_backend() etc?

My proposed route for making this in-core is simply to accept that
the extension is in core. Auditing should, in my view, always be
optional, since not everyone needs it. Cryptographic functions aren't
in-core either and I'm guessing various security conscious
organizations will use them and be happy. How does pgaudit differ from
pgcrypto?


Given the tone of this discussion, I don't see it going anywhere
further anytime soon - that is good since there is no big rush.
pgaudit is a sincere attempt to add audit functionality to Postgres.
If you or anyone else wants to make a similarly sincere attempt to add
audit functionality to Postgres, lets see the design and its
connection to requirements.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

2014-07-02 Thread Etsuro Fujita


(2014/07/02 18:19), Abhijit Menon-Sen wrote:

Thanks for the update. I have marked the patch returned with feedback
and moved it to the 2014-08 CF.


OK  I've changed the status from returned with feedback to Waiting on 
Author.


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-07-02 Thread Jeevan Chalke

On Sun, Jun 29, 2014 at 4:18 PM, David Rowley dgrowle...@gmail.com wrote:

 I think I'm finally ready for a review again, so I'll update the
 commitfest app.


I have reviewed this on code level.

1. Patch gets applied cleanly.
2. make/make install/make check all are fine

No issues found till now.

However some cosmetic points:

1.
 * The API of this function is identical to convert_ANY_sublink_to_join's,
 * except that we also support the case where the caller has found NOT
EXISTS,
 * so we need an additional input parameter under_not.

Since now, we do support NOT IN handling in convert_ANY_sublink_to_join() we
do have under_not parameter there too. So above comments near
convert_EXISTS_sublink_to_join() function is NO more valid.


2. Following is the unnecessary change. Can be removed:

@@ -506,6 +560,7 @@ pull_up_sublinks_qual_recurse(PlannerInfo *root, Node
*node,
return NULL;
}
}
+
}
/* Else return it unmodified */
return node;


3. Typos:

a.
+ * queryTargetListListCanHaveNulls
...
+queryTargetListCanHaveNulls(Query *query)

Function name is queryTargetListCanHaveNulls() not
queryTargetListListCanHaveNulls()

b.
 *For example the presense of; col IS NOT NULL, or col = 42 would
allow

presense = presence


4. get_attnotnull() throws an error for invalid relid/attnum.
But I see other get_att* functions which do not throw an error rather
returning some invalid value, eg. NULL in case of attname, InvalidOid in
case of atttype and InvalidAttrNumber in case of attnum. I have observed
that we cannot return such invalid value for type boolean and I guess that's
the reason you are throwing an error. But somehow looks unusual. You had put
a note there which is good. But yes, caller of this function should be
careful enough and should not assume its working like other get_att*()
functions.
However for this patch, I would simply accept this solution. But I wonder if
someone thinks other way round.


Testing more on SQL level.

However, assigning it to author to think on above cosmetic issues.

Thanks

-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

[HACKERS] Aggregate function API versus grouping sets

I've been assisting Atri with development of an implementation of
GROUPING SETS, beginning with a one-pass implementation of ROLLUP.
Two issues have arisen regarding the API for calling aggregate
transition and final functions that I think need answering now,
since they relate to changes in 9.4.

1. Assumptions about the relationship between aggcontext and fn_extra

Tom's version of the ordered-set aggregate code makes the assumption
that it is safe to store the aggcontext returned by AggCheckCallContext
in a structure hung off flinfo-fn_extra. This implicitly assumes that
the function will be called in only one aggcontext, or at least only one
per flinfo.

Doing rollup via GroupAggregate by maintaining multiple transition
values at a time (one per grouping set) means that the transfn is being
called interleaved for transition values in different contexts. So the
question becomes: is it wrong for the transition function to assume that
aggcontext can be cached this way, or is it necessary for the executor
to use a separate flinfo for each concurrent grouping set?

2. AggGetPerAggEContext

The comment documents this as returning the per-output-tuple econtext,
and the ordered-set code assumes that the rescan of this context implies
that the aggcontext is also about to be reset (so cleanup functions can
be called).

Since it seems that the cleanup callback is the sole reason for this
function to exist, is it acceptable to remove any implication that the
context returned is the overall per-output-tuple one, and simply state
that it is a context whose cleanup functions are called at the
appropriate time before aggcontext is reset? (In fact what I'm
considering is making aggcontext be the per-tuple memory context of an
ExprContext created for each grouping set, and having
AggGetPerAggEContext return this - but it won't be the actual context in
which the result projection happens.)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing NOT IN to use ANTI joins

2014-07-02 Thread David Rowley

On Wed, Jul 2, 2014 at 9:25 PM, Jeevan Chalke 
jeevan.cha...@enterprisedb.com wrote:



 Testing more on SQL level.



I'm just looking into an issue I've found in the find_inner_rels()
function, where it does not properly find the rel in the from list in
certain cases, for example:

explain select * from a where id not in (select b.id from b left outer join
c on b.id=c.id);

fails to use an ANTI JOIN, but if you remove the left join to c, it works
perfectly.

Currently I'm just getting my head around how the jointree is structured
and reading over deconstruct_jointree to see how it handles this. I may
change the function to find_outer_rels and just look for outer joins in the
function.

However, assigning it to author to think on above cosmetic issues.


Thanks for the review. I'll fix the issues you listed soon, but I'll likely
delay posting the updated patch until I have the other fix in place.

Regards

David Rowley

Thanks

 --
 Jeevan B Chalke
 Principal Software Engineer, Product Development
 EnterpriseDB Corporation
 The Enterprise PostgreSQL Company

Re: [HACKERS] Selectivity estimation for inet operators

2014-07-02 Thread Dilip kumar

On, 15 May 2014 14:04 Emre Hasegeli Wrote, 

 
 * matching first MCV to second MCV
 * searching first MCV in the second histogram
 * searching second MCV in the first histogram
 * searching boundaries of the first histogram in the second histogram
 
 Comparing the lists with each other slows down the function when
 statistics set to higher values. To avoid this problem I only use
 log(n) values of the lists. It is the first log(n) value for MCV,
 evenly separated values for histograms. In my tests, this optimization
 does not affect the planning time when statistics = 100, but does
 affect accuracy of the estimation. I can send the version without this
 optimization, if slow down with larger statistics is not a problem
 which should be solved on the selectivity estimation function.


I have started reviewing this patch, so far I have done basic reviews and some 
testing/debugging.

1. Patch applied to git head.
2. Basic testing works fine.

I have one query,

In inet_his_inclusion_selec function, 
When the constant matches only the right side of the bucket, and if it’s a last 
bucket then it's never considered as partial match candidate.
In my opinion, if it's not a last bucket then for next bucket it will become 
left boundary and this will be treated as partial match so no problem, but 
in-case of last bucket it can give wrong selectivity.

Can't we consider it as partial bucket match if it is last bucket ?

Apart from that there is one spell check you can correct
-- in inet_his_inclusion_selec comments
histogram boundies  - histogram boundaries :)

Thanks  Regards,
Dilip Kumar





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gaussian distribution pgbench

2014-07-02 Thread Fabien COELHO





I have just updated the wording so that it may be clearer:


Oops, I have sent the wrong patch, without the wording fix. Here is the 
real updated version, which I tested.



probability of fist/last percent of the range: 11.3% 0.0%


--
Fabien.diff --git a/contrib/pgbench/pgbench.c b/contrib/pgbench/pgbench.c
index 4aa8a50..f8ad17e 100644
--- a/contrib/pgbench/pgbench.c
+++ b/contrib/pgbench/pgbench.c
@@ -41,6 +41,7 @@
 #include math.h
 #include signal.h
 #include sys/time.h
+#include assert.h
 #ifdef HAVE_SYS_SELECT_H
 #include sys/select.h
 #endif
@@ -98,6 +99,8 @@ static int	pthread_join(pthread_t th, void **thread_return);
 #define LOG_STEP_SECONDS	5	/* seconds between log messages */
 #define DEFAULT_NXACTS	10		/* default nxacts */
 
+#define MIN_GAUSSIAN_THRESHOLD		2.0	/* minimum threshold for gauss */
+
 int			nxacts = 0;			/* number of transactions per client */
 int			duration = 0;		/* duration in seconds */
 
@@ -171,6 +174,14 @@ bool		is_connect;			/* establish connection for each transaction */
 bool		is_latencies;		/* report per-command latencies */
 int			main_pid;			/* main process id used in log filename */
 
+/* gaussian distribution tests: */
+double		stdev_threshold;   /* standard deviation threshold */
+booluse_gaussian = false;
+
+/* exponential distribution tests: */
+double		exp_threshold;   /* threshold for exponential */
+bool		use_exponential = false;
+
 char	   *pghost = ;
 char	   *pgport = ;
 char	   *login = NULL;
@@ -332,6 +343,88 @@ static char *select_only = {
 	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
 };
 
+/* --exponential case */
+static char *exponential_tpc_b = {
+	\\set nbranches  CppAsString2(nbranches)  * :scale\n
+	\\set ntellers  CppAsString2(ntellers)  * :scale\n
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts exponential :exp_threshold\n
+	\\setrandom bid 1 :nbranches\n
+	\\setrandom tid 1 :ntellers\n
+	\\setrandom delta -5000 5000\n
+	BEGIN;\n
+	UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+	UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n
+	UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n
+	INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n
+	END;\n
+};
+
+/* --exponential with -N case */
+static char *exponential_simple_update = {
+	\\set nbranches  CppAsString2(nbranches)  * :scale\n
+	\\set ntellers  CppAsString2(ntellers)  * :scale\n
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts exponential :exp_threshold\n
+	\\setrandom bid 1 :nbranches\n
+	\\setrandom tid 1 :ntellers\n
+	\\setrandom delta -5000 5000\n
+	BEGIN;\n
+	UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+	INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n
+	END;\n
+};
+
+/* --exponential with -S case */
+static char *exponential_select_only = {
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts exponential :exp_threshold\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+};
+
+/* --gaussian case */
+static char *gaussian_tpc_b = {
+	\\set nbranches  CppAsString2(nbranches)  * :scale\n
+	\\set ntellers  CppAsString2(ntellers)  * :scale\n
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n
+	\\setrandom bid 1 :nbranches\n
+	\\setrandom tid 1 :ntellers\n
+	\\setrandom delta -5000 5000\n
+	BEGIN;\n
+	UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+	UPDATE pgbench_tellers SET tbalance = tbalance + :delta WHERE tid = :tid;\n
+	UPDATE pgbench_branches SET bbalance = bbalance + :delta WHERE bid = :bid;\n
+	INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n
+	END;\n
+};
+
+/* --gaussian with -N case */
+static char *gaussian_simple_update = {
+	\\set nbranches  CppAsString2(nbranches)  * :scale\n
+	\\set ntellers  CppAsString2(ntellers)  * :scale\n
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts gaussian :stdev_threshold\n
+	\\setrandom bid 1 :nbranches\n
+	\\setrandom tid 1 :ntellers\n
+	\\setrandom delta -5000 5000\n
+	BEGIN;\n
+	UPDATE pgbench_accounts SET abalance = abalance + :delta WHERE aid = :aid;\n
+	SELECT abalance FROM pgbench_accounts WHERE aid = :aid;\n
+	INSERT INTO pgbench_history (tid, bid, aid, delta, mtime) VALUES (:tid, :bid, :aid, :delta, CURRENT_TIMESTAMP);\n
+	END;\n
+};
+
+/* --gaussian with -S case */
+static char *gaussian_select_only = {
+	\\set naccounts  CppAsString2(naccounts)  * :scale\n
+	\\setrandom aid 1 :naccounts

Re: [HACKERS] New functions in sslinfo module

2014-07-02 Thread Воронин Дмитрий

24.06.2014, 00:07, "Andreas Karlsson" andr...@proxel.se: On 04/21/2014 07:51 AM, Воронин Дмитрий wrote:  I put patch generated on git diffs to this letter. I make an a thread in  postgresql commit fest:  https://commitfest.postgresql.org/action/patch_view?id=1438 Thanks for the patch, it seems like a useful feature. === General === - Applies cleanly to HEAD and compiles without warnings. - The test suite passes. - x509v3 support was added in OpenSSL 0.9.2 and sslinfo already depends heavily on OpenSSL so no new dependencies. === User functionality === - If you are a user of the sslinfo extension the new functions should be useful additions. - I tested the code without SSL, with certificate but without client certificate, with client certificates first without extensions and the with. All of this worked fine except for some usability which could be improved, see below. - I cannot see the use for ssl_get_count_of_extensions(). When would anyone need to know the number of extensions? I think this is an implementation detail of OpenSSL which we do not need to expose. If any user wants this feature he can count the extension names. - Documentation is missing for the new functions. - I think the names of the new functions should be change. Below are my suggestions. Other suggestions are welcome. * ssl_extension_value(text) * ssl_extension_is_critical() * ssl_extension_names() * ssl_extension_count() (If we should keep it.) - Would it be interesting to have a set returning function which returns all extensions with both the names and the values? Like the below. $ SELECT * FROM ssl_extensions(); name   |    value --+--   basicConstraints | CA:FALSE   keyUsage | Digital Signature, Non Repudiation, Key Encipherment (2 rows) - I do not think that ssl_get_extension_names() should return a single row with NULL when the certificate has no extensions or when there is no certificate at all. Returning zero rows in this case should make it easier to use. - Currently ssl_is_critical_extension() and ssl_get_extension_value() throw an error when the requested extension is not in the certificate. I am not sure if I like this behavior. I think I would prefer if the code always throws an error when name lookup fails, and never when it is successful. For a valid extension name I think NULL should be returned when it does not exist. === Code review: main === - Why are X509_NAME_field_to_text(), X509_NAME_to_text(), ASN1_STRING_to_text() and get_extension() not static? None of these are a symbol which should be exported. - I have not worked with extension myself, but since your change adds functions to the extension I think you need to create a version 1.1 instead of modifying 1.0 in place. If so you also need to write an upgrade script from 1.0 to 1.1. See dblink--1.0--1.1.sql for an example. - Please break out the comment fix for ssl_cipher() into a separate patch. - Why do you use pg_do_encoding_conversion() over pg_any_to_server()? pg_any_to_server() is implemented using pg_do_encoding_conversion(). - I think you should use OBJ_txt2nid() rather than OBJ_sn2nid() + OBJ_ln2nid(). You should probably also use OBJ_txt2obj() since X509_get_ext_by_NID() will call OBJ_nid2obj() anyway. - You should free the extension_name string. I do not think it is ok to leak it to the end of the query. - I think you need to convert the extension values and names to the server encoding. I just wonder if we need to support data which is incorrectly encoded. === Code review: style issues === - Trailing whitespace in sslinfo--1.0.sql and sslinfo.c.q - sslinfo--1.0.sql does not end in a newline. - I do not think the PostgreSQL project adds authors in the top comment of files in cases like this. Authors get credit in the commit messages. - I think you can remove the prototypes of all the ssl_* functions. - Adding the have in "Indicates whether current client have provided a certificate" is not necessary. The previous wording looks correct to my non-native speaker eyes. - Too much white space in variable declarations in get_extension(). - Extra space before -1 in "X509_get_ext_by_NID(certificate, extension_nid,  -1);" - Please do not initialize variables unless necessary. Compilers are pretty good at warning about uninitialized usage. For example both locate and extension_nid do not need to be initialized. - Remove empty line directly before ssl_get_extension_value(). - Try to follow variable naming conventions from other functions (e.g. use nid rather than extension_nid, membuf rather than bio, sp rather than value). - I am pretty sure the variable you call locate should be called location (or loc for short). - There should not be any spaces around "-". - The declaration of *extension in ssl_get_extension_value is not aligned properly. - Remove white space in variable declaration in ssl_get_count_of_extensions(). - Incorrectly alignment of variable

Re: [HACKERS] New functions in sslinfo module

2014-07-02 Thread Воронин Дмитрий

Oh, how can I write a documentation for my functions?

02.07.2014, 16:17, Воронин Дмитрий carriingfat...@yandex.ru:
 24.06.2014, 00:07, Andreas Karlsson andr...@proxel.se:
  On 04/21/2014 07:51 AM, Воронин Дмитрий wrote:
   I put patch generated on git diffs to this letter. I make an a thread in
   postgresql commit fest:
   https://commitfest.postgresql.org/action/patch_view?id=1438
  Thanks for the patch, it seems like a useful feature.

  === General ===

  - Applies cleanly to HEAD and compiles without warnings.

  - The test suite passes.

  - x509v3 support was added in OpenSSL 0.9.2 and sslinfo already depends
  heavily on OpenSSL so no new dependencies.

  === User functionality ===

  - If you are a user of the sslinfo extension the new functions should be
  useful additions.

  - I tested the code without SSL, with certificate but without client
  certificate, with client certificates first without extensions and the
  with. All of this worked fine except for some usability which could be
  improved, see below.

  - I cannot see the use for ssl_get_count_of_extensions(). When would
  anyone need to know the number of extensions? I think this is an
  implementation detail of OpenSSL which we do not need to expose. If any
  user wants this feature he can count the extension names.

  - Documentation is missing for the new functions.

  - I think the names of the new functions should be change. Below are my
  suggestions. Other suggestions are welcome.

  * ssl_extension_value(text)
  * ssl_extension_is_critical()
  * ssl_extension_names()
  * ssl_extension_count() (If we should keep it.)

  - Would it be interesting to have a set returning function which returns
  all extensions with both the names and the values? Like the below.

  $ SELECT * FROM ssl_extensions();
  name   |    value
  --+--
    basicConstraints | CA:FALSE
    keyUsage | Digital Signature, Non Repudiation, Key Encipherment
  (2 rows)

  - I do not think that ssl_get_extension_names() should return a single
  row with NULL when the certificate has no extensions or when there is no
  certificate at all. Returning zero rows in this case should make it
  easier to use.

  - Currently ssl_is_critical_extension() and ssl_get_extension_value()
  throw an error when the requested extension is not in the certificate.

  I am not sure if I like this behavior. I think I would prefer if the
  code always throws an error when name lookup fails, and never when it is
  successful. For a valid extension name I think NULL should be returned
  when it does not exist.

  === Code review: main ===

  - Why are X509_NAME_field_to_text(), X509_NAME_to_text(),
  ASN1_STRING_to_text() and get_extension() not static? None of these are
  a symbol which should be exported.

  - I have not worked with extension myself, but since your change adds
  functions to the extension I think you need to create a version 1.1
  instead of modifying 1.0 in place. If so you also need to write an
  upgrade script from 1.0 to 1.1. See dblink--1.0--1.1.sql for an example.

  - Please break out the comment fix for ssl_cipher() into a separate patch.

  - Why do you use pg_do_encoding_conversion() over pg_any_to_server()?
  pg_any_to_server() is implemented using pg_do_encoding_conversion().

  - I think you should use OBJ_txt2nid() rather than OBJ_sn2nid() +
  OBJ_ln2nid(). You should probably also use OBJ_txt2obj() since
  X509_get_ext_by_NID() will call OBJ_nid2obj() anyway.

  - You should free the extension_name string. I do not think it is ok to
  leak it to the end of the query.

  - I think you need to convert the extension values and names to the
  server encoding. I just wonder if we need to support data which is
  incorrectly encoded.

  === Code review: style issues ===

  - Trailing whitespace in sslinfo--1.0.sql and sslinfo.c.q

  - sslinfo--1.0.sql does not end in a newline.

  - I do not think the PostgreSQL project adds authors in the top comment
  of files in cases like this. Authors get credit in the commit messages.

  - I think you can remove the prototypes of all the ssl_* functions.

  - Adding the have in Indicates whether current client have provided a
  certificate is not necessary. The previous wording looks correct to my
  non-native speaker eyes.

  - Too much white space in variable declarations in get_extension().

  - Extra space before -1 in X509_get_ext_by_NID(certificate,
  extension_nid,  -1);

  - Please do not initialize variables unless necessary. Compilers are
  pretty good at warning about uninitialized usage. For example both
  locate and extension_nid do not need to be initialized.

  - Remove empty line directly before ssl_get_extension_value().

  - Try to follow variable naming conventions from other functions (e.g.
  use nid rather than extension_nid, membuf rather than bio, sp rather
  than value).

  - I am pretty sure

Re: [HACKERS] New functions in sslinfo module

On Wed, Jul 2, 2014 at 9:19 PM, Воронин Дмитрий
carriingfat...@yandex.ru wrote:

 Oh, how can I write a documentation for my functions?

You will need to edit the sgml documentation and to include the diffs
in your patch. Hence in your case simply list the new functions and a
description of what they do in doc/src/sgml/sslinfo.sgml.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

* Abhijit Menon-Sen (a...@2ndquadrant.com) wrote:
 I foresee lots of disappointment, then. I don't think even Stephen is
 advocating NIST-compliance as the *baseline* for serious auditing in
 core, just that we need a design that lets us get there sometime.

Agreed.  I'm not suggesting that we must be fully NIST/PCI-DSS/whatever
compliant before we add anything but rather that we have a plan for how
to get there.  Note that the plan is needed primairly to ensure that we
don't end up in a situation that makes it difficult to move forward in
the future due to compatibility or upgrade issues.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

On Wed, Jul 2, 2014 at 5:19 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On 1 July 2014 18:32, Stephen Frost sfr...@snowman.net wrote:
 Having functions to control the auditing would work, but it's not
 exactly the ideal approach, imv, and

 What aspect is less than ideal?

 the only reason it's being
 discussed here is because it might be a way to allow an extension to
 provide the auditing- not because it's actually a benefit to anyone.

 That is a false statement, as well as being a personal one. It's sad
 to hear personal comments in this.

I am not sure that it was personal, but I agree it's false.

 Auditing should, in my view, always be
 optional, since not everyone needs it. Cryptographic functions aren't
 in-core either and I'm guessing various security conscious
 organizations will use them and be happy. How does pgaudit differ from
 pgcrypto?

+1.

 Given the tone of this discussion, I don't see it going anywhere
 further anytime soon - that is good since there is no big rush.
 pgaudit is a sincere attempt to add audit functionality to Postgres.
 If you or anyone else wants to make a similarly sincere attempt to add
 audit functionality to Postgres, lets see the design and its
 connection to requirements.

Agreed all around.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

2014-07-02 Thread Alexander Korotkov

On Wed, Jul 2, 2014 at 10:08 AM, Abhijit Menon-Sen a...@2ndquadrant.com
wrote:

 KNN-GiST with recheck
 Partial sort
 Some earlier discussion, but no conclusions, and as far as I know
 nobody is working on these patches at the moment.


I'm now working on next revision of KNN-GiST with recheck. Also, I think
partial sort need a look of somebody more aware of planner than me and
Marti.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] better atomics - v0.5

Hi,

On 2014-06-30 20:28:52 +0200, Andres Freund wrote:
 To quantify this, on my 2 socket xeon E5520 workstation - which is too
 small to heavily show the contention problems in pgbench -S - the
 numbers are:

 pgbench -M prepared -c 16 -j 16 -T 10 -S (best of 5, noticeably variability)
 master: 152354.294117
 lwlocks-atomics: 170528.784958
 lwlocks-atomics-spin: 159416.202578

 pgbench -M prepared -c 1 -j 1 -T 10 -S (best of 5, noticeably variability)
 master: 16273.702191
 lwlocks-atomics: 16768.712433
 lwlocks-atomics-spin: 16744.913083

 So, there really isn't a problem with the emulation. It's not actually
 that surprising - the absolute number of atomic ops is prety much the
 same. Where we earlier took a spinlock to increment shared we now still
 take one.

Tom, you've voiced concern that using atomics will regress performance
for platforms that don't use atomics in a nearby thread. Can these
numbers at least ease those a bit?
I've compared the atomics vs emulated atomics on 32bit x86, 64bit x86
and POWER7. That's all I've access to at the moment.  In all cases the
performance with lwlocks ontop emulated atomics was the same as the old
spinlock based algorithm or even a bit better. I'm sure that it's
possible to design atomics based algorithms where that's not the case,
but I don't think we need to solve those before we meet them.

I think for most contended places which possibly can be improved two
things matter: The number of instructions that need to work atomically
(i.e. TAS/CAS/xchg for spinlocks, xadd/cmpxchg for atomics) and the
duration locks are held. When converting to atomics, even if emulated,
the hot paths shouldn't need more locked instructions than before and
often enough the duration during which spinlocks are held will be
smaller.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-07-02 Thread Jonathan Corbet

On Tue, 1 Jul 2014 15:57:25 -0400
Robert Haas robertmh...@gmail.com wrote:

 Of course, we have no guarantee that the Linux kernel guys won't
 change this again.  Apparently we don't break userspace is a
 somewhat selectively-enforced principle.

It's selectively enforced in that kernel developers don't (and can't)
scan the world for software that might break.  OTOH, if somebody were
to respond to a proposed change by saying this breaks PostgreSQL, I
would be amazed if the change were to be merged in that form.

In other words, it's important to scream.  There were concerns during
the last round of messing with the OOM logic, but nobody pointed to
something that actually broke.

I predict that somebody will certainly try to rewrite the OOM code
again in the future.  The nature of that code is such that it is never
going to work as well as people would like.  But if application
developers make it clear that they are dependent on the current
interface working, kernel developers will be constrained to keep it
working.

jon


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] better atomics - v0.5

On 2014-06-26 00:42:37 +0300, Heikki Linnakangas wrote:
 On 06/25/2014 11:36 PM, Andres Freund wrote:
 
 - I completely loathe the file layout you've proposed in
 src/include/storage.  If we're going to have separate #include files
 for each architecture (and I'd rather we didn't go that route, because
 it's messy and unlike what we have now), then shouldn't that stuff go
 in src/include/port or a new directory src/include/port/atomics?
 Polluting src/include/storage with a whole bunch of files called
 atomics-whatever.h strikes me as completely ugly, and there's a lot of
 duplicate code in there.
 I don't mind moving it somewhere else and it's easy enough to change.
 
 I think having a separate file for each architecture is nice. I totally
 agree that they don't belong in src/include/storage, though. s_lock.h has
 always been misplaced there, but we've let it be for historical reasons, but
 now that we're adding a dozen new files, it's time to move them out.

So,
src/include/port/atomics.h,
src/include/port/atomics/generic-$compiler.h,
src/include/port/atomics/arch-$arch.h,
src/include/port/atomics/fallback.h,
src/include/port/atomics/generic.h,
src/backend/port/atomics.c
?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

Simon,

* Simon Riggs (si...@2ndquadrant.com) wrote:
 On 1 July 2014 18:32, Stephen Frost sfr...@snowman.net wrote:
  Having functions to control the auditing would work, but it's not
  exactly the ideal approach, imv, and
 
 What aspect is less than ideal?

I certainly don't like passing table names to functions, just to point
out one item.  It's necessary in cases where we want to programatically
get access to information (eg: pg_total_relation_size(),
has_*_privilege, etc).  I do not see any functions today which take a
'name' type argument (or, at a quick look, any of the other ones) and
update a catalog object- and I don't think that's a mistake.  Reading
further, I take it that you are advocating pgaudit have functions along
these lines:

-- Enable auditing for a table
pgaudit_add_audit_to_table(name);

-- Audit only a given role's actions on the table
pgaudit_add_audit_to_table(name,role);

-- Audit specific actions of a role on a table
pgaudit_add_audit_to_table(name,role,aclitem);

whereby this information would end up stored somehow through reloptions.
I'd be happy to be incorrect here as that's not a design that I like,
but if that isn't correct, getting feedback as to what the future plan
*is* would be great.

  the only reason it's being
  discussed here is because it might be a way to allow an extension to
  provide the auditing- not because it's actually a benefit to anyone.
 
 That is a false statement, as well as being a personal one. It's sad
 to hear personal comments in this.

Simon, it wasn't intended as a personal attack.  My frustration with
this thread lead me to overstate it but there has been very little
discussion about why functions are the *right* solution for controlling
auditing.  The arguments for it today have been, to try and summarize:

- Keeps the grammar small

  While this is a general benefit, I don't view it as a reason for using
  functions instead of new grammar when it comes to modifying objects in
  the system.

- Not everyone will want auditing

  I fully support this- not everyone wants column level privileges
  either though, or JSON in their database.  I also don't feel this is
  accurate- we get a lot of complaints about the lack of flexibility in
  our logging system and my hope is that an auditing system will help to
  address that.  The two are not identical but I expect them to share a
  lot of the infrastructure.

- Permissions can be set up on functions

  I admit that I had not considered this, but even so- that's very
  coarse- you can either execute the function or not.  We might be able
  to start with this but I'd expect us to need more control very
  quickly- auditor A should be able to control the auditing for tables
  in the 'low security' schema, while auditor B should be able to
  control the auditing for tables in both the 'low security' and 'high
  security' schemas.  In the end, I don't see this as being a workable
  solution- and then what?  To add that control, we add more functions
  to allow the admin to control who can define auditing for what, and
  then have to implement the permissions handling inside of the audit
  configuration functions, and then relax the permissions on those
  functions?  Perhaps you see a better approach and I'm just missing it,
  but if so, it'd be great if you could outline that.

 It seems strange to be advocating new grammar at a time when we are
 actively reducing the size of it (see recent commits and long running
 hackers discussions). Functions don't carry the same overhead, in fact
 they cost nothing if you're not using them, which is the most
 important point.

I do not agree that new grammar for auditing should be off the table
because it adds to the grammar.  I agree that we don't want to add new
grammar without good justification for it and further agree that we
should be looking at opportunities to keep the grammar small.  I view
auditing as similar to the GRANT system though and while we could reduce
our grammar by moving everything done via GRANT into functions, I don't
feel that's a good solution.

 The right to execute functions can be delegated easily to any group
 that wants access. There is no special benefit to SQL grammar on that
 point.

Addressed above.

  However, if we have such functions in a contrib extension, I worry what
  the pg_upgrade path is from that extension to an in-core solution.
 
 Functions are already used heavily for many aspects of PostgreSQL.
 http://www.postgresql.org/docs/devel/static/functions-admin.html
 
 Presumably you don't advocate an in core solution to replace
 pg_cancel_backend() etc?

I don't view cancelling a backend as being part of the grammar's
responsibility.  One of the grammar's roles is to define the objects
which exist in the system and to further support modification of those
objects.  We don't have a function called 'create_table', for example.

 My proposed route for making this in-core is simply to accept that
 the extension is in core.

* Robert Haas (robertmh...@gmail.com) wrote:
On Tue, Jul 1, 2014 at 3:20 PM, Dean Rasheed dean.a.rash...@gmail.com wrote:
If RLS quals are instead regarded as constraints on access, and
multiple policies apply, then it seems that the quals should now be
combined with AND rather than OR, right?

I do feel that RLS quals are constraints on access, but I don't see how
it follows that multiple quals should be AND'd together because of that.
I view the RLS policies on each table as being independent and standing
alone regarding what can be seen. If you have access to a table today
through policy A, and then later policy B is added, using AND would mean
that the set of rows returned is less than if only policy A existed.
That doesn't seem correct to me.

Yeah, maybe. I intuitively feel that OR would be more useful, so it
would be nice to find a design where that makes sense. But it depends
a lot, in my view, on what syntax we end up with. For example,
suppose we add just one command:

ALTER TABLE table_name FILTER [ role_name | PUBLIC ] USING qual;

If the given role inherits from multiple roles that have different
filters, I think the user will naturally expect all of the filters to
be applied.

Agreed.

But you could do it other ways. For example:

ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;

If a table is set to NO ROW LEVEL SECURITY then it behaves just like
it does now: anyone who accesses it sees all the rows, restricted to
those columns for which they have permission. If the table is set to
ROW LEVEL SECURITY then the default is to show no rows. The second
command then allows access to a subset of the rows for a give role
name. In this case, it is probably logical for access to be combined
via OR.

I can see value is having a table-level option to indicate if RLS is
applied for that table or not, but I had been thinking we'd just
automatically manage that. That is to say that once you define an RLS
policy for a table, we go look and see what policy should be applied in
each case. With the user able to control that, what happens if they say
row security on the table and there are no policies? All access would
show the table as empty? What if policies exist and they decide to
'turn off' RLS for the table- suddenly everyone can see all the rows?

My answers to the above (which are making me like the idea more,
actually...) would be:

Yes, if they turn on RLS for the table and there aren't any policies,
then the table appears empty for anyone with normal SELECT rights (table
owner and superusers would still see everything).

If policies exist and the user asks to turn off RLS, I'd throw an ERROR
as there is a security risk there. We could support a CASCADE option
which would go and drop the policies from the table first.

Otherwise, I'm generally liking Dean's thoughts in
http://www.postgresql.org/message-id/CAEZATCVftksFH=x+9mvmbnmzo5ksup+rk0kb4oro92jofjo...@mail.gmail.com
along with the table-level enable RLS option.

Are we getting to a point where there is sufficient agreement that it'd
be worthwhile to really start implementing this? I'd suggest that we
either forgo or at least table the notion of per-column policy
definitions- RLS controls whole rows and so I don't feel that per-column
policies really make sense.

Thanks,

Stephen

signature.asc
Description: Digital signature

Re: [HACKERS] Spinlocks and compiler/memory barriers

On Wed, Jul 2, 2014 at 4:06 AM, Mark Cave-Ayland
mark.cave-ayl...@ilande.co.uk wrote:
 The point I wanted to make was that there are certain applications for which
 SPARCv8 is still certified for particular military/aerospace use. While I
 don't use it myself, some people are still using it enough to want to
 contribute QEMU patches.

 In terms of QEMU, the main reason for mentioning it was that if the
 consensus were to keep SPARCv8 support then this could be one possible way
 to provide basic buildfarm support (although as Tom rightly points out,
 there would need to be testing to build up confidence that the emulator does
 the right thing in a multiprocessor environment).

I think we're getting off-track here.  The PostgreSQL project is
always willing to accept new buildfarm machines.  In the absence of
buildfarm machines, we try not to go around randomly breaking things
that were previously working.  But the current situation is that we
have good reason to suspect that a couple of very old platforms are
subtly broken.  In that situation, I think removing the
thought-to-be-broken-code is the only sensible approach.

Now, we're not crossing any sort of Rubicon here.  If buildfarm
machines show up - or, hell, if ONE person with access to the hardware
OR a simulator shows up and can compile test EVEN ONCE that a patch to
re-add support compiles and that the regression tests pass - then we
can add support back in two shakes of a lamb's tail.  In the meantime,
leaving broken code in our tree is not a service to anyone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user set local_preload_libraries.

Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 postgres=# alter user horiguti2 set local_preload_libraries='libname';
 ERROR:  parameter local_preload_libraries cannot be set after connection 
 start

Hm ... it's kind of annoying that that doesn't work; it's certainly not
hard to imagine use-cases for per-user or per-database settings of
local_preload_libraries.  However, I'm not sure if we apply per-user or
per-database settings early enough in backend startup that they could
affect library loading.  If we do, then the ALTER code needs to be
taught to allow this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

2014-07-02 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/01/2014 11:55 PM, Fujii Masao wrote:
 Hmm... I found that you had marked this proposal as Returned with
 Feedback. But I don't think that we reached the consensus to do
 that. I think that it's still worth discussing this topic in this
 CF. So I marked this as Needs Review again.
 
 If you strongly think that this proposal should be marked as 
 Returned with Feedback, could you let me know why you think so?

Returned with Feedback means, well exactly that ;-) -- the patch as
linked to the CF page is wrong and cannot be applied the way it is
currently. It is therefore returned to you to be fixed. It does not
mean Rejected which is what you seem to infer.

As mentioned on the CF the documentation change is flat wrong, and you
yourself have said that you think PGC_SUSET is wrong now and
PGC_SIGHUP should be used instead.

Furthermore I doubt you have tested the change to PGC_SIGHUP because I
did a quick test, and for some reason I haven't yet tracked down SHOW
does not see the change to log_disconnections as you said it would
after a reload, unlike other PGC_SIGHUP vars. So there is more
thought, testing, and possibly coding needed to make this a viable change.

Joe

- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTtBm6AAoJEDfy90M199hlEzQQAJOCZrUb1fQw5ArGCBRm3T2n
gB1SobGGbmSweY3M8D/U55/IpjWEp3Emma3869tTVG51d6r1vRchwax1Ol2IUuek
Z7YwdgysoUOY1RNbqsa/WUVS23djKplgA7azrDu+MXFJpupBQjCH7lumtJ/L1dbC
1ua3NKZg914HkDTNHkD2AKL6o4LupfRM0hcSmBUZRG0fWLgSBiHza+idzFR1TfK2
6SpK0T6u3H6R7eU/7YluapY6LC/nA0bx+zM7sBEsWE2Wb1NQ/DER/fkuSO0V0N3X
/ljRUP7sds2KOlIJ95081JVbN5lTM2rQfd6ZLu5CsTKEDKR+PL8rOGgCX2mMoTS5
gc8vDLtyArqzcZpmRTufyBUvQAAS7CyIG7JxJNyWkEDwnn/B8HqBuGdLIdS8VdV5
oEdhbcKuN5cMTodMAv1h/QPcpSE72O/zZ6XxJGD5Wcpury7BTmBkLNJnZOsY4GU0
Nlxcc3tTvMsZYpvrJYBVfypQ6J7PCCwx1qra2GLtA7fkSeH+tXuqQ0IKVXi2bYEr
TDC2Cx/37cn56Z9PObAUJlYmoolix8MD27m6cEW0PvkJrDe7tEPWpEf38MUeZ0ae
kinXrB0MtxrrqICsWBjtxz0HGdsbQUreYs3JadnmkAR8cvhunDuHFShZv6GGzaZL
PzhOgGxxHemWGF7er2YO
=tiPs
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate function API versus grouping sets

Andrew Gierth and...@tao11.riddles.org.uk writes:
 1. Assumptions about the relationship between aggcontext and fn_extra

 Tom's version of the ordered-set aggregate code makes the assumption
 that it is safe to store the aggcontext returned by AggCheckCallContext
 in a structure hung off flinfo-fn_extra. This implicitly assumes that
 the function will be called in only one aggcontext, or at least only one
 per flinfo.

 Doing rollup via GroupAggregate by maintaining multiple transition
 values at a time (one per grouping set) means that the transfn is being
 called interleaved for transition values in different contexts. So the
 question becomes: is it wrong for the transition function to assume that
 aggcontext can be cached this way, or is it necessary for the executor
 to use a separate flinfo for each concurrent grouping set?

Hm.  We could probably move gcontext into the per-group data.  I'm not
sure though if there are any other dependencies there on the groups being
evaluated serially.  More generally, I wonder whether user-defined
aggregates are likely to be making assumptions that will be broken by
this.

 2. AggGetPerAggEContext

 The comment documents this as returning the per-output-tuple econtext,
 and the ordered-set code assumes that the rescan of this context implies
 that the aggcontext is also about to be reset (so cleanup functions can
 be called).

 Since it seems that the cleanup callback is the sole reason for this
 function to exist, is it acceptable to remove any implication that the
 context returned is the overall per-output-tuple one, and simply state
 that it is a context whose cleanup functions are called at the
 appropriate time before aggcontext is reset?

Well, I think the implication is that it's the econtext in which the
result of the aggregation will be used.

Another approach would be to remove AggGetPerAggEContext as such from the
API altogether, and instead offer an interface that says register an
aggregate cleanup callback, leaving it to the agg/window core code to
figure out which context to hang that on.  I had thought that exposing
this econtext and the per-input-tuple one would provide useful generality,
but maybe we should rethink that.

Obviously, the time grows short to reconsider this API, but it's not
quite too late.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS Design

On Wed, Jul 2, 2014 at 9:47 AM, Stephen Frost sfr...@snowman.net wrote:
 But you could do it other ways.  For example:

 ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
 ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;

 If a table is set to NO ROW LEVEL SECURITY then it behaves just like
 it does now: anyone who accesses it sees all the rows, restricted to
 those columns for which they have permission.  If the table is set to
 ROW LEVEL SECURITY then the default is to show no rows.  The second
 command then allows access to a subset of the rows for a give role
 name.  In this case, it is probably logical for access to be combined
 via OR.

 I can see value is having a table-level option to indicate if RLS is
 applied for that table or not, but I had been thinking we'd just
 automatically manage that.  That is to say that once you define an RLS
 policy for a table, we go look and see what policy should be applied in
 each case.  With the user able to control that, what happens if they say
 row security on the table and there are no policies?  All access would
 show the table as empty?

I said the same thing in the text you quoted immediately above this reply.

 What if policies exist and they decide to
 'turn off' RLS for the table- suddenly everyone can see all the rows?

That'd be my vote.  Sorta like disabling triggers.

 Are we getting to a point where there is sufficient agreement that it'd
 be worthwhile to really start implementing this?

I think we're converging, but it might be a good idea to summarize a
specific proposal before you start implementing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.

On Tue, Jul 1, 2014 at 7:32 PM, Gurjeet Singh gurj...@singh.im wrote:
 On Tue, Jul 1, 2014 at 10:05 AM, Kevin Grittner kgri...@ymail.com wrote:
 Tom Lane t...@sss.pgh.pa.us wrote:

 If we're going to do it like this, then I think the force flag
 should be considered to do nothing except override the clock
 check, which probably means it shouldn't be tested in the initial
 if() at all.

 That makes sense, and is easily done.

 Attached is the patch to save you a few key strokes :)

 The only question left is
 how far back to take the patch.  I'm inclined to only apply it to
 master and 9.4.  Does anyone think otherwise?

 Considering this as a bug-fix, I'd vote for it to be applied to all
 supported releases. But since this may cause unforeseen performance
 penalty, I think it should be applied only as far back as the
 introduction of PGSTAT_STAT_INTERVAL throttle.

 The throttle was implemeted in 641912b, which AFAICT was part of 8.3.
 So I guess all the supported releases it is.

I'll vote for master-only.  I view this as a behavior change, and it
isn't nice to spring those on people in minor releases.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

Abhijit Menon-Sen a...@2ndquadrant.com wrote:

 Spread shared memory across NUMA memory nodes
 Marked waiting on author, but status unclear. Any updates?

The review was a little sparse:

http://www.postgresql.org/message-id/CADyhKSXs+oUetngSbeiM0tVSRy=QeCaSNBQBDbM=sfqtdg+...@mail.gmail.com

In particular, there was no benchmarking of any sort in the review.
I did some benchmarking on a 16 core Power PC machine with 4 NUMA
memory nodes, and found that the benefit was overall about 2% in
the unbalanced memory node tests I was able to devise.  Unless
memory usage was unbalanced I saw no overall difference.  Timings
with the patch varied less from run to run than without the patch. 
Where I saw worse performance in the field (prompting me to work on
this patch) was on an Intel 40 core 4 memory node system.  Perhaps
that is necessary to see the really bad behavior.

I am not sure that the modest worst-case benefits I have seen
justify applying the patch, especially since to see that I also
needed to set up a custom cpuset to run PostgreSQL under so that
the OS also used interleaving for its cache.  I did have one
outlier for the unpatched code which was about 20% worse than
average, and no such outliers for patched code.  I have seen
reports of much worse performance hits from an unbalanced load than
I was able to reliably produce, so I was hoping that someone with
more creativity in creating worst case tests would see what they
could do.  IMV, the best argument for the patch is as insurance
against such pessimal events.

The suggestion that there be a GUC to suppress the interleaving of
the main shared memory segment among NUMA memory nodes is
interesting.  I'm dubious, but I guess it would allow an opt-out
escape hatch if someone found unexpected performance problems with
it in the field.  It's certainly trivial to add; the main argument
against it is that it is another knob that might confuse users.

Anyway, Waiting on Author is probably not the best state for this
patch; there's nothing that I feel there is a consensus to change. 
I'm OK with either Needs Review or Returned with Feedback. If
someone can produce a benchmark showing more benefit than I was
able to show, it can be revived in a later CF.  Perhaps as more
machines with high core counts and multiple NUMA memory nodes
become more common the cases where we run into trouble will become
more clear.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] autovacuum: found orphan temp table

2014-07-02 Thread Joshua D. Drake



Hello,

While it is obvious what is happening in $SUBJECT as well as reasonably 
obvious why it can happen. What isn't obvious is what to do about it. It 
seems we log in as a super user and drop the temp tables.


However, I would think if we know that it is orphaned that autovacuum 
should just clean those up?


JD
--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, @cmdpromptinc
If we send our children to Caesar for their education, we should
 not be surprised when they come back as Romans.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum: found orphan temp table

Joshua D. Drake wrote:

Hi,

 While it is obvious what is happening in $SUBJECT as well as
 reasonably obvious why it can happen. What isn't obvious is what to
 do about it. It seems we log in as a super user and drop the temp
 tables.
 
 However, I would think if we know that it is orphaned that
 autovacuum should just clean those up?

As far as I recall, if you wait long enough it will drop them.  Looking
at the code: yes, it's forcibly dropped when it becomes a problem for
xid wraparound.  Of course, you can drop it as superuser as well; that
will save you a few months of WARNING spam ...

Cheers

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] autovacuum: found orphan temp table

Joshua D. Drake j...@commandprompt.com writes:
 While it is obvious what is happening in $SUBJECT as well as reasonably 
 obvious why it can happen. What isn't obvious is what to do about it. It 
 seems we log in as a super user and drop the temp tables.

You don't need to do anything --- the table will go away the next time
a session makes use of that pg_stat_temp schema.

 However, I would think if we know that it is orphaned that autovacuum 
 should just clean those up?

That seems to me to be outside the charter of autovacuum, particularly
since orphan tables might be of interest for forensic reasons.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

* Abhijit Menon-Sen (a...@2ndquadrant.com) wrote:
 Row-security based on Updatable security barrier views
 Lots of discussion that I haven't dared to look at properly yet. I
 gather there's still plenty of design-level work needed, and this
 is not in any imminent danger of being committed.

I feel like we're actually getting pretty close on the design side..
That said, the design we're arriving at will require a bit more work to
implement that seems unlikely to be completed in the next two weeks,
though I could be wrong.  I'd be fine leaving this open for now at least
('waiting on author' perhaps) and then bumping it to August only if
necessary.  I'm still very interested in getting this committed early in
this cycle to allow for a good bit of testing, of course.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

Christoph Berg c...@df7cb.de writes:
 Re: Tom Lane 2014-07-01 20654.1404247...@sss.pgh.pa.us
 Fair enough.  I went for a minimum-change approach when hacking that
 script, but we could change it some more in the name of readability.
 Will do something about that.

 Thanks, it's much nicer now. There's one uglyness left though: The
 name PG_CHILD_OOM_SCORE_ADJ should match what actually gets passed to
 the backends,

Meh ... I don't find that to be important.  Making a parallel with
PG_MASTER_OOM_SCORE_ADJ seemed more helpful here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] RLS Design

* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Jul 2, 2014 at 9:47 AM, Stephen Frost sfr...@snowman.net wrote:
  But you could do it other ways.  For example:
 
  ALTER TABLE table_name [ NO ] ROW LEVEL SECURITY;
  ALTER TABLE table_name GRANT ROW ACCESS TO role_name USING qual;
 
  If a table is set to NO ROW LEVEL SECURITY then it behaves just like
  it does now: anyone who accesses it sees all the rows, restricted to
  those columns for which they have permission.  If the table is set to
  ROW LEVEL SECURITY then the default is to show no rows.  The second
  command then allows access to a subset of the rows for a give role
  name.  In this case, it is probably logical for access to be combined
  via OR.
 
  I can see value is having a table-level option to indicate if RLS is
  applied for that table or not, but I had been thinking we'd just
  automatically manage that.  That is to say that once you define an RLS
  policy for a table, we go look and see what policy should be applied in
  each case.  With the user able to control that, what happens if they say
  row security on the table and there are no policies?  All access would
  show the table as empty?
 
 I said the same thing in the text you quoted immediately above this reply.

huh.  Somehow I managed to only read the first sentence in that
paragraph.  Clearly I need to go get (more) coffee.  Still- sounds like
agreement. :)

  What if policies exist and they decide to
  'turn off' RLS for the table- suddenly everyone can see all the rows?
 
 That'd be my vote.  Sorta like disabling triggers.

Hmm.  Ok- how would you feel about at least spitting out a WARNING if
there are still policies on the table in that case..?  Just makes me a
bit nervous to have a case where policies can be defined on a table but
are not actually being enforced..

  Are we getting to a point where there is sufficient agreement that it'd
  be worthwhile to really start implementing this?
 
 I think we're converging, but it might be a good idea to summarize a
 specific proposal before you start implementing.

Right- will do so later today.

Thanks!

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] RLS Design

On Wed, Jul 2, 2014 at 11:42 AM, Stephen Frost sfr...@snowman.net wrote:
  What if policies exist and they decide to
  'turn off' RLS for the table- suddenly everyone can see all the rows?

 That'd be my vote.  Sorta like disabling triggers.

 Hmm.  Ok- how would you feel about at least spitting out a WARNING if
 there are still policies on the table in that case..?  Just makes me a
 bit nervous to have a case where policies can be defined on a table but
 are not actually being enforced..

Sounds like nanny-ism to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.

Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jul 1, 2014 at 7:32 PM, Gurjeet Singh gurj...@singh.im wrote:

 Considering this as a bug-fix, I'd vote for it to be applied to
 all supported releases.

I initially considered that position, but balanced that against
this:

 I view this as a behavior change, and it isn't nice to spring
 those on people in minor releases.

I tend to be very conservative on what should go into a minor
release.  In this case the user impact is pretty minimal --
distortion of numbers in monitoring software.  That doesn't strike
me as a serious enough problem to risk even a trivial behavior
change.

 I'll vote for master-only.

Well, it seems tame enough to me to apply to the release still in
beta testing.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.

On Wed, Jul 2, 2014 at 11:45 AM, Kevin Grittner kgri...@ymail.com wrote:
 I'll vote for master-only.

 Well, it seems tame enough to me to apply to the release still in
 beta testing.

It didn't seem important enough to me to justify a beta-to-release
behavior change, but I won't complain too much if you feel otherwise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] slotname vs slot_name

On 2014-06-05 12:57:57 +0200, Andres Freund wrote:
  BTW, what about also renaming pg_llog directory? I'm afraid that
  a user can confuse pg_log with pg_llog.
 
 We have:
 * pg_ldecoding (Heikki)
 * pg_lcse or pg_lcset (Petr)
 * pg_logical (Andres)
 
 I like, what a surprise, my own suggestion best. The name seems more
 versatile because it's not restricted to decoding.

So, we haven't really come to any conclusion in this thread and we
better change it soon, if at all. So, unless somebody protests against
it I'm going to rename pg_llog to pg_logical and document it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Keepalive-related socket options under FreeBSD 9, 10

2014-07-02 Thread Palle Girgensohn

 Since upgrading FreeBSD from 8 to 9, I've noticed the following messages
 showing up in logs when a connection with pgAdmin3 is made:
 
 LOG:  getsockopt(TCP_KEEPCNT) failed: Protocol not available
 STATEMENT:  SELECT setting FROM pg_settings WHERE name IN ('autovacuum',
 'track_counts')
 LOG:  getsockopt(TCP_KEEPIDLE) failed: Protocol not available
 STATEMENT:  SELECT setting FROM pg_settings WHERE name IN ('autovacuum',
 'track_counts')
 LOG:  getsockopt(TCP_KEEPINTVL) failed: Protocol not available
 STATEMENT:  SELECT setting FROM pg_settings WHERE name IN ('autovacuum',
 'track_counts')
 
 tcp_keepalives_idle, tcp_keepalives_interval, and tcp_keepalives_count
 are all set to the default (0), which means system default.
 
 My guess as to what causes this:
 
 src/backend/libpq/pqcomm.c apparently assumes that if TCP_KEEPIDLE 
 friends are defined, then the respective options are readable, but
 according to man tcp, that is not the case for FreeBSD 9 (and 10):
 
 TCP_KEEPINIT
 This write-only setsockopt(2) option accepts a per-socket
 timeout argument of u_int in seconds, for new, non-estab-
 lished TCP connections.  For the global default in mil-
 liseconds see keepinit in the MIB Variables section fur-
 ther down.
 
 As a work-around, I've set the keepalive options to the system defaults
 provided by man tcp.
 

Hi,

This only happens with some clients, right? I'm seeing it for only happens for 
some programs, like connecting from perl? I've seem discussions a while back 
about why they are read-only.

I see it to, but only for a perl connection, nothing for psql or the likes.

Do you have any suggestions for a patch for the port, in the mean while?

Palle



signature.asc
Description: Message signed with OpenPGP using GPGMail

Re: [HACKERS] RLS Design

* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Jul 2, 2014 at 11:42 AM, Stephen Frost sfr...@snowman.net wrote:
   What if policies exist and they decide to
   'turn off' RLS for the table- suddenly everyone can see all the rows?
 
  That'd be my vote.  Sorta like disabling triggers.
 
  Hmm.  Ok- how would you feel about at least spitting out a WARNING if
  there are still policies on the table in that case..?  Just makes me a
  bit nervous to have a case where policies can be defined on a table but
  are not actually being enforced..
 
 Sounds like nanny-ism to me.

Alright, fair enough.  Clearly, the individual changing the RLS on the
table will have to have appropriate rights to do so.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.

Robert Haas robertmh...@gmail.com writes:
 On Wed, Jul 2, 2014 at 11:45 AM, Kevin Grittner kgri...@ymail.com wrote:
 I'll vote for master-only.

 Well, it seems tame enough to me to apply to the release still in
 beta testing.

 It didn't seem important enough to me to justify a beta-to-release
 behavior change, but I won't complain too much if you feel otherwise.

FWIW, master + 9.4 seems like a reasonable compromise to me too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

On Wed, Jul 2, 2014 at 9:08 AM, Jonathan Corbet cor...@lwn.net wrote:
 On Tue, 1 Jul 2014 15:57:25 -0400
 Robert Haas robertmh...@gmail.com wrote:
 Of course, we have no guarantee that the Linux kernel guys won't
 change this again.  Apparently we don't break userspace is a
 somewhat selectively-enforced principle.

 It's selectively enforced in that kernel developers don't (and can't)
 scan the world for software that might break.  OTOH, if somebody were
 to respond to a proposed change by saying this breaks PostgreSQL, I
 would be amazed if the change were to be merged in that form.

 In other words, it's important to scream.  There were concerns during
 the last round of messing with the OOM logic, but nobody pointed to
 something that actually broke.

 I predict that somebody will certainly try to rewrite the OOM code
 again in the future.  The nature of that code is such that it is never
 going to work as well as people would like.  But if application
 developers make it clear that they are dependent on the current
 interface working, kernel developers will be constrained to keep it
 working.

Well, of course the reality is that if you don't break some things,
you can never change anything.  And clearly we want Linux to be able
to change things, to make them better.

On the flip side, interface changes do cause some pain, and it's not
realistic to expect every software project to monitor LKML.

This one is not really a big deal, though; I think we just need to
keep in mind, as you say, that it might change again.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

Abhijit Menon-Sen wrote:

 Minmax indexes
 Álvaro will respond to the design questions Heikki raised.

I'm currently reworking the patch so that things that need to be
abstract per discussion are abstract, without enlarging the scope
excessively.  I'm working on an opclass implementation that can be used
for sortable datatypes, so that it covers the datatypes it covers
today (basically anything with a btree opclass).  I don't intend to
implement opclasses for any other type, but with the design I expect to
have it should be possible to cover stuff like geometric types usefully.
(I guess it will be possible to index arrays in a general way also, but
I don't think it will be of any usefulness).

Once I post that version, I will focus on reviewing other patches, and
then maybe someone interested can try to implement other opclasses on
top of it, to ensure that the API I propose is sensible or at least
useful.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate function API versus grouping sets

 Tom == Tom Lane t...@sss.pgh.pa.us writes:

  Doing rollup via GroupAggregate by maintaining multiple transition
  values at a time (one per grouping set) means that the transfn is
  being called interleaved for transition values in different
  contexts. So the question becomes: is it wrong for the transition
  function to assume that aggcontext can be cached this way, or is
  it necessary for the executor to use a separate flinfo for each
  concurrent grouping set?

 Tom Hm.  We could probably move gcontext into the per-group data.
 Tom I'm not sure though if there are any other dependencies there on
 Tom the groups being evaluated serially.  More generally, I wonder
 Tom whether user-defined aggregates are likely to be making
 Tom assumptions that will be broken by this.

The thing is that almost everything _except_ aggcontext and
AggGetPerAggEContext that a transfn might want to hang off fn_extra
really is going to be constant across all groups.

The big question, as you say, is whether this is going to be an issue
for existing user-defined aggregates.

  Since it seems that the cleanup callback is the sole reason for
  this function to exist, is it acceptable to remove any implication
  that the context returned is the overall per-output-tuple one, and
  simply state that it is a context whose cleanup functions are
  called at the appropriate time before aggcontext is reset?

 Tom Well, I think the implication is that it's the econtext in which
 Tom the result of the aggregation will be used.

In the rollup case, though, it does not seem reasonable to have
multiple result-tuple econtexts (that would significantly complicate
the projection of rows, the handling of rescans, etc.).

 Tom Another approach would be to remove AggGetPerAggEContext as such
 Tom from the API altogether, and instead offer an interface that
 Tom says register an aggregate cleanup callback, leaving it to the
 Tom agg/window core code to figure out which context to hang that
 Tom on.  I had thought that exposing this econtext and the
 Tom per-input-tuple one would provide useful generality, but maybe
 Tom we should rethink that.

Works for me.

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-07-02 Thread Jeff Janes

On Mon, Jun 30, 2014 at 3:17 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 Jeff Janes wrote:

 In particular, pgpipe is almost an exact duplicate between them,
 except the copy in vac_parallel.c has fallen behind changes made to
 parallel.c.  (Those changes would have fixed the Windows warnings).  I
 think that this function (and perhaps other parts as
 well--exit_horribly for example) need to refactored into a common
 file that both files can include.  I don't know where the best place
 for that would be, though.  (I haven't done this type of refactoring
 myself.)

 I think commit d2c1740dc275543a46721ed254ba3623f63d2204 is apropos.
 Maybe we should move pgpipe back to src/port and have pg_dump and this
 new thing use that.  I'm not sure about the rest of duplication in
 vac_parallel.c; there might be a lot in common with what
 pg_dump/parallel.c does too.  Having two copies of code is frowned upon
 for good reasons.  This patch introduces 1200 lines of new code in
 vac_parallel.c, ugh.

 If we really require 1200 lines to get parallel vacuum working for
 vacuumdb, I would question the wisdom of this effort.  To me, it seems
 better spent improving autovacuum to cover whatever it is that this
 patch is supposed to be good for --- or maybe just enable having a shell
 script that launches multiple vacuumdb instances in parallel ...

I would only envision using the parallel feature for vacuumdb after a
pg_upgrade or some other major maintenance window (that is the only
time I ever envision using vacuumdb at all).  I don't think autovacuum
can be expected to handle such situations well, as it is designed to
be a smooth background process.

I guess the ideal solution would be for manual VACUUM to have a
PARALLEL option, then vacuumdb could just invoke that one table at a
time.  That way you would get within-table parallelism which would be
important if one table dominates the entire database cluster. But I
don't foresee that happening any time soon.

I don't know how to calibrate the number of lines that is worthwhile.
If you write in C and need to have cross-platform compatibility and
robust error handling, it seems to take hundreds of lines to do much
of anything.  The code duplication is a problem, but I don't think
just raw line count is, especially since it has already been written.

The trend in this project seems to be for shell scripts to eventually
get converted into C programs.  In fact, src/bin/scripts now has no
scripts at all.  Also it is important to vacuum/analyze tables in the
same database at the same time, otherwise you will not get much
speed-up in the ordinary case where there is only one meaningful
database.  Doing that in a shell script would be fairly hard.  It
should be pretty easy in Perl (at least for me--I'm sure others
disagree), but that also doesn't seem to be the way we do things for
programs intended for end users.

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] 9.5 CF1

At 2014-07-02 11:19:10 -0400, sfr...@snowman.net wrote:

 I'd be fine leaving this open for now at least ('waiting on author'
 perhaps) and then bumping it to August only if necessary.

Since there's active discussion/development happening, I wasn't planning
to change the status anyway. It looks fine as-is.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

Jeff Janes wrote:

 I would only envision using the parallel feature for vacuumdb after a
 pg_upgrade or some other major maintenance window (that is the only
 time I ever envision using vacuumdb at all).  I don't think autovacuum
 can be expected to handle such situations well, as it is designed to
 be a smooth background process.

That's a fair point.  One thing that would be pretty neat but I don't
think I would get anyone to implement it, is having the user control the
autovacuum launcher in some way.  For instance please vacuum this set
of tables as quickly as possible, and it would launch as many workers
are configured.  It would take months to get a UI settled for this,
however.

 I guess the ideal solution would be for manual VACUUM to have a
 PARALLEL option, then vacuumdb could just invoke that one table at a
 time.  That way you would get within-table parallelism which would be
 important if one table dominates the entire database cluster. But I
 don't foresee that happening any time soon.

I see this as a completely different feature, which might also be pretty
neat, at least if you're open to spending more I/O bandwidth processing
a single table: have several processes scanning the heap simultaneously.
Since I think vacuum is mostly I/O bound at the moment, I'm not sure
there is much point in this currently.

 I don't know how to calibrate the number of lines that is worthwhile.
 If you write in C and need to have cross-platform compatibility and
 robust error handling, it seems to take hundreds of lines to do much
 of anything.  The code duplication is a problem, but I don't think
 just raw line count is, especially since it has already been written.

Well, there are (at least) two types of duplicate code: first you have
these common routines such as pgpipe that are duplicates for no good
reason.  Just move them to src/port or something and it's all good.  But
the OP said there is code that cannot be shared even though it's very
similar in both incarnations.  That means we cannot (or it's difficult
to) just have one copy, which means as they fix bugs in one copy we need
to update the other.  This is bad -- witness the situation with ecpg's
copy of date/time code, where there are bugs fixed in the backend
version but the ecpg version does not have the fix.  It's difficult to
keep track of these things.

 The trend in this project seems to be for shell scripts to eventually
 get converted into C programs.  In fact, src/bin/scripts now has no
 scripts at all.  Also it is important to vacuum/analyze tables in the
 same database at the same time, otherwise you will not get much
 speed-up in the ordinary case where there is only one meaningful
 database.  Doing that in a shell script would be fairly hard.  It
 should be pretty easy in Perl (at least for me--I'm sure others
 disagree), but that also doesn't seem to be the way we do things for
 programs intended for end users.

Yeah, shipping shell scripts doesn't work very well for us.  I'm
thinking perhaps we can have sample scripts in which we show how to use
parallel(1) to run multiple vacuumdb's in parallel in Unix and some
similar mechanism in Windows, and that's it.  So we wouldn't provide the
complete toolset, but the platform surely has ways to make it happen.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TODO : Allow parallel cores to be used by vacuumdb [ WIP ]

2014-07-02 Thread Sawada Masahiko

On Wed, Jul 2, 2014 at 2:27 PM, Dilip kumar dilip.ku...@huawei.com wrote:
 On 01 July 2014 22:17, Sawada Masahiko Wrote,


 I have executed latest patch.
 One question is that how to use --jobs option is correct?
 $ vacuumdb  -d postgres  --jobs=30

 I got following error.
 vacuumdb: unrecognized option '--jobs=30'
 Try vacuumdb --help for more information.


 Thanks for comments, Your usage are correct, but there are some problem in 
 code and I have fixed the same in attached patch.


This patch allows to set 0 to -j option?
When I set 0 to -j option, I think that the behavior of this is same
as when I set to 1.

Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] log_error_verbosity and unexpected errors

2014-07-02 Thread Greg Stark

I think log_error_verbosity is a strange variable. It's useless for
expected user-facing errors but essential for unexpected errors that
indicate bugs in the code -- and you can only have it on for
everything or off for everything.

I'm finding I usually want it set to 'verbose' for anything that
PANICs or is generated by an elog() but it's just noise for anything
generated by an ereport() and is ERROR or below.

The minimum suggested change would to make it implicitly true for
PANIC and any unexpected elog()s and leave the GUC for enabling it in
the other cases.

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate function API versus grouping sets

Andrew Gierth and...@tao11.riddles.org.uk writes:
 Tom == Tom Lane t...@sss.pgh.pa.us writes:
  Tom Another approach would be to remove AggGetPerAggEContext as such
  Tom from the API altogether, and instead offer an interface that
  Tom says register an aggregate cleanup callback, leaving it to the
  Tom agg/window core code to figure out which context to hang that
  Tom on.  I had thought that exposing this econtext and the
  Tom per-input-tuple one would provide useful generality, but maybe
  Tom we should rethink that.

 Works for me.

If we're going to do that, I think it needs to be in 9.4.  Are you
going to work up a patch?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user set local_preload_libraries.

On Wed, Jul 2, 2014 at 11:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 postgres=# alter user horiguti2 set local_preload_libraries='libname';
 ERROR:  parameter local_preload_libraries cannot be set after connection 
 start

 Hm ... it's kind of annoying that that doesn't work; it's certainly not
 hard to imagine use-cases for per-user or per-database settings of
 local_preload_libraries.  However, I'm not sure if we apply per-user or
 per-database settings early enough in backend startup that they could
 affect library loading.  If we do, then the ALTER code needs to be
 taught to allow this.

ISTM that's what session_preload_libraries does. Anyway I agree with
Horiguchi that we should remove incorrect description from the document
of old versions.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user set local_preload_libraries.

Fujii Masao masao.fu...@gmail.com writes:
 On Wed, Jul 2, 2014 at 11:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 postgres=# alter user horiguti2 set local_preload_libraries='libname';
 ERROR:  parameter local_preload_libraries cannot be set after connection 
 start

 Hm ... it's kind of annoying that that doesn't work; it's certainly not
 hard to imagine use-cases for per-user or per-database settings of
 local_preload_libraries.  However, I'm not sure if we apply per-user or
 per-database settings early enough in backend startup that they could
 affect library loading.  If we do, then the ALTER code needs to be
 taught to allow this.

 ISTM that's what session_preload_libraries does.

No, the reason for the distinction is that session_preload_libraries is
superuser-only, and can load anything, while local_preload_libraries is
(supposed to be) settable by ordinary users, and therefore is restricted
to load only a subset of available libraries.  But if you can't apply it
via ALTER USER that seems like it takes away a lot of the value of having
a non-SUSET GUC in this area.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] log_error_verbosity and unexpected errors

Greg Stark st...@mit.edu writes:
 I think log_error_verbosity is a strange variable. It's useless for
 expected user-facing errors but essential for unexpected errors that
 indicate bugs in the code -- and you can only have it on for
 everything or off for everything.

 I'm finding I usually want it set to 'verbose' for anything that
 PANICs or is generated by an elog() but it's just noise for anything
 generated by an ereport() and is ERROR or below.

 The minimum suggested change would to make it implicitly true for
 PANIC and any unexpected elog()s and leave the GUC for enabling it in
 the other cases.

Hm.  I'm okay with the PANIC idea, I think, but the other not so much.
That would change the rule of thumb that elog is for non user facing
errors into something more than just permission to be lazy.  In
particular, elog currently is (and is documented to be) equivalent to
an ereport call with suitable parameters.  I'm not terribly happy about
losing that equivalence, because I don't think that people have been
especially careful about which to use.  contrib is still full of
user-facing conditions reported via elog, for instance, and I expect
that third-party code is worse.

[ thinks for a bit... ]  A slightly cleaner approach is to nominate
a specified set of SQLSTATEs, certainly including XX000 and perhaps
some others, as being ones that force verbose reporting.  That would
have the same practical effect as far as elogs go, but wouldn't break
the nominal functional equivalence.

And that brings up the previous work on SQLSTATE-dependent choices
about whether to log at all.  I remember a patch was submitted for
that but don't remember offhand why it didn't get committed.  ISTM
we should think about reviving that and making the choice be not just
log or not, but no log, terse log, normal log, verbose log.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user set local_preload_libraries.

On Thu, Jul 3, 2014 at 3:53 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 On Wed, Jul 2, 2014 at 11:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Kyotaro HORIGUCHI horiguchi.kyot...@lab.ntt.co.jp writes:
 postgres=# alter user horiguti2 set local_preload_libraries='libname';
 ERROR:  parameter local_preload_libraries cannot be set after connection 
 start

 Hm ... it's kind of annoying that that doesn't work; it's certainly not
 hard to imagine use-cases for per-user or per-database settings of
 local_preload_libraries.  However, I'm not sure if we apply per-user or
 per-database settings early enough in backend startup that they could
 affect library loading.  If we do, then the ALTER code needs to be
 taught to allow this.

 ISTM that's what session_preload_libraries does.

 No, the reason for the distinction is that session_preload_libraries is
 superuser-only, and can load anything, while local_preload_libraries is
 (supposed to be) settable by ordinary users, and therefore is restricted
 to load only a subset of available libraries.  But if you can't apply it
 via ALTER USER that seems like it takes away a lot of the value of having
 a non-SUSET GUC in this area.

Agreed. I was also thinking we can set it per role, but got surprised
when I found that's impossible. Is this a problem of only
local_preload_libraries? I'm afraid that all PGC_BACKEND parameters
have the same problem.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Should extension--unpackaged-$ver.sql's \echo use FROM unpackaged;?

Hi,

Fujii noticed that I'd used
\echo Use CREATE EXTENSION pg_buffercache to load this file. \quit
in an extension upgrade script. That's obviously wrong, because it
should say ALTER EXTENSION. The reason it's that way is that I copied
the line from the unpackaged--1.0.sql file.
I noticed all unpackaged--$version transition files miss the FROM
unpackaged in that echo.
I guess that's just a oversight which we should correct? Or is there
some rationale behind it?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] log_error_verbosity and unexpected errors

2014-07-02 Thread Merlin Moncure

On Wed, Jul 2, 2014 at 2:10 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark st...@mit.edu writes:
 I think log_error_verbosity is a strange variable. It's useless for
 expected user-facing errors but essential for unexpected errors that
 indicate bugs in the code -- and you can only have it on for
 everything or off for everything.

 I'm finding I usually want it set to 'verbose' for anything that
 PANICs or is generated by an elog() but it's just noise for anything
 generated by an ereport() and is ERROR or below.

 The minimum suggested change would to make it implicitly true for
 PANIC and any unexpected elog()s and leave the GUC for enabling it in
 the other cases.

 Hm.  I'm okay with the PANIC idea, I think, but the other not so much.
 That would change the rule of thumb that elog is for non user facing
 errors into something more than just permission to be lazy.  In
 particular, elog currently is (and is documented to be) equivalent to
 an ereport call with suitable parameters.  I'm not terribly happy about
 losing that equivalence, because I don't think that people have been
 especially careful about which to use.  contrib is still full of
 user-facing conditions reported via elog, for instance, and I expect
 that third-party code is worse.

 [ thinks for a bit... ]  A slightly cleaner approach is to nominate
 a specified set of SQLSTATEs, certainly including XX000 and perhaps
 some others, as being ones that force verbose reporting.  That would
 have the same practical effect as far as elogs go, but wouldn't break
 the nominal functional equivalence.

That's a really nifty idea.  If you recall, I griped a while back
about the fact that there was no good setting for the psql VERBOSITY
switch.  In particular, it would be nice to have context for errors
but not for notices.

Marko had proposed a new verbosity level (see commentary leading to
unfinished patch here;
http://postgresql.1045698.n5.nabble.com/PL-pgSQL-RAISE-and-error-context-td5768176.html).
Not trying to hijack your thread, just wondering out load if a
SQLSTATE driven verbosity decision, if you were to do that,
could/should also be hooked to client console logging and/or psql.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] docs: additional subsection for page-level locks in explicit-locking section

2014-07-02 Thread Michael Banck

Hi,

While reading through the Explicit Locking section of the manual today,
I felt the last paragraph of section 13.3.2. (Row-level Locks) might
merit its own subsection.  It talks about page-level locks as distinct
from table- and row-level locks.  Then again, it is just one paragraph,
so maybe this was deliberate and/or rejected before (though I couldn't
find prior discussion off-hand).  Proposed patch attached.


Cheers,

Michael

-- 
Michael Banck
Projektleiter / Berater
Tel.: +49 (2161) 4643-171
Fax:  +49 (2161) 4643-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Hohenzollernstr. 133, 41061 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
diff --git a/doc/src/sgml/mvcc.sgml b/doc/src/sgml/mvcc.sgml
index 12b7814..84501e0 100644
--- a/doc/src/sgml/mvcc.sgml
+++ b/doc/src/sgml/mvcc.sgml
@@ -1140,6 +1140,9 @@ ERROR:  could not serialize access due to read/write dependencies among transact
  will result in disk writes.
 /para
 
+   sect2 id=locking-pages
+titlePage-level Locks/title
+  
 para
  In addition to table and row locks, page-level share/exclusive locks are
  used to control read/write access to table pages in the shared buffer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] alter user set local_preload_libraries.

Fujii Masao masao.fu...@gmail.com writes:
 Agreed. I was also thinking we can set it per role, but got surprised
 when I found that's impossible. Is this a problem of only
 local_preload_libraries? I'm afraid that all PGC_BACKEND parameters
 have the same problem.

Well, there aren't that many PGC_BACKEND parameters.

Two of them are log_connections and log_disconnections, which we've
been arguing over whether they should be controllable by non-superusers
in the first place.  The only other ones are ignore_system_indexes and 
post_auth_delay, which are debugging things that I can't see using with
ALTER.  So this may be the only one where it's really of much interest.

But I agree that the problem is triggered by the PGC_BACKEND categorization
and not the meaning of this specific GUC.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-02 Thread Martijn van Oosterhout

On Tue, Jul 01, 2014 at 02:36:55PM -0500, Merlin Moncure wrote:
 On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout
 klep...@svana.org wrote:
  On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
  The simplified scene:
  select slowfunction(s) from a order by b limit 1;
  is slow than
  select slowfunction(s) from (select s from a order by b limit 1) as z;
  if there are many records in table 'a'.
 
 
  The real scene. Function  ST_Distance_Sphere is slow, the query:
  SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road 
  order by c limit 1;
  is slow than:
  select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from (SELECT s 
  from road order by c limit 1) as a;
  There are about 7000 records in 'road'.
 
  I think to help here I think we need the EXPLAIN ANALYSE output for
  both queries.
 
 Well, I think the problem is a well understood one: there is no
 guarantee that functions-in-select-list are called exactly once per
 output row.  This is documented -- for example see here:
 http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
 In short, if you want very precise control of function evaluation use
 a subquery, or, if you're really paranoid, a CTE.

I'm probably dense, but I'm not sure I understand. Or it is that the
slowfunction() is called prior to the sort? That seems insane.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature

Re: [HACKERS] Audit of logout

On Wed, Jul 2, 2014 at 11:39 PM, Joe Conway m...@joeconway.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 07/01/2014 11:55 PM, Fujii Masao wrote:
 Hmm... I found that you had marked this proposal as Returned with
 Feedback. But I don't think that we reached the consensus to do
 that. I think that it's still worth discussing this topic in this
 CF. So I marked this as Needs Review again.

 If you strongly think that this proposal should be marked as
 Returned with Feedback, could you let me know why you think so?

 Returned with Feedback means, well exactly that ;-) -- the patch as
 linked to the CF page is wrong and cannot be applied the way it is
 currently. It is therefore returned to you to be fixed. It does not
 mean Rejected which is what you seem to infer.

I think that we should use Waiting on Author in that case.

Returned with Feedback is not Rejected. That's right. But it basically
means that the bugfixed or revised version of the patch will NOT be
reviewed in this CF. IOW, the author has to wait for the patch review
until next CF.

 As mentioned on the CF the documentation change is flat wrong, and you
 yourself have said that you think PGC_SUSET is wrong now and
 PGC_SIGHUP should be used instead.

 Furthermore I doubt you have tested the change to PGC_SIGHUP because I
 did a quick test, and for some reason I haven't yet tracked down SHOW
 does not see the change to log_disconnections as you said it would
 after a reload, unlike other PGC_SIGHUP vars.

No. If we change it to PGC_SIGHUP, SHOW command does display
the changed value after a reload. It's the same behavior as other
PGC_SIGHUP parameters do. Attached patch just changes it to PGC_SIGHUP.
You can test that by using the patch.

OTOH, the current behavior, i.e., log_disconnections with PGC_BACKEND,
is that SHOW command doesn't display the changed value after a reload.

 So there is more
 thought, testing, and possibly coding needed to make this a viable change.

+1

Regards,

-- 
Fujii Masao
From 098f11cfbf8f9370829e24e9f8d704c050c98875 Mon Sep 17 00:00:00 2001
From: MasaoFujii masao.fu...@gmail.com
Date: Fri, 13 Jun 2014 22:09:39 +0900
Subject: [PATCH] Make log_disconnections PGC_SUSET rather than PGC_SIGHUP.

So far even non-superusers could disable log_disconnections in order to
prevent their session logout from being logged because the parameter was
defined with PGC_BACKEND. This was harmful in the systems which need to
audit all session logouts by using log_disconnections. For this problem,
this commit changes the GUC context of log_disconnections to PGC_SIGHUP
and forbids any client from changing its setting.
---
 doc/src/sgml/config.sgml | 3 ++-
 src/backend/utils/misc/guc.c | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 37d78b3..f5f0324 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4257,7 +4257,8 @@ local0.*/var/log/postgresql
 varnamelog_connections/varname but at session termination,
 and includes the duration of the session.  This is off by
 default.
-This parameter cannot be changed after session start.
+This parameter can only be set in the filenamepostgresql.conf/
+file or on the server command line.
/para
   /listitem
  /varlistentry
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 3a31a75..19e521b 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -919,7 +919,7 @@ static struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 	{
-		{log_disconnections, PGC_BACKEND, LOGGING_WHAT,
+		{log_disconnections, PGC_SIGHUP, LOGGING_WHAT,
 			gettext_noop(Logs end of a session, including duration.),
 			NULL
 		},
-- 
1.7.12.4 (Apple Git-37)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] log_error_verbosity and unexpected errors

Merlin Moncure mmonc...@gmail.com writes:
 Not trying to hijack your thread, just wondering out load if a
 SQLSTATE driven verbosity decision, if you were to do that,
 could/should also be hooked to client console logging and/or psql.

Yeah, you could certainly argue that a similar facility on the client side
would be equally useful.  That would also lend some support to my gut
feeling that it should not be elog-vs-ereport that drives the decision,
because the client is not going to be able to tell which was used.
But the client side could have functionally equivalent behavior if it were
based on SQLSTATE and/or severity level.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate function API versus grouping sets

 Tom == Tom Lane t...@sss.pgh.pa.us writes:

 Tom Another approach would be to remove AggGetPerAggEContext as such
 Tom from the API altogether, and instead offer an interface that
 Tom says register an aggregate cleanup callback, leaving it to the
 Tom agg/window core code to figure out which context to hang that
 Tom on.  I had thought that exposing this econtext and the
 Tom per-input-tuple one would provide useful generality, but maybe
 Tom we should rethink that.

  Works for me.

 Tom If we're going to do that, I think it needs to be in 9.4.  Are
 Tom you going to work up a patch?

Do we want a decision on the fn_extra matter first, or shall I do one
patch for the econtext, and a following one for fn_extra?

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to send transaction commit/rollback stats to the stats collector unconditionally.

In preparing to push the patch, I noticed I hadn't responded to this:

Gurjeet Singh gurj...@singh.im wrote:
 Kevin Grittner kgri...@ymail.com wrote:
 I have reviewed this patch, and think we should do what the patch
 is trying to do, but I don't think the submitted patch would
 actually work.

 Just curious, why do you think it won't work.

Because you didn't touch this part of the function:

    /*
 * Send partial messages.  If force is true, make sure that any pending
 * xact commit/abort gets counted, even if no table stats to send.
 */
    if (regular_msg.m_nentries  0 ||
    (force  (pgStatXactCommit  0 || pgStatXactRollback  0)))
    pgstat_send_tabstat(regular_msg);

The statistics would not actually be sent unless a table had been
accessed or it was forced because the connection was closing.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

2014-07-02 Thread Joe Conway

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On 07/02/2014 12:43 PM, Fujii Masao wrote:
 I think that we should use Waiting on Author in that case.
 
 Returned with Feedback is not Rejected. That's right. But it
 basically means that the bugfixed or revised version of the patch
 will NOT be reviewed in this CF. IOW, the author has to wait for
 the patch review until next CF.

Doesn't mean that to me but feel free to change it to Waiting on
Author if you prefer :-)

Is there any official explanation as to what those states mean
documented anywhere?


 No. If we change it to PGC_SIGHUP, SHOW command does display the
 changed value after a reload. It's the same behavior as other 
 PGC_SIGHUP parameters do. Attached patch just changes it to
 PGC_SIGHUP. You can test that by using the patch.

I tried it already myself and it did not work the same for me. Perhaps
I messed something up but I tried two or three times and the value of
log_disconnections did not change in the open session when I did SHOW
after reload, but in new backends it did. On the other hand I also
tried a different SIGHUP var (log_checkpoints) and saw that effect
immediately in the open session. Was too tired to follow up last night
though. Have you verified for yourself that it behaves the way you say?

Joe



- -- 
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting,  24x7 Support
-BEGIN PGP SIGNATURE-
Version: GnuPG v1
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTtGMTAAoJEDfy90M199hlZs0P/A4ptorZOFkvW3DJRDfKoE2f
v9wiWHYQJtN31sdDdljRD+3WvfiGJhMVCVukzZbaZejVGsEOModEOSLJZgkPhqa/
n/TSD77KdfHWkiAENkBmsBJLfqccysRxpWsJ5rPSiXqmyz+hxHJ6u8R4RGf/gPXn
15Rq3ZYNtGP25pTlHUr869y1D9dqnEzYdhf69ql4iIQPgsGow+bppZGVEEFd6MsF
2rpUTgdczhn4yl/kMTW25CFm19dHjj+m/v+Yv9uTg0JWX8xBtQ5hT9Z8Pgc/xU2U
WDsdfQ46ORinBOfPd/RPoJoIjxpMGMtuB5dX0mHsVzp7+kqSdKI+5amSz8YxqBE8
ii01AOfH2fOws236QQtVywWRHJMs7xWj5k/YNe1ABNoh5wFQRaaFkVC5r/i+Url2
3lDHUN/MjTeXieBQMUDNmqQbrRShYQqBnWl05n8dn/6V5U42eundJCr0tO+awW9V
3gTVTpwymsv7AkJfk3LjUybTYYDee3YM1A2YhdufYrhmQJttjh1kaeai9G05y5SZ
vU6DL+FJkVukxq5n6MTDzcxKrL0SYcRUVW0FZmng/Wn5SrsBC8AifkuR3Z1uFY5s
TjgCMDr/RzTGlXjITJQOl9smc6P/0yI8JKvdkAGnjeKY9zYsVn35fs/6HGMPmusZ
DWEp8jAdDohoWgiZCz4x
=0qGo
-END PGP SIGNATURE-


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Can simplify 'limit 1' with slow function?

2014-07-02 Thread David G Johnston

Martijn van Oosterhout wrote
 On Tue, Jul 01, 2014 at 02:36:55PM -0500, Merlin Moncure wrote:
 On Tue, Jul 1, 2014 at 2:16 PM, Martijn van Oosterhout
 lt;

 kleptog@

 gt; wrote:
  On Sun, Jun 29, 2014 at 10:05:50PM +0800, gotoschool6g wrote:
  The simplified scene:
  select slowfunction(s) from a order by b limit 1;
  is slow than
  select slowfunction(s) from (select s from a order by b limit 1) as z;
  if there are many records in table 'a'.
 
 
  The real scene. Function  ST_Distance_Sphere is slow, the query:
  SELECT ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from road
 order by c limit 1;
  is slow than:
  select ST_Distance_Sphere(s, ST_GeomFromText('POINT(1 1)')) from
 (SELECT s from road order by c limit 1) as a;
  There are about 7000 records in 'road'.
 
  I think to help here I think we need the EXPLAIN ANALYSE output for
  both queries.
 
 Well, I think the problem is a well understood one: there is no
 guarantee that functions-in-select-list are called exactly once per
 output row.  This is documented -- for example see here:
 http://www.postgresql.org/docs/9.1/static/explicit-locking.html#ADVISORY-LOCKS.
 In short, if you want very precise control of function evaluation use
 a subquery, or, if you're really paranoid, a CTE.
 
 I'm probably dense, but I'm not sure I understand. Or it is that the
 slowfunction() is called prior to the sort? That seems insane.

The basic reality is that limit applies to the final set of rows that could
be output.  Since stuff like group by and distinct require knowledge of the
exact values of every output column all expressions must necessarily be
evaluated before limit.

If you want to pick just 10 rows and then process them you need a subquery.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Can-simplify-limit-1-with-slow-function-tp5809997p5810297.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] gaussian distribution pgbench

2014-07-02 Thread Gavin Flower


On 02/07/14 21:05, Fabien COELHO wrote:


Hello Mitsumasa-san,


And I'm also interested in your decile percents output like under
followings,
decile percents: 39.6% 24.0% 14.6% 8.8% 5.4% 3.3% 2.0% 1.2% 0.7% 0.4%


Sure, I'm really fine with that.

I think that it is easier than before. Sum of decile percents is just 
100%.


That's a good property:-)

However, I don't prefer highest/lowest percentage because it will 
be confused with decile percentage for users, and anyone cannot 
understand this digits. I cannot understand 4.9%, 0.0% when I see 
the first time. Then, I checked the source code, I understood it:( 
It's not good design... #Why this parameter use 100?


What else? People have ten fingers and like powers of 10, and are used 
to percents?



So I'd like to remove it if you like. It will be more simple.


I think that for the exponential distribution it helps, especially for 
high threshold, to have the lowest/highest percent density. For low 
thresholds, the decile is also definitely useful. So I'm fine with 
both outputs as you have put them.


I have just updated the wording so that it may be clearer:

 decile percents: 69.9% 21.0% 6.3% 1.9% 0.6% 0.2% 0.1% 0.0% 0.0% 0.0%
 probability of fist/last percent of the range: 11.3% 0.0%


Attached patch is fixed version, please confirm it.


Attached a v15 which just fixes a typo and the above wording update. 
I'm validating it for committers.



#Of course, World Cup is being held now. I'm not hurry at all.


I'm not a soccer kind of person, so it does not influence my 
availibility.:-)



Suggested commit message:

Add drawing random integers with a Gaussian or truncated exponentional 
distributions to pgbench.


Test variants with these distributions are also provided and triggered
with options --gaussian=... and --exponential=


Have a nice day/night,



I would suggest that probabilities should NEVER be expressed in 
percentages! As a percentage probability looks weird, and is never used 
for serious statistical work - in my experience at least.


I think probabilities should be expressed in the range 0 ... 1 - i.e. 
0.35 rather than 35%.



Cheers,
Gavin

Re: [HACKERS] Audit of logout

Fujii Masao masao.fu...@gmail.com writes:
 On Wed, Jul 2, 2014 at 11:39 PM, Joe Conway m...@joeconway.com wrote:
 Returned with Feedback means, well exactly that ;-) -- the patch as
 linked to the CF page is wrong and cannot be applied the way it is
 currently. It is therefore returned to you to be fixed. It does not
 mean Rejected which is what you seem to infer.

 I think that we should use Waiting on Author in that case.

I don't think there's a huge distinction between Waiting on Author and
Returned with Feedback.  The former means we think the author will
probably submit a new version shortly, the latter means we think it'll
take awhile, but in any particular case those predictions could turn
out wrong.  I don't have a problem with moving a patch from Returned with
Feedback back to Needs Review if the author is able to turn it around
while the CF is still going on.

As far as the particular case goes, it strikes me that the real issue
here is that we're confusing privilege level with time-duration of the
setting's effect.  The point of marking log_connections as PGC_BACKEND is
that changing it within a session after a given session starts is useless,
and it's probably better to freeze it so it can tell you what was actually
done by the current session.  The point of marking log_disconnections as
PGC_BACKEND is so that you can freeze the determination of whether a given
session will log its disconnection at the same time that its determination
of whether to log its connection got frozen, and thus ensure that each
connection log entry should eventually have a disconnection log entry,
assuming you have both enabled.  These considerations are not invalidated
by questions of which users should be allowed to adjust the value.

In short, maybe we ought to invent a new category PGC_SU_BACKEND (not
wedded to this spelling), which is to PGC_BACKEND as PGC_SUSET is to
PGC_USERSET, ie same when-it-can-be-changed behavior but only superusers
are allowed to change it.  I don't have any objection to making these two
settings only adjustable by superusers --- I just don't want to give up
the existing timing restrictions for them.

(If we did this, there are probably some other settings that should become
PGC_SU_BACKEND, eg session_preload_libraries ...)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate function API versus grouping sets

Andrew Gierth and...@tao11.riddles.org.uk writes:
 Tom == Tom Lane t...@sss.pgh.pa.us writes:
  Tom If we're going to do that, I think it needs to be in 9.4.  Are
  Tom you going to work up a patch?

 Do we want a decision on the fn_extra matter first, or shall I do one
 patch for the econtext, and a following one for fn_extra?

I think they're somewhat independent, and probably best patched
separately.  In any case orderedsetagg.c's use of fn_extra is a local
matter that we'd not really have to fix in 9.4, except to the extent
that you think third-party code might copy it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Aggregate function API versus grouping sets

 Tom == Tom Lane t...@sss.pgh.pa.us writes:

  Do we want a decision on the fn_extra matter first, or shall I do
  one patch for the econtext, and a following one for fn_extra?

 Tom I think they're somewhat independent, and probably best patched
 Tom separately.  In any case orderedsetagg.c's use of fn_extra is a
 Tom local matter that we'd not really have to fix in 9.4, except to
 Tom the extent that you think third-party code might copy it.

Given that there's been no attempt to expose ordered_set_startup /
ordered_set_transition* as some sort of API, I think it's virtually
inevitable that people will cargo-cult all of that code into any new
ordered set aggregate they might wish to create.

(Had one request so far for a mode() variant that returns the unique
modal value if one exists, otherwise null; so the current set of
ordered-set aggs by no means exhausts the possible applications.)

-- 
Andrew (irc:RhodiumToad)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Can simplify 'limit 1' with slow function?

David G Johnston david.g.johns...@gmail.com writes:
 Martijn van Oosterhout wrote
 I'm probably dense, but I'm not sure I understand. Or it is that the
 slowfunction() is called prior to the sort? That seems insane.

 The basic reality is that limit applies to the final set of rows that could
 be output.

It's not so much the limit as that the sort has to happen before the
limit, and yes, evaluation of the targetlist happens before the sort.

This is fundamental to the SQL conceptual model; remember that SQL92 had
SELECT slowfunction(), ... ORDER BY 1, which certainly requires the
function to be evaluated before the sort happens.  And there's nothing in
the conceptual model suggesting that different targetlist entries should
be evaluated at different times, so just ordering by something other than
the slowfunction() entry doesn't get you out of that.

I'm not sure how much of this there is chapter and verse for in the
SQL standard, but ISTM the stage sequencing we lay out in our SELECT
reference page is pretty much forced by the standard.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

At 2014-07-02 12:52:51 -0700, m...@joeconway.com wrote:

 Doesn't mean that to me but feel free to change it to Waiting on
 Author if you prefer :-)
 
 Is there any official explanation as to what those states mean
 documented anywhere?

I don't know if there's an official definition, but my understanding of
returned with feedback was always pretty much in agreement with what
Fujii wrote. If the author is expected to post a revised patch soon, it
gets marked waiting on author, and returned with feedback means it
will take longer, probably in the next CF.

But I also treat these labels as a matter of convenience, and definitely
not some mark of shame where the author should feel upset that the patch
was put in one state or the other. As far as I'm concerned, patches can
be switched from returned with feedback to needs review to ready
for committer at the drop of a hat if updates are made in time.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.

Tom Lane t...@sss.pgh.pa.us wrote:


 FWIW, master + 9.4 seems like a reasonable compromise to me too.

Pushed to those branches.  Marked in CF.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

I wrote:
 In short, maybe we ought to invent a new category PGC_SU_BACKEND (not
 wedded to this spelling), which is to PGC_BACKEND as PGC_SUSET is to
 PGC_USERSET, ie same when-it-can-be-changed behavior but only superusers
 are allowed to change it.  I don't have any objection to making these two
 settings only adjustable by superusers --- I just don't want to give up
 the existing timing restrictions for them.

Another idea would be to get rid of PGC_SUSET as a separate category, and
instead have a superuser-only bit in the GUC flags, which would apply to
all categories.  This would be a bit more orthogonal, though likely a
much more invasive change.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

Abhijit Menon-Sen wrote:
 At 2014-07-02 12:52:51 -0700, m...@joeconway.com wrote:
 
  Doesn't mean that to me but feel free to change it to Waiting on
  Author if you prefer :-)
  
  Is there any official explanation as to what those states mean
  documented anywhere?
 
 I don't know if there's an official definition, but my understanding of
 returned with feedback was always pretty much in agreement with what
 Fujii wrote. If the author is expected to post a revised patch soon, it
 gets marked waiting on author, and returned with feedback means it
 will take longer, probably in the next CF.

I think the main thing with returned with feedback is the patch is not
supposed to be handled any further in the current commitfest.  Normally
if a new version is submitted, it's opened as a new entry in a future
commitfest.  So it's one of the three final states for a patch, the
other two being committed and rejected.  The other status values,
needs review, waiting on author, and ready for committer are
transient and can change to any other state.

So I disagree with you here:

 But I also treat these labels as a matter of convenience, and definitely
 not some mark of shame where the author should feel upset that the patch
 was put in one state or the other. As far as I'm concerned, patches can
 be switched from returned with feedback to needs review to ready
 for committer at the drop of a hat if updates are made in time.

A patch that is Returned with Feedback is *not* supposed to go back to
needs review or any of the other states.  If we expect that the author
is going to update the patch, the right state to use is Waiting on
author.

In any case, no state is a mark of shame on the author.  We are not
supposed to judge people here.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: Patch to send transaction commit/rollback stats to the stats collector unconditionally.

2014-07-02 Thread Gurjeet Singh

On Wed, Jul 2, 2014 at 3:49 PM, Kevin Grittner kgri...@ymail.com wrote:
 In preparing to push the patch, I noticed I hadn't responded to this:

 Gurjeet Singh gurj...@singh.im wrote:
 Kevin Grittner kgri...@ymail.com wrote:
 I have reviewed this patch, and think we should do what the patch
 is trying to do, but I don't think the submitted patch would
 actually work.

 Just curious, why do you think it won't work.

 Because you didn't touch this part of the function:

 /*
  * Send partial messages.  If force is true, make sure that any pending
  * xact commit/abort gets counted, even if no table stats to send.
  */
 if (regular_msg.m_nentries  0 ||
 (force  (pgStatXactCommit  0 || pgStatXactRollback  0)))
 pgstat_send_tabstat(regular_msg);

 The statistics would not actually be sent unless a table had been
 accessed or it was forced because the connection was closing.

I sure did! In fact that was the one and only line of code that was
changed. It effectively bypassed the 'force' check if there were any
transaction stats to report.

 /*
- * Send partial messages.  If force is true, make sure that any pending
- * xact commit/abort gets counted, even if no table stats to send.
+ * Send partial messages.  Make sure that any pending xact
commit/abort gets
+ * counted, even if no table stats to send.
  */
 if (regular_msg.m_nentries  0 ||
-(force  (pgStatXactCommit  0 || pgStatXactRollback  0)))
+force || (pgStatXactCommit  0 || pgStatXactRollback  0))
 pgstat_send_tabstat(regular_msg);
 if (shared_msg.m_nentries  0)
 pgstat_send_tabstat(shared_msg);

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB : www.EnterpriseDB.com : The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to send transaction commit/rollback stats to the stats collector unconditionally.

Gurjeet Singh gurj...@singh.im wrote:
 On Wed, Jul 2, 2014 at 3:49 PM, Kevin Grittner kgri...@ymail.com wrote:
 Gurjeet Singh gurj...@singh.im wrote:
 Kevin Grittner kgri...@ymail.com wrote:
 I have reviewed this patch, and think we should do what the patch
 is trying to do, but I don't think the submitted patch would
 actually work.

 Just curious, why do you think it won't work.

 Because you didn't touch this part of the function:

 /*
   * Send partial messages.  If force is true, make sure that any pending
   * xact commit/abort gets counted, even if no table stats to send.
   */
 if (regular_msg.m_nentries  0 ||
 (force  (pgStatXactCommit  0 || pgStatXactRollback  0)))
 pgstat_send_tabstat(regular_msg);

 The statistics would not actually be sent unless a table had been
 accessed or it was forced because the connection was closing.

 I sure did! In fact that was the one and only line of code that was
 changed. It effectively bypassed the 'force' check if there were any
 transaction stats to report.

Sorry, I had too many versions of things sitting around and looked
at the wrong one.  It was the test at the top that you might not
get past to be able to report things below:

    /* Don't expend a clock check if nothing to do */
    if ((pgStatTabList == NULL || pgStatTabList-tsa_used == 0) 
    !have_function_stats  !force)
    return;

The function stats might have gotten you past it for the particular
cases you were testing, but the problem could be caused without
that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout

At 2014-07-02 16:47:16 -0400, alvhe...@2ndquadrant.com wrote:
 
 If we expect that the author is going to update the patch, the right
 state to use is Waiting on author.

Quite so. That's how I understand it, and what I've been doing. But what
if we really *don't* expect the author to update the patch, but they do
it anyway? That's the only case I was referring to.

If the right thing to do is to open an entry in the next CF (as you said
earlier in your mail), that's all right with me.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Audit of logout