Re: [HACKERS] Allowing empty target list in SELECT (1b4f7f93b4693858cb983af3cd557f6097dab67b)

2014-05-02 Thread Rajeev rastogi
On 02 May 2014 10:00, Amit Longote Wrote:

  I
  s the following behavior perceived fix-worthy?
 
 
  -- note the
  '
  1's
   in the output
  s
 
  po
  stgres=# CREATE TABLE test AS SELECT;
  SELECT 1
 
  postgres=# insert into test select;
  INSERT 0 1
 
 
 Or maybe, it just means 1 'null' row/record and not no row at all?

It just creates an item pointer and corresponding to that heap tuple header 
(without data or bitmask for NULL) gets stored as part of this insertion.
So though it does not insert anything (not even NULL) but still it reserve one 
row position. 
So while SELECT, it will not display anything but it will show actual number of 
rows.

Even below syntax is also allowed:

CREATE TABLE no_column_table();

IMO, this might be useful for dynamic use of table (later column might be added 
using 'ALTER') or to use as abstract ancestor in class hierarchy.


Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-02 Thread Ian Barwick

Hi

Here is an initial version of an auditing extension for Postgres to
generate log output suitable for compiling a comprehensive audit trail
of database operations.


Why auditing?

Various laws and regulations (HIPAA, PCI DSS, EU Data Protection
Directive etc.) as well as internal business requirements mandate
auditing at database level. While many proprietary and some open
source databases offer auditing facilities, Postgres does not currently
provide any kind of auditing feature. Availability of such a feature
will assist PostgreSQL's adoption in key sectors such as finance
and health.


About pgaudit

pgaudit uses Event Triggers to log unambiguous representation of DDL,
as well as a combination of executor and utility hooks for other
commands (DML, including SELECT, as well as other utility commands):

https://github.com/2ndQuadrant/pgaudit

To provide fully-featured auditing capability, pgaudit exploits the
capabilities of the new Event Trigger code, which 2ndQuadrant will be
submitting to core Postgres. Currently that means you'll have to
build against an enhanced version of Postgres [1]. However the
intention is that pgaudit will be both a useful module now (it is designed
to compile against 9.3 and 9.4), but  will also serve as a demonstration
of features proposed for 9.5.

[1] deparse branch of git://git.postgresql.org/git/2ndquadrant_bdr.git


Here's some example log output:

LOG:  [AUDIT],2014-04-30 17:13:55.202854+09,auditdb,ianb,ianb,DEFINITION,CREATE 
TABLE,TABLE,public.x,CREATE  TABLE  public.x (a pg_catalog.int4   , b 
pg_catalog.int4   )   WITH (oids=OFF)
LOG:  [AUDIT],2014-04-30 
17:14:06.548923+09,auditdb,ianb,ianb,WRITE,INSERT,TABLE,public.x,INSERT INTO x 
VALUES(1,1);
LOG:  [AUDIT],2014-04-30 
17:14:21.221879+09,auditdb,ianb,ianb,READ,SELECT,TABLE,public.x,SELECT * FROM x;
LOG:  [AUDIT],2014-04-30 
17:15:25.620213+09,auditdb,ianb,ianb,READ,SELECT,VIEW,public.v_x,SELECT * from 
v_x;
LOG:  [AUDIT],2014-04-30 
17:15:25.620262+09,auditdb,ianb,ianb,READ,SELECT,TABLE,public.x,SELECT * from 
v_x;
LOG:  [AUDIT],2014-04-30 
17:16:00.849868+09,auditdb,ianb,ianb,WRITE,UPDATE,TABLE,public.x,UPDATE x SET 
a=a+1;
LOG:  [AUDIT],2014-04-30 
17:16:18.291452+09,auditdb,ianb,ianb,ADMIN,VACUUM,,,VACUUM x;
LOG:  [AUDIT],2014-04-30 17:18:01.08291+09,auditdb,ianb,ianb,DEFINITION,CREATE 
FUNCTION,FUNCTION,public.func_x(),CREATE  FUNCTION public.func_x() RETURNS  
pg_catalog.int4 LANGUAGE sql  VOLATILE  CALLED ON NULL INPUT SECURITY INVOKER 
COST 100.00   AS $dprs_$SELECT a FROM x LIMIT 1;$dprs_$
LOG:  [AUDIT],2014-04-30 
17:18:09.694755+09,auditdb,ianb,ianb,FUNCTION,EXECUTE,FUNCTION,public.func_x,SELECT
 * FROM func_x();
LOG:  [AUDIT],2014-04-30 
17:18:09.694865+09,auditdb,ianb,ianb,READ,SELECT,TABLE,public.x,SELECT * FROM 
func_x();
LOG:  [AUDIT],2014-04-30 
17:18:33.703007+09,auditdb,ianb,ianb,WRITE,DELETE,VIEW,public.v_x,DELETE FROM 
v_x;
LOG:  [AUDIT],2014-04-30 
17:18:33.703051+09,auditdb,ianb,ianb,WRITE,DELETE,TABLE,public.x,DELETE FROM 
v_x;
LOG:  [AUDIT],2014-04-30 17:19:54.811244+09,auditdb,ianb,ianb,ADMIN,SET,,,set 
role ams;
LOG:  [AUDIT],2014-04-30 
17:19:57.039979+09,auditdb,ianb,ams,WRITE,INSERT,VIEW,public.v_x,INSERT INTO 
v_x VALUES(1,2);
LOG:  [AUDIT],2014-04-30 
17:19:57.040014+09,auditdb,ianb,ams,WRITE,INSERT,TABLE,public.x,INSERT INTO v_x 
VALUES(1,2);
LOG:  [AUDIT],2014-04-30 17:20:02.059415+09,auditdb,ianb,ams,ADMIN,SET,,,SET 
role ianb;
LOG:  [AUDIT],2014-04-30 17:20:09.840261+09,auditdb,ianb,ianb,DEFINITION,ALTER 
TABLE,TABLE,public.x,ALTER TABLE public.x ADD COLUMN c pg_catalog.int4
LOG:  [AUDIT],2014-04-30 17:23:58.920342+09,auditdb,ianb,ianb,ADMIN,ALTER 
ROLE,,,ALTER USER ams SET search_path = 'foo';


How is this different to log_statement='all'?

1. pgaudit logs fully-qualified relation names, so you don't have to
   wonder if SELECT * FROM x referred to public.x or other.x.

2. pgaudit creates a log entry for each affected object, so you don't
   have to wonder which tables SELECT * FROM someview accessed, and
   it's easy to identify all accesses to a particular table.

3. pgaudit allows finer-grained control over what is logged. Commands
   are classified into read, write, etc. and logging for these classes
   can be individually enabled and disabled (either via pgaudit.log in
   postgresql.conf, or as a per-database or per-user setting).


Here's a quick overview of how it works:

0. In 9.3 and 9.4, we build without USE_DEPARSE_FUNCTIONS. In the
   deparse branch (which I'll call 9.5 for convenience), we build
   with USE_DEPARSE_FUNCTIONS (set in the Makefile).

1. In 9.5, we create a ddl_command_end event trigger and use
   pg_event_trigger_{get_creation_commands,expand_command} to log
   a deparsed representation of any DDL commands supported by event
   triggers.

2. We always use an sql_drop event trigger to log DROP commands, but
   once 9.5 includes pg_event_trigger_get_deletion_commands() or some
   equivalent, we'll use that functionality as well.

3. We 

Re: [HACKERS] Allowing empty target list in SELECT (1b4f7f93b4693858cb983af3cd557f6097dab67b)

2014-05-02 Thread David Rowley
On Fri, May 2, 2014 at 3:57 PM, Amit Langote amitlangot...@gmail.comwrote:


 Hi,

 I
 ​s the following behavior perceived fix-worthy?


 -- note the
 ​'​
 ​1's
  in the output
 ​s
 ​

 ​po​
 stgres=# CREATE TABLE test AS SELECT;
 SELECT 1

 postgres=# insert into test select;
 INSERT 0 1​

 ​My guess why ​this happens is because changes made in the commit in
 $SUBJECT only pertain to fixing syntax errors and nothing else.


Are you proposing that this does not insert a 0 column row?

I don't find the current behaviour wrong. If it didn't insert the row then
the query in the following would return 0 rows.

begin work;
create table nocols ();
insert into nocols select;
insert into nocols select;

create table test (value int);
insert into test values(1);

select * from nocols cross join test; -- give 2 rows with the value 1
rollback;

Why should the above results be any different than if I created the nocols
table with a column then dropped it?
Certainly removing all of the records on the drop of the last column would
be wrong.

Regards

David Rowley

--
 Amit



Re: [HACKERS] Allowing empty target list in SELECT (1b4f7f93b4693858cb983af3cd557f6097dab67b)

2014-05-02 Thread Albe Laurenz
Amit Langote wrote:
 Is the following behavior perceived fix-worthy?


 -- note the '1's in the outputs

 postgres=# CREATE TABLE test AS SELECT;
 SELECT 1

 postgres=# insert into test select;
 INSERT 0 1

 Or maybe, it just means 1 'null' row/record and not no row at all?

Right, I'd say you end up with a table with two 0-tuples.

Maybe odd, but it shouldn't be a problem.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing empty target list in SELECT (1b4f7f93b4693858cb983af3cd557f6097dab67b)

2014-05-02 Thread Amit Langote
On Fri, May 2, 2014 at 4:14 PM, David Rowley dgrowle...@gmail.com wrote:


 Why should the above results be any different than if I created the nocols
 table with a column then dropped it?
 Certainly removing all of the records on the drop of the last column would
 be wrong.


I see, dropping the only column in a table does exhibit a similar behavior.

--
Amit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for Merge Join for Non '=' Operators

2014-05-02 Thread Dilip kumar
On 29 April 2014 13:28, Hadi Moshayedi Wrote,

This looks like a great improvement. Repeating Nicolas's question, do you have 
a real-world example of such joins?

I can think of some scenario where, user need to self-join and find the 
comparison  with other tuples, For example, list down all the employee which 
has less salary compare to others employees and count of the employees who are 
earning more than that emp. Like query given below

“select ta.emp_name, count(*) from t1 as ta, t1 as tb where 
ta.emp_salarytb.emp_salary group by ta.emp_name;”

In my experience, I see more queries like self-join table A and table B where 
A.time BETWEEN B.time - '1 week' and B.time, similar to what Nicolas and Tom 
mentioned. As an example, count users who placed   an order in the week 
following their registration.

Currently I have implemented very basic POC which can work only for a  b 
query, I think actual patch can be enhanced for these type of queries also.

Can you send a patch so we can also try it?

Patch is attached in the mail, but for testing we need to take care of some 
points

1.  Patch is implemented only for ab type of queries (only for table with 
one integer field, this can be modified in create_nestloop_plan if needed, I 
have written for basic test with integer).



2.  What changes are done

There is no changes done in planner cost calculation, so hack is put while 
generating the plan.

IF planner has selected NLJ plan, and enable material is set to off (this is 
the hint to select special Merge Join)

Then add sort node above left and right tree for NLJ.



3.  So if you want to test with normal NLJ no need to change anything, and 
if you want to test using this merge join just run ‘set enable_material=off’;


postgres=# explain select count(*) from t1 as ta, t1 as tb where ta.atb.a;
QUERY PLAN
---
Aggregate  (cost=396625.51..396625.52 rows=1 width=0)
   -  Nested Loop  (cost=0.00..375758.83 rows=8346672 width=0)
 Join Filter: (ta.a  tb.a)
 -  Seq Scan on t1 ta  (cost=0.00..73.04 rows=5004 width=4)
 -  Materialize  (cost=0.00..98.06 rows=5004 width=4)
   -  Seq Scan on t1 tb  (cost=0.00..73.04 rows=5004 width=4)
Planning time: 0.291 ms
(7 rows)

Now For enabling this merge Join
postgres=# set enable_material=off;
SET
postgres=# explain select count(*) from t1 as ta, t1 as tb where ta.atb.a;
QUERY PLAN
---
Aggregate  (cost=699432.08..699432.09 rows=1 width=0)
   -  Nested Loop  (cost=0.00..678565.40 rows=8346672 width=0)
 Join Filter: (ta.a  tb.a)
 -  Sort  (cost=380.51..393.02 rows=5004 width=4)
   Sort Key: ta.a
   -  Seq Scan on t1 ta  (cost=0.00..73.04 rows=5004 width=4)
 -  Sort  (cost=380.51..393.02 rows=5004 width=4)
   Sort Key: tb.a
   -  Seq Scan on t1 tb  (cost=0.00..73.04 rows=5004 width=4)
Planning time: 0.286 ms
(10 rows)


Thanks  Regards,
 Dilip Kumar




merge_join_nonequal.patch
Description: merge_join_nonequal.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] includedir_internal headers are not self-contained

2014-05-02 Thread Christoph Berg
Re: Tom Lane 2014-05-02 9995.1398994...@sss.pgh.pa.us
  The patch is certainly too invasive to consider back-patching into
  9.3, though.

Understood.

  I feel unsure about this.  I agree the patch is quite invasive.  Leaving
  9.3 in a broken state seems problematic.  In particular I'm not sure
  what would Debian do about the whole issue; would they have to carry the
  patch for their 9.3 packages?
 
 My recommendation to Christoph upthread was that they just look the
 other way for the time being, ie, ignore the fact that relpath.h is
 unusable by freestanding apps in 9.3.  Backpatching what I did for
 9.4 would be an ABI break, so that seems to me to be out of the
 question in 9.3.  And it's not clear that anybody outside core+contrib
 really needs relpath.h yet, anyway.  (Of course, you could argue that
 if there are no external users then the ABI break isn't a problem;
 but if there are, then it is.)

We are certainly not going to replace the old mess by a custom new
one ;)

The original problem that postgres_fe.h wasn't usable is already fixed
for 9.3, so afaict the only remaining problem there seems the
installation {rule, location} of common/, which is either taken care
of by the patch I've sent, or a trivial addition to the packaging
files on our side.

As long as there's no complaints, we'll simply ignore the fact that
the other headers in 9.3's common/ aren't self-contained, the
workaround to simply install the server headers seems easy enough.

We should probably be able to move to 9.4 in time for the freeze of
Debian Jessie in November, so backports won't matter that much. (As
long as the 9.3-and-older server-headers are self-contained and/or
compatible with what 9.4 provides...)

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Obsolete coding in fork_process.c

2014-05-02 Thread Noah Misch
On Thu, May 01, 2014 at 11:07:51PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  On Thu, May 01, 2014 at 08:44:46PM -0400, Tom Lane wrote:
  You're only considering one aspect of the problem.  Yeah, you might not
  get duplicated output unless system() prints something before exec(),
  but we would also like to have causality: that is, whatever we sent to
  stdout before calling system() should appear there before anything the
  child process sends to stdout.
 
  Good point.  I suppose a couple of fflush() calls have negligible cost next 
  to
  a system() or popen().  Introduce pg_popen()/pg_system(), and adopt a rule
  that they are [almost] our only callers of raw popen()/system()?
 
 Meh.  I'm not usually in favor of adopting nonstandard notation, and
 this doesn't seem like a place to start.  In particular, if you don't
 want to use fflush(NULL) in these proposed wrappers, then call sites
 are still going to have an issue with needing to do manual fflushes;
 pg_regress.c's spawn_process is an example:
 
 /*
  * Must flush I/O buffers before fork.  Ideally we'd use fflush(NULL) here
  * ... does anyone still care about systems where that doesn't work?
  */
 fflush(stdout);
 fflush(stderr);
 if (logfile)
 fflush(logfile);
 
 pid = fork();
 
 I think that removing the need for fflush(stdout) and fflush(stderr)
 in this context would mostly result in people forgetting to fflush
 other output files.  I'd rather have the two lines of boilerplate
 (and a comment about why we're refusing to depend on fflush(NULL))
 than take that risk.

Works for me.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] need xmllint on borka

2014-05-02 Thread Alvaro Herrera
Tom Lane wrote:

 (The subtext here is that borka is absolutely not an acceptable place
 to encounter documentation build failures.  By the time we're at that
 stage of the release cycle, I don't really care what xmllint might
 have to say; there isn't going to be time to make it happy.)

Borka is what runs the guaibasaurus animal, so failures would show up in
buildfarm ...

If the xmllint check could be made optional, I guess we could have the
failures show up in buildfarm but avoid having them cause a problem for
make dist when releases are created.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] need xmllint on borka

2014-05-02 Thread Alvaro Herrera
Peter Eisentraut wrote:
 I have been working on making the DocBook XML output valid.  The first
 part was bb4eefe7bf518e42c73797ea37b033a5d8a8e70a, I now have the rest
 ready, but I'll spare you the mostly mechanical 200kB patch for now.  In
 addition, I'd like to add the attached patch with an xmllint call to
 make sure things stay valid.
 
 But we don't have xmllint installed on borka, where we build the
 releases.  Could someone please install it?

xmllint installed on borka.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Sending out a request for more buildfarm animals?

2014-05-02 Thread Andres Freund
Hi,

There's pretty little coverage of non mainstream platforms/compilers in
the buildfarm atm. Maybe we should send an email on -announce asking for
new ones?
There's no coverage for OS-wise;
* AIX (at all)
* HP-UX (for master at least)
(* Tru64)
(* UnixWare)

Architecture wise there's no coverage for:
* some ARM architecture varians
* mips
* s390/x
* sparc 32bit
(* s390)
(* alpha)
(* mipsel)
(* M68K)

A couple of those aren't that important (my opinion indicated by ()),
but the other ones really should be covered or desupported.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] elog a stack trace

2014-05-02 Thread Jeff Janes
Does anyone have any hints on how to get a stack trace programmatically,
rather than trying to run ps to get the pid and then attach gdb to a
ephemeral process and hoping the situation has not been changed while you
are doing that?  I'd like to set

log_error_verbosity = stack

or

elog_stack(,...)

But those don't exist.

Cheers,

Jeff


Re: [HACKERS] elog a stack trace

2014-05-02 Thread Andres Freund
Hi,

On 2014-05-02 08:16:39 -0700, Jeff Janes wrote:
 Does anyone have any hints on how to get a stack trace programmatically,
 rather than trying to run ps to get the pid and then attach gdb to a
 ephemeral process and hoping the situation has not been changed while you
 are doing that?  I'd like to set
 
 log_error_verbosity = stack
 
 or
 
 elog_stack(,...)

There's backtrace() on several platforms. The stacktraces are less
detailed than what gdb gives you tho.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-02 Thread Josh Berkus
On 05/01/2014 11:19 PM, Ian Barwick wrote:
 
 Here is an initial version of an auditing extension for Postgres to
 generate log output suitable for compiling a comprehensive audit trail
 of database operations.

Cool!  Looking forward to seeing it around the 9.5 cycle.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-02 Thread Stephen Frost
Ian,

* Ian Barwick (i...@2ndquadrant.com) wrote:
 Here is an initial version of an auditing extension for Postgres to
 generate log output suitable for compiling a comprehensive audit trail
 of database operations.

Neat stuff.

 Why auditing?

Yeah, we really need to improve here.  I've been hoping to make progress
on this and it looks like I'll finally have some time to.

 pgaudit uses Event Triggers to log unambiguous representation of DDL,
 as well as a combination of executor and utility hooks for other
 commands (DML, including SELECT, as well as other utility commands):

While certainly a good approach to minimize the changes needed to the
backend, I'd really like to see us be able to, say, log to a table and
have more fine-grained control over what is logged, without needing an
extension.

 1. pgaudit logs fully-qualified relation names, so you don't have to
wonder if SELECT * FROM x referred to public.x or other.x.

Yeah, that's definitely an issue for any kind of real auditing.

 2. pgaudit creates a log entry for each affected object, so you don't
have to wonder which tables SELECT * FROM someview accessed, and
it's easy to identify all accesses to a particular table.

Interesting- I'm a bit on the fence about this one.  Perhaps you can
elaborate on the use-case for this?

 3. pgaudit allows finer-grained control over what is logged. Commands
are classified into read, write, etc. and logging for these classes
can be individually enabled and disabled (either via pgaudit.log in
postgresql.conf, or as a per-database or per-user setting).

This is something I've been mulling over for a couple of years (you can
see notes from the discussion at the 2011 hacker meeting on the wiki
about how we might change our logging system to allow for better
filtering).

 Planned future improvements include:
 
 1. Additional logging facilities, including to a separate audit
log file and to syslog, and potentially logging to a table
(possibly via a bgworker process). Currently output is simply
emitted to the server log via ereport().

Using the existing logging collector will almost certainly be a
contention point- we've seen that before.  I've had thoughts about
an option to log to individual files from each backend (perhaps based
on that backend's position in the proc table) or directly from each
backend to a remote service (eg: rabbitMQ/AMQP or something).

Regarding background worker processes, a thought that's been kicked
around a bit is to actually change our existing logging collector to
be a background worker (or perhaps be able to have multiple?) which
is fed from a DSM queue and then logs to a file (or maybe files), or
a table or something else.

 2. To implement per-object auditing configuration, it would be nice to use
extensible reloptions (or an equivalent mechanism)

Yeah, that's another interesting challenge.  This kind of auditing is
often about specific information (and therefore specific objects) and
it'd be ideal to have that set up and managed alongside the table
definition.  Having the auditing done in core instead of through an
extension would make this easier to address though.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-02 Thread Josh Berkus
On 05/02/2014 11:04 AM, Stephen Frost wrote:
 This is something I've been mulling over for a couple of years (you can
 see notes from the discussion at the 2011 hacker meeting on the wiki
 about how we might change our logging system to allow for better
 filtering).

Logging hooks.  We really need some contrib/ modules which take
advantage of these.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Tom Lane
I've been thinking about how we might implement the multiple column
assignment UPDATE syntax that was introduced in SQL:2003.  This feature
allows you to do

UPDATE table SET ..., (column, column, ...) = row-valued expression, ...

where the system arranges to evaluate the row-valued expression just
once per row and then assign its fields into the specified target columns.

Back in commit 6e8596a146c9b16f2c053ea8c6e361a114c0b65c we introduced
some limited support for this syntax, but it only handles a row-valued
expression that is a ROW() constructor, and it just does a simple
syntactic transformation of pulling apart the ROW() constructor and
building an independent assignment to each target column.  The actually
interesting uses for this feature don't work with that implementation
approach.  The most common case I've seen asked for is where the
expression is a sub-SELECT returning multiple columns (but at most one
row).

As far as the parser is concerned, the main hurdle to supporting this
feature is that the representation of an UPDATE's targetlist assumes that
each list element is an independent TargetEntry representing a single
assignment.  Now, there is a heck of a lot of code that knows what
targetlists look like, so I'm not eager to try to change that basic
assumption.  What seems like probably a better idea is to represent
SET (target1, target2, target3) = foo
as though it were
SET target1 = fooref.col1, target2 = fooref.col2, target3 = fooref.col3
where fooref is some Param-like reference to a separate list of
expressions that have composite outputs.  It would be understood that this
separate targetlist would be evaluated just once before evaluating the
main tlist.  This approach would allow all the existing targetlist
manipulation code to stay about the same.  It would be a bit of a
challenge for ruleutils.c to reconstruct the original syntax when printing
an UPDATE in a rule, but I think that's just a small matter of
programming.  (Possibly it would help if the elements of the separate
composite-values targetlist contained markers as to which main-tlist
elements they were for.)

Now, we could probably implement it straightforwardly just based on that
idea, though it's not quite clear where to shoehorn evaluation of the
separate targetlist into the constructed plan.  One way would be to
insert an additional evaluation level by adding a Result node on top
of the normal plan, and then have the lower level compute the composite
values as resjunk tlist elements, while the upper level does FieldSelects
from the composite values to implement the fooref.colN references.

However, I'm mainly interested in the sub-SELECT case; indeed, anything
else you might want to do could be transformed into a sub-SELECT, so
I wouldn't feel bad if we just restricted the new feature to that.
And this doesn't seem like quite the right way to do it for sub-SELECTs.

In the case of sub-SELECTs, we have almost the right execution mechanism
already, in that initPlans are capable of setting multiple PARAM_EXEC
runtime Params, one for each output column of the sub-SELECT.  So what
I called fooref.col1 etc above could just be PARAM_EXEC Params referring
to the subplan outputs --- except that initPlans are only for uncorrelated
subqueries (those without any outer references to Vars of the parent query
level).  And the interesting cases for UPDATE generally involve correlated
subqueries.

What I'm thinking about this is that we ought to make an effort to unify
the currently separate implementation paths for correlated and
uncorrelated subqueries.  Instead of SubPlans in the expression tree for
correlated subqueries, I think they should all be treated much like
initPlans are now, ie, there are PARAM_EXEC Params referencing outputs
from a list of subqueries that are attached to the expression tree's
parent plan node, and we lazily evaluate the subqueries upon first use of
one of their output parameters.  What would be different from the current
handling of initPlans is that each time we advance to a new input row,
we'd need to reset the evaluation state of the subqueries that are
correlated.  The reason for changing it like that is so that we can have
multiple separate Params referencing different output columns of a single
correlated subquery, and be sure that we evaluate the correlated subquery
only once; the current SubPlan mechanism can't support separate references
to the same subplan.  Now, this would add a small amount of new bookkeeping
overhead to use of correlated subqueries, but I find it hard to believe
that that'd be noticeable compared to the startup/shutdown cost of the
subquery.

So, if we were to revise the handling of correlated subqueries like that,
then for the case of a row-valued expression that is a sub-SELECT we
wouldn't need any explicit runtime evaluation of separate targetlist
entries.  Use of Params referencing the subplan's outputs would be enough
to cause evaluation to happen at the right 

Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-02 Thread Stephen Frost
* Josh Berkus (j...@agliodbs.com) wrote:
 Logging hooks.  We really need some contrib/ modules which take
 advantage of these.

I'm aware and I really am not convinced that pushing all of this to
contrib modules using the hooks is the right approach- for one thing, it
certainly doesn't seem to me that we've actually gotten a lot of
traction from people to actually make use of them and keep them updated.
We've had many of those hooks for quite a while.

What 2Q has done here is great, but they also point out problems with
building this as a contrib module using the hooks.  As we add more
capabilities and improve the overall PG system (new objects, etc), I'm
rather unconvinced that having to go, independently, update the contrib
modules to understand each new object is going to be a terribly workable
long-term solution.

Additionally, using triggers (either on the tables or the event
triggers), while good for many use-cases, doesn't completely answer the
auditing requirements (SELECT being the great example, but there are
others) and having to combine event triggers with various hooks just
doesn't strike me as a great design.  (I don't intend to knock what 2Q
has done here at all- they're using a minimal-backend-hacking approach,
and under that constraint they've done exactly what makes sense).

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] Faster array_length()

2014-05-02 Thread Hadi Moshayedi
Hello,

The attached patch improves the performance of array_length() by detoasting
only the overhead part of the datum.

Here is a test case:

postgres=# create table array_length_test as select array_agg(a) a from
generate_series(1, 1) a, generate_series(1, 1) b group by b;

Without the patch:

postgres=#  select sum(array_length(a, 1)) from array_length_test;
sum
---
 1
(1 row)

Time: 199.002 ms

With the patch:

postgres=#  select sum(array_length(a, 1)) from array_length_test;
sum
---
 1
(1 row)

Time: 34.599 ms

The motivation for patch is that some of our customers use arrays to store
a sequence of tens of thousands of events in each row. They often need to
get the last 10 event for each row, for which we do A[array_length(A, 1) -
9: 100] (assuming 1M is an upper-bound. we could use array_length()
instead of this constant too, but that is unnecessary if we know the
upper-bound and only slows down the query). Without this optimization,
array gets detoasted twice. With this patch, array_length() becomes much
faster, and the whole query saves few seconds.

Of course this technique is applicable to some other functions too, but
they have never become a bottleneck for me, so I decided to keep the
changes only to this function.

Another alternative I could think of was introducing python style slicing,
in which negative indexes start from end of array, so -10 means 10th
element from end. I thought this would be a bigger change and is probably
unnecessary, so I decided to improve array_length() instead.

Feedback is welcome.

Thanks,
   -- Hadi
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index 91df184..5e1d9c2 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -1719,11 +1719,15 @@ array_upper(PG_FUNCTION_ARGS)
 Datum
 array_length(PG_FUNCTION_ARGS)
 {
-	ArrayType  *v = PG_GETARG_ARRAYTYPE_P(0);
+	Datum		arrdatum = PG_GETARG_DATUM(0);
 	int			reqdim = PG_GETARG_INT32(1);
+	ArrayType  *v;
 	int		   *dimv;
 	int			result;
 
+	v = (ArrayType *) PG_DETOAST_DATUM_SLICE(arrdatum, 0,
+			 ARR_OVERHEAD_NONULLS(MAXDIM));
+
 	/* Sanity check: does it look like an array at all? */
 	if (ARR_NDIM(v) = 0 || ARR_NDIM(v)  MAXDIM)
 		PG_RETURN_NULL();

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [COMMITTERS] pgsql: Fix quiet inline configure test for newer clang compilers.

2014-05-02 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-05-01 20:16:48 +, Tom Lane wrote:
 Fix quiet inline configure test for newer clang compilers.

 Since it doesn't seem to have caused any problems I think this should be
 backpatched.

Hearing no objections, done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Merlin Moncure
On Fri, May 2, 2014 at 1:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've been thinking about how we might implement the multiple column
 assignment UPDATE syntax that was introduced in SQL:2003.  This feature
 allows you to do

 UPDATE table SET ..., (column, column, ...) = row-valued expression, ...

 where the system arranges to evaluate the row-valued expression just
 once per row and then assign its fields into the specified target columns.

 Back in commit 6e8596a146c9b16f2c053ea8c6e361a114c0b65c we introduced
 some limited support for this syntax, but it only handles a row-valued
 expression that is a ROW() constructor, and it just does a simple
 syntactic transformation of pulling apart the ROW() constructor and
 building an independent assignment to each target column.  The actually
 interesting uses for this feature don't work with that implementation
 approach.  The most common case I've seen asked for is where the
 expression is a sub-SELECT returning multiple columns (but at most one
 row).

 As far as the parser is concerned, the main hurdle to supporting this
 feature is that the representation of an UPDATE's targetlist assumes that
 each list element is an independent TargetEntry representing a single
 assignment.  Now, there is a heck of a lot of code that knows what
 targetlists look like, so I'm not eager to try to change that basic
 assumption.  What seems like probably a better idea is to represent
 SET (target1, target2, target3) = foo
 as though it were
 SET target1 = fooref.col1, target2 = fooref.col2, target3 = fooref.col3
 where fooref is some Param-like reference to a separate list of
 expressions that have composite outputs.  It would be understood that this
 separate targetlist would be evaluated just once before evaluating the
 main tlist.  This approach would allow all the existing targetlist
 manipulation code to stay about the same.  It would be a bit of a
 challenge for ruleutils.c to reconstruct the original syntax when printing
 an UPDATE in a rule, but I think that's just a small matter of
 programming.  (Possibly it would help if the elements of the separate
 composite-values targetlist contained markers as to which main-tlist
 elements they were for.)

 Now, we could probably implement it straightforwardly just based on that
 idea, though it's not quite clear where to shoehorn evaluation of the
 separate targetlist into the constructed plan.  One way would be to
 insert an additional evaluation level by adding a Result node on top
 of the normal plan, and then have the lower level compute the composite
 values as resjunk tlist elements, while the upper level does FieldSelects
 from the composite values to implement the fooref.colN references.

 However, I'm mainly interested in the sub-SELECT case; indeed, anything
 else you might want to do could be transformed into a sub-SELECT, so
 I wouldn't feel bad if we just restricted the new feature to that.
 And this doesn't seem like quite the right way to do it for sub-SELECTs.

 In the case of sub-SELECTs, we have almost the right execution mechanism
 already, in that initPlans are capable of setting multiple PARAM_EXEC
 runtime Params, one for each output column of the sub-SELECT.  So what
 I called fooref.col1 etc above could just be PARAM_EXEC Params referring
 to the subplan outputs --- except that initPlans are only for uncorrelated
 subqueries (those without any outer references to Vars of the parent query
 level).  And the interesting cases for UPDATE generally involve correlated
 subqueries.

 What I'm thinking about this is that we ought to make an effort to unify
 the currently separate implementation paths for correlated and
 uncorrelated subqueries.  Instead of SubPlans in the expression tree for
 correlated subqueries, I think they should all be treated much like
 initPlans are now, ie, there are PARAM_EXEC Params referencing outputs
 from a list of subqueries that are attached to the expression tree's
 parent plan node, and we lazily evaluate the subqueries upon first use of
 one of their output parameters.  What would be different from the current
 handling of initPlans is that each time we advance to a new input row,
 we'd need to reset the evaluation state of the subqueries that are
 correlated.  The reason for changing it like that is so that we can have
 multiple separate Params referencing different output columns of a single
 correlated subquery, and be sure that we evaluate the correlated subquery
 only once; the current SubPlan mechanism can't support separate references
 to the same subplan.  Now, this would add a small amount of new bookkeeping
 overhead to use of correlated subqueries, but I find it hard to believe
 that that'd be noticeable compared to the startup/shutdown cost of the
 subquery.

 So, if we were to revise the handling of correlated subqueries like that,
 then for the case of a row-valued expression that is a sub-SELECT we
 wouldn't need any explicit runtime evaluation of 

Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Fri, May 2, 2014 at 1:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I've been thinking about how we might implement the multiple column
 assignment UPDATE syntax that was introduced in SQL:2003.  This feature
 allows you to do
 UPDATE table SET ..., (column, column, ...) = row-valued expression, ...

 Couple quick questions:
 1) how does this interplay with RETURNING?  I guess it probably
 doesn't change, but I imagine there's be no way to reference the
 composite result in the RETURNING statement?

Not as such; obviously you could reference the assigned-to columns
in RETURNING and thereby reconstruct the composite value.

 2) I often wish that you could reference the table (or it's alias)
 directly as the field list.

 UPDATE foo f set f = (...)::foo;
 or even
 UPDATE foo SET foo = foo;

Hm.  You could get there with this syntax as long as you didn't mind
writing out the field list explicitly.  Arguments why you should
want to do that are the same as for avoiding SELECT *, with maybe
a bit more urgency since at least SELECT * won't trash your data
if you get it wrong.  However, assuming that that argument doesn't
impress you ...

My draft copy of SQL99 mentions a syntax

UPDATE table SET ROW = row-valued expression [ WHERE ... ]

which does not appear in later editions of the spec, and probably wasn't
in SQL99 final either (since SQL:2003 does not mention it as a removed
feature).  I'm not sure we'd want to implement that; it would require
making ROW into a fully-reserved word, which it is not today, and that
seems rather a high price for implementing a not-per-spec feature.
But I don't think your suggestions of the table name or alias work;
they could conflict with an actual column name.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Merlin Moncure
On Fri, May 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 2) I often wish that you could reference the table (or it's alias)
 directly as the field list.

 UPDATE foo f set f = (...)::foo;
 or even
 UPDATE foo SET foo = foo;

 Hm.  You could get there with this syntax as long as you didn't mind
 writing out the field list explicitly.  Arguments why you should
 want to do that are the same as for avoiding SELECT *, with maybe
 a bit more urgency since at least SELECT * won't trash your data
 if you get it wrong.  However, assuming that that argument doesn't
 impress you ...

 My draft copy of SQL99 mentions a syntax

 UPDATE table SET ROW = row-valued expression [ WHERE ... ]

 which does not appear in later editions of the spec, and probably wasn't
 in SQL99 final either (since SQL:2003 does not mention it as a removed
 feature).  I'm not sure we'd want to implement that; it would require
 making ROW into a fully-reserved word, which it is not today, and that
 seems rather a high price for implementing a not-per-spec feature.
 But I don't think your suggestions of the table name or alias work;
 they could conflict with an actual column name.

Presumably it'd follow similar rules to SELECT -- resolve the column
name in the face of ambiguity.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Tom Lane
Merlin Moncure mmonc...@gmail.com writes:
 On Fri, May 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But I don't think your suggestions of the table name or alias work;
 they could conflict with an actual column name.

 Presumably it'd follow similar rules to SELECT -- resolve the column
 name in the face of ambiguity.

Meh.  Then you could have a query that works fine until you add a column
to the table, and it stops working.  If nobody ever used column names
identical to table names it'd be all right, but unfortunately people
seem to do that a lot...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Supporting multiple column assignment in UPDATE (9.5 project)

2014-05-02 Thread Merlin Moncure
On Fri, May 2, 2014 at 3:03 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Merlin Moncure mmonc...@gmail.com writes:
 On Fri, May 2, 2014 at 2:47 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But I don't think your suggestions of the table name or alias work;
 they could conflict with an actual column name.

 Presumably it'd follow similar rules to SELECT -- resolve the column
 name in the face of ambiguity.

 Meh.  Then you could have a query that works fine until you add a column
 to the table, and it stops working.  If nobody ever used column names
 identical to table names it'd be all right, but unfortunately people
 seem to do that a lot...

That's already the case with select statements and, if a user were
concerned about that, always have the option of aliasing the table as
nearly 100% of professional developers do:

SELECT f FROM foo f;
etc.

Now, I need this feature a lot less than I used to (although I do like
the symmetry with SELECT); hstore and jsonb have matured to the point
that they can handle most trigger function operations that you'd want
to abstract over multiple tables without expensive calls to
information_schema.  The main advantages for a native approach would
be type safety (although even that situation is improving at long
last), performance, and code complexity.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Sending out a request for more buildfarm animals?

2014-05-02 Thread Noah Misch
On Fri, May 02, 2014 at 05:04:01PM +0200, Andres Freund wrote:
 There's pretty little coverage of non mainstream platforms/compilers in
 the buildfarm atm. Maybe we should send an email on -announce asking for
 new ones?
 There's no coverage for OS-wise;
 * AIX (at all)
 * HP-UX (for master at least)
 (* Tru64)
 (* UnixWare)
 
 Architecture wise there's no coverage for:
 * some ARM architecture varians
 * mips
 * s390/x
 * sparc 32bit
 (* s390)
 (* alpha)
 (* mipsel)
 (* M68K)
 
 A couple of those aren't that important (my opinion indicated by ()),
 but the other ones really should be covered or desupported.

More coverage of non-gcc compilers would be an asset to the buildfarm.

+1 for sending a call for help to -announce.  I agree with your importance
estimates, particularly on the OS side.  -1 for making code-level changes to
desupport a platform based on the lack of a buildfarm member, though I don't
mind documentation/advocacy changes on that basis.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] regexp_replace( , , , NULL ) returns null?

2014-05-02 Thread Jim Nasby
Is there any particular reason for this:

decibel@decina.local=# SELECT regexp_replace( 'a', 'a', 'b', null ) IS NULL;
 ?column? 
--
 t
(1 row)

ISTM it’d be a lot better if it treated NULL flags the same as ‘’...
--
Jim Nasby, Lead Data Architect   (512) 569-9461



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] regexp_replace( , , , NULL ) returns null?

2014-05-02 Thread Tom Lane
Jim Nasby jna...@enova.com writes:
 Is there any particular reason for this:
 decibel@decina.local=# SELECT regexp_replace( 'a', 'a', 'b', null ) IS NULL;
  ?column? 
 --
  t
 (1 row)

Yeah: regexp_replace is strict.

 ISTM it’d be a lot better if it treated NULL flags the same as ‘’...

In Oracle's universe that probably makes sense, but to me it's not
sensible.  Why should unknown flags produce a non-unknown result?

I find it hard to envision many use-cases where you wouldn't actually
have the flags as a constant, anyway; they're too fundamental to the
behavior of the function.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] New and interesting replication issues with 9.2.8 sync rep

2014-05-02 Thread Josh Berkus
Just got a report of a replication issue with 9.2.8 from a community member:

Here's the sequence:

1) A -- B (sync rep)

2) Shut down B

3) Shut down A

4) Start up B as a master

5) Start up A as sync replica of B

6) A successfully joins B as a sync replica, even though its transaction
log is 1016 bytes *ahead* of B.

7) Transactions written to B all hang

8) Xlog on A is now corrupt, although the database itself is OK

Now, the above sequence happened because of the user misunderstanding
what sync rep really means.  However, A should not have been able to
connect with B in replication mode, especially in sync rep mode; that
should have failed.  Any thoughts on why it didn't?

I'm trying to produce a test case ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster array_length()

2014-05-02 Thread Tom Lane
Hadi Moshayedi h...@moshayedi.net writes:
 The attached patch improves the performance of array_length() by detoasting
 only the overhead part of the datum.

It looks to me like this would actually make things worse for cases where
the input array wasn't toasted-out-of-line (because it would uselessly
make a copy of the header part, costing a palloc cycle).  I'm not averse
to improving the case you're worried about, but you have to pay attention
to not having bad side-effects on other cases.

Another thought is that this can only win for arrays that are external
without being compressed; when they are compressed, then
heap_tuple_untoast_attr_slice will fetch and decompress the entire array
anyway (and then, just to add insult to injury, make another copy :-().
With that in mind, I was surprised that your test case showed any
improvement at all --- it looks like the arrays aren't getting compressed
for some reason.  There are going to be a lot of other cases where this
patch doesn't help unless the user turns off compression, which will hurt
his performance in other ways.

Now, the slice detoast support was only designed to work with data stored
in external mode (that is, with compression manually disabled via the
appropriate ALTER TABLE option), and that's not unreasonable for its
originally-intended application of being able to fetch any part of an
external text string.  But it strikes me that for what you want here,
namely fetching just a few bytes from the start, it ought to be possible
to do better.  Could we teach the toast code to fetch and decompress just
an initial subset of the data?  (This might be useful even for the
original use-case of slice fetches, as long as the desired slice isn't
too close to the end of the datum.)

Bottom line for me is that you've shown that there can be a win from
improving this code, but we need to do some basic work on the
slice-fetching logic to get full value out of the idea.

 Of course this technique is applicable to some other functions too, but
 they have never become a bottleneck for me, so I decided to keep the
 changes only to this function.

I would expect a committable version of this patch to cover all the
array-dimension-related functions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Faster array_length()

2014-05-02 Thread Hadi Moshayedi
Thanks for looking into this.


 With that in mind, I was surprised that your test case showed any
 improvement at all --- it looks like the arrays aren't getting compressed
 for some reason.


You are right, it seems that they were not getting compressed, probably
because the arrays were seq 1 which seems to not get compressed by
pglz. When I changed the test data to an array containing 1 ones, there
were no speed improvement anymore.

I'll look into how to improve the compressed case and other issues you
raised.

Thanks,
   -- Hadi


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-02 Thread Abhijit Menon-Sen
At 2014-05-02 14:04:27 -0400, sfr...@snowman.net wrote:

 I'd really like to see us be able to, say, log to a table and have
 more fine-grained control over what is logged, without needing an
 extension.

There were several factors we considered in our work:

1. We did the minimum possible to produce something that gives us
   demonstrably more than «log_statement=all» in 9.3/9.4/9.5.

2. We wanted to produce something that could be used *now*, i.e. with
   9.3 and soon 9.4, to get wider feedback based on actual usage. I'm
   hoping that by the time we make a submission for 9.5, we'll have a
   clearer picture of what Postgres auditing should look like.

3. We steered clear of implementing different log targets. We know that
   ereport() doesn't cut it, but decided that doing anything else would
   be better after some feedback and wider discussion. Any suggestions
   in this regard are very welcome.

(Stephen, I can see from your mail that you've already inferred at least
some of the above, so it's more a general statement of our approach than
a response to what you said.)

  2. pgaudit creates a log entry for each affected object […]
 
 Interesting- I'm a bit on the fence about this one.  Perhaps you can
 elaborate on the use-case for this?

Who accessed public.x last month?

Answering that question would become much more difficult if one had to
account for every view that might refer to public.x. And did the view
refer to public.x before the schema change on the first Wednesday of
last month?

We don't have a deparsed representation of DML, so select * from x
is logged differently from select * from other.x. Same with potential
complications like how exactly a join is written.

The way pgaudit does it, you can just grep public.x in your audit log
and be sure (modulo bugs, of course) you're seeing everything relevant.

 This kind of auditing is often about specific information (and
 therefore specific objects) and it'd be ideal to have that set
 up and managed alongside the table definition. 

Yes, exactly.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-05-02 Thread Abhijit Menon-Sen
At 2014-05-02 14:22:23 -0400, sfr...@snowman.net wrote:

 I'm aware and I really am not convinced that pushing all of this to
 contrib modules using the hooks is the right approach- for one thing,
 it certainly doesn't seem to me that we've actually gotten a lot of
 traction from people to actually make use of them and keep them
 updated.

For what it's worth, I greatly appreciate *having* the hooks. Without
them, it would have been much more difficult to prototype pgaudit, and
it would have been impossible to do so in a way that could be used with
9.3/9.4.

As for whether auditing as a feature *should* be an extension, I do not
have a strong opinion yet. If a consensus formed around a better design
in-core, I certainly wouldn't object.

 I'm rather unconvinced that having to go, independently, update the
 contrib modules to understand each new object is going to be a
 terribly workable long-term solution.

(I am not expressing any opinion at this time on this larger question.)

 having to combine event triggers with various hooks just doesn't
 strike me as a great design.

Suggestions are welcome, but I have to say that I'm not a big fan of
reinventing what event trigger give us in the way of deparsing either.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] tab completion for setting search_path

2014-05-02 Thread Jeff Janes
I've been working with an app that uses a schema name whose spelling is
hard to type, and the lack of tab completion for SET search_path TO was
bugging me.  So see attached.

I filter out the system schemata, but not public.

For commit fest next.

Cheers,

Jeff