Re: [HACKERS] review: tab completion for set search_path TO

2014-06-23 Thread Pavel Stehule
Hello

I am sorry, I expected so review shold to start with new thread. I was
wrong, so next reviews I will use e existing threads

Regards

pavel
Dne 23. 6. 2014 7:39 Michael Paquier michael.paqu...@gmail.com
napsal(a):

 On Sun, Jun 22, 2014 at 2:22 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:

 Hello,

 this patch https://commitfest.postgresql.org/action/patch_view?id=1443
 is trivial with zero risk.

 Patch is applicable without any issues, compilation was without any
 issues too.

 Just wondering: why creating a new thread for a review and not reply
 directly reply to the exiting one? This makes the review/patch submission
 flow rather complicated to follow.
 --
 Michael



Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-23 Thread Kyotaro HORIGUCHI
Hello, thank you for the comments. 

 On Sun, Jun 22, 2014 at 8:54 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On 13 June 2014 12:27, Fujii Masao masao.fu...@gmail.com wrote:
 
  I think that pg_resetxlog should reset backup locations by default
  since they are useless (rather harmful) after pg_resetxlog. Thought?
 
  +1
 
  Do we regard that point as a bug that should be backpatched?
 
 Yep, I think so.

... Ouch, I was too short-sighted :( That is pretty natural to do
so after hearing that. I should have pointed this at the previous
discusion.

I assume the primary usage of this patch to be, as described
before, Dissolving a recovery freezing caused by wrongly placed
backup label. Crash recovery has been already done at that time
so resetxlog's current behavior seems also fittin the situation,
I suppose.

Ok, I'm doing modify it to reset backup locations by default and
remove the new option '-b' to do that. Since this seems looking
to be a bug for the poeple, I'll provide backpatches back
to... 8.4?  (Final release of 8.4 is scheduled at July 2014)

Concerning documentation, I see no need to edit it if no one
wants it more detailed after the visible option has been called
off.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-23 Thread Kyotaro HORIGUCHI
Hello, I don't know you environment so I don't see how many
additional changes are needed, but still I think the message
should not be seen there.

 @Fabrizio de Royes Mello, Even upon making changes as per your suggestion,
 I could see that initdb is failing for the same reason:
 
 /mswitch/pgsql/bin # ./initdb -D ../data/
...
 creating template1 database in ../data/base/1 ... root execution of the
 PostgreSQL server is not permitted.
 The server must be started under an unprivileged user ID to prevent
..
 Please let me know if I need to make changes in any other place. Appreciate
 your help!

The complaint looks to be made at the first one among the four
points I have shown, as far as I can see.  Are you sure that the
'postgres' binary invoked at the time has been built after
modification? 'which postgres' will show you the file to be
checked and replaced.

| postgresql $ find . -type f -print | xargs grep -nH 'geteuid() == 0'
| ./src/backend/main/main.c:377:  if (geteuid() == 0)
| ./src/bin/pg_ctl/pg_ctl.c:2121: if (geteuid() == 0)
| ./src/bin/initdb/initdb.c:778:  if (geteuid() == 0)  /* 0 
is root's uid */
| ./src/bin/pg_resetxlog/pg_resetxlog.c:250:  if (geteuid() == 0)


 On Fri, Jun 13, 2014 at 9:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
   On Thu, Jun 12, 2014 at 10:59 PM, Kyotaro HORIGUCHI 
   horiguchi.kyot...@lab.ntt.co.jp wrote:
   Try replacing these conditions with (0  geteuid() == 0) and
   you would see it run as root.
 
   Maybe a compile option like '--enable-run-as-root' could be added to
  allow
   it without the need of change the source code.
 
  If we wanted it to be easy to run PG as root, those tests wouldn't be
  there in the first place.  We don't want that, so there will never be
  anything to override that policy as easily as setting a configure option.
 
  In the case at hand, the OP wants to run on some weird system that
  apparently doesn't have multiple users at all.  It's a pretty safe bet
  that hacking out those geteuid tests is just the tip of the iceberg of
  what he's going to have to change to make it work, because it probably
  deviates from typical-Unix-environment in a lot of other ways too.
  So I can't get too concerned about how easy this one aspect is for him.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-23 Thread Fujii Masao
On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello

 third version with Erik's update

Here are some my comments:

The document of psql needs to be updated. At least the description of new option
this patch adds needs to be added into the document.

+printf(_(  --help-variables list of available
configuration variables (options), then exit\n));

We should get rid of of from the message, or add show in front of list of?

+printf(_(  ECHO   write all input lines to standard
output\n));

This message seems not to be correct when ECHO=queries is set.

+printf(_(  COMP_KEYWORD_CASE  determines which letter case to
use when completing an SQL key word\n));
+printf(_(  DBNAME name of currently connected database\n));
+printf(_(  ECHO   write all input lines to standard
output\n));

I found that some help message line uses a normal form of a verb, but
other does not.
We should standardize them?

+printf(_(  PROMPT1, PROMPT2, PROMPT3  specify the psql prompt\n));

When the option name field is long, we should add a new line just
after the name field
and align the starting position of the option explanation field. That is,
for example, the above should be

printf(_(  PROMPT1, PROMPT2, PROMPT3\n
  specify the psql prompt\n));

+printf(_(  ON_ERROR_ROLLBACK  when on, ROLLBACK on error\n));

This message seems incorrect to me. When this option is on and an error occurs
in transaction, transaction continues rather than ROLLBACK occurs, IIUC.
I did not check whole help messages yet, but ISTM some messages are not correct.
It's better to check them again.

+printf(_(  PSQL_RCalternative location of the user's
.psqlrc file\n));

Typo: PSQL_RC should be PSQLRC

+printf(_(  PGDATABASE same as the dbname connection
parameter\n));
+printf(_(  PGHOST same as the host connection parameter\n));
+printf(_(  PGPORT same as the port connection
parameter\n));
+printf(_(  PGUSER same as the user connection parameter\n));
+printf(_(  PGPASSWORD possibility to set password (not
recommended)\n));
+printf(_(  PGPASSFILE password file (default ~/.pgpass)\n));

I don't think that psql needs to display the help messages of even environment
variables supported by libpq.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-23 Thread Fujii Masao
On Mon, Jun 23, 2014 at 3:49 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Hello, thank you for the comments.

 On Sun, Jun 22, 2014 at 8:54 PM, Simon Riggs si...@2ndquadrant.com wrote:
  On 13 June 2014 12:27, Fujii Masao masao.fu...@gmail.com wrote:
 
  I think that pg_resetxlog should reset backup locations by default
  since they are useless (rather harmful) after pg_resetxlog. Thought?
 
  +1
 
  Do we regard that point as a bug that should be backpatched?

 Yep, I think so.

 ... Ouch, I was too short-sighted :( That is pretty natural to do
 so after hearing that. I should have pointed this at the previous
 discusion.

 I assume the primary usage of this patch to be, as described
 before, Dissolving a recovery freezing caused by wrongly placed
 backup label. Crash recovery has been already done at that time
 so resetxlog's current behavior seems also fittin the situation,
 I suppose.

One question is; is there case where a user wants to reset only
backup locations? I'm not sure if there are such cases. If they exist,
probably we should implement new option which resets only backup
locations. Thought?

 Ok, I'm doing modify it to reset backup locations by default and
 remove the new option '-b' to do that. Since this seems looking
 to be a bug for the poeple, I'll provide backpatches back
 to... 8.4?  (Final release of 8.4 is scheduled at July 2014)

I was thinking that we don't need to backpatch this to 8.4 because
8.4 doesn't have any backup locations. No?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-23 Thread Pavel Stehule
Hello


2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hello
 
  third version with Erik's update

 Here are some my comments:

 The document of psql needs to be updated. At least the description of new
 option
 this patch adds needs to be added into the document.

 +printf(_(  --help-variables list of available
 configuration variables (options), then exit\n));

 We should get rid of of from the message, or add show in front of
 list of?

 +printf(_(  ECHO   write all input lines to standard
 output\n));

 This message seems not to be correct when ECHO=queries is set.

 +printf(_(  COMP_KEYWORD_CASE  determines which letter case to
 use when completing an SQL key word\n));
 +printf(_(  DBNAME name of currently connected
 database\n));
 +printf(_(  ECHO   write all input lines to standard
 output\n));

 I found that some help message line uses a normal form of a verb, but
 other does not.
 We should standardize them?

 +printf(_(  PROMPT1, PROMPT2, PROMPT3  specify the psql prompt\n));

 When the option name field is long, we should add a new line just
 after the name field
 and align the starting position of the option explanation field. That is,
 for example, the above should be

 printf(_(  PROMPT1, PROMPT2, PROMPT3\n
   specify the psql prompt\n));

 +printf(_(  ON_ERROR_ROLLBACK  when on, ROLLBACK on error\n));

 This message seems incorrect to me. When this option is on and an error
 occurs
 in transaction, transaction continues rather than ROLLBACK occurs, IIUC.
 I did not check whole help messages yet, but ISTM some messages are not
 correct.
 It's better to check them again.

 +printf(_(  PSQL_RCalternative location of the user's
 .psqlrc file\n));

 Typo: PSQL_RC should be PSQLRC

 +printf(_(  PGDATABASE same as the dbname connection
 parameter\n));
 +printf(_(  PGHOST same as the host connection
 parameter\n));
 +printf(_(  PGPORT same as the port connection
 parameter\n));
 +printf(_(  PGUSER same as the user connection
 parameter\n));
 +printf(_(  PGPASSWORD possibility to set password (not
 recommended)\n));
 +printf(_(  PGPASSFILE password file (default ~/.pgpass)\n));

 I don't think that psql needs to display the help messages of even
 environment
 variables supported by libpq.


Main reason is a PGPASSWORD -- it is probably most used env variable with
psql

PGPASSWORD=** psql is very often used pattern

Regards

Pavel Stehule


 Regards,

 --
 Fujii Masao



Re: [HACKERS] Audit of logout

2014-06-23 Thread Fujii Masao
On Sat, Jun 21, 2014 at 12:59 PM, Joe Conway m...@joeconway.com wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1

 On 06/13/2014 07:29 AM, Tom Lane wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 On Thu, Jun 12, 2014 at 8:51 PM, Fujii Masao
 masao.fu...@gmail.com wrote:
 Some users enable log_disconnections in postgresql.conf to
 audit all logouts. But since log_disconnections is defined with
 PGC_BACKEND, it can be changed at connection start. This means
 that any client (even nonsuperuser) can freely disable
 log_disconnections not to log his or her logout even when the
 system admin enables it in postgresql.conf. Isn't this
 problematic for audit?

 That's harmful for audit purpose. I think that we should make
 log_disconnections PGC_SUSET rather than PGC_BACKEND in order to
 forbid non-superusers from changing its setting. Attached patch
 does this.

 This whole argument seems wrong unless I'm missing something:

 test=# set log_connections = on;
 ERROR:  parameter log_connections cannot be set after connection start
 test=# set log_disconnections = off;
 ERROR:  parameter log_disconnections cannot be set after connection
 start

You can change log_connections/disconnections via connection option as follows

$ grep log_disconnections $PGDATA/postgresql.conf
log_disconnections = on

$ psql -U hoge -d options='-c log_disconnections=off'
= show log_disconnections ;
 log_disconnections

 off
(1 row)

= \du
 List of roles
 Role name |   Attributes   | Member of
---++---
 hoge  || {}
 postgres  | Superuser, Create role, Create DB, Replication | {}


 I wonder whether we should just get rid of log_disconnections as a
 separate variable, instead logging disconnections when
 log_connections is set.


 That might be a good idea though.

David pointed the merit of keeping those two parameters separate upthread
and I understand his thought.
http://www.postgresql.org/message-id/1402675662004-5807224.p...@n5.nabble.com

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-23 Thread Fujii Masao
On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 Hello


 2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hello
 
  third version with Erik's update

 Here are some my comments:

 The document of psql needs to be updated. At least the description of new
 option
 this patch adds needs to be added into the document.

 +printf(_(  --help-variables list of available
 configuration variables (options), then exit\n));

 We should get rid of of from the message, or add show in front of
 list of?

 +printf(_(  ECHO   write all input lines to standard
 output\n));

 This message seems not to be correct when ECHO=queries is set.

 +printf(_(  COMP_KEYWORD_CASE  determines which letter case to
 use when completing an SQL key word\n));
 +printf(_(  DBNAME name of currently connected
 database\n));
 +printf(_(  ECHO   write all input lines to standard
 output\n));

 I found that some help message line uses a normal form of a verb, but
 other does not.
 We should standardize them?

 +printf(_(  PROMPT1, PROMPT2, PROMPT3  specify the psql prompt\n));

 When the option name field is long, we should add a new line just
 after the name field
 and align the starting position of the option explanation field. That is,
 for example, the above should be

 printf(_(  PROMPT1, PROMPT2, PROMPT3\n
   specify the psql prompt\n));

 +printf(_(  ON_ERROR_ROLLBACK  when on, ROLLBACK on error\n));

 This message seems incorrect to me. When this option is on and an error
 occurs
 in transaction, transaction continues rather than ROLLBACK occurs, IIUC.
 I did not check whole help messages yet, but ISTM some messages are not
 correct.
 It's better to check them again.

 +printf(_(  PSQL_RCalternative location of the user's
 .psqlrc file\n));

 Typo: PSQL_RC should be PSQLRC

 +printf(_(  PGDATABASE same as the dbname connection
 parameter\n));
 +printf(_(  PGHOST same as the host connection
 parameter\n));
 +printf(_(  PGPORT same as the port connection
 parameter\n));
 +printf(_(  PGUSER same as the user connection
 parameter\n));
 +printf(_(  PGPASSWORD possibility to set password (not
 recommended)\n));
 +printf(_(  PGPASSFILE password file (default
 ~/.pgpass)\n));

 I don't think that psql needs to display the help messages of even
 environment
 variables supported by libpq.


 Main reason is a PGPASSWORD -- it is probably most used env variable with
 psql

 PGPASSWORD=** psql is very often used pattern

But it's not recommended as the help message which the patch added says ;)

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-23 Thread Pavel Stehule
2014-06-23 10:57 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hello
 
 
  2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule 
 pavel.steh...@gmail.com
  wrote:
   Hello
  
   third version with Erik's update
 
  Here are some my comments:
 
  The document of psql needs to be updated. At least the description of
 new
  option
  this patch adds needs to be added into the document.
 
  +printf(_(  --help-variables list of available
  configuration variables (options), then exit\n));
 
  We should get rid of of from the message, or add show in front of
  list of?
 
  +printf(_(  ECHO   write all input lines to standard
  output\n));
 
  This message seems not to be correct when ECHO=queries is set.
 
  +printf(_(  COMP_KEYWORD_CASE  determines which letter case to
  use when completing an SQL key word\n));
  +printf(_(  DBNAME name of currently connected
  database\n));
  +printf(_(  ECHO   write all input lines to standard
  output\n));
 
  I found that some help message line uses a normal form of a verb, but
  other does not.
  We should standardize them?
 
  +printf(_(  PROMPT1, PROMPT2, PROMPT3  specify the psql
 prompt\n));
 
  When the option name field is long, we should add a new line just
  after the name field
  and align the starting position of the option explanation field. That
 is,
  for example, the above should be
 
  printf(_(  PROMPT1, PROMPT2, PROMPT3\n
specify the psql prompt\n));
 
  +printf(_(  ON_ERROR_ROLLBACK  when on, ROLLBACK on error\n));
 
  This message seems incorrect to me. When this option is on and an error
  occurs
  in transaction, transaction continues rather than ROLLBACK occurs, IIUC.
  I did not check whole help messages yet, but ISTM some messages are not
  correct.
  It's better to check them again.
 
  +printf(_(  PSQL_RCalternative location of the user's
  .psqlrc file\n));
 
  Typo: PSQL_RC should be PSQLRC
 
  +printf(_(  PGDATABASE same as the dbname connection
  parameter\n));
  +printf(_(  PGHOST same as the host connection
  parameter\n));
  +printf(_(  PGPORT same as the port connection
  parameter\n));
  +printf(_(  PGUSER same as the user connection
  parameter\n));
  +printf(_(  PGPASSWORD possibility to set password (not
  recommended)\n));
  +printf(_(  PGPASSFILE password file (default
  ~/.pgpass)\n));
 
  I don't think that psql needs to display the help messages of even
  environment
  variables supported by libpq.
 
 
  Main reason is a PGPASSWORD -- it is probably most used env variable with
  psql
 
  PGPASSWORD=** psql is very often used pattern

 But it's not recommended as the help message which the patch added says ;)


why?

who can see this value?

regards

Pavel



 Regards,

 --
 Fujii Masao



Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-23 Thread Kyotaro HORIGUCHI
Hi,

At Mon, 23 Jun 2014 17:10:05 +0900, Fujii Masao masao.fu...@gmail.com wrote 
in cahgqgwfy_cdmfuriu6zoat2htqo_eijaj7vwewysol15oct...@mail.gmail.com
  I assume the primary usage of this patch to be, as described
  before, Dissolving a recovery freezing caused by wrongly placed
  backup label. Crash recovery has been already done at that time
  so resetxlog's current behavior seems also fittin the situation,
  I suppose.
 
 One question is; is there case where a user wants to reset only
 backup locations? I'm not sure if there are such cases. If they exist,
 probably we should implement new option which resets only backup
 locations. Thought?

As I described as above, I don't see obvious use case where ONLY
backup location is needed to be resetted. The option you proposed
not only clears backup locations but also inhibits resetting xlog
infos which would be done without the option. The behavior would
puzzle users.

  Ok, I'm doing modify it to reset backup locations by default and
  remove the new option '-b' to do that. Since this seems looking
  to be a bug for the poeple, I'll provide backpatches back
  to... 8.4?  (Final release of 8.4 is scheduled at July 2014)
 
 I was thinking that we don't need to backpatch this to 8.4 because
 8.4 doesn't have any backup locations. No?

I'm not so knowledgeable about ancint(?) specs:p Of course this
phrase means back to 8.4 that this patch has any meaning.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-06-23 Thread Ashutosh Bapat
Hi,
Selecting tableoid on parent causes an error, ERROR:  cannot extract
system attribute from virtual tuple. The foreign table has an OID which
can be reported as tableoid for the rows coming from that foreign table. Do
we want to do that?


On Fri, Jun 20, 2014 at 1:34 PM, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hello,

 Before continueing discussion, I tried this patch on the current
 head.

 This patch applies cleanly except one hunk because of a
 modification just AFTER it, which did no harm. Finally all
 regression tests suceeded.

 Attached is the rebased patch of v11 up to the current master.

 regards,

 --
 Kyotaro Horiguchi
 NTT Open Source Software Center


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output

2014-06-23 Thread Fujii Masao
On Sat, Apr 12, 2014 at 9:25 PM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 The last week I twice had the need to see how many backends had some
 buffers pinned. Once during development and once while analyzing a stuck
 vacuum (waiting for a cleanup lock).
 I'd like to add a column to pg_buffercache exposing that. The name I've
 come up with is 'pinning_backends' to reflect the fact that it's not the
 actual pincount due to the backend private arrays.

This name sounds good to me.

+CREATE OR REPLACE VIEW pg_buffercache AS
+SELECT P.* FROM pg_buffercache_pages() AS P
+(bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
+ relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
+ pincount int8);

pincount should be pinning_backends here?

This may be harmless but pinning_backends should be defined as int4
rather than int8
because BufferDesc-refcount is just defined as unsigned and it's
converted to Datum
by Int32GetDatum().

+-- complain if script is sourced in psql, rather than via CREATE EXTENSION

s/CREATE/ALTER

+\echo Use CREATE EXTENSION pg_buffercache to load this file. \quit

The message should be something like ALTER EXTENSION pg_buffercache
UPDATE TO '1.1'.

+/* might not be used, but the array is long enough */
+values[8] = Int32GetDatum(fctx-record[i].pinning_backends);
+nulls[8] = false;

Why is the above source comment needed?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output

2014-06-23 Thread Andres Freund
On 2014-06-23 18:44:24 +0900, Fujii Masao wrote:
 On Sat, Apr 12, 2014 at 9:25 PM, Andres Freund and...@2ndquadrant.com wrote:
  Hi,
 
  The last week I twice had the need to see how many backends had some
  buffers pinned. Once during development and once while analyzing a stuck
  vacuum (waiting for a cleanup lock).
  I'd like to add a column to pg_buffercache exposing that. The name I've
  come up with is 'pinning_backends' to reflect the fact that it's not the
  actual pincount due to the backend private arrays.
 
 This name sounds good to me.
 
 +CREATE OR REPLACE VIEW pg_buffercache AS
 +SELECT P.* FROM pg_buffercache_pages() AS P
 +(bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
 + relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
 + pincount int8);
 
 pincount should be pinning_backends here?

Yes. I'd changed my mind around a couple of times and apparently didn't
send the right version of the patch. Thanks.

 This may be harmless but pinning_backends should be defined as int4
 rather than int8
 because BufferDesc-refcount is just defined as unsigned and it's
 converted to Datum
 by Int32GetDatum().

Well, in theory a uint32 can't generally be converted to a int32. That's
why I chose a int64 because it's guaranteed to have sufficient
range. It's fairly unlikely to have that many pins, but using a int64
seems cheap enough here.

 +-- complain if script is sourced in psql, rather than via CREATE EXTENSION
 
 s/CREATE/ALTER
 
 +\echo Use CREATE EXTENSION pg_buffercache to load this file. \quit

Hm, right.

 The message should be something like ALTER EXTENSION pg_buffercache
 UPDATE TO '1.1'.
 
 +/* might not be used, but the array is long enough */
 +values[8] = Int32GetDatum(fctx-record[i].pinning_backends);
 +nulls[8] = false;
 
 Why is the above source comment needed?

It tries to explain that while the caller doesn't necessarily look at
values[8] (if it's the old pg_proc entry) but we're guaranteed to have
allocated a long enough values/nulls array.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-23 Thread Fujii Masao
On Mon, Jun 23, 2014 at 6:06 PM, Pavel Stehule pavel.steh...@gmail.com wrote:



 2014-06-23 10:57 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hello
 
 
  2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule
  pavel.steh...@gmail.com
  wrote:
   Hello
  
   third version with Erik's update
 
  Here are some my comments:
 
  The document of psql needs to be updated. At least the description of
  new
  option
  this patch adds needs to be added into the document.
 
  +printf(_(  --help-variables list of available
  configuration variables (options), then exit\n));
 
  We should get rid of of from the message, or add show in front of
  list of?
 
  +printf(_(  ECHO   write all input lines to standard
  output\n));
 
  This message seems not to be correct when ECHO=queries is set.
 
  +printf(_(  COMP_KEYWORD_CASE  determines which letter case to
  use when completing an SQL key word\n));
  +printf(_(  DBNAME name of currently connected
  database\n));
  +printf(_(  ECHO   write all input lines to standard
  output\n));
 
  I found that some help message line uses a normal form of a verb, but
  other does not.
  We should standardize them?
 
  +printf(_(  PROMPT1, PROMPT2, PROMPT3  specify the psql
  prompt\n));
 
  When the option name field is long, we should add a new line just
  after the name field
  and align the starting position of the option explanation field. That
  is,
  for example, the above should be
 
  printf(_(  PROMPT1, PROMPT2, PROMPT3\n
specify the psql prompt\n));
 
  +printf(_(  ON_ERROR_ROLLBACK  when on, ROLLBACK on error\n));
 
  This message seems incorrect to me. When this option is on and an error
  occurs
  in transaction, transaction continues rather than ROLLBACK occurs,
  IIUC.
  I did not check whole help messages yet, but ISTM some messages are not
  correct.
  It's better to check them again.
 
  +printf(_(  PSQL_RCalternative location of the user's
  .psqlrc file\n));
 
  Typo: PSQL_RC should be PSQLRC
 
  +printf(_(  PGDATABASE same as the dbname connection
  parameter\n));
  +printf(_(  PGHOST same as the host connection
  parameter\n));
  +printf(_(  PGPORT same as the port connection
  parameter\n));
  +printf(_(  PGUSER same as the user connection
  parameter\n));
  +printf(_(  PGPASSWORD possibility to set password (not
  recommended)\n));
  +printf(_(  PGPASSFILE password file (default
  ~/.pgpass)\n));
 
  I don't think that psql needs to display the help messages of even
  environment
  variables supported by libpq.
 
 
  Main reason is a PGPASSWORD -- it is probably most used env variable
  with
  psql
 
  PGPASSWORD=** psql is very often used pattern

 But it's not recommended as the help message which the patch added says ;)


 why?

 who can see this value?

I'm sure that the document explains this.

http://www.postgresql.org/docs/devel/static/libpq-envars.html
---
PGPASSWORD behaves the same as the password connection parameter.
Use of this environment variable is not recommended for security reasons,
as some operating systems allow non-root users to see process environment
variables via ps; instead consider using the ~/.pgpass file
---

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add the number of pinning backends to pg_buffercache's output

2014-06-23 Thread Fujii Masao
On Mon, Jun 23, 2014 at 6:51 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-23 18:44:24 +0900, Fujii Masao wrote:
 On Sat, Apr 12, 2014 at 9:25 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Hi,
 
  The last week I twice had the need to see how many backends had some
  buffers pinned. Once during development and once while analyzing a stuck
  vacuum (waiting for a cleanup lock).
  I'd like to add a column to pg_buffercache exposing that. The name I've
  come up with is 'pinning_backends' to reflect the fact that it's not the
  actual pincount due to the backend private arrays.

 This name sounds good to me.

 +CREATE OR REPLACE VIEW pg_buffercache AS
 +SELECT P.* FROM pg_buffercache_pages() AS P
 +(bufferid integer, relfilenode oid, reltablespace oid, reldatabase oid,
 + relforknumber int2, relblocknumber int8, isdirty bool, usagecount int2,
 + pincount int8);

 pincount should be pinning_backends here?

 Yes. I'd changed my mind around a couple of times and apparently didn't
 send the right version of the patch. Thanks.

 This may be harmless but pinning_backends should be defined as int4
 rather than int8
 because BufferDesc-refcount is just defined as unsigned and it's
 converted to Datum
 by Int32GetDatum().

 Well, in theory a uint32 can't generally be converted to a int32. That's
 why I chose a int64 because it's guaranteed to have sufficient
 range. It's fairly unlikely to have that many pins, but using a int64
 seems cheap enough here.

Yep, you're right.

 +-- complain if script is sourced in psql, rather than via CREATE EXTENSION

 s/CREATE/ALTER

 +\echo Use CREATE EXTENSION pg_buffercache to load this file. \quit

 Hm, right.

 The message should be something like ALTER EXTENSION pg_buffercache
 UPDATE TO '1.1'.

 +/* might not be used, but the array is long enough */
 +values[8] = Int32GetDatum(fctx-record[i].pinning_backends);
 +nulls[8] = false;

 Why is the above source comment needed?

 It tries to explain that while the caller doesn't necessarily look at
 values[8] (if it's the old pg_proc entry) but we're guaranteed to have
 allocated a long enough values/nulls array.

Understood.

I think you can commit this patch after fixing some minor things.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crash with assertions and WAL_DEBUG

2014-06-23 Thread Heikki Linnakangas

On 06/21/2014 01:58 PM, Heikki Linnakangas wrote:

It's a bit difficult to attach the mark to the palloc calls, as neither
the WAL_DEBUG or LWLOCK_STATS code is calling palloc directly, but
marking specific MemoryContexts as sanctioned ought to work. I'll take a
stab at that.


I came up with the attached patch. It adds a function called 
MemoryContextAllowInCriticalSection(), which can be used to exempt 
specific memory contexts from the assertion. The following contexts are 
exempted:


* ErrorContext
* MdCxt, which is used in checkpointer to absorb fsync requests. (the 
checkpointer process as a whole is no longer exempt)
* The temporary StringInfos used in WAL_DEBUG (a new memory WAL Debug 
context is now created for them)

* LWLock stats hash table (a new LWLock stats context is created for it)

Barring objections, I'll commit this to master, and remove the assertion 
from REL9_4_STABLE.


- Heikki

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index abc5682..e141a28 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -60,6 +60,7 @@
 #include storage/spin.h
 #include utils/builtins.h
 #include utils/guc.h
+#include utils/memutils.h
 #include utils/ps_status.h
 #include utils/relmapper.h
 #include utils/snapmgr.h
@@ -1258,6 +1259,25 @@ begin:;
 	if (XLOG_DEBUG)
 	{
 		StringInfoData buf;
+		static MemoryContext walDebugCxt = NULL;
+		MemoryContext oldCxt;
+
+		/*
+		 * Allocations within a critical section are normally not allowed,
+		 * because allocation failure would lead to a PANIC. But this is just
+		 * debugging code that no-one is going to enable in production, so we
+		 * don't care. Use a memory context that's exempt from the rule.
+		 */
+		if (walDebugCxt == NULL)
+		{
+			walDebugCxt = AllocSetContextCreate(TopMemoryContext,
+WAL Debug,
+ALLOCSET_DEFAULT_MINSIZE,
+ALLOCSET_DEFAULT_INITSIZE,
+ALLOCSET_DEFAULT_MAXSIZE);
+			MemoryContextAllowInCriticalSection(walDebugCxt, true);
+		}
+		oldCxt = MemoryContextSwitchTo(walDebugCxt);
 
 		initStringInfo(buf);
 		appendStringInfo(buf, INSERT @ %X/%X: ,
@@ -1282,10 +1302,11 @@ begin:;
 
 			appendStringInfoString(buf,  - );
 			RmgrTable[rechdr-xl_rmid].rm_desc(buf, (XLogRecord *) recordbuf.data);
-			pfree(recordbuf.data);
 		}
 		elog(LOG, %s, buf.data);
-		pfree(buf.data);
+
+		MemoryContextSwitchTo(oldCxt);
+		MemoryContextReset(walDebugCxt);
 	}
 #endif
 
diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index d23ac62..debce86 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -106,6 +106,7 @@ typedef struct lwlock_stats
 
 static int	counts_for_pid = 0;
 static HTAB *lwlock_stats_htab;
+static MemoryContext lwlock_stats_cxt;
 #endif
 
 #ifdef LOCK_DEBUG
@@ -143,16 +144,34 @@ init_lwlock_stats(void)
 {
 	HASHCTL		ctl;
 
-	if (lwlock_stats_htab != NULL)
+	if (lwlock_stats_cxt != NULL)
 	{
-		hash_destroy(lwlock_stats_htab);
+		MemoryContextDelete(lwlock_stats_cxt);
+		lwlock_stats_cxt = NULL;
+		/* the hash table went away with the context */
 		lwlock_stats_htab = NULL;
 	}
 
+	/*
+	 * The LWLock stats will be updated within a critical section, which
+	 * requires allocating new hash entries. Allocations within a critical
+	 * section are normally not allowed because running out of memory would
+	 * lead to a PANIC, but LWLOCK_STATS is debugging code that's not normally
+	 * turned on in production, so that's an acceptable risk. The hash entries
+	 * are small, so the risk of running out of memory is minimal in practice.
+	 */
+	lwlock_stats_cxt = AllocSetContextCreate(TopMemoryContext,
+			 LWLock stats,
+			 ALLOCSET_DEFAULT_MINSIZE,
+			 ALLOCSET_DEFAULT_INITSIZE,
+			 ALLOCSET_DEFAULT_MAXSIZE);
+	MemoryContextAllowInCriticalSection(lwlock_stats_cxt, true);
+
 	MemSet(ctl, 0, sizeof(ctl));
 	ctl.keysize = sizeof(lwlock_stats_key);
 	ctl.entrysize = sizeof(lwlock_stats);
 	ctl.hash = tag_hash;
+	ctl.hcxt = lwlock_stats_cxt;
 	lwlock_stats_htab = hash_create(lwlock stats, 16384, ctl,
 	HASH_ELEM | HASH_FUNCTION);
 	counts_for_pid = MyProcPid;
diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
index 3c1c81a..4264373 100644
--- a/src/backend/storage/smgr/md.c
+++ b/src/backend/storage/smgr/md.c
@@ -219,6 +219,16 @@ mdinit(void)
 	  hash_ctl,
    HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
 		pendingUnlinks = NIL;
+
+		/*
+		 * XXX: The checkpointer needs to add entries to the pending ops
+		 * table when absorbing fsync requests. That is done within a critical
+		 * section. It means that there's a theoretical possibility that you
+		 * run out of memory while absorbing fsync requests, which leads to
+		 * a PANIC. Fortunately the hash table is small so that's unlikely to
+		 * happen in practice.
+		 */
+		MemoryContextAllowInCriticalSection(MdCxt, true);
 	}
 }
 
diff 

Re: [HACKERS] crash with assertions and WAL_DEBUG

2014-06-23 Thread Andres Freund
On 2014-06-23 12:58:19 +0300, Heikki Linnakangas wrote:
 On 06/21/2014 01:58 PM, Heikki Linnakangas wrote:
 It's a bit difficult to attach the mark to the palloc calls, as neither
 the WAL_DEBUG or LWLOCK_STATS code is calling palloc directly, but
 marking specific MemoryContexts as sanctioned ought to work. I'll take a
 stab at that.
 
 I came up with the attached patch. It adds a function called
 MemoryContextAllowInCriticalSection(), which can be used to exempt specific
 memory contexts from the assertion. The following contexts are exempted:
 
 * ErrorContext
 * MdCxt, which is used in checkpointer to absorb fsync requests. (the
 checkpointer process as a whole is no longer exempt)
 * The temporary StringInfos used in WAL_DEBUG (a new memory WAL Debug
 context is now created for them)
 * LWLock stats hash table (a new LWLock stats context is created for it)
 
 Barring objections, I'll commit this to master, and remove the assertion
 from REL9_4_STABLE.

Sounds generally sane. Some comments:

 diff --git a/src/backend/access/transam/xlog.c 
 b/src/backend/access/transam/xlog.c
 index abc5682..e141a28 100644
 --- a/src/backend/access/transam/xlog.c
 +++ b/src/backend/access/transam/xlog.c
 @@ -60,6 +60,7 @@
  #include storage/spin.h
  #include utils/builtins.h
  #include utils/guc.h
 +#include utils/memutils.h
  #include utils/ps_status.h
  #include utils/relmapper.h
  #include utils/snapmgr.h
 @@ -1258,6 +1259,25 @@ begin:;
   if (XLOG_DEBUG)
   {
   StringInfoData buf;
 + static MemoryContext walDebugCxt = NULL;
 + MemoryContext oldCxt;
 +
 + /*
 +  * Allocations within a critical section are normally not 
 allowed,
 +  * because allocation failure would lead to a PANIC. But this 
 is just
 +  * debugging code that no-one is going to enable in production, 
 so we
 +  * don't care. Use a memory context that's exempt from the rule.
 +  */
 + if (walDebugCxt == NULL)
 + {
 + walDebugCxt = AllocSetContextCreate(TopMemoryContext,
 + 
 WAL Debug,
 + 
 ALLOCSET_DEFAULT_MINSIZE,
 + 
 ALLOCSET_DEFAULT_INITSIZE,
 + 
 ALLOCSET_DEFAULT_MAXSIZE);
 + MemoryContextAllowInCriticalSection(walDebugCxt, true);
 + }
 + oldCxt = MemoryContextSwitchTo(walDebugCxt);

This will only work though if the first XLogInsert() isn't called from a
critical section. I'm not sure it's a good idea to rely on that.

 diff --git a/src/backend/storage/smgr/md.c b/src/backend/storage/smgr/md.c
 index 3c1c81a..4264373 100644
 --- a/src/backend/storage/smgr/md.c
 +++ b/src/backend/storage/smgr/md.c
 @@ -219,6 +219,16 @@ mdinit(void)
 
 hash_ctl,
  HASH_ELEM | 
 HASH_FUNCTION | HASH_CONTEXT);
   pendingUnlinks = NIL;
 +
 + /*
 +  * XXX: The checkpointer needs to add entries to the pending ops
 +  * table when absorbing fsync requests. That is done within a 
 critical
 +  * section. It means that there's a theoretical possibility 
 that you
 +  * run out of memory while absorbing fsync requests, which 
 leads to
 +  * a PANIC. Fortunately the hash table is small so that's 
 unlikely to
 +  * happen in practice.
 +  */
 + MemoryContextAllowInCriticalSection(MdCxt, true);
   }
  }

Isn't that allowing a bit too much? We e.g. shouldn't allow
_fdvec_alloc() within a crritical section. Might make sense to create a
child context for it.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Use a signal to trigger a memory context dump?

2014-06-23 Thread Andres Freund
Hi,

I wonder if it'd make sense to allow a signal to trigger a memory
context dump? I and others more than once had the need to examine memory
usage on production systems and using gdb isn't always realistic.
I wonder if we could install a signal handler for some unused signal
(e.g. SIGPWR) to dump memory.
I'd also considered adding a SQL function that uses the SIGUSR1 signal
multiplexing for the purpose but that's not necessarily nice if you have
to investigate while SQL access isn't yet possible. There's also the
problem that not all possibly interesting processes use the sigusr1
signal multiplexing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-06-23 Thread Fujii Masao
On Fri, May 2, 2014 at 3:19 PM, Ian Barwick i...@2ndquadrant.com wrote:
 Hi

 Here is an initial version of an auditing extension for Postgres to
 generate log output suitable for compiling a comprehensive audit trail
 of database operations.

You added this into CF, but its patch has not been posted yet. Are you planning
to make a patch?

 Planned future improvements include:

 1. Additional logging facilities, including to a separate audit
log file and to syslog, and potentially logging to a table
(possibly via a bgworker process). Currently output is simply
emitted to the server log via ereport().

 2. To implement per-object auditing configuration, it would be nice to use
extensible reloptions (or an equivalent mechanism)

Is it possible to implement these outside PostgreSQL by using hooks?
If not, it might be better to implement audit feature in core from the
beginning.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-23 Thread Andres Freund
On 2014-06-22 19:03:32 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 
  I think we'll want a version of this that just fails the
  transaction once we have the infrastructure. So we should choose
  a name that allows for a complimentary GUC.
 
 If we stick with the rule that what is to the left of _timeout is
 what is being cancelled, the a GUC to cancel a transaction which
 remains idle for too long could be called idle_transaction_timeout.
 
 Do you disagree with the general idea of following that pattern?

I think that'd be rather confusing. For one it'd need to be
idle_in_transaction_timeout which already seems less clear (because the
transaction belongs to idle) and for another that distinction seems to
be to subtle for users.

The reason I suggested
idle_in_transaction_termination/cancellation_timeout is that that maps
nicely to pg_terminate/cancel_backend() and is rather descriptive.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-06-23 Thread Abhijit Menon-Sen
(I'm replying as co-author of pgaudit.)

At 2014-06-23 19:15:39 +0900, masao.fu...@gmail.com wrote:

 You added this into CF, but its patch has not been posted yet. Are you
 planning to make a patch?

It's a self-contained contrib module. I thought Ian had posted a
tarball, but it looks like he forgot to attach it (or decided to
provide only a Github link). I've attached a tarball here for
your reference.

  Planned future improvements include:
 
  1. Additional logging facilities, including to a separate audit
 log file and to syslog, and potentially logging to a table
 (possibly via a bgworker process). Currently output is simply
 emitted to the server log via ereport().
 
  2. To implement per-object auditing configuration, it would be nice
 to use extensible reloptions (or an equivalent mechanism)
 
 Is it possible to implement these outside PostgreSQL by using hooks?

There are some unresolved questions with #2 because the extensible
reloptions patch seems to have lost favour, but I'm pretty sure we
could figure out some alternative.

 If not, it might be better to implement audit feature in core from the
 beginning.

Sure, we're open to that possibility. Do you have any ideas about what
an in-core implementation should do/look like?

-- Abhijit


pgaudit.tgz
Description: GNU Unix tar archive

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-23 Thread Pavel Stehule
2014-06-23 11:53 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Mon, Jun 23, 2014 at 6:06 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
 
 
 
  2014-06-23 10:57 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
 
  On Mon, Jun 23, 2014 at 5:10 PM, Pavel Stehule pavel.steh...@gmail.com
 
  wrote:
   Hello
  
  
   2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com:
  
   On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule
   pavel.steh...@gmail.com
   wrote:
Hello
   
third version with Erik's update
  
   Here are some my comments:
  
   The document of psql needs to be updated. At least the description of
   new
   option
   this patch adds needs to be added into the document.
  
   +printf(_(  --help-variables list of available
   configuration variables (options), then exit\n));
  
   We should get rid of of from the message, or add show in front of
   list of?
  
   +printf(_(  ECHO   write all input lines to standard
   output\n));
  
   This message seems not to be correct when ECHO=queries is set.
  
   +printf(_(  COMP_KEYWORD_CASE  determines which letter case to
   use when completing an SQL key word\n));
   +printf(_(  DBNAME name of currently connected
   database\n));
   +printf(_(  ECHO   write all input lines to standard
   output\n));
  
   I found that some help message line uses a normal form of a verb, but
   other does not.
   We should standardize them?
  
   +printf(_(  PROMPT1, PROMPT2, PROMPT3  specify the psql
   prompt\n));
  
   When the option name field is long, we should add a new line just
   after the name field
   and align the starting position of the option explanation field. That
   is,
   for example, the above should be
  
   printf(_(  PROMPT1, PROMPT2, PROMPT3\n
 specify the psql prompt\n));
  
   +printf(_(  ON_ERROR_ROLLBACK  when on, ROLLBACK on error\n));
  
   This message seems incorrect to me. When this option is on and an
 error
   occurs
   in transaction, transaction continues rather than ROLLBACK occurs,
   IIUC.
   I did not check whole help messages yet, but ISTM some messages are
 not
   correct.
   It's better to check them again.
  
   +printf(_(  PSQL_RCalternative location of the
 user's
   .psqlrc file\n));
  
   Typo: PSQL_RC should be PSQLRC
  
   +printf(_(  PGDATABASE same as the dbname connection
   parameter\n));
   +printf(_(  PGHOST same as the host connection
   parameter\n));
   +printf(_(  PGPORT same as the port connection
   parameter\n));
   +printf(_(  PGUSER same as the user connection
   parameter\n));
   +printf(_(  PGPASSWORD possibility to set password (not
   recommended)\n));
   +printf(_(  PGPASSFILE password file (default
   ~/.pgpass)\n));
  
   I don't think that psql needs to display the help messages of even
   environment
   variables supported by libpq.
  
  
   Main reason is a PGPASSWORD -- it is probably most used env variable
   with
   psql
  
   PGPASSWORD=** psql is very often used pattern
 
  But it's not recommended as the help message which the patch added says
 ;)
 
 
  why?
 
  who can see this value?

 I'm sure that the document explains this.


ok

I am too Linux centrist :( it is safe there and I don't see a others

Thank you for info

Regards

Pavel



 http://www.postgresql.org/docs/devel/static/libpq-envars.html
 ---
 PGPASSWORD behaves the same as the password connection parameter.
 Use of this environment variable is not recommended for security reasons,
 as some operating systems allow non-root users to see process environment
 variables via ps; instead consider using the ~/.pgpass file
 ---

 Regards,

 --
 Fujii Masao



Re: [HACKERS] tab completion for setting search_path

2014-06-23 Thread Andres Freund
On 2014-06-22 20:02:57 -0700, Tom Lane wrote:
 Ian Barwick i...@2ndquadrant.com writes:
  On 23/06/14 00:58, Andres Freund wrote:
  I thought about committing this but couldn't get over this bit. If you
  type SELECT * FROM pg_cattab it'll get autocompleted to
  pg_catalog.pg_ and pg_temptab will list all the temp schemas
  including the numeric and toast ones. So we have precedent for *not*
  bothering about excluding any schemas. I don't think we should start
  doing so in a piecemal fashion in an individual command's completion.
 
  There is an exception of sorts already for system schemas, in that although
  SELECT * FROM ptab will list the system schemas, it will not list any
  tables from them, and won't until SELECT * FROM pg_tab is entered
  (see note in tab-completion.c around line 3722).
 
  Personally I'd be mildly annoyed if every SET search_path TO ptab 
  resulted
  in all the system schemas being displayed when all I want is public; how
  about having these listed only once pg_ is entered, i.e.
  SET search_path TO pg_tab?
 
 I think there is a pretty strong practical argument for excluding the
 pg_temp and pg_toast schemas from completion for search_path, namely
 that when does anyone ever need to include those in their search_path
 explicitly?

Infrequently, yes. I've only done it when trying to break stuff ;)

 The use-case for including pg_catalog in your path is perhaps a bit
 greater, but not by much.

I don't know. It feelds like inappropriate nannyism to me. More
confusing than actually helpful. The schemas are there, so they should
get autocompleted.
But anyway, the common opinion seems to be swinging against my position,
so lets do it that way.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-23 Thread Fujii Masao
On Mon, Jun 23, 2014 at 7:48 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-22 19:03:32 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:

  I think we'll want a version of this that just fails the
  transaction once we have the infrastructure. So we should choose
  a name that allows for a complimentary GUC.

 If we stick with the rule that what is to the left of _timeout is
 what is being cancelled, the a GUC to cancel a transaction which
 remains idle for too long could be called idle_transaction_timeout.

 Do you disagree with the general idea of following that pattern?

 I think that'd be rather confusing. For one it'd need to be
 idle_in_transaction_timeout which already seems less clear (because the
 transaction belongs to idle) and for another that distinction seems to
 be to subtle for users.

 The reason I suggested
 idle_in_transaction_termination/cancellation_timeout is that that maps
 nicely to pg_terminate/cancel_backend() and is rather descriptive.

Maybe we can remove IIT_termination_timeout when we've implemented
IIT_cancellation_timeout. Right? I'm not sure if IIT_termination_timeout is
still useful even at that case. *If* it's not useful, I think we don't need to
have those two parameters and can just define one parameter IIT_timeout.
That's quite simple and it's similar to the current style of statement_timeout
and lock_timeout (IOW, we don't have something like
statement_termination_timeout and lock_termination_timeout).

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-23 Thread Andres Freund
On 2014-06-23 20:29:17 +0900, Fujii Masao wrote:
 On Mon, Jun 23, 2014 at 7:48 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2014-06-22 19:03:32 -0700, Kevin Grittner wrote:
  Andres Freund and...@2ndquadrant.com wrote:
 
   I think we'll want a version of this that just fails the
   transaction once we have the infrastructure. So we should choose
   a name that allows for a complimentary GUC.
 
  If we stick with the rule that what is to the left of _timeout is
  what is being cancelled, the a GUC to cancel a transaction which
  remains idle for too long could be called idle_transaction_timeout.
 
  Do you disagree with the general idea of following that pattern?
 
  I think that'd be rather confusing. For one it'd need to be
  idle_in_transaction_timeout which already seems less clear (because the
  transaction belongs to idle) and for another that distinction seems to
  be to subtle for users.
 
  The reason I suggested
  idle_in_transaction_termination/cancellation_timeout is that that maps
  nicely to pg_terminate/cancel_backend() and is rather descriptive.
 
 Maybe we can remove IIT_termination_timeout when we've implemented
 IIT_cancellation_timeout. Right? I'm not sure if IIT_termination_timeout is
 still useful even at that case.

I think both can actually be sensible depending on the use case. It's
also not nice to remove a feature without need when people started to
rely on it.
For a web app termination is probably more sensible. For interactive
clients cancellation.

 *If* it's not useful, I think we don't need to
 have those two parameters and can just define one parameter IIT_timeout.
 That's quite simple and it's similar to the current style of statement_timeout
 and lock_timeout (IOW, we don't have something like
 statement_termination_timeout and lock_termination_timeout).

I don't think those really are comparable. A long idle in transaction
state pretty much always indicates a problematic interaction with
postgres.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-23 Thread Vik Fearing
On 06/22/2014 07:47 PM, Andres Freund wrote:
 On 2014-06-22 09:27:24 -0700, Kevin Grittner wrote:
 Andres Freund and...@2ndquadrant.com wrote:

 The idea with the GUC name is that if we ever get support for
 cancelling transactions we can name that
 idle_in_transaction_transaction_timeout?
 That seems a bit awkward...

 No, the argument was that for all the other *_timeout settings what
 came before _timeout was the thing that was being terminated.  I
 think there were some votes in favor of the name on that basis, and
 none against.  Feel free to give your reasons for supporting some
 other name.
 
 My reasons for not liking the current GUC name are hinted at above. I think
 we'll want a version of this that just fails the transaction once we
 have the infrastructure. So we should choose a name that allows for
 a complimentary GUC.
 CAKFQuwZCg2uur=tudz_c2auwbo87offghn9mx_hz4qd-b8f...@mail.gmail.com
 suggested
 On 2014-06-19 10:39:48 -0700, David G Johnston wrote:
 idle_in_transaction_timeout=10s
 idle_in_transaction_target=session|transaction
 
 but I don't like that much. Not sure what'd be good, the best I
 currently can come up with is:
 idle_in_transaction_termination_timeout =
 idle_in_transaction_cancellation_timeout =

Except the transaction wouldn't be cancelled, it would be aborted.

idle_in_transaction_abortion_timeout seems a little... weird.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-23 Thread Andres Freund
On 2014-06-23 13:33:46 +0200, Vik Fearing wrote:
 On 06/22/2014 07:47 PM, Andres Freund wrote:
  On 2014-06-22 09:27:24 -0700, Kevin Grittner wrote:
  Andres Freund and...@2ndquadrant.com wrote:
  but I don't like that much. Not sure what'd be good, the best I
  currently can come up with is:
  idle_in_transaction_termination_timeout =
  idle_in_transaction_cancellation_timeout =
 
 Except the transaction wouldn't be cancelled, it would be aborted.

That ship has sailed with pg_cancel_backend(), no?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] review: Non-recursive processing of AND/OR lists

2014-06-23 Thread Gurjeet Singh
Thanks!

On Mon, Jun 16, 2014 at 3:58 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I wrote:
 Gurjeet Singh gurj...@singh.im writes:
 I tried to eliminate the 'pending' list, but I don't see a way around it.
 We need temporary storage somewhere to store the branches encountered on
 the right; in recursion case the call stack was serving that purpose.

 I still think we should fix this in the grammar, rather than introducing
 complicated logic to try to get rid of the recursion later.  For example,
 as attached.

 I went looking for (and found) some additional obsoleted comments, and
 convinced myself that ruleutils.c is okay as-is, and pushed this.

 regards, tom lane



-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-06-23 Thread Gurjeet Singh
On Wed, Jun 18, 2014 at 8:15 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Gurjeet Singh gurj...@singh.im writes:

 Please find attached the patch. It includes the doc changes as well.

 Applied with some editorialization.

Thanks!

would it be possible to include this in 9.4 as well?

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Request for Patch Feedback: Lag Lead Window Functions Can Ignore Nulls

2014-06-23 Thread Nicholas White
Hi Abhijit -

 What's the status of this patch?

The latest version of the patch needs a review, and I'd like to get it
committed in this CF if possible. Thanks -

Nick


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use a signal to trigger a memory context dump?

2014-06-23 Thread Stephen Frost
Andres,

* Andres Freund (and...@2ndquadrant.com) wrote:
 I wonder if it'd make sense to allow a signal to trigger a memory
 context dump? I and others more than once had the need to examine memory
 usage on production systems and using gdb isn't always realistic.

+100

I keep thinking we have this and then keep being disappointed when I go
try to find it.

 I wonder if we could install a signal handler for some unused signal
 (e.g. SIGPWR) to dump memory.

Interesting thought, but..

 I'd also considered adding a SQL function that uses the SIGUSR1 signal
 multiplexing for the purpose but that's not necessarily nice if you have
 to investigate while SQL access isn't yet possible. There's also the
 problem that not all possibly interesting processes use the sigusr1
 signal multiplexing.

I'd tend to think this would be sufficient.  You're suggesting a case
where you need to debug prior to SQL access (not specifically sure what
you mean by that) or processes which are hopefully less likely to have
memory issues, but you don't have gdb..

Another thought along the lines of getting information about running
processes would be to see the call stack or execution plan..  I seem to
recall there being a patch for the latter at one point?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-06-23 Thread Fujii Masao
On Mon, Jun 23, 2014 at 7:51 PM, Abhijit Menon-Sen a...@2ndquadrant.com wrote:
 (I'm replying as co-author of pgaudit.)

 At 2014-06-23 19:15:39 +0900, masao.fu...@gmail.com wrote:

 You added this into CF, but its patch has not been posted yet. Are you
 planning to make a patch?

 It's a self-contained contrib module. I thought Ian had posted a
 tarball, but it looks like he forgot to attach it (or decided to
 provide only a Github link). I've attached a tarball here for
 your reference.

  Planned future improvements include:
 
  1. Additional logging facilities, including to a separate audit
 log file and to syslog, and potentially logging to a table
 (possibly via a bgworker process). Currently output is simply
 emitted to the server log via ereport().
 
  2. To implement per-object auditing configuration, it would be nice
 to use extensible reloptions (or an equivalent mechanism)

 Is it possible to implement these outside PostgreSQL by using hooks?

 There are some unresolved questions with #2 because the extensible
 reloptions patch seems to have lost favour, but I'm pretty sure we
 could figure out some alternative.

 If not, it might be better to implement audit feature in core from the
 beginning.

 Sure, we're open to that possibility. Do you have any ideas about what
 an in-core implementation should do/look like?

I don't have good idea about that. But maybe we can merge pgaudit.log
into log_statement for more flexible settings of what to log.

BTW I found that pgaudit doesn't log bind parameters when prepared
statements are executed. I'm feeling this behavior is not good for audit
purpose. Thought?

Also I got only the following log message when I executed PREPARE hoge AS
INSERT INTO mytbl VALUES($1) and EXECUTE hoge(1). This means that
obviously we cannot track the statements via PREPARED command...

LOG:  [AUDIT],2014-06-23
21:14:53.667059+09,postgres,postgres,postgres,WRITE,INSERT,TABLE,postgres.mytbl,execute
hoge(1);

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-06-23 Thread Stephen Frost
* Fujii Masao (masao.fu...@gmail.com) wrote:
 On Mon, Jun 23, 2014 at 7:51 PM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
  At 2014-06-23 19:15:39 +0900, masao.fu...@gmail.com wrote:
  You added this into CF, but its patch has not been posted yet. Are you
  planning to make a patch?
 
  It's a self-contained contrib module. I thought Ian had posted a
  tarball, but it looks like he forgot to attach it (or decided to
  provide only a Github link). I've attached a tarball here for
  your reference.

I'm not a huge fan of adding this as a contrib module unless we can be
quite sure that there's a path forward from here to a rework of the
logging in core which would actually support the features pg_audit is
adding, without a lot of pain and upgrade issues.  Those issues have
kept other contrib modules from being added to core.

Splitting up contrib into other pieces, one of which is a 'features'
area, might address that but we'd really need a way to have those pieces
be able to include/add catalog tables, at least..

  If not, it might be better to implement audit feature in core from the
  beginning.
 
  Sure, we're open to that possibility. Do you have any ideas about what
  an in-core implementation should do/look like?
 
 I don't have good idea about that. But maybe we can merge pgaudit.log
 into log_statement for more flexible settings of what to log.

I'd expect a catalog table or perhaps changes to pg_class (maybe other
things also..) to define what gets logged..  I'd also like to see the
ability to log based on the connecting user, and we need to log under
what privileges a command is executing, and, really, a whole host of
other things..

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] How to use the 'char() ' as data type and a function name in the same time.

2014-06-23 Thread rohtodeveloper
Dear Hackers
When I use the  pg_catalog.char(integer) function  in the postgres, I can only 
use it like these:select pg_catalog.char(65);  select char(65);
But I want to use the function by the following  way.select char(1);Of coures, 
There would be a gram error.
I know the error is caused by that the 'char()' has been defined as one of 
character data types in the 'src\backend\parser\gram.y' file.
However, In SQLServer, the 'char()' can be used as data type and a function 
name in the same time.
For example:---select char(65)  
   char() used as a function nameA
select cast('1' as char(4));  char() used as data type---1 

So, I wonder if there is a solution in the postgreSQL to use the char() as data 
type  and function name in the same time like in SQLServer.   
==

Any help appreciated.

  

Re: [HACKERS] Use a signal to trigger a memory context dump?

2014-06-23 Thread Andres Freund
On 2014-06-23 08:36:02 -0400, Stephen Frost wrote:
 Andres,
 
 * Andres Freund (and...@2ndquadrant.com) wrote:
  I wonder if it'd make sense to allow a signal to trigger a memory
  context dump? I and others more than once had the need to examine memory
  usage on production systems and using gdb isn't always realistic.
 
 +100
 
 I keep thinking we have this and then keep being disappointed when I go
 try to find it.
 
  I wonder if we could install a signal handler for some unused signal
  (e.g. SIGPWR) to dump memory.
 
 Interesting thought, but..
 
  I'd also considered adding a SQL function that uses the SIGUSR1 signal
  multiplexing for the purpose but that's not necessarily nice if you have
  to investigate while SQL access isn't yet possible. There's also the
  problem that not all possibly interesting processes use the sigusr1
  signal multiplexing.
 
 I'd tend to think this would be sufficient.  You're suggesting a case
 where you need to debug prior to SQL access (not specifically sure what
 you mean by that) or processes which are hopefully less likely to have
 memory issues, but you don't have gdb..

prior to SQL access := Before crash recovery finished/hot standby
reached consistency.

And I don't agree that memory dumps from non-plain backends are that
uninteresting. E.g. background workers and logical decoding walsenders
both can be interesting.

 Another thought along the lines of getting information about running
 processes would be to see the call stack or execution plan..  I seem to
 recall there being a patch for the latter at one point?

I think these are *much* more complicated. I don't want to tackle them
at the same time, otherwise we'll never get anywhere.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] test failure on latest source

2014-06-23 Thread Marco Atzeri

On 16/04/2014 18:55, Marco Atzeri wrote:

On 16/04/2014 17:40, Tom Lane wrote:


The bigger picture though is that this code isn't failing on the
buildfarm.  So what we need to ask is what's different about Marco's
machine.


good question.
I checked again and I found that the fault is only on
the cygwin 64 bit build but not on the cygwin 32 bit one.

I was sure to have tested both, but it seems this time
I missed the comparison.


regards, tom lane


Regards
Marco



to close the issue, we have identified the fault in the getaddrinfo
system call on cygwin64.

What happens is that the field ai_addrlen is defined as socklen_t in
POSIX, but as size_t in the W32 API.  On 64 bit, socklen_t is 4 bytes
while size_t is 8 bytes.
Setting all the hintp members manually (in
contrast to calloc'ing it or memset'ing it to 0) leaves the 4 upper
bytes of the ai_addrlen untouched.  This in turn leads to a high
probability that ai_addrlen has an invalid value when entering
Winsock's getsockopt.


Regards
Marco





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tab completion for setting search_path

2014-06-23 Thread Kevin Grittner
Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-22 20:02:57 -0700, Tom Lane wrote:
 Ian Barwick i...@2ndquadrant.com writes:
 On 23/06/14 00:58, Andres Freund wrote:
 I thought about committing this but couldn't get over this bit. If you
 type SELECT * FROM pg_cattab it'll get autocompleted to
 pg_catalog.pg_ and pg_temptab will list all the temp schemas
 including the numeric and toast ones. So we have precedent for *not*
 bothering about excluding any schemas. I don't think we should start
 doing so in a piecemal fashion in an individual command's completion.

 There is an exception of sorts already for system schemas, in that although
 SELECT * FROM ptab will list the system schemas, it will not list any
 tables from them, and won't until SELECT * FROM pg_tab is entered
 (see note in tab-completion.c around line 3722).

 Personally I'd be mildly annoyed if every SET search_path TO ptab 
 resulted
 in all the system schemas being displayed when all I want is public; how
 about having these listed only once pg_ is entered, i.e.
 SET search_path TO pg_tab?

 I think there is a pretty strong practical argument for excluding the
 pg_temp and pg_toast schemas from completion for search_path, namely
 that when does anyone ever need to include those in their search_path
 explicitly?

 Infrequently, yes. I've only done it when trying to break stuff ;)

 The use-case for including pg_catalog in your path is perhaps a bit
 greater, but not by much.

 I don't know. It feelds like inappropriate nannyism to me. More
 confusing than actually helpful. The schemas are there, so they should
 get autocompleted.
 But anyway, the common opinion seems to be swinging against my position,
 so lets do it that way.

I would be for excluding the pg_toast, pg_toast_temp_n, and
pg_temp_n schemas, and including public and pg_catalog.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use a signal to trigger a memory context dump?

2014-06-23 Thread Stephen Frost
Andres,

* Andres Freund (and...@2ndquadrant.com) wrote:
 On 2014-06-23 08:36:02 -0400, Stephen Frost wrote:
  I'd tend to think this would be sufficient.  You're suggesting a case
  where you need to debug prior to SQL access (not specifically sure what
  you mean by that) or processes which are hopefully less likely to have
  memory issues, but you don't have gdb..
 
 prior to SQL access := Before crash recovery finished/hot standby
 reached consistency.
 
 And I don't agree that memory dumps from non-plain backends are that
 uninteresting. E.g. background workers and logical decoding walsenders
 both can be interesting.

I didn't mean they're uninteresting- I meant that if you're dealing with
those kinds of issues, having gdb isn't as huge a hurdle..

  Another thought along the lines of getting information about running
  processes would be to see the call stack or execution plan..  I seem to
  recall there being a patch for the latter at one point?
 
 I think these are *much* more complicated. I don't want to tackle them
 at the same time, otherwise we'll never get anywhere.

Sure, just some things to keep in mind as you're thinking about changes
in this area.  Just to toss another random thought out there, what about
an SQL function which does a LISTEN and then sends a signal to another
backend which throws a NOTIFY with payload including the requested info?
That'd be *very* useful as there are lots of cases where access to the
logs isn't trivial (particularly if they've been properly locked down
due to the sensetive info they can contain..).

Thanks,

Stephen


signature.asc
Description: Digital signature


[HACKERS] Re: [bug fix] multibyte messages are displayed incorrectly on the client

2014-06-23 Thread Heikki Linnakangas

On 04/05/2014 07:56 AM, Tom Lane wrote:

MauMau maumau...@gmail.com writes:

Then, as a happy medium, how about disabling message localization only if
the client encoding differs from the server one?  That is, compare the
client_encoding value in the startup packet with the result of
GetPlatformEncoding().  If they don't match, call
disable_message_localization().


AFAICT this is not what was agreed to in this thread.  It puts far too
much credence in the server-side default for client_encoding, which up to
now has never been thought to be very interesting; indeed I doubt most
people bother to set it at all.  The reason that this issue is even on
the table is that that default is too likely to be wrong, no?

Also, whatever possessed you to use pg_get_encoding_from_locale to
identify the server's encoding?  That's expensive and seems fairly
unlikely to yield the right answer.  I don't remember offhand where we
keep the postmaster's idea of what encoding messages should be in, but I'm
fairly sure it's stored explicitly somewhere.  Or if it isn't, we can for
sure do better than recalculating it during every connection attempt.

Having said all that, though, I'm unconvinced that this cure isn't worse
than the disease.  Somebody claimed upthread that no very interesting
messages would be delocalized by a change like this, but that's complete
nonsense: in particular, *every* message associated with client
authentication will be sent in English if we go down this path.  Given
the nearly complete lack of complaints in the many years that this code
has worked like this, I'm betting that most people will find a change
like this to be a net reduction in friendliness.

Given the changes here to extract client_encoding from the startup packet
ASAP, I wonder whether the right thing isn't just to set the client
encoding immediately when we do that.  Most application libraries pass
client encoding in the startup packet anyway (libpq certainly does).


Based on Tom's comments above, I'm marking this as returned with 
feedback in the commitfest. I agree that setting client_encoding as 
early as possible seems like the right thing to do.


Earlier in this thread, MauMau pointed out that we can't do encoding 
conversions until we have connected to the database because you need to 
read pg_conversion for that. That's because we support creating custom 
conversions with CREATE CONVERSION. Frankly, I don't think anyone cares 
about that feature. If we just dropped the CREATE/DROP CONVERSION 
feature altogether and hard-coded the conversions we have, there would 
be close to zero complaints. Even if you want to extend something around 
encodings and conversions, the CREATE CONVERSION interface is clunky. 
Firstly, conversions are per-database, and even schema-qualified, which 
just seems like an extra complication. You'll most likely want to modify 
the conversion across the whole system. Secondly, rather than define a 
new conversion between encodings, you'll likely want to define a whole 
new encoding with conversions to/from existing encodings, but you can't 
do that anyway without hacking the source code.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication identifier format

2014-06-23 Thread Robert Haas
On Wed, Jun 18, 2014 at 12:46 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-18 12:36:13 -0400, Robert Haas wrote:
  I actually don't think any of the discussions I was involved in had the
  externally visible version of replication identifiers limited to 16bits?
  If you are referring to my patch, 16bits was just the width of the
  *internal* name that should basically never be looked at. User visible
  replication identifiers are always identified by an arbitrary string -
  whose format is determined by the user of the replication identifier
  facility. *BDR* currently stores the system identifer, the database id
  and a name in there - but that's nothing core needs to concern itself
  with.

 I don't think you're going to be able to avoid users needing to know
 about those IDs.  The configuration table is going to have to be the
 same on all nodes, and how are you going to get that set up without
 those IDs being user-visible?

 Why? Users and other systems only ever see the external ID. Everything
 leaving the system is converted to the external form. The short id
 basically is only used in shared memory and in wal records. For both
 using longer strings would be problematic.

 In the patch I have the user can actually see them as they're stored in
 pg_replication_identifier, but there should never be a need for that.

Hmm, so there's no requirement that the short IDs are consistent
across different clusters that are replication to each other?  If
that's the case, that might address my concern, but I'd probably want
to go back through the latest patch and think about it a bit more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication identifier format

2014-06-23 Thread Andres Freund
On 2014-06-23 10:09:49 -0400, Robert Haas wrote:
 On Wed, Jun 18, 2014 at 12:46 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
  On 2014-06-18 12:36:13 -0400, Robert Haas wrote:
   I actually don't think any of the discussions I was involved in had the
   externally visible version of replication identifiers limited to 16bits?
   If you are referring to my patch, 16bits was just the width of the
   *internal* name that should basically never be looked at. User visible
   replication identifiers are always identified by an arbitrary string -
   whose format is determined by the user of the replication identifier
   facility. *BDR* currently stores the system identifer, the database id
   and a name in there - but that's nothing core needs to concern itself
   with.
 
  I don't think you're going to be able to avoid users needing to know
  about those IDs.  The configuration table is going to have to be the
  same on all nodes, and how are you going to get that set up without
  those IDs being user-visible?
 
  Why? Users and other systems only ever see the external ID. Everything
  leaving the system is converted to the external form. The short id
  basically is only used in shared memory and in wal records. For both
  using longer strings would be problematic.
 
  In the patch I have the user can actually see them as they're stored in
  pg_replication_identifier, but there should never be a need for that.
 
 Hmm, so there's no requirement that the short IDs are consistent
 across different clusters that are replication to each other?

Nope. That seemed to be a hard requirement in the earlier discussions we
had (~2 years ago).

  If
 that's the case, that might address my concern, but I'd probably want
 to go back through the latest patch and think about it a bit more.

I'll send out a new version after I'm finished with the newest atomic
ops patch.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-23 Thread Amit Kapila
On Tue, Jun 17, 2014 at 8:56 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-06-17 20:47:51 +0530, Amit Kapila wrote:
  On Tue, Jun 17, 2014 at 6:35 PM, Andres Freund and...@2ndquadrant.com
  wrote:
 
  You have followed it pretty well as far as I can understand from your
  replies, as there is no reproducible test (which I think is bit tricky
to
  prepare), so it becomes difficult to explain by theory.

 I'm working an updated patch that moves the releaseOK into the
 spinlocks. Maybe that's the problem already - it's certainly not correct
 as is.

Sure, I will do the test/performance test with updated patch; you
might want to include some more changes based on comments
in mail below:

  You are right, it will wakeup the existing waiters, but I think the
  new logic has one difference which is that it can allow the backend to
  take Exclusive lock when there are already waiters in queue.  As per
  above example even though Session-2 and Session-3 are in wait
  queue, Session-4 will be able to acquire Exclusive lock which I think
  was previously not possible.

 I think that was previously possible as well - in a slightly different
 set of circumstances though. After a process releases a lock and wakes
 up some of several waiters another process can come in and acquire the
 lock. Before the woken up process gets scheduled again. lwlocks aren't
 fair locks...

Okay, but I think changing behaviour for lwlocks might impact some
tests/applications.  As they are not fair, I think defining exact
behaviour is not easy and we don't have any concrete scenario which
can be effected, so there should not be problem in accepting
slightly different behaviour.

Few more comments:

1.
LWLockAcquireCommon()
{
..
iterations++;
}

In current logic, I could not see any use of these *iterations* variable.

2.
LWLockAcquireCommon()
{
..
if (!LWLockDequeueSelf(l))
{
/*
* Somebody else dequeued us and has or will..
 ..
*/
for (;;)
{
 PGSemaphoreLock(proc-sem, false);
if (!proc-lwWaiting)
break;
 extraWaits++;
}
lock-releaseOK = true;
}

Do we want to set result = false; after waking in above code?
The idea behind setting false is to indicate whether we get the lock
immediately or not which previously was decided based on if it needs
to queue itself?

3.
LWLockAcquireCommon()
{
..
/*
 * Ok, at this point we couldn't grab the lock on the first try. We
 * cannot simply queue ourselves to the end of the list and wait to be
 * woken up because by now the lock could long have been released.
 * Instead add us to the queue and try to grab the lock again. If we
 * suceed we need to revert the queuing and be happy, otherwise we
 * recheck the lock. If we still couldn't grab it, we know that the
 * other lock will see our queue entries when releasing since they
 * existed before we checked for the lock.
 */
/* add to the queue */
LWLockQueueSelf(l, mode);

/* we're now guaranteed to be woken up if necessary */
mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious);
..
}

a. By reading above code and comments, it is not quite clear why
second attempt is important unless somebody thinks on it or refer
your comments in *Notes* section at top of file.  I think it's better to
indicate in some way so that code reader can refer to Notes section or
whereever you are planing to keep those comments.

b. There is typo in above comment suceed/succeed.


4.
LWLockAcquireCommon()
{
..
if (!LWLockDequeueSelf(l))
{
 for (;;)
{
PGSemaphoreLock(proc-sem, false);
 if (!proc-lwWaiting)
break;
extraWaits++;
 }
lock-releaseOK = true;
..
}

Setting releaseOK in above context might not be required  because if the
control comes in this part of code, it will not retry to acquire another
time.

5.
LWLockWaitForVar()
{
..
else
mustwait = false;

if (!mustwait)
break;
..
}

I think we can directly break in else part in above code.

6.
LWLockWaitForVar()
{
..
/*
 * Quick test first to see if it the slot is free right now.
 *
 * XXX: the caller uses a spinlock before this,...
 */
if (pg_atomic_read_u32(lock-lockcount) == 0)
 return true;
}

Does the part of comment that refers to spinlock is still relevant
after using atomic ops?

7.
LWLockWaitForVar()
{
..
/*
 * Add myself to wait queue. Note that this is racy, somebody else
 * could wakeup before we're finished queuing.
 * NB: We're using nearly the same twice-in-a-row lock acquisition
 * protocol as LWLockAcquire(). Check its comments for details.
 */
LWLockQueueSelf(l, LW_WAIT_UNTIL_FREE);

/* we're now guaranteed to be woken up if necessary */
 mustwait = LWLockAttemptLock(l, LW_EXCLUSIVE, false,
potentially_spurious);
}

Why is it important to Attempt lock after queuing in this case, can't
we just re-check exclusive lock as done before queuing?

8.
LWLockWaitForVar()
{
..
PRINT_LWDEBUG(LWLockAcquire undo queue, lock, mode);
 break;
}
else
{
 PRINT_LWDEBUG(LWLockAcquire waiting 4, lock, mode);
}
..
}

a. I think instead of LWLockAcquire, here we should use
   LWLockWaitForVar
b. Isn't it better 

Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Robert Haas
On Thu, Jun 19, 2014 at 10:43 AM, Merlin Moncure mmonc...@gmail.com wrote:
 On Thu, Jun 19, 2014 at 7:07 AM, Abhijit Menon-Sen a...@2ndquadrant.com 
 wrote:
 Let's not pretend to support platforms we have no practical way of
 verifying.

 This is key.  The buildfarm defines the set of platforms we support.

This criterion has been proposed before, but I'm not sure I really
agree with it.  If having code around that targets obscure platforms
is hindering the development of new features, then that's a reason to
get rid of it.  If we're pretty sure it's useless and doesn't work, so
is that.  But if it just hasn't been tested lately, I don't personally
think that's a sufficient reason to excise it.  Telling people that
they can't have even the most minimal platform support code in
PostgreSQL unless they're willing to contribute and maintain a BF VM
indefinitely is not very friendly.  Of course, the risk of their
platform getting broken is higher if they don't, but that's different
than making it a hard requirement.  We should be looking at ways to
make it easier not harder for people who don't yet run PostgreSQL to
start doing so.  We are an open source community; we should try to be,
as far as possible, open.

But I think this is all a bit off-topic for this thread.  Andres has
already implemented a fallback for people who haven't got CAS and
fetch-and-add on their platform, so whether or not we deprecate some
more platforms has no bearing at all on this patch.  The question that
needs to be resolved in order to move forward here is this: Consider
the set of platforms we support, ignoring any that don't have
spinlocks.  Do any of them have only one of fetch-and-add, or do they
all have both or neither?  If it's the latter, then we're talking
about moving from a two-tiered system that looks like this:

1. No spinlocks.
2. Spinlocks.

...to a three-tiered system that looks like this:

1. No atomics at all.
2. Spinlocks but nothing else.
3. Spinlocks, CAS, and fetch-and-add.

I think that's eminently reasonable from a code maintenance and
developer testing perspective.  It sounds like all common systems and
architectures would fall in category 3, meaning that people wouldn't
have to worry about (just for example) significant differences between
the atomic ops supported on Intel vs. PPC.  For older or more obscure
platforms, categories 2 and 1 would still be there when needed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] releaseOk and LWLockWaitForVar

2014-06-23 Thread Amit Kapila
On Tue, Jun 17, 2014 at 5:47 PM, Andres Freund and...@2ndquadrant.com
wrote:

 Hi Heikki, All,

 Amit just pointed me to a case where the lwlock scalability patch
 apparently causes problems and I went on to review it and came across
 the following problem in 9.4/master:
 LWLockWaitForVar() doesn't set releaseOk to true when waiting
 again. Isn't that a bug? What if there's another locker coming in after
 LWLockWaitForVar() returns from the PGSemaphoreLock() but before it has
 acquire the spinlock?

I also think above mentioned scenario is a problem if releaseOk
is not set to true in above case.

While looking at function LWLockWaitForVar(), espacially below
code:

TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(l), T_ID(l), LW_EXCLUSIVE);

I think in this function tracing is done considering the Exclusive lock
is acquired, however it might have granted access because of
variable updation.  Basically this function's trace doesn't distinguish
whether the access is granted due to the reason that there is no other
exclusive locker or variable is updated.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] replication identifier format

2014-06-23 Thread Robert Haas
On Mon, Jun 23, 2014 at 10:11 AM, Andres Freund and...@2ndquadrant.com wrote:
  Why? Users and other systems only ever see the external ID. Everything
  leaving the system is converted to the external form. The short id
  basically is only used in shared memory and in wal records. For both
  using longer strings would be problematic.
 
  In the patch I have the user can actually see them as they're stored in
  pg_replication_identifier, but there should never be a need for that.

 Hmm, so there's no requirement that the short IDs are consistent
 across different clusters that are replication to each other?

 Nope. That seemed to be a hard requirement in the earlier discussions we
 had (~2 years ago).

Oh, great.  Somehow I missed the fact that that had been addressed.  I
had assumed that we still needed global identifiers in which case I
think they'd need to be 64+ bits (preferably more like 128).  If they
only need to be locally significant that makes things much better.

Is there any real reason to add a pg_replication_identifier table, or
should we just let individual replication solutions manage the
identifiers within their own configuration tables?  I guess one
question is: What happens if there are multiple replication solutions
in use on a single server?  How do they coordinate?

  If
 that's the case, that might address my concern, but I'd probably want
 to go back through the latest patch and think about it a bit more.

 I'll send out a new version after I'm finished with the newest atomic
 ops patch.

Sweet.  I'm a little backed up right now, but will look at it when able.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for multiple column assignment in UPDATE

2014-06-23 Thread Robert Haas
On Thu, Jun 19, 2014 at 9:37 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Pavel Stehule pavel.steh...@gmail.com writes:
 I did some tests and It looks so it allows only some form of nested loop.

 [ shrug... ]  It's a subplan.  One evaluation per outer row is what
 people are expecting.

Is it theoretically possible to convert a construct like this to a
semi-join?  I notice we don't, even when this new syntax isn't used:

rhaas=# explain select a, (select b from bar where foo.a = bar.a) from foo;
 QUERY PLAN

 Seq Scan on foo  (cost=0.00..855145.00 rows=1 width=4)
   SubPlan 1
 -  Seq Scan on bar  (cost=0.00..85.50 rows=1 width=4)
   Filter: (foo.a = a)
 Planning time: 0.078 ms

...but I'm wondering if that's an unimplemented feature or if there's
some reason why it's a bad idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication commands and log_statements

2014-06-23 Thread Robert Haas
On Fri, Jun 20, 2014 at 9:48 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 OK, I've just implemented the patch (attached) which does this, i.e., 
 redefines
 log_statement as a list. Thanks to the patch, log_statement can be set
 to none,
 ddl, mod, dml, all, and any combinations of them. The meanings of
 none, ddl, mod and all are the same as they are now. New setting 
 value
 dml loggs both data modification statements like INSERT and query access
 statements like SELECT and COPY TO.

 I still don't like this one bit.  It's turning log_statement from a
 reasonably clean design into a complete mess, which will be made even
 worse after you add replication control to it.

Well, I don't object to having a separate GUC for replication command
logging if people prefer that design.  But let's not have any
delusions of adequacy about log_statement.  I've had more than one
conversation with customers about that particular parameter, all of
which involved the customer being unhappy that there were only four
choices and they couldn't log the stuff that they cared about without
logging a lot of other stuff that they didn't care about.  Now,
providing more choices there will, indisputably, add complexity, but
it will also provide utility.

What we're talking about here is not unlike what we went through with
EXPLAIN syntax.  We repeatedly rejected patches that might have added
useful functionality to EXPLAIN on the grounds that (1) we didn't want
to make new keywords and (2) even if we did add new keywords,
extending the EXPLAIN productions would produce grammar conflicts.
Then, we implemented the extensible-options stuff, and suddenly it
became possible for people to write patches adding useful
functionality to EXPLAIN that didn't get sunk before it got out of the
gate, and since then we've gotten a new EXPLAIN option every release
or two, and IMHO all of those options are pretty useful.  Similarly,
building a logging facility that meets the needs of real users is
going to require a configuration method more flexible than a total
order with four choices.  I happen to think a list of comma-separated
tokens is a pretty good choice, but something else could be OK, too.
We need something better than what we've got now, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to use the 'char() ' as data type and a function name in the same time.

2014-06-23 Thread Kevin Grittner
rohtodeveloper rohtodevelo...@outlook.com wrote:

 When I use the  pg_catalog.char(integer) function  in the
 postgres, I can only use it like these:
 select pg_catalog.char(65);  select char(65);

 But I want to use the function by the following  way.
 select char(1);

Try using the chr() function.

What you are running into is a result of the fact that besides the
char datatype, PostgreSQL has a char datatype, with a completely
different implementation.  While that may not have been the best
decision, there is enough working code that depends on both of
these types that it is unlikely to change.

http://www.postgresql.org/docs/current/interactive/datatype-character.html

Be careful, though; I would not generally recommend the use of
either char or char in application tables or code.  The char type
is only included for standard compliance, and has surprising
semantics and generally poorer performance than the alternatives. 
The char type is only intended for use in the system catalogs; if
you are considering using it elsewhere, you should consider an enum
instead.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON and Postgres Variable Queries

2014-06-23 Thread Robert Haas
On Fri, Jun 20, 2014 at 11:26 AM, Joey Caughey
jcaug...@parrotmarketing.com wrote:
 I’m having an issue with JSON requests in Postgres and was wondering if
 anyone had an answer.

 I have an orders table with a field called “json_data”.

 In the json data there is a plan’s array with an id value in them.
 { plan”: { “id”: “1” } } }

 I can do regular queries that will work, like so:
 SELECT json_data-’plan'-’id' as plan_id FROM orders;

 But if I try to query on the data that is returned it will fail:
 SELECT json_data-’plan'-’id' as plan_id FROM orders WHERE plan_id = 1;
 OR
 SELECT json_data-’plan'-’id' as plan_id FROM orders GROUP BY plan_id;
 OR
 SELECT json_data-’plan'-’id' as plan_id FROM orders ORDER BY plan_id;

 Is this something that has been overlooked? or is there another way to go
 about this?

You might find a sub-SELECT helpful:

SELECT * FROM (SELECT json_data-’plan'-’id' as plan_id FROM
orders) x WHERE plan_id = 1

It might be a generally useful thing for WHERE-clause items to be able
to reference items from the target list by alias, or maybe it's
problematic for some reason that I don't know about, but right now
they can't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication commands and log_statements

2014-06-23 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 Similarly,
 building a logging facility that meets the needs of real users is
 going to require a configuration method more flexible than a total
 order with four choices.  I happen to think a list of comma-separated
 tokens is a pretty good choice, but something else could be OK, too.
 We need something better than what we've got now, though.

Absolutely, and not all of that should be done through postgresql.conf,
imv.  The level of flexibility that is being asked for, from what I've
seen and heard, really needs to be met with new catalog tables or
extending existing ones.  The nearby thread on pg_audit would be a good
example.

I certainly don't want to be specifying specific table names or
providing a list of roles through any GUC (or configuration file of any
kind).  I'm not adverse to improving log_statement to a list with tokens
that indicate various meaning levels but anything done through that
mechanism will be very coarse.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Andres Freund
On 2014-06-23 10:29:54 -0400, Robert Haas wrote:
 On Thu, Jun 19, 2014 at 10:43 AM, Merlin Moncure mmonc...@gmail.com wrote:
  On Thu, Jun 19, 2014 at 7:07 AM, Abhijit Menon-Sen a...@2ndquadrant.com 
  wrote:
  Let's not pretend to support platforms we have no practical way of
  verifying.
 
  This is key.  The buildfarm defines the set of platforms we support.
 
 This criterion has been proposed before, but I'm not sure I really
 agree with it.  If having code around that targets obscure platforms
 is hindering the development of new features, then that's a reason to
 get rid of it.

I think that's the case.

That we still have !PG_USE_INLINE support although all buildfarm animals
support it since 4c8aa8b (fixing acc) causes absurd constructs
(STATIC_IF_INLINE) and fugly macro usage making it harder to read
and modify code. I spend a good chunk of time just to make that work. Even if
nobody's going to ever use it.

That we need fallback for older gccs instead of relying on somebody else
having done the atomics abstraction causes significant waste of time.

That we have support for platforms that we haven't even documented as
working (e.g. SuperH) or platforms that don't work in the realword
(m32r) means that that one has to think about and research so many more
edge cases that it's really hard to make progress. I don't know how
often I've now sequentially gone through s_lock.h, s_lock.c,
src/backend/port/tas to understand all the supported combinations.

 If we're pretty sure it's useless and doesn't work, so
 is that.  But if it just hasn't been tested lately, I don't personally
 think that's a sufficient reason to excise it.

Sure, it just not being tested isn't a reason alone. But it's about more
than that. I've now spent *far* too much time understanding the idioms
of architectures that'll never get used again so I don't break them
accidentally. Sure, that's my choice. But it's imo necessary for
postgres to scale
I don't know about you, but for me reading/understanding/modifying
assembly on some platform you've never used is *HARD*. And even harder
if there's nobody around that's going to test it. It's entirely possible
to write incorrect locking assembly that'll break when used.

 Telling people that
 they can't have even the most minimal platform support code in
 PostgreSQL unless they're willing to contribute and maintain a BF VM
 indefinitely is not very friendly.  Of course, the risk of their
 platform getting broken is higher if they don't, but that's different
 than making it a hard requirement.

I agree that we shouldn't actively try to break stuff. But having to
understand  blindly modify unused code is on the other hand of actively
breaking platforms. It's actively hindering development.

 We should be looking at ways to
 make it easier not harder for people who don't yet run PostgreSQL to
 start doing so.  We are an open source community; we should try to be,
 as far as possible, open.

Do you really think supporting platform that 20 people worldwide run for
fun in their sparetime once in a while (sparc  v8plus) is achieving
that?
I think it's more likely that easier to understand code, quick review
for patches and better testing help with that.

 But I think this is all a bit off-topic for this thread.  Andres has
 already implemented a fallback for people who haven't got CAS and
 fetch-and-add on their platform, so whether or not we deprecate some
 more platforms has no bearing at all on this patch.

While I indeed have that fallback code, that's statement is still not
entirely true. We still need to add atomics support for lots of
platforms, otherwise they're just going to be 'less supported' than
now. Are we fine with that and just'll accept patches?

 The question that
 needs to be resolved in order to move forward here is this: Consider
 the set of platforms we support, ignoring any that don't have
 spinlocks.  Do any of them have only one of fetch-and-add, or do they
 all have both or neither?  If it's the latter, then we're talking
 about moving from a two-tiered system that looks like this:

Since fetch-and-add is trivially implemented using CAS, there's not much
need to distinguish between CAS and CAS + fetch_and_add. From my POV the
restriction to just CAS/fetch_and_add isn't actually buying much. Pretty
much all platforms but older gcc ones have atomic intrinsics since
forever. Once you've dug up the documentation for one operation not
adding two more or so doesn't save much.

 1. No spinlocks.
 2. Spinlocks.
 
 ...to a three-tiered system that looks like this:
 
 1. No atomics at all.
 2. Spinlocks but nothing else.
 3. Spinlocks, CAS, and fetch-and-add.
 
 I think that's eminently reasonable from a code maintenance and
 developer testing perspective.  It sounds like all common systems and
 architectures would fall in category 3, meaning that people wouldn't
 have to worry about (just for example) significant differences between
 the atomic ops supported on Intel vs. PPC.  

Re: [HACKERS] replication commands and log_statements

2014-06-23 Thread Robert Haas
On Mon, Jun 23, 2014 at 11:15 AM, Stephen Frost sfr...@snowman.net wrote:
 * Robert Haas (robertmh...@gmail.com) wrote:
 Similarly,
 building a logging facility that meets the needs of real users is
 going to require a configuration method more flexible than a total
 order with four choices.  I happen to think a list of comma-separated
 tokens is a pretty good choice, but something else could be OK, too.
 We need something better than what we've got now, though.

 Absolutely, and not all of that should be done through postgresql.conf,
 imv.  The level of flexibility that is being asked for, from what I've
 seen and heard, really needs to be met with new catalog tables or
 extending existing ones.  The nearby thread on pg_audit would be a good
 example.

 I certainly don't want to be specifying specific table names or
 providing a list of roles through any GUC (or configuration file of any
 kind).  I'm not adverse to improving log_statement to a list with tokens
 that indicate various meaning levels but anything done through that
 mechanism will be very coarse.

True, but it would be enough to let you make a list of commands you'd
like to audit, and I think that might be valuable enough to justify
the price of admission.  I certainly agree that logging based on which
object is being accessed needs a different design, tied into the
catalogs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] please review source(SQLServer compatible)‏

2014-06-23 Thread Andrew Dunstan


On 06/23/2014 10:51 AM, rohtodeveloper wrote:

Dear all,

Our application will be switched from SQL Server to PostgreSQL.
However, a few functions are not supported yet. So we decided to 
extend it.


The functions are as following:

1.SQL statement support
 INSERT statement without INTO keyword
 DELETE statement without FROM keywork
2.Build-in function
 SQUARE
 CHAR
 CHARINDEX
 LEN
 REPLICATE
 SPACE
 STR
 STUFF
 CONVERT
 DATALENGTH
 DATEADD
 DATEDIFF
 DATEPART
 DAY
 MONTH
 YEAR
 EOMONTH
 GETDATE
 SYSDATETIME
3.Operator
 operator ! (Not Less Than)
 operator ! (Not Greater Than)
 operator + (String Concatenation)
4.Other
 DataType support(smalldatetime,datetime,datatime2,uniqueidentifer)
 Date, Time, and Timestamp Escape Sequences ODBC Scalar Functions
 OCTET_LENGTH
 CURRENT_DATE
 CURRENT_TIME

The extended functions are almost completed but your opinion is very 
important to us.

Would you please help us to review the extended source?

The attachments is the diff source.

Thank you very much.





I think this effort is fundamentally misguided. It will mean a 
maintenance nightmare for you. You would be much better off migrating 
your app to rid it of these SQLServerisms, especially those that require 
backend changes. If you have layered your application correctly, so that 
the places it calls SQL are relatively confined, then this should not be 
terribly difficult. If you have not, then you have bigger problems than 
these anyway.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-23 Thread Heikki Linnakangas

Some comments, aside from the design wrt. bounding boxes etc. :

On 06/15/2014 05:34 AM, Alvaro Herrera wrote:

Robert Haas wrote:

On Wed, Sep 25, 2013 at 4:34 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:

Here's an updated version of this patch, with fixes to all the bugs
reported so far.  Thanks to Thom Brown, Jaime Casanova, Erik Rijkers and
Amit Kapila for the reports.


I'm not very happy with the use of a separate relation fork for
storing this data.


Here's a new version of this patch.  Now the revmap is not stored in a
separate fork, but together with all the regular data, as explained
elsewhere in the thread.


Thanks! Please update the README accordingly.

If I understand the code correctly, the revmap is a three-level deep 
structure. The bottom level consists of regular revmap pages, and each 
regular revmap page is filled with ItemPointerDatas, which point to the 
index tuples. The middle level consists of array revmap pages, and 
each array revmap page contains an array of BlockNumbers of the regular 
revmap pages. The top level is an array of BlockNumbers of the array 
revmap pages, and it is stored in the metapage.


With 8k block size, that's just enough to cover the full range of 2^32-1 
blocks that you'll need if you set mm_pages_per_range=1. Each regular 
revmap page can store about 8192/6 = 1365 item pointers, each array 
revmap page can store about 8192/4 = 2048 block references, and the size 
of the top array is 8192/4. That's just enough; to store the required 
number of array pages in the top array, the array needs to be 
(2^32/1365)/2048)=1536 elements large.


But with 4k or smaller blocks, it's not enough.

I wonder if it would be simpler to just always store the revmap pages in 
the beginning of the index, before any other pages. Finding the revmap 
page would then be just as easy as with a separate fork. When the 
table/index is extended so that a new revmap page is needed, move the 
existing page at that block out of the way. Locking needs some 
consideration, but I think it would be feasible and simpler than you 
have now.



I have followed the suggestion by Amit to overwrite the index tuple when
a new heap tuple is inserted, instead of creating a separate index
tuple.  This saves a lot of index bloat.  This required a new entry
point in bufpage.c, PageOverwriteItemData().  bufpage.c also has a new
function PageIndexDeleteNoCompact which is similar in spirit to
PageIndexMultiDelete except that item pointers do not change.  This is
necessary because the revmap stores item pointers, and such reference
would break if we were to renumber items in index pages.


ISTM that when the old tuple cannot be updated in-place, the new index 
tuple is inserted with mm_doinsert(), but the old tuple is never deleted.


- Heikki

--
- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication identifier format

2014-06-23 Thread Andres Freund
On 2014-06-23 10:45:51 -0400, Robert Haas wrote:
 On Mon, Jun 23, 2014 at 10:11 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
   Why? Users and other systems only ever see the external ID. Everything
   leaving the system is converted to the external form. The short id
   basically is only used in shared memory and in wal records. For both
   using longer strings would be problematic.
  
   In the patch I have the user can actually see them as they're stored in
   pg_replication_identifier, but there should never be a need for that.
 
  Hmm, so there's no requirement that the short IDs are consistent
  across different clusters that are replication to each other?
 
  Nope. That seemed to be a hard requirement in the earlier discussions we
  had (~2 years ago).
 
 Oh, great.  Somehow I missed the fact that that had been addressed.  I
 had assumed that we still needed global identifiers in which case I
 think they'd need to be 64+ bits (preferably more like 128).  If they
 only need to be locally significant that makes things much better.

Well, I was just talking about the 'short ids' here and how they are
used in crash recovery/shmem et al. Those indeed don't need to be
coordinated.
If you ever use logical decoding on a system that receives changes from
other systems (cascading replication, multimaster) you'll likely want to
add the *long* form of that identifier to the output in the output
plugin so the downstream nodes can identify the source. How one
specific replication solution deals with coordinating this between
systems is essentially that suite's problem.

The external identifier currently is a 'text' column, so essentially
unlimited. (Well, I just noticed that the table currently doesn't have a
toast table assigned, so it's only a couple kb right now, but ...)

 Is there any real reason to add a pg_replication_identifier table, or
 should we just let individual replication solutions manage the
 identifiers within their own configuration tables?

I don't think that'd work. During crash recovery the short/internal IDs
are read from WAL records and need to be unique across *all*
databases. Since there's no way for different replication solutions or
even the same to coordinate this across databases (as there's no way to
add shared relations) it has to be builtin.

It's also useful so we can have stuff like the
'pg_replication_identifier_progress' view which tells you internal_id,
external_id, remote_lsn, local_lsn. Just showing the internal ID would
imo be bad.

 I guess one
 question is: What happens if there are multiple replication solutions
 in use on a single server?  How do they coordinate?

What's your concern here? You're wondering how they can make sure the
identifiers they create are non-overlapping?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON and Postgres Variable Queries

2014-06-23 Thread Andrew Dunstan


On 06/23/2014 11:06 AM, Robert Haas wrote:

On Fri, Jun 20, 2014 at 11:26 AM, Joey Caughey
jcaug...@parrotmarketing.com wrote:

I’m having an issue with JSON requests in Postgres and was wondering if
anyone had an answer.

I have an orders table with a field called “json_data”.

In the json data there is a plan’s array with an id value in them.
{ plan”: { “id”: “1” } } }

I can do regular queries that will work, like so:
SELECT json_data-’plan'-’id' as plan_id FROM orders;

But if I try to query on the data that is returned it will fail:
SELECT json_data-’plan'-’id' as plan_id FROM orders WHERE plan_id = 1;
OR
SELECT json_data-’plan'-’id' as plan_id FROM orders GROUP BY plan_id;
OR
SELECT json_data-’plan'-’id' as plan_id FROM orders ORDER BY plan_id;

Is this something that has been overlooked? or is there another way to go
about this?

You might find a sub-SELECT helpful:

SELECT * FROM (SELECT json_data-’plan'-’id' as plan_id FROM
orders) x WHERE plan_id = 1

It might be a generally useful thing for WHERE-clause items to be able
to reference items from the target list by alias, or maybe it's
problematic for some reason that I don't know about, but right now
they can't.




Once again,

   json_data-’plan'-’id'

is an expression guaranteed to fail, since - returns text but expects 
its left hand o0perand to be json, unlike


   json_data-’plan'-’id'

or

   json_data#'{plan,id}'

So I don't believe the OPs original statement about what is and isn't 
working. The alias issue, of course, is not at all JSON-specific, and 
the subselect is one solution - a CTE is another. But you CAN use the 
alias in an ORDER BY or GROUP BY.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait free LW_SHARED acquisition - v0.2

2014-06-23 Thread Andres Freund
On 2014-06-23 19:59:10 +0530, Amit Kapila wrote:
 On Tue, Jun 17, 2014 at 8:56 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-06-17 20:47:51 +0530, Amit Kapila wrote:
   On Tue, Jun 17, 2014 at 6:35 PM, Andres Freund and...@2ndquadrant.com
   wrote:
  
   You have followed it pretty well as far as I can understand from your
   replies, as there is no reproducible test (which I think is bit tricky
 to
   prepare), so it becomes difficult to explain by theory.
 
  I'm working an updated patch that moves the releaseOK into the
  spinlocks. Maybe that's the problem already - it's certainly not correct
  as is.
 
 Sure, I will do the test/performance test with updated patch; you
 might want to include some more changes based on comments
 in mail below:

I'm nearly finished in cleaning up the atomics part of the patch which
also includes a bit of cleanup of the lwlocks code.

 Few more comments:
 
 1.
 LWLockAcquireCommon()
 {
 ..
 iterations++;
 }
 
 In current logic, I could not see any use of these *iterations* variable.

It's useful for debugging. Should be gone in the final code.

 2.
 LWLockAcquireCommon()
 {
 ..
 if (!LWLockDequeueSelf(l))
 {
 /*
 * Somebody else dequeued us and has or will..
  ..
 */
 for (;;)
 {
  PGSemaphoreLock(proc-sem, false);
 if (!proc-lwWaiting)
 break;
  extraWaits++;
 }
 lock-releaseOK = true;
 }
 
 Do we want to set result = false; after waking in above code?
 The idea behind setting false is to indicate whether we get the lock
 immediately or not which previously was decided based on if it needs
 to queue itself?

Hm. I don't think it's clear which version is better.

 3.
 LWLockAcquireCommon()
 {
 ..
 /*
  * Ok, at this point we couldn't grab the lock on the first try. We
  * cannot simply queue ourselves to the end of the list and wait to be
  * woken up because by now the lock could long have been released.
  * Instead add us to the queue and try to grab the lock again. If we
  * suceed we need to revert the queuing and be happy, otherwise we
  * recheck the lock. If we still couldn't grab it, we know that the
  * other lock will see our queue entries when releasing since they
  * existed before we checked for the lock.
  */
 /* add to the queue */
 LWLockQueueSelf(l, mode);
 
 /* we're now guaranteed to be woken up if necessary */
 mustwait = LWLockAttemptLock(l, mode, false, potentially_spurious);
 ..
 }
 
 a. By reading above code and comments, it is not quite clear why
 second attempt is important unless somebody thinks on it or refer
 your comments in *Notes* section at top of file.  I think it's better to
 indicate in some way so that code reader can refer to Notes section or
 whereever you are planing to keep those comments.

Ok.

 b. There is typo in above comment suceed/succeed.

Thanks, fixed.

 
 4.
 LWLockAcquireCommon()
 {
 ..
 if (!LWLockDequeueSelf(l))
 {
  for (;;)
 {
 PGSemaphoreLock(proc-sem, false);
  if (!proc-lwWaiting)
 break;
 extraWaits++;
  }
 lock-releaseOK = true;
 ..
 }
 
 Setting releaseOK in above context might not be required  because if the
 control comes in this part of code, it will not retry to acquire another
 time.

Hm. You're probably right.

 5.
 LWLockWaitForVar()
 {
 ..
 else
 mustwait = false;
 
 if (!mustwait)
 break;
 ..
 }
 
 I think we can directly break in else part in above code.

Well, there's another case of mustwait=false above which is triggered
while the spinlock is held. Don't think it'd get simpler.

 6.
 LWLockWaitForVar()
 {
 ..
 /*
  * Quick test first to see if it the slot is free right now.
  *
  * XXX: the caller uses a spinlock before this,...
  */
 if (pg_atomic_read_u32(lock-lockcount) == 0)
  return true;
 }
 
 Does the part of comment that refers to spinlock is still relevant
 after using atomic ops?

Yes. pg_atomic_read_u32() isn't a memory barrier (and explicitly
documented not to be).

 7.
 LWLockWaitForVar()
 {
 ..
 /*
  * Add myself to wait queue. Note that this is racy, somebody else
  * could wakeup before we're finished queuing.
  * NB: We're using nearly the same twice-in-a-row lock acquisition
  * protocol as LWLockAcquire(). Check its comments for details.
  */
 LWLockQueueSelf(l, LW_WAIT_UNTIL_FREE);
 
 /* we're now guaranteed to be woken up if necessary */
  mustwait = LWLockAttemptLock(l, LW_EXCLUSIVE, false,
 potentially_spurious);
 }
 
 Why is it important to Attempt lock after queuing in this case, can't
 we just re-check exclusive lock as done before queuing?

Well, that's how Heikki designed LWLockWaitForVar().

 8.
 LWLockWaitForVar()
 {
 ..
 PRINT_LWDEBUG(LWLockAcquire undo queue, lock, mode);
  break;
 }
 else
 {
  PRINT_LWDEBUG(LWLockAcquire waiting 4, lock, mode);
 }
 ..
 }
 
 a. I think instead of LWLockAcquire, here we should use
LWLockWaitForVar

right.

 b. Isn't it better to use LOG_LWDEBUG instead of PRINT_LWDEBUG(),
 as PRINT_LWDEBUG() is generally used in file at entry of functions to
 log info about locks?

Fine with me.

 9.
 LWLockUpdateVar()
 {
 /* 

Re: [HACKERS] [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns

2014-06-23 Thread Tom Lane
Michael Paquier michael.paqu...@gmail.com writes:
 Digging more into that, I have found the issue and a fix for it. It happens
 that populate_recordset_object_start, which is used to initialize the
 process for the population of the record, is taken *each* time a json
 object is found, re-creating every time the hash table for the parsing
 process, hence removing from PopulateRecordsetState all the entries already
 parsed and creating the problem reported by Matti. The fix I am proposing
 to fix this issue is rather simple: simply bypass the creation of the hash
 table if lex_level  1 as we are in presence of a nested object and rely on
 the existing hash table.

Yes, this code is clearly not handling the nested-objects case correctly.
I had written a fix more or less equivalent to yours last night.

However, it seems to me that these functions (json[b]_to_record[set]) are
handling the nested-json-objects case in a fairly brain-dead fashion to
start with.  I would like to propose that we should think about getting
rid of the use_json_as_text flag arguments altogether.  What purpose do
they serve?  If we're going to the trouble of parsing the nested JSON
objects anyway, why don't we just reconstruct from that data?

(IOW, we probably actually should have nested hash tables in this case.
I suspect that the current bug arose from incompletely-written logic
to do it like that.)

Since we've already decided to force an initdb for 9.4beta2, it's not
quite too late to revisit this API, and I think it needs revisiting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-06-23 Thread Tom Lane
Gurjeet Singh gurj...@singh.im writes:
 would it be possible to include this in 9.4 as well?

While this is clearly an improvement over what we had before, it's
impossible to argue that it's a bug fix, and we are way past the 9.4
feature freeze deadline.  In particular, packagers who've already done
their 9.4 development work might be blindsided by us slipping this into
9.4 release.  So while I wouldn't have a problem with putting this change
into 9.4 from a technical standpoint, it's hard to argue that it'd meet
project norms from a development-process standpoint.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-06-23 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 I'd expect a catalog table or perhaps changes to pg_class (maybe other
 things also..) to define what gets logged.

How exactly will that work for log messages generated in contexts where
we do not have working catalog access?  (postmaster, crash recovery,
or pretty much anywhere where we're not in a valid transaction.)

This strikes me as much like the periodic suggestions we hear to get
rid of the GUC infrastructure in favor of keeping all those settings
in a table.  It doesn't work because too much of that info is needed
below the level of working table access.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] please review source(SQLServer compatible)‏

2014-06-23 Thread Kevin Grittner
Andrew Dunstan and...@dunslane.net wrote:
 On 06/23/2014 10:51 AM, rohtodeveloper wrote:

 Our application will be switched from SQL Server to PostgreSQL.
 However, a few functions are not supported yet. So we decided to
 extend it.

 The functions are as following:

 1.SQL statement support
   INSERT statement without INTO keyword
   DELETE statement without FROM keywork

Those would be pretty trivial to do in core; the question is
whether the community would agree that a few extra lines in the
parser (and compatibility sections of the docs) is worth it for
portability from SQL Server and Sybase.

 2.Build-in function
   SQUARE
   CHAR
   CHARINDEX
   LEN
   REPLICATE
   SPACE
   STR
   STUFF
   CONVERT
   DATALENGTH
   DATEADD
   DATEDIFF
   DATEPART
   DAY
   MONTH
   YEAR
   EOMONTH
   GETDATE
   SYSDATETIME
 3.Operator
   operator ! (Not Less Than)
   operator ! (Not Greater Than)
   operator + (String Concatenation)

It seems likely that you could write an extension to add these
(using the CREATE EXTENSION feature) and submit them to
http://pgxn.org if you wanted to.  Is there some reason you're not
going this route?

 4.Other
   DataType support(smalldatetime,datetime,datatime2,uniqueidentifer)
   Date, Time, and Timestamp Escape Sequences ODBC Scalar Functions
   OCTET_LENGTH
   CURRENT_DATE
   CURRENT_TIME

You can add data types (including within extensions), and some of
those are things which seem to be implemented in some form.

test=# select current_date;
    date   

 2014-06-23
(1 row)

test=# select current_time;
   timetz  

 10:44:36.958967-05
(1 row)

test=# select octet_length('abcd');
 octet_length
--
    4
(1 row)

test=# select octet_length('π');
 octet_length
--
    2
(1 row)

If the point is that you want to change the semantics of existing
valid PostgreSQL statements, that's probably not a good idea.

 The extended functions are almost completed but your opinion is very
 important to us.
 Would you please help us to review the extended source?

http://wiki.postgresql.org/wiki/Submitting_a_Patch

 The attachments is the diff source.

I think if you want someone to look at this, you really need to
provide a single file with a unified or context diff of the entire
source trees.  And you may have trouble finding anyone willing to
review it for free unless you are explicitly looking to share the
code for free.

 I think this effort is fundamentally misguided. It will mean a
 maintenance nightmare for you. You would be much better off migrating
 your app to rid it of these SQLServerisms, especially those that require
 backend changes. If you have layered your application correctly, so that
 the places it calls SQL are relatively confined, then this should not be
 terribly difficult. If you have not, then you have bigger problems than
 these anyway.

There is certainly something to that point of view, but
implementing compatibility shims can reduce the effort of
migration, and isn't always a bad idea.  One thing which is just
going to need to be fixed in the application code is any instance
of UPDATE with a FROM clause.  SQL Server and PostgreSQL both have
non-standard extensions which support such syntax, but with
different semantics, so such a statement written for SQL Server
will probably run without throwing an error under PostgreSQL, but
will not do what it did under SQL Server.  In many cases it will
run for a *very* long time, and if allowed to finish will probably
update rows which were not intended.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] Re: [HACKERS] please review source(SQLServer compatible)‏

2014-06-23 Thread Pavel Stehule
2014-06-23 18:00 GMT+02:00 Kevin Grittner kgri...@ymail.com:

 Andrew Dunstan and...@dunslane.net wrote:
  On 06/23/2014 10:51 AM, rohtodeveloper wrote:

  Our application will be switched from SQL Server to PostgreSQL.
  However, a few functions are not supported yet. So we decided to
  extend it.
 
  The functions are as following:
 
  1.SQL statement support
INSERT statement without INTO keyword
DELETE statement without FROM keywork

 Those would be pretty trivial to do in core; the question is
 whether the community would agree that a few extra lines in the
 parser (and compatibility sections of the docs) is worth it for
 portability from SQL Server and Sybase.


I am strongly against - it is murder of ANSI SQL

Regards

Pavel



  2.Build-in function
SQUARE
CHAR
CHARINDEX
LEN
REPLICATE
SPACE
STR
STUFF
CONVERT
DATALENGTH
DATEADD
DATEDIFF
DATEPART
DAY
MONTH
YEAR
EOMONTH
GETDATE
SYSDATETIME
  3.Operator
operator ! (Not Less Than)
operator ! (Not Greater Than)
operator + (String Concatenation)

 It seems likely that you could write an extension to add these
 (using the CREATE EXTENSION feature) and submit them to
 http://pgxn.org if you wanted to.  Is there some reason you're not
 going this route?

  4.Other
DataType support(smalldatetime,datetime,datatime2,uniqueidentifer)
Date, Time, and Timestamp Escape Sequences ODBC Scalar Functions
OCTET_LENGTH
CURRENT_DATE
CURRENT_TIME

 You can add data types (including within extensions), and some of
 those are things which seem to be implemented in some form.

 test=# select current_date;
 date
 
  2014-06-23
 (1 row)

 test=# select current_time;
timetz
 
  10:44:36.958967-05
 (1 row)

 test=# select octet_length('abcd');
  octet_length
 --
 4
 (1 row)

 test=# select octet_length('π');
  octet_length
 --
 2
 (1 row)

 If the point is that you want to change the semantics of existing
 valid PostgreSQL statements, that's probably not a good idea.

  The extended functions are almost completed but your opinion is very
  important to us.
  Would you please help us to review the extended source?

 http://wiki.postgresql.org/wiki/Submitting_a_Patch

  The attachments is the diff source.

 I think if you want someone to look at this, you really need to
 provide a single file with a unified or context diff of the entire
 source trees.  And you may have trouble finding anyone willing to
 review it for free unless you are explicitly looking to share the
 code for free.

  I think this effort is fundamentally misguided. It will mean a
  maintenance nightmare for you. You would be much better off migrating
  your app to rid it of these SQLServerisms, especially those that require
  backend changes. If you have layered your application correctly, so that
  the places it calls SQL are relatively confined, then this should not be
  terribly difficult. If you have not, then you have bigger problems than
  these anyway.

 There is certainly something to that point of view, but
 implementing compatibility shims can reduce the effort of
 migration, and isn't always a bad idea.  One thing which is just
 going to need to be fixed in the application code is any instance
 of UPDATE with a FROM clause.  SQL Server and PostgreSQL both have
 non-standard extensions which support such syntax, but with
 different semantics, so such a statement written for SQL Server
 will probably run without throwing an error under PostgreSQL, but
 will not do what it did under SQL Server.  In many cases it will
 run for a *very* long time, and if allowed to finish will probably
 update rows which were not intended.

 --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Robert Haas
On Mon, Jun 23, 2014 at 11:16 AM, Andres Freund and...@2ndquadrant.com wrote:
 This criterion has been proposed before, but I'm not sure I really
 agree with it.  If having code around that targets obscure platforms
 is hindering the development of new features, then that's a reason to
 get rid of it.

 I think that's the case.

At some level, I agree, but see below.

 That we still have !PG_USE_INLINE support although all buildfarm animals
 support it since 4c8aa8b (fixing acc) causes absurd constructs
 (STATIC_IF_INLINE) and fugly macro usage making it harder to read
 and modify code. I spend a good chunk of time just to make that work. Even if
 nobody's going to ever use it.

I expect that we're eventually going to make a decision to require
inline support, but that commit was only last month, so probably not
just yet.  We've been trying hard to maintain support for C89, but I
think even Tom is starting to wonder whether we ought to let a few
widely-implemented post-C89 features in.

 That we need fallback for older gccs instead of relying on somebody else
 having done the atomics abstraction causes significant waste of time.

I agree, but I feel that's time well-spent.  gcc is not the only
compiler, not everyone can or wants to run bleeding-edge versions, and
I don't trust the gcc implementors to a sufficient degree to throw
away our existing TAS implementations wholesale and rely on theirs
instead.  Building cross-platform software is sometimes difficult; but
it's also got a lot of value.  I've spent enough time over the years
working on obscure platforms to value software that works even on
obscure platforms, and I'm glad that PostgreSQL tries to do that, even
if we don't always succeed perfectly.

 That we have support for platforms that we haven't even documented as
 working (e.g. SuperH) or platforms that don't work in the realword
 (m32r) means that that one has to think about and research so many more
 edge cases that it's really hard to make progress. I don't know how
 often I've now sequentially gone through s_lock.h, s_lock.c,
 src/backend/port/tas to understand all the supported combinations.

I agree that s_lock.h is the most painful part of this whole thing,
mostly because I'd really like to get to the point where
SpinLockAcquire() and SpinLockRelease() function as compiler barriers.

 If we're pretty sure it's useless and doesn't work, so
 is that.  But if it just hasn't been tested lately, I don't personally
 think that's a sufficient reason to excise it.

 Sure, it just not being tested isn't a reason alone. But it's about more
 than that. I've now spent *far* too much time understanding the idioms
 of architectures that'll never get used again so I don't break them
 accidentally. Sure, that's my choice. But it's imo necessary for
 postgres to scale
 I don't know about you, but for me reading/understanding/modifying
 assembly on some platform you've never used is *HARD*. And even harder
 if there's nobody around that's going to test it. It's entirely possible
 to write incorrect locking assembly that'll break when used.

Agree.

 Telling people that
 they can't have even the most minimal platform support code in
 PostgreSQL unless they're willing to contribute and maintain a BF VM
 indefinitely is not very friendly.  Of course, the risk of their
 platform getting broken is higher if they don't, but that's different
 than making it a hard requirement.

 I agree that we shouldn't actively try to break stuff. But having to
 understand  blindly modify unused code is on the other hand of actively
 breaking platforms. It's actively hindering development.

It's actively hindering a small number of important patches, but most
developers, most of the time, do not need to care.

 We should be looking at ways to
 make it easier not harder for people who don't yet run PostgreSQL to
 start doing so.  We are an open source community; we should try to be,
 as far as possible, open.

 Do you really think supporting platform that 20 people worldwide run for
 fun in their sparetime once in a while (sparc  v8plus) is achieving
 that?
 I think it's more likely that easier to understand code, quick review
 for patches and better testing help with that.

I don't think we can solve this problem with a sledgehammer.  We can
whittle down the number of platforms in s_lock.h that require special
treatment, and eventually some of the odder cases may go away
altogether, and that's great.  But I'd rather not make this patch
dependent on the rate at which we feel comfortable removing cases from
that file.

 But I think this is all a bit off-topic for this thread.  Andres has
 already implemented a fallback for people who haven't got CAS and
 fetch-and-add on their platform, so whether or not we deprecate some
 more platforms has no bearing at all on this patch.

 While I indeed have that fallback code, that's statement is still not
 entirely true. We still need to add atomics support for lots of
 platforms, 

Re: [HACKERS] Re: [bug fix] multibyte messages are displayed incorrectly on the client

2014-06-23 Thread Tom Lane
Heikki Linnakangas hlinnakan...@vmware.com writes:
 Earlier in this thread, MauMau pointed out that we can't do encoding 
 conversions until we have connected to the database because you need to 
 read pg_conversion for that. That's because we support creating custom 
 conversions with CREATE CONVERSION. Frankly, I don't think anyone cares 
 about that feature. If we just dropped the CREATE/DROP CONVERSION 
 feature altogether and hard-coded the conversions we have, there would 
 be close to zero complaints. Even if you want to extend something around 
 encodings and conversions, the CREATE CONVERSION interface is clunky. 
 Firstly, conversions are per-database, and even schema-qualified, which 
 just seems like an extra complication. You'll most likely want to modify 
 the conversion across the whole system. Secondly, rather than define a 
 new conversion between encodings, you'll likely want to define a whole 
 new encoding with conversions to/from existing encodings, but you can't 
 do that anyway without hacking the source code.

There's certainly something to be said for that position.  If there were
any prospect of extensions defining new encodings someday, I'd argue for
keeping CREATE CONVERSION.  But the performance headaches would be
substantial, and there aren't new encodings coming down the pike often
enough to justify the work involved, so I don't see us ever doing CREATE
ENCODING; and that means that CREATE CONVERSION is of little value.

I'd kind of like to see this go just because having catalog accesses
involved in encoding conversion setup is messy and fragile.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON and Postgres Variable Queries

2014-06-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 You might find a sub-SELECT helpful:

 SELECT * FROM (SELECT json_data-’plan'-’id' as plan_id FROM
 orders) x WHERE plan_id = 1

 It might be a generally useful thing for WHERE-clause items to be able
 to reference items from the target list by alias, or maybe it's
 problematic for some reason that I don't know about,

Standards compliance?  It's not just a trivial syntactic issue either,
but a rather fundamental conceptual one: expressions in the target list
are not supposed to be evaluated until after the WHERE clause is.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How about a proper TEMPORARY TABLESPACE?

2014-06-23 Thread Robert Haas
On Wed, Jun 18, 2014 at 9:39 PM, Matheus de Oliveira
matioli.math...@gmail.com wrote:
 Then, to summarize Matheus must do:
 * use an option instead of change the syntax and catalog to indicate that
 a tablespace is used to store temp objects

 Yes. I myself wasn't sure TEMPORARY syntax would be good, but I would just
 like to hear more about. Storing as an options seems nice to me.

I kind of like the TEMPORARY syntax, by analogy with what we do for
tables.  On the other hand, this situation isn't quite analogous: the
tablespace itself is permanent; it's only the objects inside the
tablespace that are temporary.  Hmm.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use a signal to trigger a memory context dump?

2014-06-23 Thread MauMau

From: Andres Freund and...@2ndquadrant.com

I wonder if it'd make sense to allow a signal to trigger a memory
context dump? I and others more than once had the need to examine memory
usage on production systems and using gdb isn't always realistic.


+1

It would be nice if there's a generic infrastructure on which the DBA can 
get information of running backends.I wish for a functionality to dump 
info of all backends with a single operation as well as one backend at a 
time, because it would be difficult to ask for users to choose a specific 
backend or operate on all backends, especially on Windows.  The candidate 
info are:


* memory context

* stack trace: I'd like to implement this.

* GUC settings: to know that backends are running with intended settings.

* prepared statements (= pg_prepared_statements): to know if applications 
are taking advantage of prepared statements for performance.


Regards
MauMau



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [BUGS] BUG #10728: json_to_recordset with nested json objects NULLs columns

2014-06-23 Thread Merlin Moncure
On Mon, Jun 23, 2014 at 10:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Michael Paquier michael.paqu...@gmail.com writes:
 Digging more into that, I have found the issue and a fix for it. It happens
 that populate_recordset_object_start, which is used to initialize the
 process for the population of the record, is taken *each* time a json
 object is found, re-creating every time the hash table for the parsing
 process, hence removing from PopulateRecordsetState all the entries already
 parsed and creating the problem reported by Matti. The fix I am proposing
 to fix this issue is rather simple: simply bypass the creation of the hash
 table if lex_level  1 as we are in presence of a nested object and rely on
 the existing hash table.

 Yes, this code is clearly not handling the nested-objects case correctly.
 I had written a fix more or less equivalent to yours last night.

 However, it seems to me that these functions (json[b]_to_record[set]) are
 handling the nested-json-objects case in a fairly brain-dead fashion to
 start with.  I would like to propose that we should think about getting
 rid of the use_json_as_text flag arguments altogether.  What purpose do
 they serve?  If we're going to the trouble of parsing the nested JSON
 objects anyway, why don't we just reconstruct from that data?

I think they should be removed.  (I called this out in the feature
level review: 
http://www.postgresql.org/message-id/CAHyXU0wqadCJk7MMkeARuuY05VrD=axdn6wdcemtuwo5p4c...@mail.gmail.com).
AIUI, the flag was introduced as a workaround to try and deal with
mapping nested structures.  Text variant 'json' flags have had them.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] please review source(SQLServer compatible)‏

2014-06-23 Thread Vik Fearing
On 06/23/2014 04:51 PM, rohtodeveloper wrote:
 1.SQL statement support
  INSERT statement without INTO keyword
  DELETE statement without FROM keywork

Why would we want this?
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replication identifier format

2014-06-23 Thread Robert Haas
On Mon, Jun 23, 2014 at 11:28 AM, Andres Freund and...@2ndquadrant.com wrote:
 Oh, great.  Somehow I missed the fact that that had been addressed.  I
 had assumed that we still needed global identifiers in which case I
 think they'd need to be 64+ bits (preferably more like 128).  If they
 only need to be locally significant that makes things much better.

 Well, I was just talking about the 'short ids' here and how they are
 used in crash recovery/shmem et al. Those indeed don't need to be
 coordinated.
 If you ever use logical decoding on a system that receives changes from
 other systems (cascading replication, multimaster) you'll likely want to
 add the *long* form of that identifier to the output in the output
 plugin so the downstream nodes can identify the source. How one
 specific replication solution deals with coordinating this between
 systems is essentially that suite's problem.

OK.

 The external identifier currently is a 'text' column, so essentially
 unlimited. (Well, I just noticed that the table currently doesn't have a
 toast table assigned, so it's only a couple kb right now, but ...)

OK.  I have no clear reason to dislike that.

 Is there any real reason to add a pg_replication_identifier table, or
 should we just let individual replication solutions manage the
 identifiers within their own configuration tables?

 I don't think that'd work. During crash recovery the short/internal IDs
 are read from WAL records and need to be unique across *all*
 databases. Since there's no way for different replication solutions or
 even the same to coordinate this across databases (as there's no way to
 add shared relations) it has to be builtin.

That makes sense.

 It's also useful so we can have stuff like the
 'pg_replication_identifier_progress' view which tells you internal_id,
 external_id, remote_lsn, local_lsn. Just showing the internal ID would
 imo be bad.

OK.

 I guess one
 question is: What happens if there are multiple replication solutions
 in use on a single server?  How do they coordinate?

 What's your concern here? You're wondering how they can make sure the
 identifiers they create are non-overlapping?

Yeah, I was just thinking that might be why you installed a catalog
table for this, but now I see that there are several other reasons
also.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL access to database attributes

2014-06-23 Thread Robert Haas
On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 I found only one problem - first patch introduce a new property
 CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in
 documentation. But CONNECTION LIMIT is still supported, but it is not
 documented. So for some readers it can look like breaking compatibility, but
 it is false. This should be documented better.

Yeah, I think the old syntax should be documented also.  See, e.g.,
what we do for COPY.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 23, 2014 at 11:16 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 Since fetch-and-add is trivially implemented using CAS, there's not much
 need to distinguish between CAS and CAS + fetch_and_add. From my POV the
 restriction to just CAS/fetch_and_add isn't actually buying much. Pretty
 much all platforms but older gcc ones have atomic intrinsics since
 forever. Once you've dug up the documentation for one operation not
 adding two more or so doesn't save much.

 Again, the concern that was expressed at the developer's meeting was
 that the performance characteristics of the CAS loop might be
 significantly different from a native atomic op as to cause us
 heartburn.  I think that's a valid concern... but if everything in
 common use has both CAS and fetch-and-add, then I think there's
 probably no issue here.

What I want to know is whether we are going to agree that the set of
atomics is going to be CAS plus fetch_and_add plus *nothing else*.

Andres seems to envision that those will be a minimal set and then
we'll freely use any other atomic op that seems handy as long as we can
provide a fallback implementation of it.  I agree with Robert that
keeping track of the actual cross-platform performance characteristics
of such code would be a nightmare.

What I want to see is a list of atomic ops that is short enough that
we can agree we do not care about PG performance on any platform that
hasn't got (fast versions of) all of them.  Then we will code and
optimize on the basis of having all those ops and no others.  We can
offer fallback implementations of those ops so that older platforms
aren't entirely dead in the water, but they will not be the focus
of any performance testing.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Andres Freund
On 2014-06-23 12:08:08 -0400, Robert Haas wrote:
  That we still have !PG_USE_INLINE support although all buildfarm animals
  support it since 4c8aa8b (fixing acc) causes absurd constructs
  (STATIC_IF_INLINE) and fugly macro usage making it harder to read
  and modify code. I spend a good chunk of time just to make that work. Even 
  if
  nobody's going to ever use it.
 
 I expect that we're eventually going to make a decision to require
 inline support, but that commit was only last month, so probably not
 just yet.

We were in pretty much the same position before - it was just the quiet
inline test tripping us up...

  That we have support for platforms that we haven't even documented as
  working (e.g. SuperH) or platforms that don't work in the realword
  (m32r) means that that one has to think about and research so many more
  edge cases that it's really hard to make progress. I don't know how
  often I've now sequentially gone through s_lock.h, s_lock.c,
  src/backend/port/tas to understand all the supported combinations.
 
 I agree that s_lock.h is the most painful part of this whole thing,
 mostly because I'd really like to get to the point where
 SpinLockAcquire() and SpinLockRelease() function as compiler barriers.

s_lock.h is basically gone in my patch btw. Only old comments + defines
for TAS/TAS_SPIN/S_UNLOCK etc mapping the old calls onto atomic flags
remain.
The new code should, hopefully, all be barrier safe.

  I agree that we shouldn't actively try to break stuff. But having to
  understand  blindly modify unused code is on the other hand of actively
  breaking platforms. It's actively hindering development.
 
 It's actively hindering a small number of important patches, but most
 developers, most of the time, do not need to care.

True. I guess I just seem to find myself hit by this a bit heavily :)

  While I indeed have that fallback code, that's statement is still not
  entirely true. We still need to add atomics support for lots of
  platforms, otherwise they're just going to be 'less supported' than
  now. Are we fine with that and just'll accept patches?
 
 I guess it depends on which platforms are affected.  I certainly don't
 think any new facilities we add need to be more comprehensive than
 what I did in barrier.h, and in fact I think we could omit a few of
 the things I have there, like PA-RISC, Itanium, and Alpha.  And, yeah,
 if somebody needs an atomics implementation for a platform we haven't
 got yet, they can submit a patch.

I'll omit !gcc PA-RISC unless somebody complains. I don't think I'll be
able to modify hpux_hppa.s + build infrastructure correctly and there's
no buildfarm. That means it'll require --disable-spinlocks.
I've now also removed sunstudio*.s because sunstudio has provided
intrinsics for a *long* while. No idea whether it'll work out of the
box, but it shouldn't be hard to mop up eventual bugs.

  The question that
  needs to be resolved in order to move forward here is this: Consider
  the set of platforms we support, ignoring any that don't have
  spinlocks.  Do any of them have only one of fetch-and-add, or do they
  all have both or neither?  If it's the latter, then we're talking
  about moving from a two-tiered system that looks like this:
 
  Since fetch-and-add is trivially implemented using CAS, there's not much
  need to distinguish between CAS and CAS + fetch_and_add. From my POV the
  restriction to just CAS/fetch_and_add isn't actually buying much. Pretty
  much all platforms but older gcc ones have atomic intrinsics since
  forever. Once you've dug up the documentation for one operation not
  adding two more or so doesn't save much.
 
 Again, the concern that was expressed at the developer's meeting was
 that the performance characteristics of the CAS loop might be
 significantly different from a native atomic op as to cause us
 heartburn.  I think that's a valid concern... but if everything in
 common use has both CAS and fetch-and-add, then I think there's
 probably no issue here.

Well, there's platforms where both CAS and atomic add are implemented
using load linked/store conditional. So neither really is 100%
native. But that's essentially how CAS et al are implemented in
microcode on pretty much all somewhat recent cpus anyway.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-06-23 Thread Gurjeet Singh
On Mon, Jun 23, 2014 at 11:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Gurjeet Singh gurj...@singh.im writes:
 would it be possible to include this in 9.4 as well?

 While this is clearly an improvement over what we had before, it's
 impossible to argue that it's a bug fix, and we are way past the 9.4
 feature freeze deadline.  In particular, packagers who've already done
 their 9.4 development work might be blindsided by us slipping this into
 9.4 release.  So while I wouldn't have a problem with putting this change
 into 9.4 from a technical standpoint, it's hard to argue that it'd meet
 project norms from a development-process standpoint.

While I'd love to reduce the number of future installations without
this fix in place, I respect the decision to honor project policy. At
the same time, this change does not break anything. It introduces new
environment variables which change the behaviour, but behaves the old
way in the absence of those variables. So I guess it's not changing
the default behaviour in incompatible way.

BTW, does the project publish the feature-freeze deadlines and other
dates somewhere (apart from on this list). I was looking the other day
and didn't find any reference. Either the commitfest app or the
'Developer' section of the wiki [1] seem to be good candidates for
this kind of information.

[1]: https://wiki.postgresql.org/wiki/Development_information

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-23 Thread Robert Haas
On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost sfr...@snowman.net wrote:
 I'm also of the opinion that this isn't strictly necessary for the
 initial RLS offering in PG- there's a clear way we could migrate
 existing users to a multi-policy system from a single-policy system.
 Sure, to get the performance and optimization benefits that we'd
 presumably have in the multi-policy case they'd need to re-work their
 RLS configuration, but for users who care, they'll likely be very happy
 to do so to gain those benefits.

I think a lot depends on the syntax we choose.  If we choose a syntax
that only makes sense in a single-policy framework, then I think
allowing upgrades to a multi-policy syntax is going to be really
difficult.  On the other hand, if we choose a syntax that allows
multiple policies, I suspect we can support multiple policies from the
beginning without much extra effort.

 - Require the user to specify in some way which of the available
 policies they want applied, and then apply only that one.

 I'd want to at least see a way to apply an ordering to the policies
 being applied, or have PG work out which one is cheapest and try that
 one first.

Cost-based comparison of policies that return different results
doesn't seem sensible to me.

 I think exactly the opposite, for the query planning reasons
 previously stated.  I think the policies will quickly get so
 complicated that they're no longer optimizable.  Here's a simple
 example:

 - Policy 1 allows the user to access rows for which complexfunc() returns 
 true.
 - Policy 2 allows the user to access rows for which a = 1.

 Most users have access only through policy 2, but some have access
 through policy 1.  Users who have access through policy 1 will always
 get a sequential scan,

 This is the thing which I most object to- if the quals being provided at
 any level are leakproof and would be able to reduce the returned set
 sufficiently that an index scan is the best bet, we should be doing
 that.  I don't anticipate the RLS quals to be as selective as the
 quals which the user is adding.

I think it would be a VERY bad idea to design the system around the
assumption that the RLS quals will be much more or less selective than
the user-supplied quals.  That's going to be different in different
environments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] releaseOk and LWLockWaitForVar

2014-06-23 Thread Heikki Linnakangas

On 06/17/2014 03:17 PM, Andres Freund wrote:

LWLockWaitForVar() doesn't set releaseOk to true when waiting
again. Isn't that a bug?


LWLockWaitForVar() waits in LW_WAIT_UNTIL_FREE mode, because it's not 
interested in acquiring the lock, it just wants to be woken up when it's 
released (or the var is updated). LWLockRelease doesn't clear 
releaseOK when it wakes up a LW_WAIT_UNTIL_FREE-mode waiter.



What if there's another locker coming in after
LWLockWaitForVar() returns from the PGSemaphoreLock() but before it has
acquire the spinlock? Now, it might be that it's unproblematic because
of hte specific way these locks are used right now, but it doesn't seem
like a good idea to leave it that way.


In that scenario, LWLockWaitForVar() will grab the spinlock, after the 
other process. What happens next depends on the whether the value of the 
variable it guards was changed. If it was, LWLockWaitForVar() will see 
that it changed, and return false without waiting again. If the value 
didn't change, it will sleep until the new locker releases the lock. In 
either case, I don't see a problem with releaseOK. It seems correct as 
it is.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Andres Freund
On 2014-06-23 09:28:19 -0700, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  On Mon, Jun 23, 2014 at 11:16 AM, Andres Freund and...@2ndquadrant.com 
  wrote:
  Since fetch-and-add is trivially implemented using CAS, there's not much
  need to distinguish between CAS and CAS + fetch_and_add. From my POV the
  restriction to just CAS/fetch_and_add isn't actually buying much. Pretty
  much all platforms but older gcc ones have atomic intrinsics since
  forever. Once you've dug up the documentation for one operation not
  adding two more or so doesn't save much.
 
  Again, the concern that was expressed at the developer's meeting was
  that the performance characteristics of the CAS loop might be
  significantly different from a native atomic op as to cause us
  heartburn.  I think that's a valid concern... but if everything in
  common use has both CAS and fetch-and-add, then I think there's
  probably no issue here.
 
 What I want to know is whether we are going to agree that the set of
 atomics is going to be CAS plus fetch_and_add plus *nothing else*.

It's going to be TAS, CAS, fetch_and_add, right? Since TAS is the only
thing supported on some platforms?

The only op I'm currently wondering about adding is a atomic exchange,
without compare to that set. All platforms that support CAS also have a
non-comparing version of it.

Right now the patch also uses __sync_fetch_and_sub() in the generic gcc
implementation instead of doing the negation itself, but that's easily
fixable.

 Andres seems to envision that those will be a minimal set and then
 we'll freely use any other atomic op that seems handy as long as we can
 provide a fallback implementation of it.

Well, I *do* also want pg_atomic_fetch_and/or_u32() - but I'm totally
fine with those two only being implemented with CAS. On all
platforms. Otherwise the next scalability patch I'm going to submit will
just have to open code a CAS loop for it which doesn't seem to help.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] releaseOk and LWLockWaitForVar

2014-06-23 Thread Heikki Linnakangas

On 06/23/2014 05:38 PM, Amit Kapila wrote:

While looking at function LWLockWaitForVar(), espacially below
code:

TRACE_POSTGRESQL_LWLOCK_ACQUIRE(T_NAME(l), T_ID(l), LW_EXCLUSIVE);

I think in this function tracing is done considering the Exclusive lock
is acquired, however it might have granted access because of
variable updation.  Basically this function's trace doesn't distinguish
whether the access is granted due to the reason that there is no other
exclusive locker or variable is updated.


Yeah. Not sure it's worth it to add new TRACE points for this, I'm not 
really familiar with the way the traces work or how people use them.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL access to database attributes

2014-06-23 Thread Vik Fearing
On 06/23/2014 06:21 PM, Robert Haas wrote:
 On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I found only one problem - first patch introduce a new property
 CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in
 documentation. But CONNECTION LIMIT is still supported, but it is not
 documented. So for some readers it can look like breaking compatibility, but
 it is false. This should be documented better.
 
 Yeah, I think the old syntax should be documented also. 

Why do we want to document syntax that should eventually be deprecated?

 See, e.g., what we do for COPY.

Exactly.  We're still carrying around baggage from 7.2!

Backward compatibility: yes.
Backward documentation: no.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL access to database attributes

2014-06-23 Thread Pavel Stehule
2014-06-23 18:39 GMT+02:00 Vik Fearing vik.fear...@dalibo.com:

 On 06/23/2014 06:21 PM, Robert Haas wrote:
  On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  I found only one problem - first patch introduce a new property
  CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in
  documentation. But CONNECTION LIMIT is still supported, but it is not
  documented. So for some readers it can look like breaking
 compatibility, but
  it is false. This should be documented better.
 
  Yeah, I think the old syntax should be documented also.

 Why do we want to document syntax that should eventually be deprecated?


It is fair to our users. It can be deprecated, ok, we can write in doc -
this feature will be deprecated in next three years. Don't use it, but this
should be documentated.

Pavel



  See, e.g., what we do for COPY.

 Exactly.  We're still carrying around baggage from 7.2!

 Backward compatibility: yes.
 Backward documentation: no.
 --
 Vik



Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Robert Haas
On Mon, Jun 23, 2014 at 12:29 PM, Andres Freund and...@2ndquadrant.com wrote:
  That we have support for platforms that we haven't even documented as
  working (e.g. SuperH) or platforms that don't work in the realword
  (m32r) means that that one has to think about and research so many more
  edge cases that it's really hard to make progress. I don't know how
  often I've now sequentially gone through s_lock.h, s_lock.c,
  src/backend/port/tas to understand all the supported combinations.

 I agree that s_lock.h is the most painful part of this whole thing,
 mostly because I'd really like to get to the point where
 SpinLockAcquire() and SpinLockRelease() function as compiler barriers.

 s_lock.h is basically gone in my patch btw. Only old comments + defines
 for TAS/TAS_SPIN/S_UNLOCK etc mapping the old calls onto atomic flags
 remain.
 The new code should, hopefully, all be barrier safe.

Urp.  I'm not sure that's at all a good idea.  I don't love s_lock.h,
but if we make a wholesale change, we risk breaking things that work
today in obscure ways that are hard to find and debug.  I thought we
were talking about adding new facilities with their own fallbacks, not
massively reengineering the facilities we've already got.

 I'll omit !gcc PA-RISC unless somebody complains. I don't think I'll be
 able to modify hpux_hppa.s + build infrastructure correctly and there's
 no buildfarm. That means it'll require --disable-spinlocks.

I dunno whether we should care in that particular case, but in general
I don't think it's OK for lack of support for the *new* atomics to
prevent us from using the *existing* TAS support.

 Again, the concern that was expressed at the developer's meeting was
 that the performance characteristics of the CAS loop might be
 significantly different from a native atomic op as to cause us
 heartburn.  I think that's a valid concern... but if everything in
 common use has both CAS and fetch-and-add, then I think there's
 probably no issue here.

 Well, there's platforms where both CAS and atomic add are implemented
 using load linked/store conditional. So neither really is 100%
 native. But that's essentially how CAS et al are implemented in
 microcode on pretty much all somewhat recent cpus anyway.

I think on platforms that expose LL/SC, we have to assume that
spurious failures will be infrequent enough not to matter to
performance.  If they do, that's something that's going to have to be
fixed by the hardware manufacturer, not us.  There is a danger here
that we could end up with optimizations that are wins on some
platforms and losses on others, but I think we're going to have to
live with that.  Refusing to use atomic ops at all isn't a viable way
forward.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL access to database attributes

2014-06-23 Thread Robert Haas
On Mon, Jun 23, 2014 at 12:39 PM, Vik Fearing vik.fear...@dalibo.com wrote:
 On 06/23/2014 06:21 PM, Robert Haas wrote:
 On Sun, Jun 22, 2014 at 2:59 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 I found only one problem - first patch introduce a new property
 CONNECTION_LIMIT and replace previously used CONNECTION LIMIT in
 documentation. But CONNECTION LIMIT is still supported, but it is not
 documented. So for some readers it can look like breaking compatibility, but
 it is false. This should be documented better.

 Yeah, I think the old syntax should be documented also.

 Why do we want to document syntax that should eventually be deprecated?

Because otherwise existing users will wonder if their dumps can still
be restored on newer systems.

And also, because documentation is, in general, a good thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-06-23 Thread Tom Lane
Gurjeet Singh gurj...@singh.im writes:
 On Mon, Jun 23, 2014 at 11:52 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 While this is clearly an improvement over what we had before, it's
 impossible to argue that it's a bug fix, and we are way past the 9.4
 feature freeze deadline.  In particular, packagers who've already done
 their 9.4 development work might be blindsided by us slipping this into
 9.4 release.  So while I wouldn't have a problem with putting this change
 into 9.4 from a technical standpoint, it's hard to argue that it'd meet
 project norms from a development-process standpoint.

 While I'd love to reduce the number of future installations without
 this fix in place, I respect the decision to honor project policy. At
 the same time, this change does not break anything. It introduces new
 environment variables which change the behaviour, but behaves the old
 way in the absence of those variables.

Uh, no, it doesn't.  We removed the dependence on -DLINUX_OOM_SCORE_ADJ.
If a packager is expecting that to still work in 9.4, he's going to be
unpleasantly surprised, because the system will silently fail to do what
he's expecting: it will run all the backend processes at no-OOM-kill
priority, which is likely to be bad.

It is possible to make packages that will work either way, along the lines
of the advice I sent to the Red Hat guys:
https://bugzilla.redhat.com/show_bug.cgi?id=1110969

but I think it's a bit late in the cycle to be telling packagers to do
that for 9.4.

 BTW, does the project publish the feature-freeze deadlines and other
 dates somewhere (apart from on this list).

It's usually in the dev meeting minutes
https://wiki.postgresql.org/wiki/PgCon_2014_Developer_Meeting#9.5_Schedule

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] checking for interrupts during heap insertion

2014-06-23 Thread Robert Haas
Hi,

While talking to Amit Kapila this morning, he mentioned to me that
there seem to be no CHECK_FOR_INTERRUPTS() calls anywhere in
heap_multi_insert() or the functions it calls. Should there be?

By way of contrast, heapgetpage() has this:

/*
 * Be sure to check for interrupts at least once per page.  Checks at
 * higher code levels won't be able to stop a seqscan that encounters many
 * pages' worth of consecutive dead tuples.
 */
CHECK_FOR_INTERRUPTS();

In heap_multi_insert(), we first do heap_prepare_insert() on each
tuple, which may involve dirtying many pages, since it handles TOAST.
Then, we loop over the tuples themselves and dirty a bunch more pages.
All of that will normally happen pretty quickly, but if the I/O
subsystem is very slow for some reason, such as due to heavy system
load, then it might take quite a long time.  I'm thinking we might
want a CHECK_FOR_INTERRUPTS() in the following two places:

1. Inside toast_save_datum, at the top of the loop that starts with
while (data_todo  0).
2. Inside heap_multi_insert, at the top of the loop that starts with
while (ndone  ntuples).

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use a signal to trigger a memory context dump?

2014-06-23 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I wonder if it'd make sense to allow a signal to trigger a memory
 context dump? I and others more than once had the need to examine memory
 usage on production systems and using gdb isn't always realistic.
 I wonder if we could install a signal handler for some unused signal
 (e.g. SIGPWR) to dump memory.
 I'd also considered adding a SQL function that uses the SIGUSR1 signal
 multiplexing for the purpose but that's not necessarily nice if you have
 to investigate while SQL access isn't yet possible. There's also the
 problem that not all possibly interesting processes use the sigusr1
 signal multiplexing.

Well, you can't just have the signal handler call MemoryContextStats
directly.  (Even if the memory manager's state were 100% interrupt-safe,
which it ain't, fprintf itself might not be safe either.)

The closest approximation that I think would be reasonable is to
set a flag that would be noticed by the next CHECK_FOR_INTERRUPTS
macro.  So you're already buying into the assumption that the process
executes CHECK_FOR_INTERRUPTS fairly often.  Which probably means
that assuming it's using the standard sigusr1 handler isn't a big
extra limitation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-23 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 Some comments, aside from the design wrt. bounding boxes etc. :

Thanks.  I haven't commented on that sub-thread because I think it's
possible to come up with a reasonable design that solves the issue by
adding a couple of amprocs.  I need to do some more thinking to ensure
it is really workable, and then I'll post my ideas.

 On 06/15/2014 05:34 AM, Alvaro Herrera wrote:
 Robert Haas wrote:

 If I understand the code correctly, the revmap is a three-level deep
 structure. The bottom level consists of regular revmap pages, and
 each regular revmap page is filled with ItemPointerDatas, which
 point to the index tuples. The middle level consists of array
 revmap pages, and each array revmap page contains an array of
 BlockNumbers of the regular revmap pages. The top level is an
 array of BlockNumbers of the array revmap pages, and it is stored in
 the metapage.

Yep, that's correct.  Essentially, we still have the revmap as a linear
space (containing TIDs); the other two levels on top of that are only
there to enable locating the physical page numbers for each revmap
logical page.  I make one exception that the first logical revmap page
is always stored in page 1, to optimize the case of a smallish table
(~1360 page ranges; approximately 1.3 gigabytes of data at 128 pages per
range, or 170 megabytes at 16 pages per range.)

Each page has a page header (24 bytes) and special space (4 bytes), so
it has 8192-28=8164 bytes available for data, so 1360 item pointers.

 With 8k block size, that's just enough to cover the full range of
 2^32-1 blocks that you'll need if you set mm_pages_per_range=1. Each
 regular revmap page can store about 8192/6 = 1365 item pointers,
 each array revmap page can store about 8192/4 = 2048 block
 references, and the size of the top array is 8192/4. That's just
 enough; to store the required number of array pages in the top
 array, the array needs to be (2^32/1365)/2048)=1536 elements large.
 
 But with 4k or smaller blocks, it's not enough.

Yeah.  As I said elsewhere, actual useful values are likely to be close
to the read-ahead setting of the underlying disk; by default that'd be
16 pages (128kB), but I think it's common advice to increase the kernel
setting to improve performance.  I don't think we don't need to prevent
minmax indexes with pages_per_range=1, but I don't think we need to
ensure that that setting works with the largest tables, either, because
it doesn't make any sense to set it up like that.

Also, while there are some recommendations to set up a system with
larger page sizes (32kB), I have never seen any recommendation to set it
lower.  It wouldn't make sense to build a system that has very large
tables and use a smaller page size.

So in other words, yes, you're correct that the mechanism doesn't work
in some cases (small page size and index configured for highest level of
detail), but the conditions are such that I don't think it matters.

ISTM the thing to do here is to do the math at index creation time, and
if we find that we don't have enough space in the metapage for all array
revmap pointers we need, bail out and require the index to be created
with a larger pages_per_range setting.

 I wonder if it would be simpler to just always store the revmap
 pages in the beginning of the index, before any other pages. Finding
 the revmap page would then be just as easy as with a separate fork.
 When the table/index is extended so that a new revmap page is
 needed, move the existing page at that block out of the way. Locking
 needs some consideration, but I think it would be feasible and
 simpler than you have now.

Moving index items around is not easy, because you'd have to adjust the
revmap to rewrite the item pointers.

 I have followed the suggestion by Amit to overwrite the index tuple when
 a new heap tuple is inserted, instead of creating a separate index
 tuple.  This saves a lot of index bloat.  This required a new entry
 point in bufpage.c, PageOverwriteItemData().  bufpage.c also has a new
 function PageIndexDeleteNoCompact which is similar in spirit to
 PageIndexMultiDelete except that item pointers do not change.  This is
 necessary because the revmap stores item pointers, and such reference
 would break if we were to renumber items in index pages.
 
 ISTM that when the old tuple cannot be updated in-place, the new
 index tuple is inserted with mm_doinsert(), but the old tuple is
 never deleted.

It's deleted by the next vacuum.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Atomics hardware support table supported architectures

2014-06-23 Thread Andres Freund
On 2014-06-23 12:46:11 -0400, Robert Haas wrote:
 On Mon, Jun 23, 2014 at 12:29 PM, Andres Freund and...@2ndquadrant.com 
 wrote:
   That we have support for platforms that we haven't even documented as
   working (e.g. SuperH) or platforms that don't work in the realword
   (m32r) means that that one has to think about and research so many more
   edge cases that it's really hard to make progress. I don't know how
   often I've now sequentially gone through s_lock.h, s_lock.c,
   src/backend/port/tas to understand all the supported combinations.
 
  I agree that s_lock.h is the most painful part of this whole thing,
  mostly because I'd really like to get to the point where
  SpinLockAcquire() and SpinLockRelease() function as compiler barriers.
 
  s_lock.h is basically gone in my patch btw. Only old comments + defines
  for TAS/TAS_SPIN/S_UNLOCK etc mapping the old calls onto atomic flags
  remain.
  The new code should, hopefully, all be barrier safe.
 
 Urp.  I'm not sure that's at all a good idea.

I'm not sure either. But it got harder and harder to do it without that
because of currently interweaved dependencies. Barriers depend on
spinlocks, atomics depend on barriers, atomics depend on spinlocks. And
for a useful generic spinlock implementation spinlocks depend on
atomics.

And then there's the problem that the spinlock based atomics emulation
needs to influence the semaphore spinlock implementation because you
otherwise can get deadlocks (imagine a atomic var being manipulated
while a spinlock is held. If they map to the same semaphore...).

I guess it's possible to remove all traces of s_lock.h from barrier.h;
add a atomics using fallback to s_lock.h, forceable using a define; make
the the public part of the spinlock based fallback not require the
spinlock implementation somehow.
But even if, I think we should get rid of s_lock.h in a later commit for
9.5.

 I don't love s_lock.h,
 but if we make a wholesale change, we risk breaking things that work
 today in obscure ways that are hard to find and debug.  I thought we
 were talking about adding new facilities with their own fallbacks, not
 massively reengineering the facilities we've already got.

I think we can separate it if we want, but I doubt it'll reall make
stuff simpler.

  I'll omit !gcc PA-RISC unless somebody complains. I don't think I'll be
  able to modify hpux_hppa.s + build infrastructure correctly and there's
  no buildfarm. That means it'll require --disable-spinlocks.
 
 I dunno whether we should care in that particular case, but in general
 I don't think it's OK for lack of support for the *new* atomics to
 prevent us from using the *existing* TAS support.

I think I'll just need help from Tom for that case.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] tab completion for setting search_path

2014-06-23 Thread Robert Haas
On Mon, Jun 23, 2014 at 9:10 AM, Kevin Grittner kgri...@ymail.com wrote:
 Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-22 20:02:57 -0700, Tom Lane wrote:
 Ian Barwick i...@2ndquadrant.com writes:
 On 23/06/14 00:58, Andres Freund wrote:
 I thought about committing this but couldn't get over this bit. If you
 type SELECT * FROM pg_cattab it'll get autocompleted to
 pg_catalog.pg_ and pg_temptab will list all the temp schemas
 including the numeric and toast ones. So we have precedent for *not*
 bothering about excluding any schemas. I don't think we should start
 doing so in a piecemal fashion in an individual command's completion.

 There is an exception of sorts already for system schemas, in that although
 SELECT * FROM ptab will list the system schemas, it will not list any
 tables from them, and won't until SELECT * FROM pg_tab is entered
 (see note in tab-completion.c around line 3722).

 Personally I'd be mildly annoyed if every SET search_path TO ptab 
 resulted
 in all the system schemas being displayed when all I want is public; how
 about having these listed only once pg_ is entered, i.e.
 SET search_path TO pg_tab?

 I think there is a pretty strong practical argument for excluding the
 pg_temp and pg_toast schemas from completion for search_path, namely
 that when does anyone ever need to include those in their search_path
 explicitly?

 Infrequently, yes. I've only done it when trying to break stuff ;)

 The use-case for including pg_catalog in your path is perhaps a bit
 greater, but not by much.

 I don't know. It feelds like inappropriate nannyism to me. More
 confusing than actually helpful. The schemas are there, so they should
 get autocompleted.
 But anyway, the common opinion seems to be swinging against my position,
 so lets do it that way.

 I would be for excluding the pg_toast, pg_toast_temp_n, and
 pg_temp_n schemas, and including public and pg_catalog.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Use a signal to trigger a memory context dump?

2014-06-23 Thread Andres Freund
On 2014-06-23 10:07:36 -0700, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I wonder if it'd make sense to allow a signal to trigger a memory
  context dump? I and others more than once had the need to examine memory
  usage on production systems and using gdb isn't always realistic.
  I wonder if we could install a signal handler for some unused signal
  (e.g. SIGPWR) to dump memory.
  I'd also considered adding a SQL function that uses the SIGUSR1 signal
  multiplexing for the purpose but that's not necessarily nice if you have
  to investigate while SQL access isn't yet possible. There's also the
  problem that not all possibly interesting processes use the sigusr1
  signal multiplexing.
 
 Well, you can't just have the signal handler call MemoryContextStats
 directly.  (Even if the memory manager's state were 100% interrupt-safe,
 which it ain't, fprintf itself might not be safe either.)

Yea. And fprintf() definitely isn't.

 The closest approximation that I think would be reasonable is to
 set a flag that would be noticed by the next CHECK_FOR_INTERRUPTS
 macro.  So you're already buying into the assumption that the process
 executes CHECK_FOR_INTERRUPTS fairly often.  Which probably means
 that assuming it's using the standard sigusr1 handler isn't a big
 extra limitation.

There seem to be far more subsystems doing CHECK_FOR_INTERRUPTS than
using SIGUSR1 multiplexing. Several processes have their own SIGUSR1
handlers:
* bgworkers (Which certainly is a major candidate for this. And: Isn't this a 
bug?
  Think recovery conflicts.)
* startup process (certainly interesting as well)
* checkpointer
* walreceiver
* walsender
* wal writer
* bgwriter
* archiver
* syslogger

At least bgworkers, startup process, walsenders are definitely
interesting from this POV.

It very well might be best to provide a common sigusr1 implementation
supporting a subset of multiplexing for some of those since they
essentially all do the same... Although that'd require a fair bit of
surgery in procsignal.c

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: [HACKERS] please review source(SQLServer compatible)‏

2014-06-23 Thread Kevin Grittner
Vik Fearing vik.fear...@dalibo.com wrote:
 On 06/23/2014 04:51 PM, rohtodeveloper wrote:
 1.SQL statement support
   INSERT statement without INTO keyword
   DELETE statement without FROM keywork

 Why would we want this?

I'm pretty sure that the only argument for it is to ease migration
of software from other DBMS products which allow that non-standard
syntax for people who have chosen to use the non-standard form of
the statement instead of the standard syntax (which is also
available in all cases that I know of).

If the SQL standard were static, I would actually lean toward
allowing it, to make it easier for people to switch to PostgreSQL.
The biggest down side I see is the possibility that some future
version of the standard might implement some new syntax which is
more difficult to implement if we need to also support this
non-standard variation.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-06-23 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Stephen Frost sfr...@snowman.net writes:
  I'd expect a catalog table or perhaps changes to pg_class (maybe other
  things also..) to define what gets logged.
 
 How exactly will that work for log messages generated in contexts where
 we do not have working catalog access?  (postmaster, crash recovery,
 or pretty much anywhere where we're not in a valid transaction.)

Certainly a great question and we may have to address it by not
supporting changes immediately (in other words, cache things at backend
start, or only at specific times), or by reviewing what logging we do at
those times and work out which of those can be controllerd through the
catalog and which can't.  The logging which can't be managed through the
catalog would have to be managed through GUCs or in another way.

There's a difference between being able to have finer grained control
over certain log messages and having every single ereport() query the
catalog.  I'm not advocating the latter.

 This strikes me as much like the periodic suggestions we hear to get
 rid of the GUC infrastructure in favor of keeping all those settings
 in a table.  It doesn't work because too much of that info is needed
 below the level of working table access.

I'm not suggesting this.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [Fwd: Re: [HACKERS] proposal: new long psql parameter --on-error-stop]

2014-06-23 Thread Pavel Stehule
Hello

I am sending little bit modified patch by Fujii' comments - but I am not
able to fix it more - it is task for someone with better English skill then
I have

Regards

Pavel


2014-06-23 10:02 GMT+02:00 Fujii Masao masao.fu...@gmail.com:

 On Mon, Jun 23, 2014 at 12:04 AM, Pavel Stehule pavel.steh...@gmail.com
 wrote:
  Hello
 
  third version with Erik's update

 Here are some my comments:

 The document of psql needs to be updated. At least the description of new
 option
 this patch adds needs to be added into the document.

 +printf(_(  --help-variables list of available
 configuration variables (options), then exit\n));

 We should get rid of of from the message, or add show in front of
 list of?

 +printf(_(  ECHO   write all input lines to standard
 output\n));

 This message seems not to be correct when ECHO=queries is set.

 +printf(_(  COMP_KEYWORD_CASE  determines which letter case to
 use when completing an SQL key word\n));
 +printf(_(  DBNAME name of currently connected
 database\n));
 +printf(_(  ECHO   write all input lines to standard
 output\n));

 I found that some help message line uses a normal form of a verb, but
 other does not.
 We should standardize them?

 +printf(_(  PROMPT1, PROMPT2, PROMPT3  specify the psql prompt\n));

 When the option name field is long, we should add a new line just
 after the name field
 and align the starting position of the option explanation field. That is,
 for example, the above should be

 printf(_(  PROMPT1, PROMPT2, PROMPT3\n
   specify the psql prompt\n));

 +printf(_(  ON_ERROR_ROLLBACK  when on, ROLLBACK on error\n));

 This message seems incorrect to me. When this option is on and an error
 occurs
 in transaction, transaction continues rather than ROLLBACK occurs, IIUC.
 I did not check whole help messages yet, but ISTM some messages are not
 correct.
 It's better to check them again.

 +printf(_(  PSQL_RCalternative location of the user's
 .psqlrc file\n));

 Typo: PSQL_RC should be PSQLRC

 +printf(_(  PGDATABASE same as the dbname connection
 parameter\n));
 +printf(_(  PGHOST same as the host connection
 parameter\n));
 +printf(_(  PGPORT same as the port connection
 parameter\n));
 +printf(_(  PGUSER same as the user connection
 parameter\n));
 +printf(_(  PGPASSWORD possibility to set password (not
 recommended)\n));
 +printf(_(  PGPASSFILE password file (default ~/.pgpass)\n));

 I don't think that psql needs to display the help messages of even
 environment
 variables supported by libpq.

 Regards,

 --
 Fujii Masao

commit 4d8a267870f15a22818da226f72223db86944636
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Mon Jun 23 19:38:41 2014 +0200

without comments

diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index 3aa3c16..e960f34 100644
--- a/src/bin/psql/help.c
+++ b/src/bin/psql/help.c
@@ -78,12 +78,13 @@ usage(void)
 	printf(_(  -f, --file=FILENAME  execute commands from file, then exit\n));
 	printf(_(  -l, --list   list available databases, then exit\n));
 	printf(_(  -v, --set=, --variable=NAME=VALUE\n
-			set psql variable NAME to VALUE\n));
+			set psql variable NAME to VALUE e.g.: -v ON_ERROR_STOP=1\n));
 	printf(_(  -V, --versionoutput version information, then exit\n));
 	printf(_(  -X, --no-psqlrc  do not read startup file (~/.psqlrc)\n));
 	printf(_(  -1 (\one\), --single-transaction\n
 			execute as a single transaction (if non-interactive)\n));
 	printf(_(  -?, --help   show this help, then exit\n));
+	printf(_(  --help-variables list of available configuration variables (options), then exit\n));
 
 	printf(_(\nInput and output options:\n));
 	printf(_(  -a, --echo-all   echo all input from script\n));
@@ -279,6 +280,84 @@ slashUsage(unsigned short int pager)
 }
 
 
+/*
+ * show list of available variables (options) from command line
+ */
+void
+help_variables(void)
+{
+	printf(_(List of variables (options) for use from command line.\n));
+
+	printf(_(psql variables:\n));
+	printf(_(Usage:\n));
+	printf(_(  psql --set=NAME=VALUE\n  or \\set NAME VALUE in interactive mode\n\n));
+
+	printf(_(  AUTOCOMMIT successful SQL commands are automatically committed\n));
+	printf(_(  COMP_KEYWORD_CASE  determines which letter case to use when completing an SQL key word\n));
+	printf(_(  DBNAME name of currently connected database\n));
+	printf(_(  ECHO   controls what input can be writtent to standard output [all, queries]\n));
+	printf(_(  ECHO_HIDDENdisplay internal queries (same as -E option)\n));
+	printf(_(  ENCODING   current client character set encoding\n));
+	printf(_(  FETCH_COUNTfetch many rows 

[HACKERS] Re: [HACKERS] Re: [HACKERS] please review source(SQLServer compatible)‏

2014-06-23 Thread Pavel Stehule
2014-06-23 19:22 GMT+02:00 Kevin Grittner kgri...@ymail.com:

 Vik Fearing vik.fear...@dalibo.com wrote:
  On 06/23/2014 04:51 PM, rohtodeveloper wrote:
  1.SQL statement support
INSERT statement without INTO keyword
DELETE statement without FROM keywork
 
  Why would we want this?

 I'm pretty sure that the only argument for it is to ease migration
 of software from other DBMS products which allow that non-standard
 syntax for people who have chosen to use the non-standard form of
 the statement instead of the standard syntax (which is also
 available in all cases that I know of).


There is a fork of PostgreSQL  http://www.tpostgres.org/se/ what can do it
better this task. We doesn't support a special syntax for Oracle more, for
DB2 and I don't see any reason, why we should to do for T-SQL.

More - usually this is most simple part in migration from Sybase family to
PostgreSQL - there is totally different concept of stored procedures, temp
tables, and other so there is not possible simple migration without
relative hard changes in PostgreSQL parser.



 If the SQL standard were static, I would actually lean toward
 allowing it, to make it easier for people to switch to PostgreSQL.
 The biggest down side I see is the possibility that some future
 version of the standard might implement some new syntax which is
 more difficult to implement if we need to also support this
 non-standard variation.


yes.

Regards

Pavel



  --
 Kevin Grittner
 EDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] idle_in_transaction_timeout

2014-06-23 Thread Josh Berkus
On 06/22/2014 09:02 PM, Abhijit Menon-Sen wrote:
 I (somewhat reluctantly) agree with Kevin that
 idle_in_transaction_session_timeout (for FATAL) and
 idle_transaction_timeout (for ERROR) would work.

Given that an IIT timeout has been a TODO for at least 6 years before
being addressed, I'm not convinced that we need to worry about what an
eventual error vs. fatal timeout should be named or how it should be
scoped.  Let's attack that when someone actually shows an inclination to
work on it.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] /proc/self/oom_adj is deprecated in newer Linux kernels

2014-06-23 Thread Gurjeet Singh
On Mon, Jun 23, 2014 at 12:49 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Gurjeet Singh gurj...@singh.im writes:
 While I'd love to reduce the number of future installations without
 this fix in place, I respect the decision to honor project policy. At
 the same time, this change does not break anything. It introduces new
 environment variables which change the behaviour, but behaves the old
 way in the absence of those variables.

 Uh, no, it doesn't.  We removed the dependence on -DLINUX_OOM_SCORE_ADJ.
 If a packager is expecting that to still work in 9.4, he's going to be
 unpleasantly surprised, because the system will silently fail to do what
 he's expecting: it will run all the backend processes at no-OOM-kill
 priority, which is likely to be bad.

True. I didn't think from a packager's perspective.

 It is possible to make packages that will work either way, along the lines
 of the advice I sent to the Red Hat guys:
 https://bugzilla.redhat.com/show_bug.cgi?id=1110969

 but I think it's a bit late in the cycle to be telling packagers to do
 that for 9.4.

Understandable.

 BTW, does the project publish the feature-freeze deadlines and other
 dates somewhere (apart from on this list).

 It's usually in the dev meeting minutes
 https://wiki.postgresql.org/wiki/PgCon_2014_Developer_Meeting#9.5_Schedule

Thanks. But it doesn't spell out the specific dates, or even the month
of feature-freeze.

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-23 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Jun 18, 2014 at 2:18 PM, Stephen Frost sfr...@snowman.net wrote:
  I'm also of the opinion that this isn't strictly necessary for the
  initial RLS offering in PG- there's a clear way we could migrate
  existing users to a multi-policy system from a single-policy system.
  Sure, to get the performance and optimization benefits that we'd
  presumably have in the multi-policy case they'd need to re-work their
  RLS configuration, but for users who care, they'll likely be very happy
  to do so to gain those benefits.
 
 I think a lot depends on the syntax we choose.  If we choose a syntax
 that only makes sense in a single-policy framework, then I think
 allowing upgrades to a multi-policy syntax is going to be really
 difficult.  On the other hand, if we choose a syntax that allows
 multiple policies, I suspect we can support multiple policies from the
 beginning without much extra effort.

What are these policies going to depend on?  Will they be allowed to
overlap?  I don't see multi-policy support as being very easily added.

If there are specific ways to design the syntax which would make it
easier to support multiple policies in the future, I'm all for it.  Have
any specific thoughts regarding that?

  - Require the user to specify in some way which of the available
  policies they want applied, and then apply only that one.
 
  I'd want to at least see a way to apply an ordering to the policies
  being applied, or have PG work out which one is cheapest and try that
  one first.
 
 Cost-based comparison of policies that return different results
 doesn't seem sensible to me.

I keep coming back to the thought that, really, having multiple
overlapping policies just adds unnecessary complication to the system
for not much gain in real functionality.  Being able to specify a policy
per-role might be useful, but that's only one dimension and I can
imagine a lot of other dimensions that one might want to use to control
which policy is used.

  I think exactly the opposite, for the query planning reasons
  previously stated.  I think the policies will quickly get so
  complicated that they're no longer optimizable.  Here's a simple
  example:
 
  - Policy 1 allows the user to access rows for which complexfunc() returns 
  true.
  - Policy 2 allows the user to access rows for which a = 1.
 
  Most users have access only through policy 2, but some have access
  through policy 1.  Users who have access through policy 1 will always
  get a sequential scan,
 
  This is the thing which I most object to- if the quals being provided at
  any level are leakproof and would be able to reduce the returned set
  sufficiently that an index scan is the best bet, we should be doing
  that.  I don't anticipate the RLS quals to be as selective as the
  quals which the user is adding.
 
 I think it would be a VERY bad idea to design the system around the
 assumption that the RLS quals will be much more or less selective than
 the user-supplied quals.  That's going to be different in different
 environments.

Fine- but do you really see the query planner having a problem pushing
down whichever is the more selective qual, if the user-provided qual is
marked as leakproof?

I realize that you want multiple policies because you'd like a way for
the RLS qual to be made simpler for certain cases while also having more
complex quals for other cases.  What I keep waiting to hear is exactly
how you want to specify which policy is used because that's where it
gets ugly and complicated.  I still really don't like the idea of trying
to apply multiple policies inside of a single query execution.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pgaudit - an auditing extension for PostgreSQL

2014-06-23 Thread Abhijit Menon-Sen
At 2014-06-23 08:50:33 -0400, sfr...@snowman.net wrote:
 
 I'm not a huge fan of adding this as a contrib module

I added it to the CF because we're interested in auditing functionality
for 9.5, and as far as I can tell, there's nothing better available. At
the moment, the contrib module has the advantage that it exists :-) and
works with 9.[34] (and could be made to work with 9.2, though I didn't
bother for the initial submission).

 unless we can be
 quite sure that there's a path forward from here to a rework of the
 logging in core which would actually support the features pg_audit is
 adding, without a lot of pain and upgrade issues.

What sort of pain and upgrade issues did you have in mind?

 I'd expect a catalog table or perhaps changes to pg_class (maybe other
 things also..) to define what gets logged.. 

Please explain?

(I wish extensions were able to add reloptions. That would have made it
relatively easy for us to implement per-object audit logging.)

 I'd also like to see the ability to log based on the connecting user,
 and we need to log under what privileges a command is executing

I imagine it's not useful to point out that you can do the former with
pgaudit (using ALTER ROLE … SET), and that we log the effective userid
for the latter (though maybe you had something more in mind)…

 and, really, a whole host of other things..

…but there's not a whole lot I can do with that, either.

Is the minimal set of auditing features that we would need in-core very
different from what pgaudit already has?

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-23 Thread Heikki Linnakangas

On 06/23/2014 08:07 PM, Alvaro Herrera wrote:

Heikki Linnakangas wrote:

With 8k block size, that's just enough to cover the full range of
2^32-1 blocks that you'll need if you set mm_pages_per_range=1. Each
regular revmap page can store about 8192/6 = 1365 item pointers,
each array revmap page can store about 8192/4 = 2048 block
references, and the size of the top array is 8192/4. That's just
enough; to store the required number of array pages in the top
array, the array needs to be (2^32/1365)/2048)=1536 elements large.

But with 4k or smaller blocks, it's not enough.


Yeah.  As I said elsewhere, actual useful values are likely to be close
to the read-ahead setting of the underlying disk; by default that'd be
16 pages (128kB), but I think it's common advice to increase the kernel
setting to improve performance.


My gut feeling is that it might well be best to set pages_per_page=1. 
Even if you do the same amount of I/O, thanks to kernel read-ahead, you 
might still avoid processing a lot of tuples. But would need to see some 
benchmarks to know..



I don't think we don't need to prevent
minmax indexes with pages_per_range=1, but I don't think we need to
ensure that that setting works with the largest tables, either, because
it doesn't make any sense to set it up like that.

Also, while there are some recommendations to set up a system with
larger page sizes (32kB), I have never seen any recommendation to set it
lower.  It wouldn't make sense to build a system that has very large
tables and use a smaller page size.

So in other words, yes, you're correct that the mechanism doesn't work
in some cases (small page size and index configured for highest level of
detail), but the conditions are such that I don't think it matters.

ISTM the thing to do here is to do the math at index creation time, and
if we find that we don't have enough space in the metapage for all array
revmap pointers we need, bail out and require the index to be created
with a larger pages_per_range setting.


Yeah, I agree that would be acceptable.

I feel that the below would nevertheless be simpler:


I wonder if it would be simpler to just always store the revmap
pages in the beginning of the index, before any other pages. Finding
the revmap page would then be just as easy as with a separate fork.
When the table/index is extended so that a new revmap page is
needed, move the existing page at that block out of the way. Locking
needs some consideration, but I think it would be feasible and
simpler than you have now.


Moving index items around is not easy, because you'd have to adjust the
revmap to rewrite the item pointers.


Hmm. Two alternative schemes come to mind:

1. Move each index tuple off the page individually, updating the revmap 
while you do it, until the page is empty. Updating the revmap for a 
single index tuple isn't difficult; you have to do it anyway when an 
index tuple is replaced. (MMTuples don't contain a heap block number 
ATM, but IMHO they should, see below)


2. Store the new block number of the page that you moved out of the way 
in the revmap page, and leave the revmap pointers unchanged. The revmap 
pointers can be updated later, lazily.


Both of those seem pretty straightforward.


I have followed the suggestion by Amit to overwrite the index tuple when
a new heap tuple is inserted, instead of creating a separate index
tuple.  This saves a lot of index bloat.  This required a new entry
point in bufpage.c, PageOverwriteItemData().  bufpage.c also has a new
function PageIndexDeleteNoCompact which is similar in spirit to
PageIndexMultiDelete except that item pointers do not change.  This is
necessary because the revmap stores item pointers, and such reference
would break if we were to renumber items in index pages.


ISTM that when the old tuple cannot be updated in-place, the new
index tuple is inserted with mm_doinsert(), but the old tuple is
never deleted.


It's deleted by the next vacuum.


Ah I see. Vacuum reads the whole index, and builds an in-memory hash 
table that contains an ItemPointerData for every tuple in the index. 
Doesn't that require a lot of memory, for a large index? That might be 
acceptable - you ought to have plenty of RAM if you're pushing around 
multi-terabyte tables - but it would nevertheless be nice to not have a 
hard requirement for something as essential as vacuum.


In addition to the hash table, remove_deletable_tuples() pallocs an 
array to hold an ItemPointer for every index tuple about to be removed. 
A single palloc is limited to 1GB, so that will fail outright if there 
are more than ~179 million dead index tuples. You're unlikely to hit 
that in practice, but if you do, you'll never be able to vacuum the 
index. So that's not very nice.


Wouldn't it be simpler to remove the old tuple atomically with inserting 
the new tuple and updating the revmap? Or at least mark the old tuple as 
deletable, so that vacuum can just delete it, without building the large 
hash 

Re: [HACKERS] checking for interrupts during heap insertion

2014-06-23 Thread Heikki Linnakangas

On 06/23/2014 08:07 PM, Robert Haas wrote:

While talking to Amit Kapila this morning, he mentioned to me that
there seem to be no CHECK_FOR_INTERRUPTS() calls anywhere in
heap_multi_insert() or the functions it calls. Should there be?


Haven't heard any complaints, but I guess..


By way of contrast, heapgetpage() has this:

 /*
  * Be sure to check for interrupts at least once per page.  Checks at
  * higher code levels won't be able to stop a seqscan that encounters many
  * pages' worth of consecutive dead tuples.
  */
 CHECK_FOR_INTERRUPTS();

In heap_multi_insert(), we first do heap_prepare_insert() on each
tuple, which may involve dirtying many pages, since it handles TOAST.
Then, we loop over the tuples themselves and dirty a bunch more pages.
All of that will normally happen pretty quickly, but if the I/O
subsystem is very slow for some reason, such as due to heavy system
load, then it might take quite a long time.  I'm thinking we might
want a CHECK_FOR_INTERRUPTS() in the following two places:

1. Inside toast_save_datum, at the top of the loop that starts with
while (data_todo  0).
2. Inside heap_multi_insert, at the top of the loop that starts with
while (ndone  ntuples).


Seems reasonable.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Minmax indexes

2014-06-23 Thread Robert Haas
On Thu, Jun 19, 2014 at 12:32 PM, Greg Stark st...@mit.edu wrote:
 On Wed, Jun 18, 2014 at 4:51 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 Implementing something is a good way to demonstrate how it would look like.
 But no, I don't insist on implementing every possible design whenever a new
 feature is proposed.

 I liked Greg's sketch of what the opclass support functions would be. It
 doesn't seem significantly more complicated than what's in the patch now.

 As a counter-point to my own point there will be nothing stopping us
 in the future from generalizing things. Dealing with catalogs is
 mostly book-keeping headaches and careful work. it's something that
 might be well-suited for a GSOC or first patch from someone looking to
 familiarize themselves with the system architecture. It's hard to
 invent a whole new underlying infrastructure at the same time as
 dealing with all that book-keeping and it's hard for someone
 familiarizing themselves with the system to also have a great new
 idea. Having tasks like this that are easy to explain and that mentor
 understands well can be easier to manage than tasks where the newcomer
 has some radical new idea.

Generalizing this in the future would be highly likely to change the
on-disk format for existing indexes, which would be a problem for
pg_upgrade.  I think we will likely be stuck with whatever the initial
on-disk format looks like for a very long time, which is why I think
we need to try rather hard to get this right the first time.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >