Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Jim Nasby

On 1/9/15, 6:54 PM, Jim Nasby wrote:

On 1/9/15, 6:44 PM, Petr Jelinek wrote:



Yep, I had a same impression when I looked at the code first time,
however, it is defined as below. Not a manner of custom-scan itself.

/*
  * ==
  * Scan nodes
  * ==
  */
typedef struct Scan
{
 Planplan;
 Index   scanrelid;  /* relid is index into the range table */
} Scan;



Yeah there are actually several places in the code where relid means index in 
range table and not oid of relation, it still manages to confuse me. Nothing this patch 
can do about that.


Well, since it's confused 3 of us now... should we change it (as a separate 
patch)? I'm willing to do that work but don't want to waste time if it'll just 
be rejected.

Any other examples of this I should fix too?


Sorry, to clarify... any other items besides Scan.scanrelid that I should fix?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing memory leak in pg_upgrade

2015-01-09 Thread Tatsuo Ishii
 On Fri, Jan 9, 2015 at 9:23 PM, Tatsuo Ishii is...@postgresql.org wrote:
 This is because gen_db_file_maps() allocates memory even if n_maps == 0.
 Purely cosmetic: the initialization n_maps = 0 before the call of
 gen_db_file_maps is unnecessary ;)

Of course. n_maps is written by calling gen_db_file_maps() anyway. I
was talking about the case after calling gen_db_file_maps().

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Jim Nasby

On 1/9/15, 6:44 PM, Petr Jelinek wrote:



Yep, I had a same impression when I looked at the code first time,
however, it is defined as below. Not a manner of custom-scan itself.

/*
  * ==
  * Scan nodes
  * ==
  */
typedef struct Scan
{
 Planplan;
 Index   scanrelid;  /* relid is index into the range table */
} Scan;



Yeah there are actually several places in the code where relid means index in 
range table and not oid of relation, it still manages to confuse me. Nothing this patch 
can do about that.


Well, since it's confused 3 of us now... should we change it (as a separate 
patch)? I'm willing to do that work but don't want to waste time if it'll just 
be rejected.

Any other examples of this I should fix too?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Escaping from blocked send() reprised.

2015-01-09 Thread Andres Freund
On 2014-11-17 18:22:54 +0300, Alex Shulgin wrote:
 
 Andres Freund and...@2ndquadrant.com writes:
 
  I've invested some more time in this:
  0002 now makes sense on its own and doesn't change anything around the
   interrupt handling. Oh, and it compiles without 0003.
 
 In this patch, the endif appears to be misplaced in PostgresMain:
 
 + if (MyProcPort != NULL)
 + {
 +#ifdef WIN32
 + pgwin32_noblock = true;
 +#else
 + if (!pg_set_noblock(MyProcPort-sock))
 + ereport(COMMERROR,
 + (errmsg(could not set socket to 
 nonblocking mode: %m)));
 + }
 +#endif
 +

Uh. Odd. Anyway, that bit of code is now somewhere else anyway...

 One thing I did try is sending a NOTICE to the client when in
 ProcessInterrupts() and DoingCommandRead is true.  I think[1] it was
 expected to be delivered instantly, but actually the client (psql) only
 displays it after sending the next statement.

Yea, that should be psql specific though. I hope ;)

 While I'm reading on FE/BE protocol someone might want to share his
 wisdom on this subject.  My guess: psql blocks on readline/libedit call
 and can't effectively poll the server socket before complete input from
 user.

I'm not sure if it's actually a can't. It doesn't at least ;)

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] max_connections documentation

2015-01-09 Thread Jim Nasby

I'm surprised to see that the docs make no mention of how max_connections, 
max_worker_processes and autovacuum_max_workers (don't) relate. I couldn't 
remember and had to actually look at the code. I'd like to clarify this in the 
max_connecitons section of the documents by doing s/connections/user 
connections/ and including the formula for MaxBackends (MaxBackends = 
MaxConnections + autovacuum_max_workers + 1 + max_worker_processes). I'll also 
mention that any postgres_fdw connections are considered user connections.

Objections? Comments?

Also, my understanding is that the parallel stuff will continue to fall under 
max_worker_processes?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: [BUGS] BUG #12320: json parsing with embedded double quotes

2015-01-09 Thread Aaron Botsis

 On Jan 9, 2015, at 11:37 AM, Andrew Dunstan and...@dunslane.net wrote:
 
 
 On 01/08/2015 08:42 PM, Aaron Botsis wrote:
 
 On Jan 8, 2015, at 3:44 PM, Andrew Dunstan and...@dunslane.net wrote:
 
 
 On 01/08/2015 03:05 PM, Aaron Botsis wrote:
 
 It's also unnecessary. CSV format, while not designed for this, is 
 nevertheless sufficiently flexible to allow successful import of json 
 data meeting certain criteria (essentially no newlines), like this:
 
  copy the_table(jsonfield)
  from '/path/to/jsondata'
  csv quote e'\x01' delimiter e'\x02’;
 While perhaps unnecessary, given the size and simplicity of the patch, IMO 
 it’s a no brainer to merge (it actually makes the code smaller by 3 
 lines). It also enables non-json use cases anytime one might want to 
 preserve embedded escapes, or use different ones entirely. Do you see 
 other reasons not to commit it?
 
 Well, for one thing it's seriously incomplete. You need to be able to 
 change the delimiter as well. Otherwise, any embedded tab in the json will 
 cause you major grief.
 
 Currently the delimiter and the escape MUST be a single byte non-nul 
 character, and there is a check for this in csv mode. Your patch would 
 allow any arbitrary string (including one of zero length) for the escape in 
 text mode, and would then silently ignore all but the first byte. That's 
 not the way we like to do things.
 
 And, frankly, I would need to spend quite a lot more time thinking about 
 other implications than I have given it so far. This is an area where I 
 tend to be VERY cautious about making changes. This is a fairly fragile 
 ecosystem.
 
 So I’m going to do a bit more testing with another patch tomorrow with 
 delimiters removed. If you can think of any specific cases you think will 
 break it let me know and I’ll make sure to add regression tests for them as 
 well.
 
 Well, I still need convincing that this is the best solution to the problem. 
 As I said, I need to spend more time thinking about it.


I offered an alternative (RAW mode w/record terminator) that you ignored. So in 
lieu of a productive discussion about “the best solution to the problem”, I 
went ahead and continued to complete the patch since I believe it’s a useful FE 
that it could be helpful for this and other use cases. There have been multiple 
times (not even w/json) I wished COPY would stop being so smart. 

FWIW, (if anyone’s interested in it) I also hacked up some python that’ll read 
a json file, and outputs a binary file suitable for use with COPY BINARY that 
gets around all this stuff. Obviously this only works for json (not jsonb) 
columns (though you could SELECT INTO a jsonb column). Happy to pass it along.

Aaron



Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Amit Kapila
On Fri, Jan 9, 2015 at 10:54 PM, Stephen Frost sfr...@snowman.net wrote:
 * Amit Kapila (amit.kapil...@gmail.com) wrote:
  In our case as currently we don't have a mechanism to reuse parallel
  workers, so we need to account for that cost as well.  So based on that,
  I am planing to add three new parameters cpu_tuple_comm_cost,
  parallel_setup_cost, parallel_startup_cost
  *  cpu_tuple_comm_cost - Cost of CPU time to pass a tuple from worker
  to master backend with default value
  DEFAULT_CPU_TUPLE_COMM_COST as 0.1, this will be multiplied
  with tuples expected to be selected
  *  parallel_setup_cost - Cost of setting up shared memory for
parallelism
 with default value as 100.0
   *  parallel_startup_cost  - Cost of starting up parallel workers with
  default
  value as 1000.0 multiplied by number of workers decided for scan.
 
  I will do some experiments to finalise the default values, but in
general,
  I feel developing cost model on above parameters is good.

 The parameters sound reasonable but I'm a bit worried about the way
 you're describing the implementation.  Specifically this comment:

 Cost of starting up parallel workers with default value as 1000.0
 multiplied by number of workers decided for scan.

 That appears to imply that we'll decide on the number of workers, figure
 out the cost, and then consider parallel as one path and
 not-parallel as another.  I'm worried that if I end up setting the max
 parallel workers to 32 for my big, beefy, mostly-single-user system then
 I'll actually end up not getting parallel execution because we'll always
 be including the full startup cost of 32 threads.  For huge queries,
 it'll probably be fine, but there's a lot of room to parallelize things
 at levels less than 32 which we won't even consider.


Actually the main factor to decide whether a parallel plan will be
selected or not will be based on selectivity and cpu_tuple_comm_cost,
parallel_startup_cost is mainly to prevent the cases where user
has set parallel_seqscan_degree, but the table is small enough
(letus say 10,000 tuples) that it doesn't need parallelism. If you are
worried by default cost parameter's, then I think those still needs
to be decided based on certain experiments.

 What I was advocating for up-thread was to consider multiple parallel
 paths and to pick whichever ends up being the lowest overall cost.  The
 flip-side to that is increased planning time.

The main idea behind providing a parameter like parallel_seqscan_degree
is such that it will try to use that many number of workers for a single
parallel operation (intra-node parallelism) and incase we have to perform
inter-node parallelism than having such an parameter means that each
node can use that many number of parallel worker. For example we have
to parallelize scan as well as sort (Select * from t1 order by c1), and
parallel_degree is specified as 2, then each of the scan and sort can use
2 parallel workers each.

This is somewhat similar to the concept how degree of parallelism (DOP)
works in other databases. Refer case of Oracle [1] (Setting Degree of
Parallelism).

I don't deny the fact that it will be a idea worth exploring to make
optimizer
more smart for deciding parallel plans, but it seems to me it is an advanced
topic which will be more valuable when we will try to parallelize joins or
other
similar stuff and even most papers talk about it in those regards only.
At this moment if we can ensure that parallel plan should not be selected
for cases where it will perform poorly is more than enough considering
we have lots of other work left to even make any parallel operation work.

 Perhaps we can come up
 with an efficient way of working out where the break-point is based on
 the non-parallel cost and go at it from that direction instead of
 building out whole paths for each increment of parallelism.

 I'd really like to be able to set the 'max parallel' high and then have
 the optimizer figure out how many workers should actually be spawned for
 a given query.

  Execution:
  Most other databases does partition level scan for partition on
  different disks by each individual parallel worker.  However,
  it seems amazon dynamodb [2] also works on something
  similar to what I have used in patch which means on fixed
  blocks.  I think this kind of strategy seems better than dividing
  the blocks at runtime because dividing randomly the blocks
  among workers could lead to random scan for a parallel
  sequential scan.

 Yeah, we also need to consider the i/o side of this, which will
 definitely be tricky.  There are i/o systems out there which are faster
 than a single CPU and ones where a single CPU can manage multiple i/o
 channels.  There are also cases where the i/o system handles sequential
 access nearly as fast as random and cases where sequential is much
 faster than random.  Where we can get an idea of that distinction is
 with seq_page_cost vs. random_page_cost as folks running 

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Petr Jelinek

On 10/01/15 01:19, Kohei KaiGai wrote:

2015-01-10 8:18 GMT+09:00 Jim Nasby jim.na...@bluetreble.com:

On 1/6/15, 5:43 PM, Kouhei Kaigai wrote:


scan_relid != InvalidOid






Ideally, they should be OidIsValid(scan_relid)



Scan.scanrelid is an index of range-tables list, not an object-id.
So, InvalidOid or OidIsValid() are not a good choice.



I think the name needs to change then; scan_relid certainly looks like the
OID of a relation.

scan_index?


Yep, I had a same impression when I looked at the code first time,
however, it is defined as below. Not a manner of custom-scan itself.

/*
  * ==
  * Scan nodes
  * ==
  */
typedef struct Scan
{
 Planplan;
 Index   scanrelid;  /* relid is index into the range table */
} Scan;



Yeah there are actually several places in the code where relid means 
index in range table and not oid of relation, it still manages to 
confuse me. Nothing this patch can do about that.



--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Updating copyright notices to 2015 for PGDG

2015-01-09 Thread Jim Nasby

On 1/6/15, 6:15 PM, Tom Lane wrote:

Jim Nasby jim.na...@bluetreble.com writes:

On 1/6/15, 3:30 PM, Stefan Kaltenbrunner wrote:

I dont know why it is really needed but maybe for the files that have
identical copyrights one could simple reference to the COPYRIGHT file we
already have in the tree?



+1


Unless either of you is a copyright lawyer, your opinion on whether that's
sufficient is of zero relevance.


No, but a friend's dad is. I could probably get his opinion on it if that would 
be helpful.


Personally I think it's just fine if we have some mechanism that forces
text files to have trailing newlines ;-)


Which is what I was suggesting... ;) AFAIK that would be a matter of installing 
the appropriate hook on git.postgresql.org.

https://www.google.com/search?q=git+enforce+trailing+newline
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Kohei KaiGai
2015-01-10 9:56 GMT+09:00 Jim Nasby jim.na...@bluetreble.com:
 On 1/9/15, 6:54 PM, Jim Nasby wrote:

 On 1/9/15, 6:44 PM, Petr Jelinek wrote:


 Yep, I had a same impression when I looked at the code first time,
 however, it is defined as below. Not a manner of custom-scan itself.

 /*
   * ==
   * Scan nodes
   * ==
   */
 typedef struct Scan
 {
  Planplan;
  Index   scanrelid;  /* relid is index into the range table
 */
 } Scan;


 Yeah there are actually several places in the code where relid means
 index in range table and not oid of relation, it still manages to confuse
 me. Nothing this patch can do about that.


 Well, since it's confused 3 of us now... should we change it (as a
 separate patch)? I'm willing to do that work but don't want to waste time if
 it'll just be rejected.

 Any other examples of this I should fix too?


 Sorry, to clarify... any other items besides Scan.scanrelid that I should
 fix?

This naming is a little bit confusing, however, I don't think it should be
changed because this structure has been used for a long time, so reworking
will prevent back-patching when we find bugs around scanrelid.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Amit Kapila
On Sat, Jan 10, 2015 at 2:45 AM, Stefan Kaltenbrunner
ste...@kaltenbrunner.cc wrote:

 On 01/09/2015 08:01 PM, Stephen Frost wrote:
  Amit,
 
  * Amit Kapila (amit.kapil...@gmail.com) wrote:
  On Fri, Jan 9, 2015 at 1:02 AM, Jim Nasby jim.na...@bluetreble.com
wrote:
  I agree, but we should try and warn the user if they set
  parallel_seqscan_degree close to max_worker_processes, or at least
give
  some indication of what's going on. This is something you could end up
  beating your head on wondering why it's not working.
 
  Yet another way to handle the case when enough workers are not
  available is to let user  specify the desired minimum percentage of
  requested parallel workers with parameter like
  PARALLEL_QUERY_MIN_PERCENT. For  example, if you specify
  50 for this parameter, then at least 50% of the parallel workers
  requested for any  parallel operation must be available in order for
  the operation to succeed else it will give error. If the value is set
to
  null, then all parallel operations will proceed as long as at least two
  parallel workers are available for processing.
 

  Now, for debugging purposes, I could see such a parameter being
  available but it should default to 'off/never-fail'.

 not sure what it really would be useful for - if I execute a query I
 would truely expect it to get answered - if it can be made faster if
 done in parallel thats nice but why would I want it to fail?


One usecase where I could imagine it to be useful is when the
query is going to take many hours if run sequentially and it could
be finished in minutes if run with 16 parallel workers, now let us
say during execution if there are less than 30% of parallel workers
available it might not be acceptable to user and he would like to
rather wait for some time and again run the query and if he wants
to run query even if 2 workers are available, he can choose not
to such a parameter.

Having said that, I also feel this doesn't seem to be an important case
to introduce a new parameter and such a behaviour.  I have mentioned,
because it came across my eyes how some other databases handle
such a situation.  Lets forget this suggestion if we can't imagine any
use of such a parameter.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] max_connections documentation

2015-01-09 Thread Amit Kapila
On Sat, Jan 10, 2015 at 6:20 AM, Jim Nasby jim.na...@bluetreble.com wrote:

 I'm surprised to see that the docs make no mention of how
max_connections, max_worker_processes and autovacuum_max_workers (don't)
relate. I couldn't remember and had to actually look at the code. I'd like
to clarify this in the max_connecitons section of the documents by doing
s/connections/user connections/ and including the formula for MaxBackends
(MaxBackends = MaxConnections + autovacuum_max_workers + 1 +
max_worker_processes). I'll also mention that any postgres_fdw connections
are considered user connections.


I think it makes sense to add such a clarification in docs, however
not sure if specifying along with max_connections parameter is
good idea as the formula is somewhat internal and is not directly
related to max_connections.  How about specifying in below page:

http://www.postgresql.org/docs/devel/static/connect-estab.html


 Also, my understanding is that the parallel stuff will continue to fall
under max_worker_processes?

Yes.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] libpq 9.4 requires /etc/passwd?

2015-01-09 Thread Bruce Momjian
On Fri, Jan  9, 2015 at 06:57:02PM -0500, Tom Lane wrote:
 I wrote:
  Christoph Berg c...@df7cb.de writes:
  libpq wants the user home directory to find .pgpass and
  .pg_service.conf files, but apparently the behavior to require the
  existence of the passwd file (or nss equivalent) is new in 9.4.
 
  There is demonstrably no direct reference to /etc/passwd in the PG code.
  It's possible that we've added some new expectation that $HOME can be
  identified, but a quick look through the code shows no such checks that
  don't look like they've been there for some time.
 
 Hmm ... actually, I'll bet it's not $HOME that's at issue, but the name
 of the user.  Commit a4c8f14364c27508233f8a31ac4b10a4c90235a9 turned
 failure of pg_fe_getauthname() into a hard connection failure, whereas
 previously it was harmless as long as the caller provided a username.
 
 I wonder if we shouldn't just revert that commit in toto.  Yeah,
 identifying an out-of-memory error might be useful, but this cure
 seems a lot worse than the disease.  What's more, this coding reports
 *any* pg_fe_getauthname failure as out of memory, which is much worse
 than useless.
 
 Alternatively, maybe don't try to resolve username this way until
 we've found that the caller isn't providing any username.

Seems we both found it at the same time.  Here is the thread were we
discussed it, and you were concerned about the memory allocation failure
too:


http://www.postgresql.org/message-id/flat/20140320154905.gc7...@momjian.us#20140320154905.gc7...@momjian.us

In particular, it appears to me that if the strdup in pg_fe_getauthname
fails, we'll just let that pass without comment, which is inconsistent
with all the other out-of-memory cases in conninfo_add_defaults.
(I wonder whether any code downstream of this supposes that we always
have a value for the user option.  It's a pretty safe bet that the
case is hardly ever tested.)

I have developed the attached patch which documents why the user name
lookup might fail, and sets the failure string to .  It preserves the
memory allocation failure behavior.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/interfaces/libpq/fe-auth.c b/src/interfaces/libpq/fe-auth.c
new file mode 100644
index 179793e..b9155b4
*** a/src/interfaces/libpq/fe-auth.c
--- b/src/interfaces/libpq/fe-auth.c
*** pg_fe_sendauth(AuthRequest areq, PGconn
*** 715,726 
   * pg_fe_getauthname
   *
   * Returns a pointer to dynamic space containing whatever name the user
!  * has authenticated to the system.  If there is an error, return NULL.
   */
  char *
  pg_fe_getauthname(void)
  {
! 	const char *name = NULL;
  	char	   *authn;
  
  #ifdef WIN32
--- 715,731 
   * pg_fe_getauthname
   *
   * Returns a pointer to dynamic space containing whatever name the user
!  * has authenticated to the system.  Returns NULL on memory allocation
!  * failure, and a zero-length string on user name lookup failure.
   */
  char *
  pg_fe_getauthname(void)
  {
! 	/*
! 	 * We might be in a chroot environment where we can't look up our own
! 	 * user name, so return .
! 	 */
! 	const char *name = ;
  	char	   *authn;
  
  #ifdef WIN32

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] libpq 9.4 requires /etc/passwd?

2015-01-09 Thread Christoph Berg
Hi,

I've got several reports that postfix's pgsql lookup tables are broken
with 9.4's libpq5, while 9.3's libpq5 works just fine. The error
message looks like this:

Jan 10 00:11:40 lehmann postfix/trivial-rewrite[29960]: warning: connect to 
pgsql server localhost:5432: out of memory?
Jan 10 00:11:40 lehmann postfix/trivial-rewrite[29960]: warning: 
pgsql:/etc/postfix/pgsqltest lookup error for *

The out of memory? message comes from PQsetdbLogin and
PQerrorMessage in postfix-2.11.3/src/global/dict_pgsql.c:

if ((host-db = PQsetdbLogin(host-name, host-port, NULL, NULL,
 dbname, username, password)) == NULL
|| PQstatus(host-db) != CONNECTION_OK) {
msg_warn(connect to pgsql server %s: %s,
 host-hostname, PQerrorMessage(host-db));

There are two ways around the problem on the postfix side: not running
the postfix service chrooted, or using a proxy map which effectively
does the lookup also from outside the chroot.

Debian bug https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=756627
mentions another solution: creation of an /etc/passwd file inside the
postfix chroot.

libpq wants the user home directory to find .pgpass and
.pg_service.conf files, but apparently the behavior to require the
existence of the passwd file (or nss equivalent) is new in 9.4.

I've tried to make sense of the fe-connect.c code to find the issue
but couldn't spot it. Can someone explain what's going on there? Is
this a bug in libpq? (It's certainly a regression.)

Christoph
-- 
c...@df7cb.de | http://www.df7cb.de/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq 9.4 requires /etc/passwd?

2015-01-09 Thread Tom Lane
Christoph Berg c...@df7cb.de writes:
 libpq wants the user home directory to find .pgpass and
 .pg_service.conf files, but apparently the behavior to require the
 existence of the passwd file (or nss equivalent) is new in 9.4.

There is demonstrably no direct reference to /etc/passwd in the PG code.
It's possible that we've added some new expectation that $HOME can be
identified, but a quick look through the code shows no such checks that
don't look like they've been there for some time.

Are the complainants doing anything that would result in SSL certificate
checking?  More generally, it'd be useful to see an exact example of
what are the connection parameters and environment that result in a
failure.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq 9.4 requires /etc/passwd?

2015-01-09 Thread Tom Lane
I wrote:
 Christoph Berg c...@df7cb.de writes:
 libpq wants the user home directory to find .pgpass and
 .pg_service.conf files, but apparently the behavior to require the
 existence of the passwd file (or nss equivalent) is new in 9.4.

 There is demonstrably no direct reference to /etc/passwd in the PG code.
 It's possible that we've added some new expectation that $HOME can be
 identified, but a quick look through the code shows no such checks that
 don't look like they've been there for some time.

Hmm ... actually, I'll bet it's not $HOME that's at issue, but the name
of the user.  Commit a4c8f14364c27508233f8a31ac4b10a4c90235a9 turned
failure of pg_fe_getauthname() into a hard connection failure, whereas
previously it was harmless as long as the caller provided a username.

I wonder if we shouldn't just revert that commit in toto.  Yeah,
identifying an out-of-memory error might be useful, but this cure
seems a lot worse than the disease.  What's more, this coding reports
*any* pg_fe_getauthname failure as out of memory, which is much worse
than useless.

Alternatively, maybe don't try to resolve username this way until
we've found that the caller isn't providing any username.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] libpq 9.4 requires /etc/passwd?

2015-01-09 Thread Bruce Momjian
On Fri, Jan  9, 2015 at 06:42:19PM -0500, Tom Lane wrote:
 Christoph Berg c...@df7cb.de writes:
  libpq wants the user home directory to find .pgpass and
  .pg_service.conf files, but apparently the behavior to require the
  existence of the passwd file (or nss equivalent) is new in 9.4.
 
 There is demonstrably no direct reference to /etc/passwd in the PG code.
 It's possible that we've added some new expectation that $HOME can be
 identified, but a quick look through the code shows no such checks that
 don't look like they've been there for some time.
 
 Are the complainants doing anything that would result in SSL certificate
 checking?  More generally, it'd be useful to see an exact example of
 what are the connection parameters and environment that result in a
 failure.

I see similar error strings, but nothing that ends with a question mark:
out of memory?.  However, I wonder if the crux of the problem is that
they are running in a chroot environment where the user id can't be
looked up.

Looking at libpq's pg_fe_getauthname(), I see that if it can't get the
OS user name of the effective user, it assumes it failed and returns
NULL:

/*
 * We document PQconndefaults() to return NULL for a memory 
allocation
 * failure.  We don't have an API to return a user name lookup 
failure, so
 * we just assume it always succeeds.
 */
#ifdef WIN32
if (GetUserName(username, namesize))
name = username;
#else
if (pqGetpwuid(geteuid(), pwdstr, pwdbuf, sizeof(pwdbuf), pw) == 
0)
name = pw-pw_name;
#endif

authn = name ? strdup(name) : NULL;

and conninfo_add_defaults() assumes a pg_fe_getauthname() failure is a
memory allocation failure:

option-val = pg_fe_getauthname();
if (!option-val)
{
if (errorMessage)
printfPQExpBuffer(errorMessage,
  libpq_gettext(out of 
memory\n));
return false;

It was my 9.4 commit that changed this:

commit a4c8f14364c27508233f8a31ac4b10a4c90235a9
Author: Bruce Momjian br...@momjian.us
Date:   Thu Mar 20 11:48:31 2014 -0400

libpq:  pass a memory allocation failure error up to 
PQconndefaults()

Previously user name memory allocation failures were ignored and the
default user name set to NULL.

I am thinking we have to now assume that user name lookups can fail for
other reasons.  I am unclear how this worked in the past though.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Transactions involving multiple postgres foreign servers

2015-01-09 Thread Jim Nasby

On 1/8/15, 12:00 PM, Kevin Grittner wrote:

Robert Haas robertmh...@gmail.com wrote:

On Thu, Jan 8, 2015 at 10:19 AM, Kevin Grittner kgri...@ymail.com wrote:

Robert Haas robertmh...@gmail.com wrote:

Andres is talking in my other ear suggesting that we ought to
reuse the 2PC infrastructure to do all this.


If you mean that the primary transaction and all FDWs in the
transaction must use 2PC, that is what I was saying, although
apparently not clearly enough.  All nodes *including the local one*
must be prepared and committed with data about the nodes saved
safely off somewhere that it can be read in the event of a failure
of any of the nodes *including the local one*.  Without that, I see
this whole approach as a train wreck just waiting to happen.


Clearly, all the nodes other than the local one need to use 2PC.  I am
unconvinced that the local node must write a 2PC state file only to
turn around and remove it again almost immediately thereafter.


The key point is that the distributed transaction data must be
flagged as needing to commit rather than roll back between the
prepare phase and the final commit.  If you try to avoid the
PREPARE, flagging, COMMIT PREPARED sequence by building the
flagging of the distributed transaction metadata into the COMMIT
process, you still have the problem of what to do on crash
recovery.  You really need to use 2PC to keep that clean, I think.


If we had an independent transaction coordinator then I agree with you Kevin. I 
think Robert is proposing that if we are controlling one of the nodes that's 
participating as well as coordinating the overall transaction that we can take 
some shortcuts. AIUI a PREPARE means you are completely ready to commit. In 
essence you're just waiting to write and fsync the commit message. That is in 
fact the state that a coordinating PG node would be in by the time everyone 
else has done their prepare. So from that standpoint we're OK.

Now, as soon as ANY of the nodes commit, our coordinating node MUST be able to 
commit as well! That would require it to have a real prepared transaction of 
it's own created. However, as long as there is zero chance of any other 
prepared transactions committing before our local transaction, that step isn't 
actually needed. Our local transaction will either commit or abort, and that 
will determine what needs to happen on all other nodes.

I'm ignoring the question of how the local node needs to store info about the 
other nodes in case of a crash, but AFAICT you could reliably recover manually 
from what I just described.

I think the question is: are we OK with going under the skirt in this 
fashion? Presumably it would provide better performance, whereas forcing ourselves to eat 
our own 2PC dogfood would presumably make it easier for someone to plugin an external 
coordinator instead of using our own. I think there's also a lot to be said for getting a 
partial implementation of this available today (requiring manual recovery), so long as 
it's not in core.

BTW, I found 
https://www.cs.rutgers.edu/~pxk/417/notes/content/transactions.html a useful 
read, specifically the 2PC portion.


I'm not really clear on the mechanism that is being proposed for
doing this, but one way would be to have the PREPARE of the local
transaction be requested explicitly and to have that cause all FDWs
participating in the transaction to also be prepared.  (That might
be what Andres meant; I don't know.)


We want this to be client-transparent, so that the client just says
COMMIT and everything Just Works.


What about the case where one or more nodes doesn't support 2PC.
Do we silently make the choice, without the client really knowing?


We abort. (Unless we want to have a running_with_scissors GUC...)


That doesn't strike me as the
only possible mechanism to drive this, but it might well be the
simplest and cleanest.  The trickiest bit might be to find a good
way to persist the distributed transaction information in a way
that survives the failure of the main transaction -- or even the
abrupt loss of the machine it's running on.


I'd be willing to punt on surviving a loss of the entire machine.  But
I'd like to be able to survive an abrupt reboot.


As long as people are aware that there is an urgent need to find
and fix all data stores to which clusters on the failed machine
were connected via FDW when there is a hard machine failure, I
guess it is OK.  In essence we just document it and declare it to
be somebody else's problem.  In general I would expect a
distributed transaction manager to behave well in the face of any
single-machine failure, but if there is one aspect of a
full-featured distributed transaction manager we could give up, I
guess that would be it.


ISTM that one option here would be to simply write and sync WAL record(s) of 
all externally prepared transactions. That would be enough for a hot standby to find all 
the other servers and tell them to either commit or abort, based on whether 

Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Jim Nasby

On 1/9/15, 3:34 PM, Stephen Frost wrote:

* Stefan Kaltenbrunner (ste...@kaltenbrunner.cc) wrote:

On 01/09/2015 08:01 PM, Stephen Frost wrote:

Now, for debugging purposes, I could see such a parameter being
available but it should default to 'off/never-fail'.


not sure what it really would be useful for - if I execute a query I
would truely expect it to get answered - if it can be made faster if
done in parallel thats nice but why would I want it to fail?


I was thinking for debugging only, though I'm not really sure why you'd
need it if you get a NOTICE when you don't end up with all the workers
you expect.


Yeah, debugging is my concern as well. You're working on a query, you expect it 
to be using parallelism, and EXPLAIN is showing it's not. Now you're scratching 
your head.
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Kohei KaiGai
2015-01-10 8:18 GMT+09:00 Jim Nasby jim.na...@bluetreble.com:
 On 1/6/15, 5:43 PM, Kouhei Kaigai wrote:

 scan_relid != InvalidOid
  

 
 Ideally, they should be OidIsValid(scan_relid)
 

 Scan.scanrelid is an index of range-tables list, not an object-id.
 So, InvalidOid or OidIsValid() are not a good choice.


 I think the name needs to change then; scan_relid certainly looks like the
 OID of a relation.

 scan_index?

Yep, I had a same impression when I looked at the code first time,
however, it is defined as below. Not a manner of custom-scan itself.

/*
 * ==
 * Scan nodes
 * ==
 */
typedef struct Scan
{
Planplan;
Index   scanrelid;  /* relid is index into the range table */
} Scan;

-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-09 Thread Dean Rasheed
On 9 January 2015 at 20:46, Stephen Frost sfr...@snowman.net wrote:
 I'd suggest we further clarify
 with:

The commandCREATE POLICY/command command defines a new policy for a
table.  Note that row level security must also be enabled on the table 
 using
commandALTER TABLE/command in order for created policies to be applied.
Once row level security has been enabled, a default-deny policy is used and
no rows in the table are visible, except to the table owner or
superuser, unless permitted by a specific policy.

A policy permits SELECT, INSERT, UPDATE or DELETE commands to access rows
in a table that has row level security enabled.  Access to existing table
rows is granted if they match a policy expression specified via USING,
while new rows that would be created via INSERT or UPDATE are checked
against policy expressions specified via WITH CHECK.  For policy
expressions specified via USING which grant access to existing rows, the
system will generally test the policy expressions prior to any
qualifications that appear in the query itself, in order to the prevent the
inadvertent exposure of the protected data to user-defined functions which
might not be trustworthy.  However, functions and operators marked by the
system (or the system administrator) as LEAKPROOF may be evaluated before
policy expressions, as they are assumed to be trustworthy.


Looks good to me.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Jim Nasby

On 1/9/15, 11:24 AM, Stephen Frost wrote:

What I was advocating for up-thread was to consider multiple parallel
paths and to pick whichever ends up being the lowest overall cost.  The
flip-side to that is increased planning time.  Perhaps we can come up
with an efficient way of working out where the break-point is based on
the non-parallel cost and go at it from that direction instead of
building out whole paths for each increment of parallelism.


I think at some point we'll need the ability to stop planning part-way through 
for queries producing really small estimates. If the first estimate you get is 
1000 units, does it really make sense to do something like try every possible 
join permutation, or attempt to parallelize?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Translating xlogreader.c

2015-01-09 Thread Alvaro Herrera
Heikki Linnakangas wrote:
 xlogreader.c contains a bunch of strings passed to the report_invalid_record
 function, that are supposed to be translated. src/backend/nls.mk lists
 report_invalid_record as a gettext trigger.
 
 In 9.2 and below, when those functions were still in xlog, those strings
 were in the postgres.pot file, and translated. But not since 9.3, when they
 were moved to xlogreader.c
 
 What's going on?

I missed a :2 in GETTEXT_TRIGGERS.  Fixed now, backpatched to 9.3.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improving RLS qual pushdown

2015-01-09 Thread Stephen Frost
Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
 A while ago [1] I proposed an enhancement to the way qual pushdown
 safety is decided in RLS / security barrier views. Currently we just
 test for the presence of leaky functions in the qual, but it is
 possible to do better than that, by further testing if the leaky
 function is actually being passed information that we don't want to be
 leaked.

This certainly sounds reasonable to me.

 In fact the majority of builtin functions aren't marked leakproof, and
 probably most user functions aren't either, so this could potentially
 be useful in a wide range of real-world queries, where it is common to
 write quals of the form column operator expression, and the
 expression may contain leaky functions.

Agreed.

Looks like you've already added it to the next commitfest, which is
great.  I'm definitely interested but probably won't get to it right
away as I have a few other things to address.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Comment typo in src/backend/executor/execMain.c

2015-01-09 Thread Stephen Frost
Etsuro,

* Etsuro Fujita (fujita.ets...@lab.ntt.co.jp) wrote:
 I ran into a comment type.  Please find attached a patch.

Fix pushed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Dean Rasheed
On 9 January 2015 at 00:49, Stephen Frost sfr...@snowman.net wrote:
 Peter,

 * Peter Geoghegan (p...@heroku.com) wrote:
 For column level privileges, you wouldn't expect to only get an error
 about not having the relevant update permissions at runtime, when the
 update path happens to be taken. And so it is for RLS.

 Right, that's the precedent we should be considering.  Column-level
 privileges is a great example- you need both insert and update
 privileges for the columns involved for the command to succeed.  It
 shouldn't depend on which path actually ends up being taken.


It's not the same as column-level privileges though, because RLS
policies depend on the new/existing data, not just the table metadata.
So for example an UPDATE ... WHERE false can fail column-level
privilege checks even though it isn't updating anything, but it cannot
fail a RLS policy check.

I was trying to think up an example where you might actually have
different INSERT and UPDATE policies, and the best I can think of is
some sort of mod_count column where you have an INSERT CHECK
(mod_count = 0) and an UPDATE CHECK (mod_count  0). In that case,
checking both policies would make an UPSERT impossible, whereas if you
think of it as doing either an INSERT or an UPDATE, as the syntax
suggests, it becomes possible.

(I'm assuming here from your description that you intend for the
policy expressions to be AND'ed together, so it might end up being an
AND of 2 OR expressions.)

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Peter Geoghegan
On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 I was trying to think up an example where you might actually have
 different INSERT and UPDATE policies, and the best I can think of is
 some sort of mod_count column where you have an INSERT CHECK
 (mod_count = 0) and an UPDATE CHECK (mod_count  0). In that case,
 checking both policies would make an UPSERT impossible, whereas if you
 think of it as doing either an INSERT or an UPDATE, as the syntax
 suggests, it becomes possible.

Why does this user want to do this upsert? If they're upserting, then
the inserted row could only reasonably have a value of (mod_count =
0). If updating, then they must have a constant value for the update
path (a value that's greater than 0, naturally - say 2), which doesn't
make any sense in the context of an upsert's auxiliary update - what
happened to the 0 value? Sorry, but I don't think your example makes
sense - I can't see what would motivate anyone to write a query like
that with those RLS policies in place. It sounds like you're talking
about an insert and a separate update that may or may not affect the
same row, and not an upsert. Then those policies make sense, but in
practice they render the upsert you describe contradictory.

FWIW, I'm not suggesting that there couldn't possibly be a use case
that doesn't do well with this convention where we enforce RLS
deepening on the path taken. The cases are just very marginal, as I
think your difficulty in coming up with a convincing counter-argument
shows. I happen to think what Stephen and I favor (bunching together
USING() barrier quals and check options from INSERT and UPDATE
policies) is likely to be the best alternative available on balance.

More generally, you could point out that I'm actually testing
different tuples at different points in query processing under that
regime (e.g. the post-insert tuple, or the before-update conflicting,
existing tuple from the target, or the post update tuple) and so
things could fail when the update path is taken despite the fact that
they didn't fail when the insert path was taken. That's technically
true, of course, but with idiomatic usage it isn't true, and that's
what I care about.

Does anyone have another counter-example of a practical upsert
statement that cannot be used with certain RLS policies due to the
fact that we chose to bunch together INSERT and UPDATE RLS policies?
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewind in contrib

2015-01-09 Thread Heikki Linnakangas

On 01/08/2015 10:44 PM, Peter Eisentraut wrote:

On 1/6/15 7:17 PM, Andres Freund wrote:

One problem is that it doesn't use the replication protocol,

so the setup is going to be inconsistent with pg_basebackup.  Maybe the
replication protocol could be extended to provide the required data.

I'm not particularly bothered by the requirement of also requiring a
normal, not replication, connection. In most cases that'll already be
allowed.


I don't agree.  We have separated out replication access, especially in
pg_hba.conf and with the replication role attribute, for a reason.  (I
hope there was a reason; I don't remember.)  It is not unreasonable to
set things up so that non-replication access is only from the
application tier, and replication access is only from within the
database tier.


I agree. A big difference between a replication connection and a normal 
database connection is that a replication connection is not bound to any 
particular database. It also cannot run transactions, and many other things.


It would nevertheless be handy to be able to do more stuff in a 
replication connection. For example, you might want to run functions 
like pg_create_restore_point(), pg_xlog_replay_pause/resume etc. in a 
replication connection. We should extend the replication protocol to 
allow such things. I'm not sure what it would look like; we can't run 
arbitrary queries when not being connected to a database, or arbitrary 
functions, but there are a lot of functions that would make sense.



Now I understand that making pg_rewind work over a replication
connection is a lot more work, and maybe we don't want to spend it, at
least right now.  But then we either need to document this as an
explicit deficiency and think about fixing it later, or we should
revisit how replication access control is handled.


Right, that's how I've been thinking about this. I don't want to start 
working on the replication protocol right now, I think we can live with 
it as it is, but it sure would be nicer if it worked over a replication 
connection.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What exactly is our CRC algorithm?

2015-01-09 Thread Abhijit Menon-Sen
Hi Heikki.

I've attached two regenerated CRC patches, split up as before.

1. The slicing-by-8 patch contains numerous changes:

a. A configure test for __builtin_bswap32
b. A comment referencing the slicing-by-8 paper (which is behind a
   paywall, unfortunately, so I haven't even read it). Are more
   comments needed? If so, where/what kind?
c. A byte-reversed big-endian version of the 8*256 table. In Linux,
   there's only one table that uses __constant_swab32, but for us
   it's simpler to have two tables.
d. Thanks to (c), we can avoid the bswap32 in the hot loop.
e. On big-endian systems, FIN_CRC32C now bswap32()s the CRC before
   finalising it. (We don't need to do this in INIT_CRC32C only
   because the initialiser is 0x.)

2. The sse4.2 patch has only some minor compile fixes.

I have built and tested both patches individually on little-endian
(amd64) and big-endian (ppc) systems. I verified that the _sse is
chosen at startup on the former, and _sb8 on the latter, and that
both implementations function correctly with respect to HEAD.

Please let me know if there's anything else I need to do.

-- Abhijit
From a202673fa8e4e60e7fd5087fd8577a90a75e90ee Mon Sep 17 00:00:00 2001
From: Abhijit Menon-Sen a...@2ndquadrant.com
Date: Fri, 9 Jan 2015 10:38:47 +0530
Subject: Implement slicing-by-8 CRC calculation

The COMP_CRC32C macro now calls pg_comp_crc32c(), which processes eight
data bytes at a time. Its output is identical to the byte-at-a-time CRC
code. (This change does not apply to the LEGACY_CRC32 computation.)

Reviewers: Andres Freund, Heikki Linnakangas

Author: Abhijit Menon-Sen
---
 config/c-compiler.m4  |   17 +
 configure |   30 +
 configure.in  |1 +
 src/include/pg_config.h.in|3 +
 src/include/pg_config.h.win32 |3 +
 src/include/utils/pg_crc.h|   11 +-
 src/include/utils/pg_crc_tables.h | 1128 ++---
 src/port/pg_crc.c |  105 +++-
 8 files changed, 1228 insertions(+), 70 deletions(-)

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index 90b56e7..509f961 100644
--- a/config/c-compiler.m4
+++ b/config/c-compiler.m4
@@ -193,6 +193,23 @@ fi])# PGAC_C_TYPES_COMPATIBLE
 
 
 
+# PGAC_C_BUILTIN_BSWAP32
+# -
+# Check if the C compiler understands __builtin_bswap32(),
+# and define HAVE__BUILTIN_BSWAP32 if so.
+AC_DEFUN([PGAC_C_BUILTIN_BSWAP32],
+[AC_CACHE_CHECK(for __builtin_bswap32, pgac_cv__builtin_bswap32,
+[AC_TRY_COMPILE([static unsigned long int x = __builtin_bswap32(0xaabbccdd);],
+[],
+[pgac_cv__builtin_bswap32=yes],
+[pgac_cv__builtin_bswap32=no])])
+if test x$pgac_cv__builtin_bswap32 = xyes ; then
+AC_DEFINE(HAVE__BUILTIN_BSWAP32, 1,
+  [Define to 1 if your compiler understands __builtin_bswap32.])
+fi])# PGAC_C_BUILTIN_BSWAP32
+
+
+
 # PGAC_C_BUILTIN_CONSTANT_P
 # -
 # Check if the C compiler understands __builtin_constant_p(),
diff --git a/configure b/configure
index fce4908..27092ec 100755
--- a/configure
+++ b/configure
@@ -10324,6 +10324,36 @@ if test x$pgac_cv__types_compatible = xyes ; then
 $as_echo #define HAVE__BUILTIN_TYPES_COMPATIBLE_P 1 confdefs.h
 
 fi
+{ $as_echo $as_me:${as_lineno-$LINENO}: checking for __builtin_bswap32 5
+$as_echo_n checking for __builtin_bswap32...  6; }
+if ${pgac_cv__builtin_bswap32+:} false; then :
+  $as_echo_n (cached)  6
+else
+  cat confdefs.h - _ACEOF conftest.$ac_ext
+/* end confdefs.h.  */
+static unsigned long int x = __builtin_bswap32(0xaabbccdd);
+int
+main ()
+{
+
+  ;
+  return 0;
+}
+_ACEOF
+if ac_fn_c_try_compile $LINENO; then :
+  pgac_cv__builtin_bswap32=yes
+else
+  pgac_cv__builtin_bswap32=no
+fi
+rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+fi
+{ $as_echo $as_me:${as_lineno-$LINENO}: result: $pgac_cv__builtin_bswap32 5
+$as_echo $pgac_cv__builtin_bswap32 6; }
+if test x$pgac_cv__builtin_bswap32 = xyes ; then
+
+$as_echo #define HAVE__BUILTIN_BSWAP32 1 confdefs.h
+
+fi
 { $as_echo $as_me:${as_lineno-$LINENO}: checking for __builtin_constant_p 5
 $as_echo_n checking for __builtin_constant_p...  6; }
 if ${pgac_cv__builtin_constant_p+:} false; then :
diff --git a/configure.in b/configure.in
index 6aa69fb..0206836 100644
--- a/configure.in
+++ b/configure.in
@@ -1176,6 +1176,7 @@ PGAC_C_SIGNED
 PGAC_C_FUNCNAME_SUPPORT
 PGAC_C_STATIC_ASSERT
 PGAC_C_TYPES_COMPATIBLE
+PGAC_C_BUILTIN_BSWAP32
 PGAC_C_BUILTIN_CONSTANT_P
 PGAC_C_BUILTIN_UNREACHABLE
 PGAC_C_VA_ARGS
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 83f04e9..7962757 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -666,6 +666,9 @@
 /* Define to 1 if you have the winldap.h header file. */
 #undef HAVE_WINLDAP_H
 
+/* Define to 1 if your compiler understands __builtin_bswap32. */
+#undef HAVE__BUILTIN_BSWAP32
+
 /* Define to 1 if your compiler understands 

Re: [HACKERS] TABLESAMPLE patch

2015-01-09 Thread Michael Paquier
On Fri, Jan 9, 2015 at 1:10 AM, Petr Jelinek p...@2ndquadrant.com wrote:
 On 06/01/15 14:22, Petr Jelinek wrote:

 On 06/01/15 08:51, Michael Paquier wrote:

 On Tue, Dec 23, 2014 at 5:21 AM, Petr Jelinek p...@2ndquadrant.com
 wrote:

 Attached is v3 which besides the fixes mentioned above also includes
 changes
 discussed with Tomas (except the CREATE/DROP TABLESAMPLE METHOD),
 fixes for
 crash with FETCH FIRST and is rebased against current master.

 This patch needs a rebase, there is a small conflict in
 parallel_schedule.


 Sigh, I really wish we had automation that checks this automatically for
 patches in CF.


 Here is rebase against current master.
Thanks!


 Structurally speaking, I think that the tsm methods should be added in
 src/backend/utils and not src/backend/access which is more high-level
 as tsm_bernoulli.c and tsm_system.c contain only a set of new


 I am not sure if I parsed this correctly, do you mean to say that only
 low-level access functions belong to src/backend/access? Makes sense.


 Made this change.


 procedure functions. Having a single header file tsm.h would be also a
 good thing.



 Moved into single tablesample.h file.


 Regarding the naming, is tsm (table sample method) really appealing?
 Wouldn't it be better to use simply tablesample_* for the file names
 and the method names?


 I created the src/backend/tablesample and files are named just system.c and
 bernoulli.c, but I kept tsm_ prefix for methods as they become too long for
 my taste when prefixing with tablesample_.

 This is a large patch... Wouldn't sampling.[c|h] extracted from
 ANALYZE live better as a refactoring patch? This would limit a bit bug
 occurrences on the main patch.


 That's a good idea, I'll split it into patch series.


 I've split the sampling.c/h into separate patch.

 I also wrote basic CREATE/DROP TABLESAMPLE METHOD support, again as separate
 patch in the attached patch-set. This also includes modules test with simple
 custom tablesample method.

 There are some very minor cleanups in the main tablesample code itself but
 no functional changes.

Some comments about the 1st patch:
1) Nitpicking:
+ *   Block sampling routines shared by ANALYZE and TABLESAMPLE.
TABLESAMPLE is not added yet. You may as well mention simplify this
description with something like Sampling routines for relation
blocks.

2) file_fdw and postgres_fdw do not compile correctly as they still
use anl_random_fract. This function is still mentioned in vacuum.h as
well.

3) Not really an issue of this patch, but I'd think that this comment
should be reworked:
+ * BlockSampler is used for stage one of our new two-stage tuple
+ * sampling mechanism as discussed on pgsql-hackers 2004-04-02 (subject
+ * Large DB).  It selects a random sample of samplesize blocks out of
+ * the nblocks blocks in the table.  If the table has less than
+ * samplesize blocks, all blocks are selected.

4) As a refactoring patch, why is the function providing a random
value changed? Shouldn't sample_random_fract be consistent with
anl_random_fract?

5) The seed numbers can be changed to RAND48_SEED_0, RAND48_SEED_1 and
RAND48_SEED_2 instead of being hardcoded?
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Async execution of postgres_fdw.

2015-01-09 Thread Kyotaro HORIGUCHI
Hello, thank you for the comment.

This is the second version of the patch.

- Refactored to make the code simpler and clearer.
- Added comment about logic outline and struct members.
- Removed trailig white spaces..

- No additional test yet.


==
 warning: 3 lines add whitespace errors.

Whoops. Fixed.

 2. The patches compile cleanly.
 3. The regression is clean, even in contrib/postgres_fdw and
 contrib/file_fdw
 
 Tests
 ---
 We need tests to make sure that the logic remains intact even after further
 changes in this area. Couple of tests which require multiple foreign scans
 within the same query fetching rows more than fetch size (100) would be
 required. Also, some DMLs, which involve multiple foreign scans would test
 the sanity when UPDATE/DELETE interleave such scans. By multiple foreign
 scans I mean both multiple scans on a single foreign server and multiple
 scans spread across multiple foreign servers.

Additional tests indeed might be needed. Some of the test related
to this patch are implicitly done in the present regression
tests. But no explicit ones.

fetch_size is currently a bare constant so I think it is not so
necessary to test for other fetch sizes. Even if different size
will potentially cause a problem, it will be found when the
different number is actually applied.

On the current design, async scan is started only on the first
scan on the connection, and if the next scan or modify claims the
same connection, the async state is immediately finished and
behaves as the same as the unpatched version. But since
asynchronous/parallel scan is introduced in any form, such kind
of test seems to be needed.

multi-server tests are not done also in the unpatched version but
there's no difference between multiple foregn servers on the same
remote server and them distributed on multiple remotes. The async
scan of this patch works only on the same foreign server so there
seems to be no need such kind of test. Do you have any specific
concern about this?

After all, I will add some explict tests for async-canceling in
the next patch.

 Code
 ---
 Because previous conn is now replaced by conn-pgconn, the double
 indirection makes the code a bit ugly and prone to segfaults (conn being
 NULL or invalid pointer). Can we minimize such code or wrap it under a
 macro?

Agreed. It was annoyance also for me. I've done the following
things to encapsulate PgFdwConn to some extent in the second
version of this patch. They are described below.

 We need some comments about the structure definition of PgFdwConn and its
 members explaining the purpose of this structure and its members.

Thank you for pointing that. I forgot that. I added simple
comments there.

 Same is the case with enum PgFdwConnState. In fact, the state diagram of a
 connection has become more complicated with the async connections, so it
 might be better to explain that state diagram at one place in the code
 (through comments). The definition of the enum might be a good place to do
 that.

I added a comment describing the and logic and meaning of the
statesjust above the enum declaration.

 Otherwise, the logic of connection maintenance is spread at multiple
 places and is difficult to understand by looking at the code.
 
 In function GetConnection(), at line
 elog(DEBUG3, new postgres_fdw connection %p for server \%s\,
 -entry-conn, server-servername);
 +entry-conn-pgconn, server-servername);

Thank you, I replaced conn's in this form with PFC_PGCONN(conn).

 entry-conn-pgconn may not necessarily be a new connection and may be NULL
 (as the next line check it for being NULL). So, I think this line should be
 moved within the following if block after pgconn has been initialised with
 the new connection.
 +   if (entry-conn-pgconn == NULL)
 +   {
 +   entry-conn-pgconn = connect_pg_server(server, user);
 +   entry-conn-nscans = 0;
 +   entry-conn-async_state = PGFDW_CONN_IDLE;
 +   entry-conn-async_scan = NULL;
 +   }
 
 The if condition if (entry-conn == NULL) in GetConnection(), used to track
 whether there is a PGConn active for the given entry, now it tracks whether
 it has PgFdwConn for the same.

After some soncideration, I decided to make PgFdwConn to be
handled more similarly to PGconn. This patch has shrunk as a
result and bacame looks clear.

- Added macros to encapsulate PgFdwConn struct. (One of them is a function)

- Added macros to call PQxxx functions taking PgFdwConn.

- connect_pg_server() returns PgFdwConn.

- connection.c does not touch the inside of PgFdwConn except a
  few points. The PgFdwConn's memory is allocated with malloc()
  as PGconn and freed by PFCfinish() which is the correspondent
  of PQfinish().

As the result of those chagnes, this patch has altered into the
following shape.

- All points where PGconn is used now uses PgFdwConn. They are
  seemingly simple replacements.

- The major functional changes are concentrated within
  

Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-09 Thread Dean Rasheed
On 8 January 2015 at 18:57, Stephen Frost sfr...@snowman.net wrote:
 What do you think of the attached rewording?

 Rewording it this way is a great idea.  Hopefully that will help address
 the confusion which we've seen.  The only comment I have offhand is:
 should we should add a sentence to this paragraph about the default-deny
 policy?


Yes, good idea, although I think perhaps that sentence should be added
to the preceding paragraph, after noting that RLS has to be enabled on
the table for the policies to be applied:

   The commandCREATE POLICY/command command defines a new policy for a
   table.  Note that row level security must also be enabled on the table using
   commandALTER TABLE/command in order for created policies to be applied.
   Once row level security has been enabled, a default-deny policy is
used and no rows
   in the table are visible unless permitted by a specific policy.

   A policy permits SELECT, INSERT, UPDATE or DELETE commands to access rows
   in a table that has row level security enabled.  Access to existing table
   rows is granted if they match a policy expression specified via USING,
   while new rows that would be created via INSERT or UPDATE are checked
   against policy expressions specified via WITH CHECK.  For policy
   expressions specified via USING which grant access to existing rows, the
   system will generally test the policy expressions prior to any
   qualifications that appear in the query itself, in order to the prevent the
   inadvertent exposure of the protected data to user-defined functions which
   might not be trustworthy.  However, functions and operators marked by the
   system (or the system administrator) as LEAKPROOF may be evaluated before
   policy expressions, as they are assumed to be trustworthy.

Also, perhaps the ALTER TABLE in the first paragraph should be
turned into a link.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] POLA violation with \c service=

2015-01-09 Thread David Fetter
On Thu, Jan 08, 2015 at 08:04:47PM -0800, David Fetter wrote:
 On Mon, Jan 05, 2015 at 02:26:59PM -0800, David Fetter wrote:
  On Tue, Dec 30, 2014 at 04:48:11PM -0800, David Fetter wrote:
   On Wed, Dec 17, 2014 at 08:14:04AM -0500, Andrew Dunstan wrote:

Yeah, that's the correct solution. It should not be terribly difficult 
to
create a test for a conninfo string in the dbname parameter. That's what
libpq does after all. We certainly don't want psql to have to try to
interpret the service file. psql just needs to let libpq do its work in 
this
situation.
   
   This took a little longer to get time to polish than I'd hoped, but
   please find attached a patch which:
   
   - Correctly connects to service= and postgres(ql)?:// with \c
   - Disallows tab completion in the above cases
   
   I'd like to see about having tab completion actually work correctly in
   at least the service= case, but that's a matter for a follow-on patch.
   
   Thanks to Andrew Dunstan for the original patch, and to Andrew Gierth
   for his help getting it into shape.
   
   Cheers,
   David.
  
  I should mention that the patch also corrects a problem where the
  password was being saved/discarded at inappropriate times.  Please
  push this patch to the back branches :)
 
 Per discussion with Stephen Frost, I've documented the previously
 undocumented behavior with conninfo strings and URIs.

Some C cleanups...

I think longer term we should see about having libpq export the
functions I've put in common.[ch], but that's for a later patch.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index bdfb67c..eb6a57b 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -796,18 +796,20 @@ testdb=gt;
   /varlistentry
 
   varlistentry
-termliteral\c/literal or literal\connect/literal literal[ 
replaceable class=parameterdbname/replaceable [ replaceable 
class=parameterusername/replaceable ] [ replaceable 
class=parameterhost/replaceable ] [ replaceable 
class=parameterport/replaceable ] ]/literal/term
+termliteral\c/literal or literal\connect/literal literal [ 
{ [ replaceable class=parameterdbname/replaceable [ replaceable 
class=parameterusername/replaceable ] [ replaceable 
class=parameterhost/replaceable ] [ replaceable 
class=parameterport/replaceable ] ] | replaceable 
class=parameterconninfo/replaceable string | replaceable 
class=parameterURI/replaceable } ] /literal/term
 listitem
 para
 Establishes a new connection to a productnamePostgreSQL/
-server. If the new connection is successfully made, the
-previous connection is closed. If any of replaceable
+server using positional parameters as described below, a
+parameterconninfo/parameter string, or a acronymURI/acronym. 
If the new connection is
+successfully made, the
+previous connection is closed.  When using positional parameters, if 
any of replaceable
 class=parameterdbname/replaceable, replaceable
 class=parameterusername/replaceable, replaceable
 class=parameterhost/replaceable or replaceable
 class=parameterport/replaceable are omitted or specified
 as literal-/literal, the value of that parameter from the
-previous connection is used. If there is no previous
+previous connection is used.  If using positional parameters and there 
is no previous
 connection, the applicationlibpq/application default for
 the parameter's value is used.
 /para
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 4ac21f2..f290fbc 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1610,6 +1610,8 @@ do_connect(char *dbname, char *user, char *host, char 
*port)
PGconn *o_conn = pset.db,
   *n_conn;
char   *password = NULL;
+   boolkeep_password = true;
+   boolhas_connection_string = false;
 
if (!o_conn  (!dbname || !user || !host || !port))
{
@@ -1623,14 +1625,32 @@ do_connect(char *dbname, char *user, char *host, char 
*port)
return false;
}
 
-   if (!dbname)
-   dbname = PQdb(o_conn);
if (!user)
user = PQuser(o_conn);
+   else if (strcmp(user, PQuser(o_conn)) != 0)
+   keep_password = false;
+
if (!host)
host = PQhost(o_conn);
+   else if (strcmp(host, PQhost(o_conn)) != 0)
+   keep_password = false;
+
if (!port)
port = PQport(o_conn);
+   else if (strcmp(port, PQport(o_conn)) != 0)
+

Re: [HACKERS] Patch: [BUGS] BUG #12320: json parsing with embedded double quotes

2015-01-09 Thread Andrew Dunstan


On 01/08/2015 08:42 PM, Aaron Botsis wrote:



On Jan 8, 2015, at 3:44 PM, Andrew Dunstan and...@dunslane.net wrote:


On 01/08/2015 03:05 PM, Aaron Botsis wrote:



It's also unnecessary. CSV format, while not designed for this, is nevertheless 
sufficiently flexible to allow successful import of json data meeting certain 
criteria (essentially no newlines), like this:

  copy the_table(jsonfield)
  from '/path/to/jsondata'
  csv quote e'\x01' delimiter e'\x02’;

While perhaps unnecessary, given the size and simplicity of the patch, IMO it’s 
a no brainer to merge (it actually makes the code smaller by 3 lines). It also 
enables non-json use cases anytime one might want to preserve embedded escapes, 
or use different ones entirely. Do you see other reasons not to commit it?


Well, for one thing it's seriously incomplete. You need to be able to change 
the delimiter as well. Otherwise, any embedded tab in the json will cause you 
major grief.

Currently the delimiter and the escape MUST be a single byte non-nul character, 
and there is a check for this in csv mode. Your patch would allow any arbitrary 
string (including one of zero length) for the escape in text mode, and would 
then silently ignore all but the first byte. That's not the way we like to do 
things.

And, frankly, I would need to spend quite a lot more time thinking about other 
implications than I have given it so far. This is an area where I tend to be 
VERY cautious about making changes. This is a fairly fragile ecosystem.

Good point.

This version:

* doesn't allow ENCODING in BINARY mode (existing bug)
* doesn’t allow ESCAPE in BINARY mode
* makes COPY TO work with escape
* ensures escape char length is  2 for text mode, 1 for CSV

Couple more things to realize: setting both the escape and delimiter characters 
to null won’t be any different than how you fiddled with them in CSV mode. The 
code paths will be the same because we should never encounter a null in the 
middle of the string. And even if we did (and the encoding didn’t catch it), 
we’d treat it just like any other field delimiter or escape character.

So I’m going to do a bit more testing with another patch tomorrow with 
delimiters removed. If you can think of any specific cases you think will break 
it let me know and I’ll make sure to add regression tests for them as well.




Well, I still need convincing that this is the best solution to the 
problem. As I said, I need to spend more time thinking about it.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: [BUGS] BUG #12320: json parsing with embedded double quotes

2015-01-09 Thread Aaron Botsis


 On Jan 8, 2015, at 3:44 PM, Andrew Dunstan and...@dunslane.net wrote:
 
 
 On 01/08/2015 03:05 PM, Aaron Botsis wrote:
 
 
 It's also unnecessary. CSV format, while not designed for this, is 
 nevertheless sufficiently flexible to allow successful import of json data 
 meeting certain criteria (essentially no newlines), like this:
 
  copy the_table(jsonfield)
  from '/path/to/jsondata'
  csv quote e'\x01' delimiter e'\x02’;
 
 While perhaps unnecessary, given the size and simplicity of the patch, IMO 
 it’s a no brainer to merge (it actually makes the code smaller by 3 lines). 
 It also enables non-json use cases anytime one might want to preserve 
 embedded escapes, or use different ones entirely. Do you see other reasons 
 not to commit it?
 
 
 Well, for one thing it's seriously incomplete. You need to be able to change 
 the delimiter as well. Otherwise, any embedded tab in the json will cause you 
 major grief.
 
 Currently the delimiter and the escape MUST be a single byte non-nul 
 character, and there is a check for this in csv mode. Your patch would allow 
 any arbitrary string (including one of zero length) for the escape in text 
 mode, and would then silently ignore all but the first byte. That's not the 
 way we like to do things.
 
 And, frankly, I would need to spend quite a lot more time thinking about 
 other implications than I have given it so far. This is an area where I tend 
 to be VERY cautious about making changes. This is a fairly fragile ecosystem.

Good point.

This version:

* doesn't allow ENCODING in BINARY mode (existing bug)
* doesn’t allow ESCAPE in BINARY mode
* makes COPY TO work with escape
* ensures escape char length is  2 for text mode, 1 for CSV

Couple more things to realize: setting both the escape and delimiter characters 
to null won’t be any different than how you fiddled with them in CSV mode. The 
code paths will be the same because we should never encounter a null in the 
middle of the string. And even if we did (and the encoding didn’t catch it), 
we’d treat it just like any other field delimiter or escape character.

So I’m going to do a bit more testing with another patch tomorrow with 
delimiters removed. If you can think of any specific cases you think will break 
it let me know and I’ll make sure to add regression tests for them as well. 

Cheers!

Aaron



escape-without-csv-v4.patch
Description: Binary data




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Stephen Frost
Amit,

* Amit Kapila (amit.kapil...@gmail.com) wrote:
 On Fri, Jan 9, 2015 at 1:02 AM, Jim Nasby jim.na...@bluetreble.com wrote:
  I agree, but we should try and warn the user if they set
  parallel_seqscan_degree close to max_worker_processes, or at least give
  some indication of what's going on. This is something you could end up
  beating your head on wondering why it's not working.
 
 Yet another way to handle the case when enough workers are not
 available is to let user  specify the desired minimum percentage of
 requested parallel workers with parameter like
 PARALLEL_QUERY_MIN_PERCENT. For  example, if you specify
 50 for this parameter, then at least 50% of the parallel workers
 requested for any  parallel operation must be available in order for
 the operation to succeed else it will give error. If the value is set to
 null, then all parallel operations will proceed as long as at least two
 parallel workers are available for processing.

Ugh.  I'm not a fan of this..  Based on how we're talking about modeling
this, if we decide to parallelize at all, then we expect it to be a win.
I don't like the idea of throwing an error if, at execution time, we end
up not being able to actually get the number of workers we want-
instead, we should degrade gracefully all the way back to serial, if
necessary.  Perhaps we should send a NOTICE or something along those
lines to let the user know we weren't able to get the level of
parallelization that the plan originally asked for, but I really don't
like just throwing an error.

Now, for debugging purposes, I could see such a parameter being
available but it should default to 'off/never-fail'.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Patch: [BUGS] BUG #12320: json parsing with embedded double quotes

2015-01-09 Thread Aaron Botsis

 On Jan 7, 2015, at 2:45 PM, Andrew Dunstan and...@dunslane.net wrote:
 
 
 On 01/07/2015 08:25 AM, Aaron Botsis wrote:
 Hi folks, I was having a problem importing json data with COPY. Lots of 
 things export data nicely as one json blob per line. This is excellent for 
 directly importing into a JSON/JSONB column for analysis.
 
 ...Except when there’s an embedded doublequote. Or anything that’s escaped. 
 COPY handles this, but by the time the escaped char hit the JSON parser, 
 it's not escaped anymore. This breaks the JSON parsing. This means I need to 
 manipulate the input data to double-escape it. See bug #12320 for an 
 example. Yuck.
 
 I propose this small patch that simply allows specifying COPY … ESCAPE 
 without requiring the CSV parser. It will make it much easier to directly 
 use json formatted export data for folks going forward. This seemed like the 
 simplest route.
 
 This isn't a bug. Neither CSV format nor TEXT format are partucularly 
 suitable for json. I'm quite certain I could compose legal json that will 
 break your proposal (for example, with an embedded newline in the white 
 space.)

Sorry - though I originally reported it as a bug, I didn’t mean to imply it 
ultimately was one. :) The patch is a feature enhancement.

 It's also unnecessary. CSV format, while not designed for this, is 
 nevertheless sufficiently flexible to allow successful import of json data 
 meeting certain criteria (essentially no newlines), like this:
 
   copy the_table(jsonfield)
   from '/path/to/jsondata'
   csv quote e'\x01' delimiter e'\x02’;

While perhaps unnecessary, given the size and simplicity of the patch, IMO it’s 
a no brainer to merge (it actually makes the code smaller by 3 lines). It also 
enables non-json use cases anytime one might want to preserve embedded escapes, 
or use different ones entirely. Do you see other reasons not to commit it?

 You aren't the first person to encounter this problem. See 
 http://adpgtech.blogspot.com/2014/09/importing-json-data.html 
 http://adpgtech.blogspot.com/2014/09/importing-json-data.html
 
 Maybe we need to add something like this to the docs, or to the wiki.

In your post you acknowledged a text mode copy with null escape character would 
have solved your problem, and to me, that was the intuitive/obvious choice as 
well. *shrug*

 Note too my comment in that blog post:
 
   Now this solution is a bit of a hack. I wonder if there's a case for
   a COPY mode that simply treats each line as a single datum. I also
   wonder if we need some more specialized tools for importing JSON,
   possibly one or more Foreign Data Wrappers. Such things could
   handle, say, embedded newline punctuation.

Agree. Though https://github.com/citusdata/json_fdw 
https://github.com/citusdata/json_fdw already exists (I haven’t used it). How 
do folks feel about a COPY mode that reads a single datum until it finds a 
single character record terminator alone on a line? something like:

=# COPY test(data) from stdin terminator ‘}’;
End with a backslash and a period on a line by itself.
{
  a: 123, c: string,  b: [1, 2, 3
 ]
}
{
  [1,2,3]
}
\.
COPY 2

You could also get fancy and support multi-character record terminators which 
would allow the same thing in a slightly different way:
COPY test(data) from stdin terminator ‘\n}’;

COPY test(data) from stdin terminator ‘\n}’;

I don’t know how crazy you could get without a lot of rework. It might have to 
be used in conjunction with a more constrained mode like COPY…RAW”.

Aaron

Re: [HACKERS] Fixing memory leak in pg_upgrade

2015-01-09 Thread Bruce Momjian
On Fri, Jan  9, 2015 at 11:34:24AM -0500, Tom Lane wrote:
 Tatsuo Ishii is...@postgresql.org writes:
  According to Coverity, there's a memory leak bug in transfer_all_new_dbs().
 
 It's pretty difficult to get excited about that; how many table-free
 databases is pg_upgrade likely to see in one run?  But surely we could
 just move the pg_free call to after the if-block.

I have fixed this with the attached, applied patch.  I thought malloc(0)
would return null, but our src/common pg_malloc() has:

/* Avoid unportable behavior of malloc(0) */
if (size == 0)
size = 1;

so some memory is allocated, and has to be freed.  I looked at avoiding
the call to gen_db_file_maps() for old_db-rel_arr.nrels == 0, but there
are checks in there comparing the old/new relation counts, so it can't
be skipped.  I also removed the unnecessary memory initialization.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
commit ac7009abd228362042edd10e6b12556ddef35171
Author: Bruce Momjian br...@momjian.us
Date:   Fri Jan 9 12:12:30 2015 -0500

pg_upgrade:  fix one-byte per empty db memory leak

Report by Tatsuo Ishii, Coverity

diff --git a/contrib/pg_upgrade/relfilenode.c b/contrib/pg_upgrade/relfilenode.c
new file mode 100644
index 70753f2..423802b
*** a/contrib/pg_upgrade/relfilenode.c
--- b/contrib/pg_upgrade/relfilenode.c
*** transfer_all_new_dbs(DbInfoArr *old_db_a
*** 110,119 
pg_fatal(old database \%s\ not found in the new 
cluster\n,
 old_db-db_name);
  
-   n_maps = 0;
mappings = gen_db_file_maps(old_db, new_db, n_maps, old_pgdata,

new_pgdata);
- 
if (n_maps)
{
print_maps(mappings, n_maps, new_db-db_name);
--- 110,117 
*** transfer_all_new_dbs(DbInfoArr *old_db_a
*** 123,131 
  #endif
transfer_single_new_db(pageConverter, mappings, n_maps,
   
old_tablespace);
- 
-   pg_free(mappings);
}
}
  
return;
--- 121,129 
  #endif
transfer_single_new_db(pageConverter, mappings, n_maps,
   
old_tablespace);
}
+   /* We allocate something even for n_maps == 0 */
+   pg_free(mappings);
}
  
return;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fixing memory leak in pg_upgrade

2015-01-09 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes:
 According to Coverity, there's a memory leak bug in transfer_all_new_dbs().

It's pretty difficult to get excited about that; how many table-free
databases is pg_upgrade likely to see in one run?  But surely we could
just move the pg_free call to after the if-block.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Stephen Frost
Amit,

* Amit Kapila (amit.kapil...@gmail.com) wrote:
 On Fri, Dec 19, 2014 at 7:57 PM, Stephen Frost sfr...@snowman.net wrote:
  There's certainly documentation available from the other RDBMS' which
  already support parallel query, as one source.  Other academic papers
  exist (and once you've linked into one, the references and prior work
  helps bring in others).  Sadly, I don't currently have ACM access (might
  have to change that..), but there are publicly available papers also,
 
 I have gone through couple of papers and what some other databases
 do in case of parallel sequential scan and here is brief summarization
 of same and how I am planning to handle in the patch:

Great, thanks!

 Costing:
 In one of the paper's [1] suggested by you, below is the summarisation:
   a. Startup costs are negligible if processes can be reused
rather than created afresh.
 b. Communication cost consists of the CPU cost of sending
and receiving messages.
   c. Communication costs can exceed the cost of operators such
 as scanning, joining or grouping
 These findings lead to the important conclusion that
 Query optimization should be concerned with communication costs
 but not with startup costs.
 
 In our case as currently we don't have a mechanism to reuse parallel
 workers, so we need to account for that cost as well.  So based on that,
 I am planing to add three new parameters cpu_tuple_comm_cost,
 parallel_setup_cost, parallel_startup_cost
 *  cpu_tuple_comm_cost - Cost of CPU time to pass a tuple from worker
 to master backend with default value
 DEFAULT_CPU_TUPLE_COMM_COST as 0.1, this will be multiplied
 with tuples expected to be selected
 *  parallel_setup_cost - Cost of setting up shared memory for parallelism
with default value as 100.0
  *  parallel_startup_cost  - Cost of starting up parallel workers with
 default
 value as 1000.0 multiplied by number of workers decided for scan.
 
 I will do some experiments to finalise the default values, but in general,
 I feel developing cost model on above parameters is good.

The parameters sound reasonable but I'm a bit worried about the way
you're describing the implementation.  Specifically this comment:

Cost of starting up parallel workers with default value as 1000.0
multiplied by number of workers decided for scan.

That appears to imply that we'll decide on the number of workers, figure
out the cost, and then consider parallel as one path and
not-parallel as another.  I'm worried that if I end up setting the max
parallel workers to 32 for my big, beefy, mostly-single-user system then
I'll actually end up not getting parallel execution because we'll always
be including the full startup cost of 32 threads.  For huge queries,
it'll probably be fine, but there's a lot of room to parallelize things
at levels less than 32 which we won't even consider.

What I was advocating for up-thread was to consider multiple parallel
paths and to pick whichever ends up being the lowest overall cost.  The
flip-side to that is increased planning time.  Perhaps we can come up
with an efficient way of working out where the break-point is based on
the non-parallel cost and go at it from that direction instead of
building out whole paths for each increment of parallelism.

I'd really like to be able to set the 'max parallel' high and then have
the optimizer figure out how many workers should actually be spawned for
a given query.

 Execution:
 Most other databases does partition level scan for partition on
 different disks by each individual parallel worker.  However,
 it seems amazon dynamodb [2] also works on something
 similar to what I have used in patch which means on fixed
 blocks.  I think this kind of strategy seems better than dividing
 the blocks at runtime because dividing randomly the blocks
 among workers could lead to random scan for a parallel
 sequential scan.

Yeah, we also need to consider the i/o side of this, which will
definitely be tricky.  There are i/o systems out there which are faster
than a single CPU and ones where a single CPU can manage multiple i/o
channels.  There are also cases where the i/o system handles sequential
access nearly as fast as random and cases where sequential is much
faster than random.  Where we can get an idea of that distinction is
with seq_page_cost vs. random_page_cost as folks running on SSDs tend to
lower random_page_cost from the default to indicate that.

 Also I find in whatever I have read (Oracle, dynamodb) that most
 databases divide work among workers and master backend acts
 as coordinator, atleast that's what I could understand.

Yeah, I agree that's more typical.  Robert's point that the master
backend should participate is interesting but, as I recall, it was based
on the idea that the master could finish faster than the worker- but if
that's the case then we've planned it out wrong from the beginning.

Thanks!

Stephen



Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Kouhei Kaigai
 On Tue, Jan 6, 2015 at 9:17 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
  The attached patch is newer revision of custom-/foreign-join
  interface.
 
 It seems that the basic purpose of this patch is to allow a foreign scan
 or custom scan to have scanrelid == 0, reflecting the case where we are
 scanning a joinrel rather than a baserel.  The major problem that seems
 to create is that we can't set the target list from the relation descriptor,
 because there isn't one.  To work around that, you've added fdw_ps_list
 and custom_ps_tlist, which the FDW or custom-plan provider must set.  I
 don't know off-hand whether that's a good interface or not.  How does the
 FDW know what to stick in there?

In the most usual scenario, FDP/CSP will make a ps_tlist according to the
target-list of the joinrel (that contains mixture of var-nodes to left-side
and right-side), and qualifier's expression tree if any.
As long as FDW can construct a remote query, it knows which attributes shall
be returned and which relation does it come from. It is equivalent to what
ps_tlist tries to inform the core optimizer.


 There's a comment that seems to be trying to explain this:
 
 + * An optional fdw_ps_tlist is used to map a reference to an attribute
 + of
 + * underlying relation(s) on a pair of INDEX_VAR and alternative varattno.
 + * It looks like a scan on pseudo relation that is usually result of
 + * relations join on remote data source, and FDW driver is responsible
 + to
 + * set expected target list for this. If FDW returns records as
 + foreign-
 + * table definition, just put NIL here.
 
 ...but I can't understand what that's telling me.
 
Sorry, let me explain in another expression.

A joinrel has a target-list that can/may contain references to both of
the left and right relations. These are eventually mapped to either
INNER_VAR or OUTER_VAR, then executor switch the TupleTableSlot
(whether ecxt_innertuple or ecxt_outertuple) according to the special
varno.
On the other hands, because ForeignScan/CustomScan is a scan plan, it
shall have just one TupleTableSlot on execution time. Thus, we need a
mechanism that maps attributes from both of the relations on a certain
location of the slot; that shall be eventually translated to var-node
with INDEX_VAR to reference ecxt_scantuple.
Of course, ps_tlist is not necessary if ForeignScan/CustomScan scans
on a base relation as literal. In this case, the interface contract
expects NIL is set on the ps_tlist field.

 You've added an Oid fdw_handler field to the ForeignScan and RelOptInfo
 structures.  I think this is the OID of the pg_proc entry for the handler
 function; and I think we need it because, if scanrelid == 0 then we don't
 have a relation that we can trace to a foreign table, to a server, to an
 FDW, and then to a handler.  So we need to get that information some other
 way.  When building joinrels, the fdw_handler OID, and the associated
 routine, are propagated from any two relations that share the same
 fdw_handler OID to the resulting joinrel.  I guess that's reasonable,
 although it feels a little weird that we're copying around both the OID
 and the structure-pointer.

Unlike CustomScan node, ForeignScan node does not have function pointers.
In addition, it is dynamically allocated by palloc(), so we have no
guarantee the pointer constructed on plan-stage is valid on beginning
of the executor.
It is the reason why I put OID of the FDW handler routine.
Any other better idea?

 For non-obvious reasons, you've made create_plan_recurse() non-static.
 
When custom-scan node replaced a join-plan, it shall have at least two
child plan-nodes. The callback handler of PlanCustomPath needs to be
able to call create_plan_recurse() to transform the underlying path-nodes
to plan-nodes, because this custom-scan node may take other built-in
scan or sub-join nodes as its inner/outer input.
In case of FDW, it shall kick any underlying scan relations to remote
side, thus we may not expect ForeignScan has underlying plans...

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Peter Geoghegan
On Fri, Jan 9, 2015 at 2:22 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Whoa, hang on. I think you're being a bit quick to dismiss that
 example. Why shouldn't I want an upsert where the majority of the
 table columns follow the usual make it so pattern of an upsert, but
 there is also this kind of audit column to be maintained? Then I would
 write something like

 INSERT INTO tbl (some values, 0)
   ON CONFLICT UPDATE SET same values, mod_count=mod_count+1;

 The root of the problem is the way that you're proposing to combine
 the RLS policies (using AND), which runs contrary to the way RLS
 policies are usually combined (using OR), which is why this kind of
 example fails -- RLS policies in general aren't intended to all be
 true simultaneously.

In case I wasn't clear, I'm proposing that we AND together the already
OR'd together UPDATE and INSERT quals.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Stephen Frost
* Peter Geoghegan (p...@heroku.com) wrote:
 On Fri, Jan 9, 2015 at 1:09 PM, Stephen Frost sfr...@snowman.net wrote:
  Therefore,
  I'm not sure that I see the point in checking the INSERT tuple against
  the UPDATE policy.
 
 I guess it wouldn't be hard to modify the struct WithCheckOption to
 store the cmd (e.g. RowSecurityPolicy-style ACL_*_CHR permissions),
 usable only in this context. That way, when ExecWithCheckOptions() is
 called from the INSERT, we can tell it to only enforce
 INSERT-applicable policy cmds (in particular, not
 UPDATE-only-applicable policy cmds that happen to end up associated
 with the same ResultRelation). Whereas, at the end of ExecUpdate(),
 that final ExecWithCheckOptions() call (call 3 of 3 when the UPDATE
 path is taken), INSERT-based policies can still be enforced against
 the final tuple (as can USING() quals that would ordinarily not be
 checked at that point either). A given ResultRelation's policies
 wouldn't necessarily always need to be enforced within
 ExecWithCheckOptions(), which is a change. Does that make sense to
 you?

That sounds reasonable on first blush.  I'll note that I've not looked
deeply at the UPSERT patch, but if the above means that the INSERT
policy is always checked against the INSERT tuple and the UPDATE policy
is always checked against the tuple-resulting-from-an-UPDATE, and that
tuples which are not visible due to the UPDATE policy throw an error in
that case, then it's working as I'd expect.

This does mean that the UPDATE policy won't be checked when the INSERT
path is used but that seems appropriate to me, as we can't check a
policy against a tuple that never exists (the result of the update
applied to an existing tuple).

 Note that I've already taught ExecWithCheckOptions() to not leak the
 existing, target tuple when the UPDATE path is taken (the tuple that
 can be referenced in the UPDATE using the TARGET.* alias), while still
 performing enforcement against it. I can teach it this too.

Ok.

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] libpq bad async behaviour

2015-01-09 Thread Daurnimator
I'm worried about libpq blocking in some circumstances; particularly
around SSL renegotiations.
This came up while writing an async postgres library for lua, I
realised that this code was dangerous:
https://github.com/daurnimator/cqueues-pgsql/blob/ee9c3fc85c94669b8128386d99d730fe93d9dbad/cqueues-pgsql.lua#L121


e.g. 1:
When a PQ connection is in non-blocking mode, PQflush returns 1, the docs say:
 wait for the socket to be write-ready and call it again
However, if the SSL layer is waiting on data for a renegotiation,
write readiness is not enough:
Waiting for POLLOUT and calling PQflush again will (untested) just
return 1 again, and continue to do so until data is recieved.
This is a busy-loop, and will block the host application.

e.g. 2:
An SSL renegiation happens while trying to receive a response.
According to 'andres' on IRC, inside of `PQisBusy` there is a busy loop:
 14:22:32 andres You'll not see that. Even though the explanation for it is 
 absolutely horrid.
 14:23:32 andres There's a busy retry loop because of exactly that reason 
 inside libpq's ssl read function whenever it hits a WANT_WRITE.
 14:23:58 daurnimator so... libpq will block my process? :(
 14:24:25 andres daurnimator: That case is unlikely to be hit often luckily 
 because of the OS buffering. But yea, it's really unsatisfying.
 14:26:06 andres daurnimator: I think this'll need a new API to be properly 
 fixed.


One idea that came to mind if we want to keep the same api, is to hide
the socket behind an epoll file descriptor,
they always poll read ready when an fd in their set becomes ready.
I think this is also possible for kqueue on bsd, ports on solaris and
IOCP on windows.


Regards,
Daurnimator.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Stephen Frost
Dean, Peter,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
 On 9 January 2015 at 08:49, Peter Geoghegan p...@heroku.com wrote:
  On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed dean.a.rash...@gmail.com 
  wrote:
  I was trying to think up an example where you might actually have
  different INSERT and UPDATE policies, and the best I can think of is
  some sort of mod_count column where you have an INSERT CHECK
  (mod_count = 0) and an UPDATE CHECK (mod_count  0). In that case,
  checking both policies would make an UPSERT impossible, whereas if you
  think of it as doing either an INSERT or an UPDATE, as the syntax
  suggests, it becomes possible.
 
  Why does this user want to do this upsert? If they're upserting, then
  the inserted row could only reasonably have a value of (mod_count =
  0). If updating, then they must have a constant value for the update
  path (a value that's greater than 0, naturally - say 2), which doesn't
  make any sense in the context of an upsert's auxiliary update - what
  happened to the 0 value? Sorry, but I don't think your example makes
  sense - I can't see what would motivate anyone to write a query like
  that with those RLS policies in place. It sounds like you're talking
  about an insert and a separate update that may or may not affect the
  same row, and not an upsert. Then those policies make sense, but in
  practice they render the upsert you describe contradictory.
 
 
 Whoa, hang on. I think you're being a bit quick to dismiss that
 example. Why shouldn't I want an upsert where the majority of the
 table columns follow the usual make it so pattern of an upsert, but
 there is also this kind of audit column to be maintained? Then I would
 write something like
 
 INSERT INTO tbl (some values, 0)
   ON CONFLICT UPDATE SET same values, mod_count=mod_count+1;

That's a good point- it doesn't make sense to compare the INSERT values
against the UPDATE policy as the result of the UPDATE could be a
completely different tuple, one we can't know the contents of until we
actually find a conflicting tuple.  We can't simply combine the policies
and run them at the same time, but I'm not sure that then means we
should let an INSERT .. ON CONFLICT UPDATE pass in values to the INSERT
which are not permitted by any INSERT policy on the relation.

Further, forgetting about RLS, what happens if a user doesn't have
INSERT rights on the mod_count column at all?  We still allow the above
to execute if the row already exists?  That doesn't strike me as a good
idea.  In this case, I do think it makes sense to consider RLS and our
regular permissions system as they are both 'OR' cases.  I might have
UPDATE rights for a particular column through multiple different roles
for a given relation, but if none of them give me INSERT rights for that
column, then I'd expect to get an error if I try to do an INSERT into
that column.

 The root of the problem is the way that you're proposing to combine
 the RLS policies (using AND), which runs contrary to the way RLS
 policies are usually combined (using OR), which is why this kind of
 example fails -- RLS policies in general aren't intended to all be
 true simultaneously.

I agree that RLS policies, in general, are not intended to all be true
simultaneously.  I don't think it then follows that an INSERT .. ON
CONFLICT UPDATE should be allowed if, for example, there are no INSERT
policies on an RLS-enabled relation, because the call happens to end up
doing an UPDATE.

To try and outline another possible use-case, consider that you have
tuples coming in from two independent sets of users, orange and blue.
All users are allowed to INSERT, but the 'color' column must equal the
user's.  For UPDATEs, the blue users can see both orange and blue
records while orange users can only see orange records.  Rows resulting
from an UPDATE can be orange or blue for blue users, but must be orange
for orange users.

Further, in this environment, attempts by orange users to insert blue
records are flagged to be audited for review because orange users should
never have access to nor be handling blue data.  With the approach
you're outlining, a command like:

INSERT INTO data VALUES (... , 'blue')
  ON CONFLICT UPDATE set ... = ...;

run by an orange user wouldn't error if it happened to result in an
UPDATE (presuming, of course, that the UPDATE didn't try to change the
existing color), but from a security perspective, it's clearly an error
for an orange user to be attempting to insert blue data.

I don't see this as impacting the more general case of how RLS policies
are applied.  An individual blue user might also be a member of some
orange-related role and it wouldn't make any sense for that user to then
be only able to see orange records (were we to AND RLS policies
together).

Where this leaves me, at least, is feeling like we should always apply
the INSERT WITH CHECK policy, then if there is a conflict, check the
UPDATE USING policy and throw an error if the row 

Re: [HACKERS] PQputCopyEnd doesn't adhere to its API contract

2015-01-09 Thread Bruce Momjian
On Thu, Sep  4, 2014 at 12:52:14PM -0400, Robert Haas wrote:
 On Wed, Sep 3, 2014 at 6:24 PM, Bruce Momjian br...@momjian.us wrote:
  On Fri, May  9, 2014 at 12:03:36PM -0400, Robert Haas wrote:
  On Thu, May 8, 2014 at 5:21 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   Perhaps the text should be like this:
  
   The result is 1 if the termination message was sent; or in nonblocking
   mode, this may only indicate that the termination message was 
   successfully
   queued.  (In nonblocking mode, to be certain that the data has been sent,
   you should next wait for write-ready and call functionPQflush/,
   repeating until it returns zero.)  Zero indicates that the function could
   not queue the termination message because of full buffers; this will only
   happen in nonblocking mode.  (In this case, wait for write-ready and try
   the PQputCopyEnd call again.)  If a hard error occurs, -1 is returned; 
   you
   can use functionPQerrorMessage/function to retrieve details.
 
  That looks pretty good.   However, I'm realizing this isn't the only
  place where we probably need to clarify the language.  Just to take
  one example near at hand, PQputCopyData may also return 1 when it's
  only queued the data; it seems to try even less hard than PQputCopyEnd
  to ensure that the data is actually sent.
 
  Uh, where are we on this?
 
 I think someone needs to take Tom's proposed language and make it into
 a patch.  And figure out which other functions in the documentation
 need similar updates.

I have developed such a patch --- attached.  I didn't see any other
mentions of blocking in the docs.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
new file mode 100644
index de272c5..ad104a3
*** a/doc/src/sgml/libpq.sgml
--- b/doc/src/sgml/libpq.sgml
*** int PQsetnonblocking(PGconn *conn, int a
*** 4318,4325 
para
 In the nonblocking state, calls to
 functionPQsendQuery/function, functionPQputline/function,
!functionPQputnbytes/function, and
!functionPQendcopy/function will not block but instead return
 an error if they need to be called again.
/para
  
--- 4318,4325 
para
 In the nonblocking state, calls to
 functionPQsendQuery/function, functionPQputline/function,
!functionPQputnbytes/function, functionPQputCopyData/function,
!and functionPQendcopy/function will not block but instead return
 an error if they need to be called again.
/para
  
*** int PQputCopyData(PGconn *conn,
*** 4961,4969 
para
 Transmits the commandCOPY/command data in the specified
 parameterbuffer/, of length parameternbytes/, to the server.
!The result is 1 if the data was sent, zero if it was not sent
!because the attempt would block (this case is only possible if the
!connection is in nonblocking mode), or -1 if an error occurred.
 (Use functionPQerrorMessage/function to retrieve details if
 the return value is -1.  If the value is zero, wait for write-ready
 and try again.)
--- 4961,4969 
para
 Transmits the commandCOPY/command data in the specified
 parameterbuffer/, of length parameternbytes/, to the server.
!The result is 1 if the data was queued, zero if it was not queued
!because of full buffers (this will only happen in nonblocking mode),
!or -1 if an error occurred.
 (Use functionPQerrorMessage/function to retrieve details if
 the return value is -1.  If the value is zero, wait for write-ready
 and try again.)
*** int PQputCopyEnd(PGconn *conn,
*** 5009,5021 
 connections.)
/para
  
!   para
!The result is 1 if the termination data was sent, zero if it was
!not sent because the attempt would block (this case is only possible
!if the connection is in nonblocking mode), or -1 if an error
!occurred.  (Use functionPQerrorMessage/function to retrieve
!details if the return value is -1.  If the value is zero, wait for
!write-ready and try again.)
/para
  
para
--- 5009,5026 
 connections.)
/para
  
!   para 
!The result is 1 if the termination message was sent; or in
!nonblocking mode, this may only indicate that the termination
!message was successfully queued.  (In nonblocking mode, to be
!certain that the data has been sent, you should next wait for
!write-ready and call functionPQflush/, repeating until it
!returns zero.)  Zero indicates that the function could not queue
!the termination message because of full buffers; this will only
!happen in nonblocking mode.  (In this case, wait 

Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Jim Nasby

On 1/6/15, 5:43 PM, Kouhei Kaigai wrote:

scan_relid != InvalidOid
 


Ideally, they should be OidIsValid(scan_relid)


Scan.scanrelid is an index of range-tables list, not an object-id.
So, InvalidOid or OidIsValid() are not a good choice.


I think the name needs to change then; scan_relid certainly looks like the OID 
of a relation.

scan_index?
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Possible typo in create_policy.sgml

2015-01-09 Thread Stephen Frost
Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
 On 8 January 2015 at 18:57, Stephen Frost sfr...@snowman.net wrote:
  What do you think of the attached rewording?
 
  Rewording it this way is a great idea.  Hopefully that will help address
  the confusion which we've seen.  The only comment I have offhand is:
  should we should add a sentence to this paragraph about the default-deny
  policy?
 
 Yes, good idea, although I think perhaps that sentence should be added
 to the preceding paragraph, after noting that RLS has to be enabled on
 the table for the policies to be applied:

I'm a bit on the fence about these ending up as different paragraphs
then, but ignoring that for the moment, I'd suggest we further clarify
with:

   The commandCREATE POLICY/command command defines a new policy for a
   table.  Note that row level security must also be enabled on the table using
   commandALTER TABLE/command in order for created policies to be applied.
   Once row level security has been enabled, a default-deny policy is used and
   no rows in the table are visible, except to the table owner or
   superuser, unless permitted by a specific policy.

   A policy permits SELECT, INSERT, UPDATE or DELETE commands to access rows
   in a table that has row level security enabled.  Access to existing table
   rows is granted if they match a policy expression specified via USING,
   while new rows that would be created via INSERT or UPDATE are checked
   against policy expressions specified via WITH CHECK.  For policy
   expressions specified via USING which grant access to existing rows, the
   system will generally test the policy expressions prior to any
   qualifications that appear in the query itself, in order to the prevent the
   inadvertent exposure of the protected data to user-defined functions which
   might not be trustworthy.  However, functions and operators marked by the
   system (or the system administrator) as LEAKPROOF may be evaluated before
   policy expressions, as they are assumed to be trustworthy.

 Also, perhaps the ALTER TABLE in the first paragraph should be
 turned into a link.

Ah, yes, agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Stephen Frost
Peter,

* Peter Geoghegan (p...@heroku.com) wrote:
 On Fri, Jan 9, 2015 at 2:22 AM, Dean Rasheed dean.a.rash...@gmail.com wrote:
  Whoa, hang on. I think you're being a bit quick to dismiss that
  example. Why shouldn't I want an upsert where the majority of the
  table columns follow the usual make it so pattern of an upsert, but
  there is also this kind of audit column to be maintained? Then I would
  write something like
 
  INSERT INTO tbl (some values, 0)
ON CONFLICT UPDATE SET same values, mod_count=mod_count+1;
 
  The root of the problem is the way that you're proposing to combine
  the RLS policies (using AND), which runs contrary to the way RLS
  policies are usually combined (using OR), which is why this kind of
  example fails -- RLS policies in general aren't intended to all be
  true simultaneously.
 
 In case I wasn't clear, I'm proposing that we AND together the already
 OR'd together UPDATE and INSERT quals.

I'm not sure how that would work exactly though, since the tuple the
UPDATE results in might be different from what the INSERT has, as Dean
pointed out.  The INSERT tuple might even pass the UPDATE policy where
the resulting tuple from the actual UPDATE part doesn't.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Peter Geoghegan
On Fri, Jan 9, 2015 at 12:53 PM, Stephen Frost sfr...@snowman.net wrote:
 I'm not sure how that would work exactly though, since the tuple the
 UPDATE results in might be different from what the INSERT has, as Dean
 pointed out.  The INSERT tuple might even pass the UPDATE policy where
 the resulting tuple from the actual UPDATE part doesn't.

I'm not suggesting that Dean's example is totally without merit (his
revised explanation made it clearer what he meant). I just think, on
balance, that enforcing all INSERT + UPDATE policies at various points
is the best behavior. Part of the reason for that is that if you ask
people how they think it works, you'll get as many answers as the
number of people you ask. My proposal (which may or may not be the
same as yours, but if not is only slightly more restrictive) is
unlikely to affect many users negatively, and is relatively easy to
explain and reason about. There are workarounds for Dean's case, too.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Stefan Kaltenbrunner
On 01/09/2015 08:01 PM, Stephen Frost wrote:
 Amit,
 
 * Amit Kapila (amit.kapil...@gmail.com) wrote:
 On Fri, Jan 9, 2015 at 1:02 AM, Jim Nasby jim.na...@bluetreble.com wrote:
 I agree, but we should try and warn the user if they set
 parallel_seqscan_degree close to max_worker_processes, or at least give
 some indication of what's going on. This is something you could end up
 beating your head on wondering why it's not working.

 Yet another way to handle the case when enough workers are not
 available is to let user  specify the desired minimum percentage of
 requested parallel workers with parameter like
 PARALLEL_QUERY_MIN_PERCENT. For  example, if you specify
 50 for this parameter, then at least 50% of the parallel workers
 requested for any  parallel operation must be available in order for
 the operation to succeed else it will give error. If the value is set to
 null, then all parallel operations will proceed as long as at least two
 parallel workers are available for processing.
 
 Ugh.  I'm not a fan of this..  Based on how we're talking about modeling
 this, if we decide to parallelize at all, then we expect it to be a win.
 I don't like the idea of throwing an error if, at execution time, we end
 up not being able to actually get the number of workers we want-
 instead, we should degrade gracefully all the way back to serial, if
 necessary.  Perhaps we should send a NOTICE or something along those
 lines to let the user know we weren't able to get the level of
 parallelization that the plan originally asked for, but I really don't
 like just throwing an error.

yeah this seems like the the behaviour I would expect, if we cant get
enough parallel workers we should just use as much as we can get.
Everything else and especially erroring out will just cause random
application failures and easy DoS vectors.
I think all we need initially is being able to specify a maximum number
of workers per query as well as a maximum number of workers in total
for parallel operations.


 
 Now, for debugging purposes, I could see such a parameter being
 available but it should default to 'off/never-fail'.

not sure what it really would be useful for - if I execute a query I
would truely expect it to get answered - if it can be made faster if
done in parallel thats nice but why would I want it to fail?


Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Peter Geoghegan
On Fri, Jan 9, 2015 at 1:09 PM, Stephen Frost sfr...@snowman.net wrote:
 To flip it around a bit, I don't think we can avoid checking the
 *resulting* tuple from the UPDATE against the UPDATE policy.

We can avoid it - by not updating. What I'm suggesting is that an
enforcement occurs that is more or less equivalent to the enforcement
that occurs when the UPDATE path is taken, without it actually being
taken. I accept that that isn't perfect, but it has its advantages.

 Therefore,
 I'm not sure that I see the point in checking the INSERT tuple against
 the UPDATE policy.

I guess it wouldn't be hard to modify the struct WithCheckOption to
store the cmd (e.g. RowSecurityPolicy-style ACL_*_CHR permissions),
usable only in this context. That way, when ExecWithCheckOptions() is
called from the INSERT, we can tell it to only enforce
INSERT-applicable policy cmds (in particular, not
UPDATE-only-applicable policy cmds that happen to end up associated
with the same ResultRelation). Whereas, at the end of ExecUpdate(),
that final ExecWithCheckOptions() call (call 3 of 3 when the UPDATE
path is taken), INSERT-based policies can still be enforced against
the final tuple (as can USING() quals that would ordinarily not be
checked at that point either). A given ResultRelation's policies
wouldn't necessarily always need to be enforced within
ExecWithCheckOptions(), which is a change. Does that make sense to
you?

Note that I've already taught ExecWithCheckOptions() to not leak the
existing, target tuple when the UPDATE path is taken (the tuple that
can be referenced in the UPDATE using the TARGET.* alias), while still
performing enforcement against it. I can teach it this too.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Stephen Frost
* Stefan Kaltenbrunner (ste...@kaltenbrunner.cc) wrote:
 On 01/09/2015 08:01 PM, Stephen Frost wrote:
  Now, for debugging purposes, I could see such a parameter being
  available but it should default to 'off/never-fail'.
 
 not sure what it really would be useful for - if I execute a query I
 would truely expect it to get answered - if it can be made faster if
 done in parallel thats nice but why would I want it to fail?

I was thinking for debugging only, though I'm not really sure why you'd
need it if you get a NOTICE when you don't end up with all the workers
you expect.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Peter Geoghegan
On Fri, Jan 9, 2015 at 12:26 PM, Stephen Frost sfr...@snowman.net wrote:
 Where this leaves me, at least, is feeling like we should always apply
 the INSERT WITH CHECK policy, then if there is a conflict, check the
 UPDATE USING policy and throw an error if the row isn't visible but
 otherwise perform the UPDATE and then check the UPDATE WITH CHECK
 policy.  I see your point that this runs counter to the 'mod_count'
 example use-case and could cause problems for users using RLS with such
 a strategy.  For my part, I expect users of RLS to expect errors in such
 a case rather than allowing it, but it's certainly a judgement call.

I mostly agree, but I don't know that I fully agree. Specifically, I
also think we should check the update policy even when we only insert,
on the theory that if we did go to update, the would-be inserted row
would be a proxy for what we'd check then (the existing, target
relation's tuple). What do you think of that?

I certainly agree that the correct behavior here is at least a bit
subjective. We cannot exactly generalize from other areas of the code,
nor can we look for a precedent set by other systems (AFAICT there is
none).

 The only reasonable way that I can see to support both sides would be to
 allow UPSERT to be a top-level policy definition in its own right and
 let users specify exactly what is allowed in the UPSERT case (possibly
 requiring three different expressions to represent the initial INSERT,
 what the UPDATE can see, and what the UPDATE results in..).  I tend to
 doubt it would be worth it unless we end up supporting UPSERT-specific
 triggers and permissions..

Agreed. That would technically make everyone happy, in that it defers
to the user, but is unlikely to be worth it.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Stephen Frost
* Peter Geoghegan (p...@heroku.com) wrote:
 On Fri, Jan 9, 2015 at 12:26 PM, Stephen Frost sfr...@snowman.net wrote:
  Where this leaves me, at least, is feeling like we should always apply
  the INSERT WITH CHECK policy, then if there is a conflict, check the
  UPDATE USING policy and throw an error if the row isn't visible but
  otherwise perform the UPDATE and then check the UPDATE WITH CHECK
  policy.  I see your point that this runs counter to the 'mod_count'
  example use-case and could cause problems for users using RLS with such
  a strategy.  For my part, I expect users of RLS to expect errors in such
  a case rather than allowing it, but it's certainly a judgement call.
 
 I mostly agree, but I don't know that I fully agree. Specifically, I
 also think we should check the update policy even when we only insert,
 on the theory that if we did go to update, the would-be inserted row
 would be a proxy for what we'd check then (the existing, target
 relation's tuple). What do you think of that?

To flip it around a bit, I don't think we can avoid checking the
*resulting* tuple from the UPDATE against the UPDATE policy.  Therefore,
I'm not sure that I see the point in checking the INSERT tuple against
the UPDATE policy.  I also don't like the idea that a completely
legitimate command (one which allows the to-be-inserted row via the
INSERT policy AND allows the tuple-post-update via the UPDATE policy) to
throw an error because the to-be-inserted row violated the UPDATE
policy.  That doesn't make sense to me.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] INSERT ... ON CONFLICT UPDATE and RLS

2015-01-09 Thread Dean Rasheed
On 9 January 2015 at 08:49, Peter Geoghegan p...@heroku.com wrote:
 On Fri, Jan 9, 2015 at 12:19 AM, Dean Rasheed dean.a.rash...@gmail.com 
 wrote:
 I was trying to think up an example where you might actually have
 different INSERT and UPDATE policies, and the best I can think of is
 some sort of mod_count column where you have an INSERT CHECK
 (mod_count = 0) and an UPDATE CHECK (mod_count  0). In that case,
 checking both policies would make an UPSERT impossible, whereas if you
 think of it as doing either an INSERT or an UPDATE, as the syntax
 suggests, it becomes possible.

 Why does this user want to do this upsert? If they're upserting, then
 the inserted row could only reasonably have a value of (mod_count =
 0). If updating, then they must have a constant value for the update
 path (a value that's greater than 0, naturally - say 2), which doesn't
 make any sense in the context of an upsert's auxiliary update - what
 happened to the 0 value? Sorry, but I don't think your example makes
 sense - I can't see what would motivate anyone to write a query like
 that with those RLS policies in place. It sounds like you're talking
 about an insert and a separate update that may or may not affect the
 same row, and not an upsert. Then those policies make sense, but in
 practice they render the upsert you describe contradictory.


Whoa, hang on. I think you're being a bit quick to dismiss that
example. Why shouldn't I want an upsert where the majority of the
table columns follow the usual make it so pattern of an upsert, but
there is also this kind of audit column to be maintained? Then I would
write something like

INSERT INTO tbl (some values, 0)
  ON CONFLICT UPDATE SET same values, mod_count=mod_count+1;

The root of the problem is the way that you're proposing to combine
the RLS policies (using AND), which runs contrary to the way RLS
policies are usually combined (using OR), which is why this kind of
example fails -- RLS policies in general aren't intended to all be
true simultaneously.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Fixing memory leak in pg_upgrade

2015-01-09 Thread Tatsuo Ishii
According to Coverity, there's a memory leak bug in transfer_all_new_dbs().

mappings = gen_db_file_maps(old_db, new_db, n_maps, old_pgdata,

new_pgdata);

if (n_maps)
{
print_maps(mappings, n_maps, new_db-db_name);

#ifdef PAGE_CONVERSION
pageConverter = setupPageConverter();
#endif
transfer_single_new_db(pageConverter, mappings, n_maps,
   
old_tablespace);

pg_free(mappings);
}
- leaks mappings
}
return;
}

This is because gen_db_file_maps() allocates memory even if n_maps == 0.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom/Foreign-Join-APIs (Re: [HACKERS] [v9.5] Custom Plan API)

2015-01-09 Thread Robert Haas
On Tue, Jan 6, 2015 at 9:17 AM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:
 The attached patch is newer revision of custom-/foreign-join
 interface.

It seems that the basic purpose of this patch is to allow a foreign
scan or custom scan to have scanrelid == 0, reflecting the case where
we are scanning a joinrel rather than a baserel.  The major problem
that seems to create is that we can't set the target list from the
relation descriptor, because there isn't one.  To work around that,
you've added fdw_ps_list and custom_ps_tlist, which the FDW or
custom-plan provider must set.  I don't know off-hand whether that's a
good interface or not.  How does the FDW know what to stick in there?
There's a comment that seems to be trying to explain this:

+ * An optional fdw_ps_tlist is used to map a reference to an attribute of
+ * underlying relation(s) on a pair of INDEX_VAR and alternative varattno.
+ * It looks like a scan on pseudo relation that is usually result of
+ * relations join on remote data source, and FDW driver is responsible to
+ * set expected target list for this. If FDW returns records as foreign-
+ * table definition, just put NIL here.

...but I can't understand what that's telling me.

You've added an Oid fdw_handler field to the ForeignScan and
RelOptInfo structures.  I think this is the OID of the pg_proc entry
for the handler function; and I think we need it because, if scanrelid
== 0 then we don't have a relation that we can trace to a foreign
table, to a server, to an FDW, and then to a handler.  So we need to
get that information some other way.  When building joinrels, the
fdw_handler OID, and the associated routine, are propagated from any
two relations that share the same fdw_handler OID to the resulting
joinrel.  I guess that's reasonable, although it feels a little weird
that we're copying around both the OID and the structure-pointer.

For non-obvious reasons, you've made create_plan_recurse() non-static.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Translating xlogreader.c

2015-01-09 Thread Heikki Linnakangas
xlogreader.c contains a bunch of strings passed to the 
report_invalid_record function, that are supposed to be translated. 
src/backend/nls.mk lists report_invalid_record as a gettext trigger.


In 9.2 and below, when those functions were still in xlog, those strings 
were in the postgres.pot file, and translated. But not since 9.3, when 
they were moved to xlogreader.c


What's going on?

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Amit Kapila
On Fri, Jan 9, 2015 at 1:02 AM, Jim Nasby jim.na...@bluetreble.com wrote:

 On 1/5/15, 9:21 AM, Stephen Frost wrote:

 * Robert Haas (robertmh...@gmail.com) wrote:

 I think it's right to view this in the same way we view work_mem.  We
 plan on the assumption that an amount of memory equal to work_mem will
 be available at execution time, without actually reserving it.


 Agreed- this seems like a good approach for how to address this.  We
 should still be able to end up with plans which use less than the max
 possible parallel workers though, as I pointed out somewhere up-thread.
 This is also similar to work_mem- we certainly have plans which don't
 expect to use all of work_mem and others that expect to use all of it
 (per node, of course).


 I agree, but we should try and warn the user if they set
parallel_seqscan_degree close to max_worker_processes, or at least give
some indication of what's going on. This is something you could end up
beating your head on wondering why it's not working.


Yet another way to handle the case when enough workers are not
available is to let user  specify the desired minimum percentage of
requested parallel workers with parameter like
PARALLEL_QUERY_MIN_PERCENT. For  example, if you specify
50 for this parameter, then at least 50% of the parallel workers
requested for any  parallel operation must be available in order for
the operation to succeed else it will give error. If the value is set to
null, then all parallel operations will proceed as long as at least two
parallel workers are available for processing.

This is something how other commercial database handles such a
situation.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


[HACKERS] Improving RLS qual pushdown

2015-01-09 Thread Dean Rasheed
A while ago [1] I proposed an enhancement to the way qual pushdown
safety is decided in RLS / security barrier views. Currently we just
test for the presence of leaky functions in the qual, but it is
possible to do better than that, by further testing if the leaky
function is actually being passed information that we don't want to be
leaked.

Attached is a patch that does that, allowing restriction clause quals
to be pushed down into subquery RTEs if they contain leaky functions,
provided that the arglists of those leaky functions contain no Vars
(such Vars would necessarily refer to output columns of the subquery,
which is the data that must not be leaked).

An example of the sort of query this will help optimise is:

SELECT * FROM table_with_rls WHERE created  now() - '1 hour'::interval;

where currently this qual cannot be pushed down because neither now()
nor timestamptz_mi_interval() are leakproof, but since they are not
being passed any data from the table, they can't actually leak
anything, so the qual can be safely pushed down, allowing indexes to
be used if available.

In fact the majority of builtin functions aren't marked leakproof, and
probably most user functions aren't either, so this could potentially
be useful in a wide range of real-world queries, where it is common to
write quals of the form column operator expression, and the
expression may contain leaky functions.

Regards,
Dean

[1] 
http://www.postgresql.org/message-id/caezatcwkcpfwilocnmfmszklgoabz7axkmgz-mk83gd3jg8...@mail.gmail.com
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
new file mode 100644
index 58d78e6..4433468
*** a/src/backend/optimizer/path/allpaths.c
--- b/src/backend/optimizer/path/allpaths.c
*** targetIsInAllPartitionLists(TargetEntry
*** 1982,1988 
   * 2. If unsafeVolatile is set, the qual must not contain any volatile
   * functions.
   *
!  * 3. If unsafeLeaky is set, the qual must not contain any leaky functions.
   *
   * 4. The qual must not refer to the whole-row output of the subquery
   * (since there is no easy way to name that within the subquery itself).
--- 1982,1989 
   * 2. If unsafeVolatile is set, the qual must not contain any volatile
   * functions.
   *
!  * 3. If unsafeLeaky is set, the qual must not contain any leaky functions
!  * that might reveal values from the subquery as side effects.
   *
   * 4. The qual must not refer to the whole-row output of the subquery
   * (since there is no easy way to name that within the subquery itself).
*** qual_is_pushdown_safe(Query *subquery, I
*** 2009,2015 
  
  	/* Refuse leaky quals if told to (point 3) */
  	if (safetyInfo-unsafeLeaky 
! 		contain_leaky_functions(qual))
  		return false;
  
  	/*
--- 2010,2016 
  
  	/* Refuse leaky quals if told to (point 3) */
  	if (safetyInfo-unsafeLeaky 
! 		contain_leaked_vars(qual))
  		return false;
  
  	/*
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
new file mode 100644
index b340b01..7f82d54
*** a/src/backend/optimizer/util/clauses.c
--- b/src/backend/optimizer/util/clauses.c
*** static bool contain_mutable_functions_wa
*** 97,103 
  static bool contain_volatile_functions_walker(Node *node, void *context);
  static bool contain_volatile_functions_not_nextval_walker(Node *node, void *context);
  static bool contain_nonstrict_functions_walker(Node *node, void *context);
! static bool contain_leaky_functions_walker(Node *node, void *context);
  static Relids find_nonnullable_rels_walker(Node *node, bool top_level);
  static List *find_nonnullable_vars_walker(Node *node, bool top_level);
  static bool is_strict_saop(ScalarArrayOpExpr *expr, bool falseOK);
--- 97,103 
  static bool contain_volatile_functions_walker(Node *node, void *context);
  static bool contain_volatile_functions_not_nextval_walker(Node *node, void *context);
  static bool contain_nonstrict_functions_walker(Node *node, void *context);
! static bool contain_leaked_vars_walker(Node *node, void *context);
  static Relids find_nonnullable_rels_walker(Node *node, bool top_level);
  static List *find_nonnullable_vars_walker(Node *node, bool top_level);
  static bool is_strict_saop(ScalarArrayOpExpr *expr, bool falseOK);
*** contain_nonstrict_functions_walker(Node
*** 1318,1343 
  }
  
  /*
!  *		  Check clauses for non-leakproof functions
   */
  
  /*
!  * contain_leaky_functions
!  *		Recursively search for leaky functions within a clause.
   *
!  * Returns true if any function call with side-effect may be present in the
!  * clause.  Qualifiers from outside the a security_barrier view should not
!  * be pushed down into the view, lest the contents of tuples intended to be
!  * filtered out be revealed via side effects.
   */
  bool

Re: [HACKERS] Fixing memory leak in pg_upgrade

2015-01-09 Thread Michael Paquier
On Fri, Jan 9, 2015 at 9:23 PM, Tatsuo Ishii is...@postgresql.org wrote:
 This is because gen_db_file_maps() allocates memory even if n_maps == 0.
Purely cosmetic: the initialization n_maps = 0 before the call of
gen_db_file_maps is unnecessary ;)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Compression of full-page-writes

2015-01-09 Thread Rahila Syed
So this test can be used to evaluate how shorter records influence 
performance since the master waits for flush confirmation from the 
standby, right? 

Yes. This test can help measure performance improvement due to reduced I/O
on standby as master waits for WAL records flush on standby.

Isn't that GB and not MB? 
Yes. That is a typo. It should be GB.

How many FPWs have been generated and how many dirty buffers have been 
flushed for the 3 checkpoints of each test? 

Any data about the CPU activity? 
Above data is not available for this run . I will rerun the tests to gather
above data.

Thank you,
Rahila Syed





--
View this message in context: 
http://postgresql.nabble.com/Compression-of-full-page-writes-tp5769039p5833389.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Seq Scan

2015-01-09 Thread Amit Kapila
On Fri, Dec 19, 2014 at 7:57 PM, Stephen Frost sfr...@snowman.net wrote:


 There's certainly documentation available from the other RDBMS' which
 already support parallel query, as one source.  Other academic papers
 exist (and once you've linked into one, the references and prior work
 helps bring in others).  Sadly, I don't currently have ACM access (might
 have to change that..), but there are publicly available papers also,

I have gone through couple of papers and what some other databases
do in case of parallel sequential scan and here is brief summarization
of same and how I am planning to handle in the patch:

Costing:
In one of the paper's [1] suggested by you, below is the summarisation:
  a. Startup costs are negligible if processes can be reused
   rather than created afresh.
b. Communication cost consists of the CPU cost of sending
   and receiving messages.
  c. Communication costs can exceed the cost of operators such
as scanning, joining or grouping
These findings lead to the important conclusion that
Query optimization should be concerned with communication costs
but not with startup costs.

In our case as currently we don't have a mechanism to reuse parallel
workers, so we need to account for that cost as well.  So based on that,
I am planing to add three new parameters cpu_tuple_comm_cost,
parallel_setup_cost, parallel_startup_cost
*  cpu_tuple_comm_cost - Cost of CPU time to pass a tuple from worker
to master backend with default value
DEFAULT_CPU_TUPLE_COMM_COST as 0.1, this will be multiplied
with tuples expected to be selected
*  parallel_setup_cost - Cost of setting up shared memory for parallelism
   with default value as 100.0
 *  parallel_startup_cost  - Cost of starting up parallel workers with
default
value as 1000.0 multiplied by number of workers decided for scan.

I will do some experiments to finalise the default values, but in general,
I feel developing cost model on above parameters is good.

Execution:
Most other databases does partition level scan for partition on
different disks by each individual parallel worker.  However,
it seems amazon dynamodb [2] also works on something
similar to what I have used in patch which means on fixed
blocks.  I think this kind of strategy seems better than dividing
the blocks at runtime because dividing randomly the blocks
among workers could lead to random scan for a parallel
sequential scan.
Also I find in whatever I have read (Oracle, dynamodb) that most
databases divide work among workers and master backend acts
as coordinator, atleast that's what I could understand.

Let me know your opinion about the same?

I am planning to proceed with above ideas to strengthen the patch
in absence of any objection or better ideas.


[1] : http://i.stanford.edu/pub/cstr/reports/cs/tr/96/1570/CS-TR-96-1570.pdf
[2] :
http://docs.aws.amazon.com/amazondynamodb/latest/developerguide/QueryAndScan.html#QueryAndScanParallelScan


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] [PATCH] server_version_num should be GUC_REPORT

2015-01-09 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 While looking into client code that relies on parsing server_version
 instead of checking server_version_num, I was surprised to discover that
 server_version_num isn't sent to the client by the server as part of the
 standard set of parameters reported post-auth.

Why should it be?  server_version is what's documented to be sent.

 The attached patch marks server_version_num GUC_REPORT and documents
 that it's reported to the client automatically.

I think this is just a waste of network bandwidth.  No client-side code
could safely depend on its being available for many years yet, therefore
they're going to keep using server_version.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers