date:20140528


On 05/28/2014 09:41 AM, Simon Riggs wrote:

On 27 May 2014 13:20, Heikki Linnakangas hlinnakan...@vmware.com wrote:

On 05/27/2014 03:18 PM, Simon Riggs wrote:


IIRC Koichi had a patch for prefetch during recovery. Heikki, is that
the reason you also discussed changing the WAL record format to allow
us to identify the blocks touched by recovery more easily?



Yeah, that was one use case I had in mind for the WAL format changes. See
http://www.postgresql.org/message-id/533d6cbf.6080...@vmware.com.


Those proposals suggest some very big changes to the way WAL works.

Prefetch can work easily enough for most records - do we really need
that much churn?

You mentioned Btree vacuum records, but I'm planning to optimize those
another way.

Why don't we just have the prefetch code in core and forget the WAL
format changes?


Well, the prefetching was just one example of why the proposed WAL 
format changes are a good idea. The changes will make life easier for 
any external (or internal, for that matter) tool that wants to read WAL 
records. The thing that finally really got me into doing that was 
pg_rewind. For pg_rewind it's not enough to cover most records, you have 
to catch all modifications to data pages for correctness, and that's 
difficult to maintain as new WAL record types are added and old ones are 
modified in every release.


Also, the changes make WAL-logging and -replaying code easier to write. 
Which reduces the potential for bugs.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Race condition within _bt_findinsertloc()? (new page split code)


On 05/28/2014 02:15 AM, Peter Geoghegan wrote:

On Tue, May 27, 2014 at 12:19 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

Fair enough, but I don't think that affects correctness either way (I
don't think that you meant to imply that this was a necessary
precaution that you'd taken - right?).


Right.


So, the comments within _bt_search() suggest that the _bt_moveright()
call will perform a _bt_finish_split() call opportunistically iff it's
called from _bt_doinsert() (i.e. access == BT_WRITE). There is no
reason to not do so in all circumstances though, assuming that it's
desirable to do so as soon as possible (which I *don't* actually
assume).


You can't write in a hot standby, so that's one reason to only do it 
when inserting, not when querying. Even when you're not in a hot 
standby, it seems like a nice property that a read-only query doesn't do 
writes. I know we do hint bit updates and other kinds of write-action 
when reading anyway, but still.



If I'm not mistaken, it's also true that it would be strictly
speaking correct to never do it there.


Hmm. No, it wouldn't. It is not allowed to have two incompletely-split 
pages adjacent to each other. If you move right to the right-half of an 
incomplete split, i.e. a page that does not have a downlink in its 
parent, and then try to split the page again, _bt_insert_parent() would 
fail to find the location to insert the new downlink to. It assumes that 
there is a downlink to the page being split in the parent, and uses that 
to find the location for the new downlink.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] rangetypes spgist questions/refactoring

On Tue, May 20, 2014 at 11:18:29AM -0700, Jeff Davis wrote:
 I think this can be done without breaking upgrade compatibility, because
 I think the structure already satisfies the invariants I mentioned in
 the other email (aside from the special case of a root tuple with two
 nodes and no prefix). That being said, it's a little scary to refactor
 indexing code while trying to keep it upgrade-compatible.

We can make pg_upgrade mark such indexes as invalid and create a user
script to reindex all the indexes after the upgrade.  We have done
that in the past.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allowing join removals for more join types

2014-05-28 Thread David Rowley

On Sun, May 25, 2014 at 5:42 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrowle...@gmail.com writes:
  I agree that there are not many cases left to remove the join that remain
  after is_simple_subquery() has decided not to pullup the subquery. Some
 of
  the perhaps more common cases would be having windowing functions in the
  subquery as this is what you need to do if you want to include the
 results
  of a windowing function from within the where clause. Another case,
 though
  I can't imagine it would be common, is ORDER BY in the subquery... But
 for
  that one I can't quite understand why is_simple_subquery() stops that
 being
  flattened in the first place.

 The problem there is that (in general) pushing qual conditions to below a
 window function will change the window function's results.  If we flatten
 such a subquery then the outer query's quals can get evaluated before
 the window function, so that's no good.  Another issue is that flattening
 might cause the window function call to get copied to places in the outer
 query where it can't legally go, such as the WHERE clause.


I should have explained more clearly. I was meaning that a query such as
this:

SELECT a.* FROM a LEFT OUTER JOIN (SELECT id,LAG(id) OVER (ORDER BY id) AS
prev_id FROM b) b ON a.id=b.id;

assuming that id is the primary key, could have the join removed.
I was just commenting on this as it's probably a fairly common thing to
have a subquery with windowing functions in order to perform some sort of
filtering of window function columns in the outer query.
The other use cases for example:

SELECT a.* FROM a LEFT OUTER JOIN (SELECT id FROM b LIMIT 10) b ON a.id=b.id
;

Are likely less common.

Regards

David Rowley

Re: [HACKERS] pg9.4b1: unhelpful error message when creating a collation

2014-05-28 Thread Fabien COELHO




Searching for that error turned up:
https://sourceware.org/bugzilla/show_bug.cgi?id=14247
https://bugzilla.redhat.com/show_bug.cgi?id=827510


Indeed. Thanks for the pointer.

I have reported the issue on launchpad (ubuntu bug tracking site) with a 
link to the redhat bug and Tom's test program attached.


https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1323953

We'll see what happens.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposing pg_hibernate

2014-05-28 Thread Gurjeet Singh

On Wed, May 28, 2014 at 2:15 AM, Amit Kapila amit.kapil...@gmail.com wrote:
On Wed, May 28, 2014 at 7:31 AM, Gurjeet Singh gurj...@singh.im wrote:
Caveats
--

- Buffer list is saved only when Postgres is shutdown in smart and
fast modes.

That is, buffer list is not saved when database crashes, or on
immediate
shutdown.

- A reduction in `shared_buffers` is not detected.

If the `shared_buffers` is reduced across a restart, and if the
combined
saved buffer list is larger than the new shared_buffers, Postgres
Hibernator continues to read and restore blocks even after
`shared_buffers`
worth of buffers have been restored.

How about the cases when shared buffers already contain some
data:
a. Before Readers start filling shared buffers, if this cluster wishes
to join replication as a slave and receive the data from master, then
this utility might need to evict some buffers filled during startup
phase.

A cluster that wishes to be a replication standby, it would do so
while it's in startup phase. The BlockReaders are launched immediately
on cluster reaching consistent state, at which point, I presume, in
most of the cases, most of the buffers would be unoccupied. Hence
BlockReaders might evict the occupied buffers, which may be a small
fraction of total shared_buffers.

b. As soon as the server completes startup (reached consistent
point), it allows new connections which can also use some shared
buffers before Reader process could use shared buffers or are you
planing to change the time when users can connect to database.

The BlockReaders are launched immediately after the cluster reaches
consistent state, that is, just about when it is ready to accept
connections. So yes, there is a possibility that the I/O caused by the
BlockReaders may affect the performance of queries executed right at
cluster startup. But given that the performance of those queries was
anyway going to be low (because of empty shared buffers), and that
BlockReaders tend to cause sequential reads, and that by default
there's only one BlockReader active at a time, I think this won't be a
concern in most of the cases. By the time the shared buffers start
getting filled up, the buffer replacement strategy will evict any
buffers populated by BlockReaders if they are not used by the normal
queries.

In the 'Sample Runs' section of my blog [1], I compared the cases
'Hibernator w/ App' and 'Hibernator then App', which demonstrate that
launching application load while the BlockReaders are active does
cause performance of both to be impacted by each other. But overall
it's a net win for application performance.

I am not sure if replacing shared buffer contents in such cases can
always be considered useful.

IMHO, all of these caveats, would affect a very small fraction of
use-cases and are eclipsed by the benefits this extension provides in
normal cases.

[1]:
http://gurjeet.singh.im/blog/2014/04/30/postgres-hibernator-reduce-planned-database-down-times/

Best regards,
--
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

* Robert Haas (robertmh...@gmail.com) wrote:
 On Mon, May 26, 2014 at 10:39 AM, Stephen Frost sfr...@snowman.net wrote:
  It'd need to be explicitly requested, eg a 'CASCADE' option.
 
 Why?  Would any sane person NOT want this behavior?

[...]

 Now maybe there are options other than trying to reproduce what the
 original CREATE OR REPLACE statement would have done against the new
 type.  For example, we could look through views that depend on t.a and
 rewrite each reference to that column to t.a::oldtype.  This might
 lead to odd results with multiple nested casts and generally funny
 behavior if the column is re-typed multiple times; but maybe there's
 some way to fix that.  

This.

 Also, it might not really be the semantics you
 want if you were hoping the type update would truly cascade.  But it
 might still be better than a sharp stick in the eye, which is kinda
 what we offer today.

I hadn't even considered the idea that we would go through and try to
change everything which referenced that view to now be the new type- but
in that case, I'd want to know that there were other changes which were
happening beyond the single view which I was updating.  Perhaps a NOTICE
would be enough, but it doesn't feel correct to me.  Also consider
MatViews which would need to be rewritten for the new type, or pl/pgsql
functions which we couldn't possibly fix entirely (we're going to change
the variable's type definition because it selects out a column from this
view?) and so they'd just break instead.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?


Robert Haas robertmh...@gmail.com writes:

 Well, pg_dump is trying to do something different than what you're
 trying to do here.  pg_dump wants to make sure that the view, when fed
 back into psql, creates the same view that exists now, regardless of
 whether that's what the user created originally.  For example, if a
 view is created referring to table foo, and table foo is later renamed
 to bar, then pg_dump wants to (and does) dump a statement referring to
 bar, not foo - even if there's a *new* table called foo against which
 the view could have been defined.  Similarly, pg_dump will
 schema-qualify functions and operators, or not, based on whether
 that's necessary to reference the exact same operators that were
 selected when the original CREATE VIEW command was run, regardless of
 whether the original references were schema-qualified.

Sorry, I don't see how any of the above is a problem in my use case.
Should a table has been renamed, naturally we want to re-create the view
referring to the *old* table, but under its *new name*.  The same goes
with schema-qualifying objects.

 None of that involves answering hypothetical questions; but what you
 want to do does, and that I think is the problem in a nutshell.

In a nutshell I'd like PostgreSQL to just re-parse the *current* view
definition.  Should that throw an error, user intervention will be
required anyway, but most of the time it should just work.

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?


Stephen Frost sfr...@snowman.net writes:

 I hadn't even considered the idea that we would go through and try to
 change everything which referenced that view to now be the new type- but
 in that case, I'd want to know that there were other changes which were
 happening beyond the single view which I was updating.  Perhaps a NOTICE
 would be enough, but it doesn't feel correct to me.

 Also consider MatViews which would need to be rewritten for the new
 type

That might be costly but not impossible.  A user would need to do that
anyway, though manually.

 or pl/pgsql functions which we couldn't possibly fix entirely
 (we're going to change the variable's type definition because it
 selects out a column from this view?) and so they'd just break
 instead.

I'm not suggesting that we try to *fix* functions, but if we can detect
function breakage by re-parsing them it would be nice to alert the user
of any problems at the instant of running ALTER TABLE statement, not
when the user tries to actually run the function (see my mail upthread
for sample scenario.)

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

* ash (a...@commandprompt.com) wrote:
 Stephen Frost sfr...@snowman.net writes:
  I hadn't even considered the idea that we would go through and try to
  change everything which referenced that view to now be the new type- but
  in that case, I'd want to know that there were other changes which were
  happening beyond the single view which I was updating.  Perhaps a NOTICE
  would be enough, but it doesn't feel correct to me.
 
  Also consider MatViews which would need to be rewritten for the new
  type
 
 That might be costly but not impossible.  A user would need to do that
 anyway, though manually.

I was pointing out why it should need to be explicitly requested, but
all-in-all, I don't really see this proposal going anywhere.  It's a
neat idea, and if there's a sensible way to do it, then the user should
have to explicitly request it, imv.

  or pl/pgsql functions which we couldn't possibly fix entirely
  (we're going to change the variable's type definition because it
  selects out a column from this view?) and so they'd just break
  instead.
 
 I'm not suggesting that we try to *fix* functions, but if we can detect
 function breakage by re-parsing them it would be nice to alert the user
 of any problems at the instant of running ALTER TABLE statement, not
 when the user tries to actually run the function (see my mail upthread
 for sample scenario.)

We're not going to re-parse every function in the system, like, ever.  I
might be willing to buy off on too bad in those cases (it's not like
we're going and fixing them today for ALTER TABLE .. TYPE cases anyway)
but the point is that cascading the change to a column's type through
all of its dependencies is likely to cause breakage somewhere in even
modestly complex systems and we shouldn't just start doing that
automatically.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?


Stephen Frost sfr...@snowman.net writes:

 * ash (a...@commandprompt.com) wrote:
 Stephen Frost sfr...@snowman.net writes:
 
  Also consider MatViews which would need to be rewritten for the new
  type
 
 That might be costly but not impossible.  A user would need to do that
 anyway, though manually.

 I was pointing out why it should need to be explicitly requested, but
 all-in-all, I don't really see this proposal going anywhere.  It's a
 neat idea, and if there's a sensible way to do it, then the user should
 have to explicitly request it, imv.

Ah, understood.

  or pl/pgsql functions which we couldn't possibly fix entirely
  (we're going to change the variable's type definition because it
  selects out a column from this view?) and so they'd just break
  instead.
 
 I'm not suggesting that we try to *fix* functions, but if we can detect
 function breakage by re-parsing them it would be nice to alert the user
 of any problems at the instant of running ALTER TABLE statement, not
 when the user tries to actually run the function (see my mail upthread
 for sample scenario.)

 We're not going to re-parse every function in the system, like, ever.

Well, only every *affected* function, which might be pretty minimal in
the usual case.

 I might be willing to buy off on too bad in those cases (it's not
 like we're going and fixing them today for ALTER TABLE .. TYPE cases
 anyway) but the point is that cascading the change to a column's type
 through all of its dependencies is likely to cause breakage somewhere
 in even modestly complex systems and we shouldn't just start doing
 that automatically.

What I am suggesting is that we try to detect such breakage at the time
the user runs ALTER TABLE (issuing NOTICE or ERROR at user discretion.)
If changing column type of a table breaks some functions down the way,
the user will hit it anyway, but better know it soon than when he wants
to *run* that function.

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_llog not mentioned in Database File Layout

On Wed, May 28, 2014 at 4:36 AM, Fabrízio de Royes Mello
fabriziome...@gmail.com wrote:

 On Mon, May 26, 2014 at 12:33 PM, Amit Langote amitlangot...@gmail.com
 wrote:

 Hi,

 Just noticed pg_llog is not mentioned in the Database File Layout
 section. Wonder if it's an oversight?


 Yes, it is an oversight. Patch attached.

ISTM pg_stat directory is also missing in the doc.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_llog not mentioned in Database File Layout

Hi,

On 2014-05-27 00:33:12 +0900, Amit Langote wrote:
 Just noticed pg_llog is not mentioned in the Database File Layout
 section. Wonder if it's an oversight?

Yes, it should be mentioned there. I'll fix it.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

* ash (a...@commandprompt.com) wrote:
 What I am suggesting is that we try to detect such breakage at the time
 the user runs ALTER TABLE (issuing NOTICE or ERROR at user discretion.)
 If changing column type of a table breaks some functions down the way,
 the user will hit it anyway, but better know it soon than when he wants
 to *run* that function.

Sure, if we had the information about what would break then we could
tell the user about it.  We don't and that's not likely to ever
change...

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] pg_llog not mentioned in Database File Layout

2014-05-28 Thread Fabrízio de Royes Mello

On Wed, May 28, 2014 at 10:46 AM, Fujii Masao masao.fu...@gmail.com wrote:

 On Wed, May 28, 2014 at 4:36 AM, Fabrízio de Royes Mello
 fabriziome...@gmail.com wrote:
 
  On Mon, May 26, 2014 at 12:33 PM, Amit Langote amitlangot...@gmail.com
  wrote:
 
  Hi,
 
  Just noticed pg_llog is not mentioned in the Database File Layout
  section. Wonder if it's an oversight?
 
 
  Yes, it is an oversight. Patch attached.

 ISTM pg_stat directory is also missing in the doc.


Added pg_stat too. Attached patch!

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 57e7f09..41f81e9 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -78,6 +78,11 @@ Item
 /row
 
 row
+ entryfilenamepg_llog//entry
+ entrySubdirectory containing logical replication status data/entry
+/row
+
+row
  entryfilenamepg_multixact//entry
  entrySubdirectory containing multitransaction status data
   (used for shared row locks)/entry
@@ -104,6 +109,11 @@ Item
 /row
 
 row
+ entryfilenamepg_stat//entry
+ entrySubdirectory containing files for the statistics subsystem/entry
+/row
+
+row
  entryfilenamepg_stat_tmp//entry
  entrySubdirectory containing temporary files for the statistics
   subsystem/entry

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_stat directory and pg_stat_statements

Hi,

Thanks to 187492b6c2e8cafc5b39063ca3b67846e8155d24, pgstat
files are now saved to $PGDATA/pg_stat directory at shutdown.
But pg_stat_statements file is saved under $PGDATA/global yet.
Is this intentional or just oversight? Saving that file to global is
harmless, though.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

ash a...@commandprompt.com writes:
 Stephen Frost sfr...@snowman.net writes:
 We're not going to re-parse every function in the system, like, ever.

 Well, only every *affected* function, which might be pretty minimal in
 the usual case.

We don't store dependency information for function bodies, so there's
no way to do this except by reparsing everything in sight.

A larger issue with the idea is that a function might fail reparsing
for reasons having nothing to do with the proposed ALTER TABLE.
For instance, it's not at all unusual for functions to contain references
to tables that don't normally exist, but are created when the function is
to be called (or maybe even by the function itself).  Because of this
problem, reparsing, in the sense of detecting semantic rather than
purely syntactic problems in a function body, is something that we don't
actually do *at all*, ever, except when the function is actually executed.
(This is part of the reason why there's no dependency info.)
Pavel Stehule has made some efforts towards improving that situation
for plpgsql functions:
https://commitfest.postgresql.org/action/patch_view?id=884
but that patch remains pretty controversial and may never get committed.
Even if it does get in, it wouldn't move the goalposts for any other PL.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

2014-05-28 Thread David Fetter

On Tue, May 27, 2014 at 09:41:06AM -0400, Stephen Frost wrote:
 * David Fetter (da...@fetter.org) wrote:
  - We make type mappings settable at the level of:
  - FDW
  - Instance (a.k.a. cluster)
  - Database
  - Schema
  - Table
  - Column
 
 While I like the general idea, you seem to be making this appear much
 more complex than it actually is.  For example, I see no value in a
 table-level uint4 - int8 definition.

You chose a particularly silly example, so I have to assume it's a
straw man.  My point was that since we don't know what things are
relevant to preserve and transform here in the design stage, we need
to leave them settable by people who have needs we don't know about,
i.e. the users of the feature.

 If you want something which is
 not the default, just set it on the individual columns of the foreign
 table, exactly how we handle column name differences.
 
 There might be some value in being able to redefine what the default is
 at the FOREIGN SERVER level, as perhaps MySQL DB X and MySQL DB Y could
 have different default mappings for some reason, but adding in the rest
 of those levels doesn't add value imv.
 
  This would consist of:
  - The remote type
  - The local type to which it maps
  - The inbound transformation (default identity)
  - The outbound transformation (default identity)
 
 The remote type and the local type are known already to the FDW, no?

The above aren't separable items.  They all have to be there, even if
for user convenience we have two default transformations which are the
identity.

 The FDW will need to know how to do whatever conversions we're talking
 about also, right?

No.  The writer of the FDW is not the same person as the user of the
FDW, and the former is not omniscient.  My idea here is to allow FDW
users to supply transformation code at these spots.

 (If you want to do it in core PG instead of the FDW
 then I'm thinking you should probably use a view over top of the
 foreign table..).

We can't know in advance what to preserve and what to jettison.  We
can't even know which server is responsible for doing the
transformation.

 What's nice is that this all falls under what an FDW
 can do *already*, if it wants (and, actually, it could do it on the
 table-level technically too, if the FDW supports that, but schema?
 database?  these things don't make sense...).

Not to you yet, but I've seen deployed applications with exactly these
requirements.

 For the IMPORT bit, it should just be doing whatever the defaults are
 unless you've set some different defaults for a given foreign server
 (also something which could be set using our existing system...).
 
  ALTER FOREIGN TABLE foo ADD (mapping '{
  datetime: text,
  inbound: IDENTITY,
  outbound: IDENTITY
  }')
 
 I'm very much against having two different command languages where one
 is used when convenient.

I did not suggest that we use two different ways to specify this.  I
did sketch out one path--adding a bunch of SQL grammar--which I made
it explicitly clear we shouldn't tread.  Apparently, I wasn't quite
explicit enough. 

 If we wanted to do this, we should build a full-spec mapping from
 whatever JSON language you come up with for our utility commands to
 what we actually implement in the grammar.

Excellent plan.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Compression of full-page-writes

On Tue, May 27, 2014 at 12:57 PM, Rahila Syed rahilasyed...@gmail.com wrote:
 Hello All,

 0001-CompressBackupBlock_snappy_lz4_pglz extends patch on compression of
 full page writes to include LZ4 and Snappy . Changes include making
 compress_backup_block GUC from boolean to enum. Value of the GUC can be
 OFF, pglz, snappy or lz4 which can be used to turn off compression or set
 the desired compression algorithm.

 0002-Support_snappy_lz4 adds support for LZ4 and Snappy in PostgreSQL. It
 uses Andres’s patch for getting Makefiles working and has a few wrappers to
 make the function calls to LZ4 and Snappy compression functions and handle
 varlena datatypes.
 Patch Courtesy: Pavan Deolasee

Thanks for extending and revising the FPW-compress patch! Could you add
your patch into next CF?

 Also, compress_backup_block GUC needs to be merged with full_page_writes.

Basically I agree with you because I don't want to add new GUC very similar to
the existing one.

But could you imagine the case where full_page_writes = off. Even in this case,
FPW is forcibly written only during base backup. Such FPW also should be
compressed? Which compression algorithm should be used? If we want to
choose the algorithm for such FPW, we would not be able to merge those two
GUCs. IMO it's OK to always use the best compression algorithm for such FPW
and merge them, though.

 Tests use JDBC runner TPC-C benchmark to measure the amount of WAL
 compression ,tps and response time in each of the scenarios viz .
 Compression = OFF , pglz, LZ4 , snappy ,FPW=off

Isn't it worth measuring the recovery performance for each compression
algorithm?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

On Mon, May 26, 2014 at 09:50:42PM +0900, Michael Paquier wrote:
  x86_64-w64-mingw32-gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith 
  -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute 
  -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard 
  -g -I/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/interfaces/libpq 
  -I../../../src/include 
  -I/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/include 
  -I../pgsql/src/include/port/win32 -DEXEC_BACKEND 
  -I/c/prog/3p64/include/libxml2  -I/c/prog/3p64/include 
  -I/c/prog/3p64/openssl/include 
  -I/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/include/port/win32  
  -c -o parallel.o 
  /home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c
  c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c:
   In function 'pgpipe':
  c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c:1332:2:
   warning: overflow in implicit constant conversion [-Woverflow]
handles[0] = handles[1] = INVALID_SOCKET;
^
  c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c:1386:3:
   warning: overflow in implicit constant conversion [-Woverflow]
 handles[1] = INVALID_SOCKET;
 ^
 In mingw-w64, SOCKET_INVALID is defined as ~0:
 http://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/include/psdk_inc/_socket_types.h
 Is this overflow caused because SOCKET_INVALID corresponds to a 64b
 value in mingw-w64?
 Looking at the code this exists since 9.3.

I think this is caused because the variable is not defined as SOCKET. 
The attached patch fixes this.  This should prevent the warning.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
new file mode 100644
index caedbb8..cd6ef7a
*** a/src/bin/pg_dump/parallel.c
--- b/src/bin/pg_dump/parallel.c
***
*** 36,42 
  #ifdef WIN32
  static unsigned int tMasterThreadId = 0;
  static HANDLE termEvent = INVALID_HANDLE_VALUE;
! static int	pgpipe(int handles[2]);
  static int	piperead(int s, char *buf, int len);
  
  /*
--- 36,42 
  #ifdef WIN32
  static unsigned int tMasterThreadId = 0;
  static HANDLE termEvent = INVALID_HANDLE_VALUE;
! static int	pgpipe(SOCKET handles[2]);
  static int	piperead(int s, char *buf, int len);
  
  /*
*** readMessageFromPipe(int fd)
*** 1323,1329 
   * with recv/send.
   */
  static int
! pgpipe(int handles[2])
  {
  	SOCKET		s;
  	struct sockaddr_in serv_addr;
--- 1323,1329 
   * with recv/send.
   */
  static int
! pgpipe(SOCKET handles[2])
  {
  	SOCKET		s;
  	struct sockaddr_in serv_addr;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Compression of full-page-writes

2014-05-28 Thread Simon Riggs

On 28 May 2014 15:34, Fujii Masao masao.fu...@gmail.com wrote:

 Also, compress_backup_block GUC needs to be merged with full_page_writes.

 Basically I agree with you because I don't want to add new GUC very similar to
 the existing one.

 But could you imagine the case where full_page_writes = off. Even in this 
 case,
 FPW is forcibly written only during base backup. Such FPW also should be
 compressed? Which compression algorithm should be used? If we want to
 choose the algorithm for such FPW, we would not be able to merge those two
 GUCs. IMO it's OK to always use the best compression algorithm for such FPW
 and merge them, though.

I'd prefer a new name altogether

torn_page_protection = 'full_page_writes'
torn_page_protection = 'compressed_full_page_writes'
torn_page_protection = 'none'

this allows us to add new techniques later like

torn_page_protection = 'background_FPWs'

or

torn_page_protection = 'double_buffering'

when/if we add those new techniques

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_sleep() doesn't work well with recovery conflict interrupts.

Hi,

Since a64ca63e59c11d8fe6db24eee3d82b61db7c2c83 pg_sleep() uses
WaitLatch() to wait. That's fine in itself. But
procsignal_sigusr1_handler, which is used e.g. when resolving recovery
conflicts, doesn't unconditionally do a SetLatch().
That means that we'll we'll currently not be able to cancel conflicting
backends during recovery for 10min. Now, I don't think that'll happen
too often in practice, but it's still annoying.

As an alternative to doing the PG_TRY/save set_latch_on_sigusr1/set
set_latch_on_sigusr1/PG_CATCH/reset set_latch_on_sigusr1/ dance in
pg_sleep() we could also have RecoveryConflictInterrupt() do an
unconditional SetLatch()?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

2014-05-28 Thread Jeff Janes

On Wed, May 28, 2014 at 7:38 AM, Bruce Momjian br...@momjian.us wrote:

 On Mon, May 26, 2014 at 09:50:42PM +0900, Michael Paquier wrote:
   x86_64-w64-mingw32-gcc -O2 -Wall -Wmissing-prototypes -Wpointer-arith
 -Wdeclaration-after-statement -Wendif-labels -Wmissing-format-attribute
 -Wformat-security -fno-strict-aliasing -fwrapv -fexcess-precision=standard
 -g -I/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/interfaces/libpq
 -I../../../src/include
 -I/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/include
 -I../pgsql/src/include/port/win32 -DEXEC_BACKEND
 -I/c/prog/3p64/include/libxml2  -I/c/prog/3p64/include
 -I/c/prog/3p64/openssl/include
 -I/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/include/port/win32
  -c -o parallel.o
 /home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c
  
 c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c:
 In function 'pgpipe':
  
 c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c:1332:2:
 warning: overflow in implicit constant conversion [-Woverflow]
 handles[0] = handles[1] = INVALID_SOCKET;
 ^
  
 c:/mingw/msys/1.0/home/pgrunner/bf/root/HEAD/pgsql.5100/../pgsql/src/bin/pg_dump/parallel.c:1386:3:
 warning: overflow in implicit constant conversion [-Woverflow]
  handles[1] = INVALID_SOCKET;
  ^
  In mingw-w64, SOCKET_INVALID is defined as ~0:
 
 http://sourceforge.net/p/mingw-w64/mingw-w64/ci/master/tree/mingw-w64-headers/include/psdk_inc/_socket_types.h
  Is this overflow caused because SOCKET_INVALID corresponds to a 64b
  value in mingw-w64?
  Looking at the code this exists since 9.3.

 I think this is caused because the variable is not defined as SOCKET.
 The attached patch fixes this.  This should prevent the warning.


Wouldn't it be better to use pgsocket rather than SOCKET?

Cheers,

Jeff

Re: [HACKERS] pg_stat directory and pg_stat_statements

On Wed, May 28, 2014 at 7:01 AM, Fujii Masao masao.fu...@gmail.com wrote:
 But pg_stat_statements file is saved under $PGDATA/global yet.
 Is this intentional or just oversight?


I think it's an oversight.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?


Tom Lane t...@sss.pgh.pa.us writes:

 We don't store dependency information for function bodies, so there's
 no way to do this except by reparsing everything in sight.

 A larger issue with the idea is that a function might fail reparsing
 for reasons having nothing to do with the proposed ALTER TABLE.
 For instance, it's not at all unusual for functions to contain references
 to tables that don't normally exist, but are created when the function is
 to be called (or maybe even by the function itself).  Because of this
 problem, reparsing, in the sense of detecting semantic rather than
 purely syntactic problems in a function body, is something that we don't
 actually do *at all*, ever, except when the function is actually executed.
 (This is part of the reason why there's no dependency info.)
 Pavel Stehule has made some efforts towards improving that situation
 for plpgsql functions:
 https://commitfest.postgresql.org/action/patch_view?id=884
 but that patch remains pretty controversial and may never get committed.
 Even if it does get in, it wouldn't move the goalposts for any other PL.

OK, forget functions, I now realize it's not feasible to consider.

Can we get back to re-defining views at least?

--
Alex


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-05-28 Thread Alvaro Herrera

ash wrote:
 
 Tom Lane t...@sss.pgh.pa.us writes:
 
  We don't store dependency information for function bodies, so there's
  no way to do this except by reparsing everything in sight.

 OK, forget functions, I now realize it's not feasible to consider.
 
 Can we get back to re-defining views at least?

Hi Alex,

I think it's reasonable to try and fix the problems for views (and other
objects -- there are other things that can depend on table definitions;
composite types come to mind) and ignore functions bodies, since you can
already get into trouble by using ALTER TABLE today and it's known to be
an unsolvable problem.

Now -- do we need to do anything about tables used as return types or
argument types for functions?

alvherre=# create table qux (a int, b text);
CREATE TABLE
alvherre=# create or replace function test_qux(a qux) returns void language 
plpgsql as $$ begin raise notice 'the qux we got is %', $1; end; $$;
CREATE FUNCTION
alvherre=# insert into qux values (1, 'one');
INSERT 0 1
alvherre=# select * from qux, test_qux(qux.*);
NOTICE:  the qux we got is (1,one)
 a |  b  | test_qux 
---+-+--
 1 | one | 
(1 fila)

alvherre=# alter table qux add column c timestamptz;
ALTER TABLE
alvherre=# update qux set c = now();
UPDATE 1
alvherre=# select * from qux, test_qux(qux.*);
NOTICE:  the qux we got is (1,one,)
 a |  b  |   c   | test_qux 
---+-+---+--
 1 | one | 2014-05-28 12:08:28.210895-04 | 
(1 fila)


Notice how the NOTICE has a final comma, meaning the tuple descriptor is
aware that there is a third column -- but the value in the table is not
null per the UPDATE, so the fact that there's nothing after the comma
means this is not being handled correctly.  If I close the session and
start a fresh one, the result is saner:

alvherre=# select * from qux, test_qux(qux.*);
NOTICE:  the qux we got is (1,one,2014-05-28 12:08:28.210895-04)
 a |  b  |   c   | test_qux 
---+-+---+--
 1 | one | 2014-05-28 12:08:28.210895-04 | 
(1 fila)

Maybe we're missing a function cache invalidation or something like
that.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

Bruce Momjian br...@momjian.us writes:
 I think this is caused because the variable is not defined as SOCKET. 
 The attached patch fixes this.  This should prevent the warning.

Surely that's just going to move the errors somewhere else.  The call
site still expects the argument to be int[].

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] libpq: PQexec may block indefinitly

2014-05-28 Thread Dmitry Samonenko

Greetings.

I have an application which uses libpq for interaction with remote
PostgreSQL server 9.2. Clients and Server nodes are running Linux and
connection is established using TCPv4. The client application has some
small fault-tolerance features, which are activated when server related
problems are encountered.

One day some bad things happened with network layer hardware and, long
story short, host with PSQL server got isolated. All TCP messages routed to
server node were NOT delivered or acknowledged in any way. None of
fault-tolerance features were triggered. Said critical network problem was
resolved and I started to investigate why clients got fully inoperable.

I have successfully reproduced the problem in the laboratory environment.
These iptables commands should be run on the server node after some period
of client - server interaction:

# iptables -A OUTPUT -p tcp --sport 5432 -j DROP
# iptables -A INPUT  -p tcp --dport 5432 -j DROP


After this my client blocks in libpq code according to debugger. I made a
glimpse over master branch of libpq sources and some questions arose.
Namely:

1. Connection to PSQL server is made without an option to specify
SO_RCVTIMEO and SO_SNDTIMEO. Why is that? Is setting socket timeouts
considered harmful?
2. PQexec ultimately leads to PQwait, which after some function calls
lands in pqSocketCheck and pqSocketPoll. These 2 functions have parameter
end_time. It is set (-1) for PQexec scenario, which leads to infinite poll
timeout in pqSocketPoll. Is it possible to implement configurable timeout
for PQexec calls? Is there some implemented features, which should be used
to handle situation like this?

Currently, I have changed Linux kernel tcp4 stack counters responsible for
retransmission, so OS actually closes socket after some period. This is
detected by pqSocketPoll's poll and libpq handles situation correctly -
error is reported to my application. But it's just a workaround.

So, this infinite poll situation looks like imperfection to me and I think
it should be fixed. Waiting for your comments: is it a bug or a feature?

With regards,
  Dmitry Samonenko

Re: [HACKERS] pg_stat directory and pg_stat_statements

On Thu, May 29, 2014 at 12:37 AM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, May 28, 2014 at 7:01 AM, Fujii Masao masao.fu...@gmail.com wrote:
 But pg_stat_statements file is saved under $PGDATA/global yet.
 Is this intentional or just oversight?


 I think it's an oversight.

OK, patch attached.

I'm afraid that it's not okay to change the file layout in $PGDATA at this beta1
stage because that change basically seems to need initdb. Otherwise something
like no such file or directory error can happen. But in this case what we need
to change is only the location of the pg_stat_statements permanent stats file.
So, without initdb, the server will not be able to find the
pg_stat_statements stats
file, but this is not so harmful. Only the problem is that the
pg_stat_statements
stats which were collected in past would disappear. OTOH, the server can keep
running successfully from then and no critical data will not
disappear. Therefore
I think we can commit this patch even at beta1. Thought?

Regards,

-- 
Fujii Masao
From 8f3281566399bbb088b826b8fe9c3baa5027e5da Mon Sep 17 00:00:00 2001
From: MasaoFujii masao.fu...@gmail.com
Date: Thu, 29 May 2014 02:14:34 +0900
Subject: [PATCH] Save pg_stat_statements statistics file into $PGDATA/pg_stat
 directory at shutdown.

187492b6c2e8cafc5b39063ca3b67846e8155d24 changed pgstat.c so that
the stats files were saved into $PGDATA/pg_stat directory when the server
was shutdowned. But it accidentally forgot to change the location of
pg_stat_statements permanent stats file. This commit fixes pg_stat_statements
so that its stats file is also saved into $PGDATA/pg_stat at shutdown.
---
 contrib/pg_stat_statements/pg_stat_statements.c | 2 +-
 src/backend/postmaster/pgstat.c | 8 
 src/include/pgstat.h| 8 
 3 files changed, 9 insertions(+), 9 deletions(-)

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 32d16cc..a3e8c59 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -80,7 +80,7 @@
 PG_MODULE_MAGIC;
 
 /* Location of permanent stats file (valid when database is shut down) */
-#define PGSS_DUMP_FILE	global/pg_stat_statements.stat
+#define PGSS_DUMP_FILE	PGSTAT_STAT_PERMANENT_DIRECTORY /pg_stat_statements.stat
 
 /*
  * Location of external query text file.  We don't keep it in the core
diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index f864816..3ab1428 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -68,14 +68,6 @@
 
 
 /* --
- * Paths for the statistics files (relative to installation's $PGDATA).
- * --
- */
-#define PGSTAT_STAT_PERMANENT_DIRECTORY		pg_stat
-#define PGSTAT_STAT_PERMANENT_FILENAME		pg_stat/global.stat
-#define PGSTAT_STAT_PERMANENT_TMPFILE		pg_stat/global.tmp
-
-/* --
  * Timer definitions.
  * --
  */
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index d9de09f..0892533 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -20,6 +20,14 @@
 #include utils/relcache.h
 
 
+/* --
+ * Paths for the statistics files (relative to installation's $PGDATA).
+ * --
+ */
+#define PGSTAT_STAT_PERMANENT_DIRECTORY		pg_stat
+#define PGSTAT_STAT_PERMANENT_FILENAME		pg_stat/global.stat
+#define PGSTAT_STAT_PERMANENT_TMPFILE		pg_stat/global.tmp
+
 /* Default directory to store temporary statistics data in */
 #define PG_STAT_TMP_DIR		pg_stat_tmp
 
-- 
1.7.12.4 (Apple Git-37)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] IMPORT FOREIGN SCHEMA statement

* David Fetter (da...@fetter.org) wrote:
 On Tue, May 27, 2014 at 09:41:06AM -0400, Stephen Frost wrote:
  * David Fetter (da...@fetter.org) wrote:
   - We make type mappings settable at the level of:
   - FDW
   - Instance (a.k.a. cluster)
   - Database
   - Schema
   - Table
   - Column
  
  While I like the general idea, you seem to be making this appear much
  more complex than it actually is.  For example, I see no value in a
  table-level uint4 - int8 definition.
 
 You chose a particularly silly example, so I have to assume it's a
 straw man.  

... I used your example from 20140525194118.gb32...@fetter.org and your
suggestion that we support that on a per-table basis.  If that
combination is silly then let's not support it.

 My point was that since we don't know what things are
 relevant to preserve and transform here in the design stage, we need
 to leave them settable by people who have needs we don't know about,
 i.e. the users of the feature.

This is not relevant as far as I can tell.  Users can already make
changes to the definitions, or create them however they want the first
time by using CREATE FOREIGN TABLE.  The point of IMPORT is that it
should be for bulk operations- if you have a lot of very specific
transformations, then use CREATE instead.  I don't see value in
rebuilding the flexibility of what a script of CREATEs gives you into
a single IMPORT command.

  The FDW will need to know how to do whatever conversions we're talking
  about also, right?
 
 No.  The writer of the FDW is not the same person as the user of the
 FDW, and the former is not omniscient.  My idea here is to allow FDW
 users to supply transformation code at these spots.

Then we're not talking about something which should be IMPORT's job, or
at least it goes far beyond it and that's what we're discussing here.
This sounds a bit like you're looking to rebuild what ETLs provide in a
streaming fashion- and I'm all for that as an interesting project that
isn't about adding IMPORT.  I still think you could use views for this,
or ETL, if you really want to implement it using core instead of the
FDW.

  (If you want to do it in core PG instead of the FDW
  then I'm thinking you should probably use a view over top of the
  foreign table..).
 
 We can't know in advance what to preserve and what to jettison.  We
 can't even know which server is responsible for doing the
 transformation.

Either side could handle the transformation using views, or there could
be transformations on both sides.

  What's nice is that this all falls under what an FDW
  can do *already*, if it wants (and, actually, it could do it on the
  table-level technically too, if the FDW supports that, but schema?
  database?  these things don't make sense...).
 
 Not to you yet, but I've seen deployed applications with exactly these
 requirements.

I don't view IMPORT as being part of a deployed application.  It's a way
of simplifying the process of setting up FDWs, not an ETL mechanism.

  For the IMPORT bit, it should just be doing whatever the defaults are
  unless you've set some different defaults for a given foreign server
  (also something which could be set using our existing system...).
  
   ALTER FOREIGN TABLE foo ADD (mapping '{
   datetime: text,
   inbound: IDENTITY,
   outbound: IDENTITY
   }')
  
  I'm very much against having two different command languages where one
  is used when convenient.
 
 I did not suggest that we use two different ways to specify this.  I
 did sketch out one path--adding a bunch of SQL grammar--which I made
 it explicitly clear we shouldn't tread.  Apparently, I wasn't quite
 explicit enough. 

What you missed here is that I'd find it objectionable to have some
portion of the system has a JSON-based grammar while the rest is an SQL
grammar.  I'm not entirely thrilled with the jsquery approach for that
reason, but at least that's a very contained case which is about a logic
expression (and we have logic expressions already in SQL).  That's not
what you're describing here.

  If we wanted to do this, we should build a full-spec mapping from
  whatever JSON language you come up with for our utility commands to
  what we actually implement in the grammar.
 
 Excellent plan.

Go for it.  From here, I don't see anything stopping you from making a
wrapper around PG which converts your JSON syntax into SQL (indeed,
people have already done this..).  I wouldn't get your hopes up about
having it ever part of core PG though, and it certainly won't be until
it's as complete and capable as the existing SQL grammar.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

* ash (a...@commandprompt.com) wrote:
 OK, forget functions, I now realize it's not feasible to consider.

I never meant to imply that it was but rather to point out that we might
have users who actually want to get an error when they're changing a
type definition which goes beyond the scope of the explicit action (and
therefore could very well have more side-effects than they realize),
rather than just doing it for them.

 Can we get back to re-defining views at least?

I'm still not convinced you'll be able to do it in a sensible and
reliable way, but you're certainly welcome to continue exploring.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Bison 3.0 updates

2014-05-28 Thread Peter Eisentraut

On 5/21/14, 12:29 PM, Tom Lane wrote:
 Vik Fearing vik.fear...@dalibo.com writes:
 I'm getting some more of these, including some I thought you had fixed.
 Bison 3.0.2 on current head.
 
 I didn't do anything to suppress those warnings:
 
 gram.y:172.1-13: warning: deprecated directive, use â€˜%name-prefixâ€™
 [-Wdeprecated]
  %name-prefix=base_yy
  ^
 
 because it's hard to see how that's anything but a bison bug.

What they want is that you use

%name-prefix base_yy

instead of

%name-prefix=base_yy

That makes the warning go away.

The %something=something syntax is not documented anywhere, so it looks
like it worked more or less by accident.  We don't use it anywhere else
either (e.g. %expect 0, not %expect=0).



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bison 3.0 updates

Peter Eisentraut pete...@gmx.net writes:
 What they want is that you use
 %name-prefix base_yy
 instead of
 %name-prefix=base_yy
 That makes the warning go away.

Oh really!?

 The %something=something syntax is not documented anywhere, so it looks
 like it worked more or less by accident.  We don't use it anywhere else
 either (e.g. %expect 0, not %expect=0).

Hah.  That's probably my fault, somewhere way back when.  I'll fix it,
unless you're already on it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Bison 3.0 updates

2014-05-28 Thread Peter Eisentraut

On 5/28/14, 2:43 PM, Tom Lane wrote:
 Peter Eisentraut pete...@gmx.net writes:
 What they want is that you use
 %name-prefix base_yy
 instead of
 %name-prefix=base_yy
 That makes the warning go away.
 
 Oh really!?
 
 The %something=something syntax is not documented anywhere, so it looks
 like it worked more or less by accident.  We don't use it anywhere else
 either (e.g. %expect 0, not %expect=0).
 
 Hah.  That's probably my fault, somewhere way back when.  I'll fix it,
 unless you're already on it.

Go ahead.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] jsonb failed assertions


On 05/28/2014 04:13 AM, Peter Geoghegan wrote:

On Wed, May 21, 2014 at 4:53 AM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

Hmm. The patch looks correct as far as it goes. But that function is still a
bit funny. When it compares two identical arrays (or objects), and reaches
the WJB_END_ARRAY token, it will still fall into the code that checks what
the va and vb types are, and compares the last scalar values in that array
again. That's wrong, and will fail if the compiler decides to clobber the
local va or vb variables between iterations of the do-while loop,
because JsonbIteratorNext() does not set the value when returning
WJB_END_ARRAY.


That's an obsolete assumption that once actually applied during development.

Attached revision also adds handling for that case.


Thanks, applied.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-05-28 Thread Tomas Vondra

On 28.5.2014 19:52, Fujii Masao wrote:
 On Thu, May 29, 2014 at 12:37 AM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, May 28, 2014 at 7:01 AM, Fujii Masao masao.fu...@gmail.com wrote:
 But pg_stat_statements file is saved under $PGDATA/global yet.
 Is this intentional or just oversight?


 I think it's an oversight.
 
 OK, patch attached.
 
 I'm afraid that it's not okay to change the file layout in $PGDATA at this 
 beta1
 stage because that change basically seems to need initdb. Otherwise something
 like no such file or directory error can happen. But in this case what we 
 need
 to change is only the location of the pg_stat_statements permanent stats file.
 So, without initdb, the server will not be able to find the
 pg_stat_statements stats
 file, but this is not so harmful. Only the problem is that the
 pg_stat_statements
 stats which were collected in past would disappear. OTOH, the server can keep
 running successfully from then and no critical data will not
 disappear. Therefore
 I think we can commit this patch even at beta1. Thought?

For HEAD, probably yes. But what about backpatching 9.3?

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Re: popen and pclose redefinitions causing many warning in Windows build

On Wed, May 28, 2014 at 12:29:28PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I think this is caused because the variable is not defined as SOCKET. 
  The attached patch fixes this.  This should prevent the warning.
 
 Surely that's just going to move the errors somewhere else.  The call
 site still expects the argument to be int[].

Ah, yes, you are right.  This is a similar problem I had with libpq
where PQsocket() had to return an int.

Attached is an updated patch which follows my previous coding of
checking for PGINVALID_SOCKET, and if not equal, assigns the value to an
integer handle.  I would also like to rename variable 's' to
'listen_sock', but that is not in the patch, for clarity reasons.

Should this be held for 9.5?  I think it is only warning removal.  On
the other hand, portability is what we do during beta testing.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
new file mode 100644
index caedbb8..4efa8fb
*** a/src/bin/pg_dump/parallel.c
--- b/src/bin/pg_dump/parallel.c
*** readMessageFromPipe(int fd)
*** 1320,1337 
  /*
   * This is a replacement version of pipe for Win32 which allows returned
   * handles to be used in select(). Note that read/write calls must be replaced
!  * with recv/send.
   */
  static int
  pgpipe(int handles[2])
  {
! 	SOCKET		s;
  	struct sockaddr_in serv_addr;
  	int			len = sizeof(serv_addr);
  
  	handles[0] = handles[1] = INVALID_SOCKET;
  
! 	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
  	{
  		write_msg(modulename, pgpipe: could not create socket: error code %d\n,
    WSAGetLastError());
--- 1320,1342 
  /*
   * This is a replacement version of pipe for Win32 which allows returned
   * handles to be used in select(). Note that read/write calls must be replaced
!  * with recv/send.  handles have to be integers so we check for errors then
!  * cast to integers.
   */
  static int
  pgpipe(int handles[2])
  {
! 	pgsocket		s, tmp_sock;
  	struct sockaddr_in serv_addr;
  	int			len = sizeof(serv_addr);
  
+ 	/* We have to use the Unix socket definition here. */
  	handles[0] = handles[1] = INVALID_SOCKET;
  
! 	/*
! 	 * setup listen socket
! 	 */
! 	if ((s = socket(AF_INET, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
  	{
  		write_msg(modulename, pgpipe: could not create socket: error code %d\n,
    WSAGetLastError());
*** pgpipe(int handles[2])
*** 1363,1375 
  		closesocket(s);
  		return -1;
  	}
! 	if ((handles[1] = socket(AF_INET, SOCK_STREAM, 0)) == INVALID_SOCKET)
  	{
  		write_msg(modulename, pgpipe: could not create second socket: error code %d\n,
    WSAGetLastError());
  		closesocket(s);
  		return -1;
  	}
  
  	if (connect(handles[1], (SOCKADDR *) serv_addr, len) == SOCKET_ERROR)
  	{
--- 1368,1385 
  		closesocket(s);
  		return -1;
  	}
! 
! 	/*
! 	 * setup pipe handles
! 	 */
! 	if ((tmp_sock = socket(AF_INET, SOCK_STREAM, 0)) == PGINVALID_SOCKET)
  	{
  		write_msg(modulename, pgpipe: could not create second socket: error code %d\n,
    WSAGetLastError());
  		closesocket(s);
  		return -1;
  	}
+ 	handles[1] = (int) tmp_sock;
  
  	if (connect(handles[1], (SOCKADDR *) serv_addr, len) == SOCKET_ERROR)
  	{
*** pgpipe(int handles[2])
*** 1378,1384 
  		closesocket(s);
  		return -1;
  	}
! 	if ((handles[0] = accept(s, (SOCKADDR *) serv_addr, len)) == INVALID_SOCKET)
  	{
  		write_msg(modulename, pgpipe: could not accept connection: error code %d\n,
    WSAGetLastError());
--- 1388,1394 
  		closesocket(s);
  		return -1;
  	}
! 	if ((tmp_sock = accept(s, (SOCKADDR *) serv_addr, len)) == PGINVALID_SOCKET)
  	{
  		write_msg(modulename, pgpipe: could not accept connection: error code %d\n,
    WSAGetLastError());
*** pgpipe(int handles[2])
*** 1387,1392 
--- 1397,1404 
  		closesocket(s);
  		return -1;
  	}
+ 	handles[0] = (int) tmp_sock;
+ 
  	closesocket(s);
  	return 0;
  }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch


On 05/28/2014 11:52 PM, John Lumby wrote:


The patch is attached.
It is based on clone of today's 9.4dev source.
I have noticed that this source is
(not suprisingly) quite a moving target at present,
meaning that this patch becomes stale quite quickly.
So although this copy is fine for reviewing,
it may quite probably soon not be correct
for the current source tree.

As mentioned before,  if anyone wishes to try this feature out
on 9.3.4,   I will be making a patch for that soon
which I can supply on request.


Wow, that's a huge patch. I took a very brief look, focusing on the 
basic design. ignoring the style  other minor things for now:


The patch seems to assume that you can put the aiocb struct in shared 
memory, initiate an asynchronous I/O request from one process, and wait 
for its completion from another process. I'm pretty surprised if that 
works on any platform.


How portable is POSIX aio nowadays? Googling around, it still seems that 
on Linux, it's implemented using threads. Does the thread-emulation 
implementation cause problems with the rest of the backend, which 
assumes that there is only a single thread? In any case, I think we'll 
want to encapsulate the AIO implementation behind some kind of an API, 
to allow other implementations to co-exist.


Benchmarks?
- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

On Tue, May 27, 2014 at 3:17 PM, John Lumby johnlu...@hotmail.com wrote:
 Below I am pasting the README we have written for this new functionality
 which mentions some of the measurements, advantages (and disadvantages)
 and we welcome all and any comments on this.

I think that this is likely to be a useful area to work on, but I
wonder: can you suggest a useful test-case or benchmark to show the
advantages of the patch you posted? You mention a testcase already,
but it's a little short on details. I think it's always a good idea to
start with that when pursuing a performance feature.

Have you thought about things like specialized prefetching for nested
loop joins?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] SP-GiST bug.

Teodor Sigaev teo...@sigaev.ru writes:
 create table xxx ( t text);

 insert into xxx select 'x' || v::text from generate_series(1, 291) as v;
 insert into xxx  values ('');

 create index xxxidx on xxx using spgist (t);

Fun!

 And postgres will eat memory forever. It seems to me that checkAllTheSame 
 wrongly decides that all tuples are the same. All tuples except tuple to be 
 inserted have the same character 'x' and only new tuple has '\0'.

I don't think that checkAllTheSame is broken, but it probably would be
after this patch --- ISTM you're breaking the corner case described in the
header comment for checkAllTheSame, where we need to force splitting of a
set of existing tuples that the opclass can't distinguish.

What actually is broken, I think, is the spgist text opclass, because it's
using a zero node label for two different situations.  The scenario we're
looking at here is:

1. We call picksplit with all 292 of these entries.  Not surprisingly,
it puts the first 291 into a single node with label 'x' and the last one
into another node with label '\0'.

2. Because this is too many to fit on one index page, checkAllTheSame
decides it had better create an allTheSame tuple for the first 291 and
then try again to insert the empty string into that tuple.  While you
could argue whether this is *necessary*, it is not *incorrect*, so ISTM
the opclass had better cope with it.

3. We call spg_text_choose with the allTheSame tuple and the empty-string
value to be inserted.  It finds the empty-string value doesn't match
(since all the nodes have label 'x') so it requests a split:

out-resultType = spgSplitTuple;
out-result.splitTuple.prefixHasPrefix = in-hasPrefix;
out-result.splitTuple.prefixPrefixDatum = in-prefixDatum;
out-result.splitTuple.nodeLabel = UInt8GetDatum('\0');
out-result.splitTuple.postfixHasPrefix = false;

After splitting, we have a new upper tuple with a single node with label
'\0', pointing to a new allTheSame tuple containing the original 291
entries.

4. We call spg_text_choose with the new upper tuple and the empty-string
value to be inserted.  It looks for a node labeled '\0', and finds it, so
it reports that we should descend into that node --- ie, down to the new
allTheSame tuple.

5. We call spg_text_choose with the new allTheSame tuple and the
empty-string value to be inserted.  This is exactly like the situation
at step 3, so we're in an infinite loop.

It doesn't look to me like the core code has done anything wrong here;
it just did what the opclass requested.  Rather, the problem is we're
using '\0' node label both to represent a no op label descent and
to represent we're past the end of the string.

I'm afraid there may not be any way to fix this without breaking on-disk
compatibility for spgist text indexes :-(.  It seems like we have to
distinguish the case of zero-length string from that of a dummy label
for a pushed-down allTheSame tuple.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] CREATE REPLICATION SLOT fails on a timeout

Hi,

On 2014-05-17 01:34:25 +0200, Andres Freund wrote:
 On 2014-05-16 17:02:33 -0400, Steve Singer wrote:
  I don't think that's going to cut it though. The creation can take
  longer than whatever wal_sender_timeout is set to (when there's lots of
  longrunning transactions). I think checking whether last_reply_timestamp
  = 0 during timeout checking is more robust.
 
  That makes sense.
  A patch that does that is attached.
 
 I've attached a heavily revised version of that patch. Didn't touch all
 the necessary places...
 
 Survives creation of logical replication slots under 'intensive
 pressure', with a wal_sender_timeout=10ms.
 
 Thanks for the bugreport!

Pushed a fix for it. I am pretty sure it will, but could you still test
that it fixes your problem?

Thanks!

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] json casts

2014-05-28 Thread Andrew Dunstan



On 05/27/2014 07:25 PM, Andrew Dunstan wrote:


On 05/27/2014 07:17 PM, Tom Lane wrote:

Stephen Frost sfr...@snowman.net writes:

* Andrew Dunstan (and...@dunslane.net) wrote:

Given that this would be a hard coded behaviour change, is it too
late to do this for 9.4?

No, for my 2c.

If we do it by adding casts then it'd require an initdb, so I'd vote
against that for 9.4.  If we just change behavior in json.c then that
objection doesn't apply, so I wouldn't complain.





I wasn't proposing to add a cast, just to allow users to add one if 
they wanted. But I'm quite happy to go with special-casing timestamps, 
and leave the larger question for another time.






Here's a draft patch. I'm still checking to see if there are other 
places that need to be fixed, but I think this has the main one.


cheers

andrew

diff --git a/src/backend/utils/adt/json.c b/src/backend/utils/adt/json.c
index a7364f3..d262bda 100644
--- a/src/backend/utils/adt/json.c
+++ b/src/backend/utils/adt/json.c
@@ -24,6 +24,7 @@
 #include parser/parse_coerce.h
 #include utils/array.h
 #include utils/builtins.h
+#include utils/formatting.h
 #include utils/lsyscache.h
 #include utils/json.h
 #include utils/jsonapi.h
@@ -53,6 +54,8 @@ typedef enum	/* type categories for datum_to_json */
 	JSONTYPE_NULL,/* null, so we didn't bother to identify */
 	JSONTYPE_BOOL,/* boolean (built-in types only) */
 	JSONTYPE_NUMERIC,			/* numeric (ditto) */
+	JSONTYPE_TIMESTAMP, /* we use special formatting for timestamp */
+	JSONTYPE_TIMESTAMPTZ,   /* ... and timestamptz */
 	JSONTYPE_JSON,/* JSON itself (and JSONB) */
 	JSONTYPE_ARRAY,/* array */
 	JSONTYPE_COMPOSITE,			/* composite */
@@ -60,6 +63,13 @@ typedef enum	/* type categories for datum_to_json */
 	JSONTYPE_OTHER/* all else */
 } JsonTypeCategory;
 
+/*
+ * to_char formats to turn timestamps and timpstamptzs into json strings
+ * that are ISO 8601 compliant
+ */
+#define TS_ISO8601_FMT \\\-MM-DD\T\HH24:MI:SS.US\\\
+#define TSTZ_ISO8601_FMT \\\-MM-DD\T\HH24:MI:SS.USOF\\\
+
 static inline void json_lex(JsonLexContext *lex);
 static inline void json_lex_string(JsonLexContext *lex);
 static inline void json_lex_number(JsonLexContext *lex, char *s, bool *num_err);
@@ -1262,6 +1272,14 @@ json_categorize_type(Oid typoid,
 			*tcategory = JSONTYPE_NUMERIC;
 			break;
 
+		case TIMESTAMPOID:
+			*tcategory = JSONTYPE_TIMESTAMP;
+			break;
+
+		case TIMESTAMPTZOID:
+			*tcategory = JSONTYPE_TIMESTAMPTZ;
+			break;
+
 		case JSONOID:
 		case JSONBOID:
 			*tcategory = JSONTYPE_JSON;
@@ -1375,6 +1393,29 @@ datum_to_json(Datum val, bool is_null, StringInfo result,
 			}
 			pfree(outputstr);
 			break;
+		case JSONTYPE_TIMESTAMP:
+			/* 
+			 * The timestamp format used here provides for quoting the string, 
+			 * so no escaping is required.
+			 */
+			jsontext = DatumGetTextP(
+DirectFunctionCall2(timestamp_to_char, val, 
+	CStringGetTextDatum(TS_ISO8601_FMT)));
+			outputstr = text_to_cstring(jsontext);
+			appendStringInfoString(result, outputstr);
+			pfree(outputstr);
+			pfree(jsontext);
+			break;
+		case JSONTYPE_TIMESTAMPTZ:
+			/* same comment as for timestamp above */
+			jsontext = DatumGetTextP(
+DirectFunctionCall2(timestamptz_to_char, val, 
+	CStringGetTextDatum(TSTZ_ISO8601_FMT)));
+			outputstr = text_to_cstring(jsontext);
+			appendStringInfoString(result, outputstr);
+			pfree(outputstr);
+			pfree(jsontext);
+			break;
 		case JSONTYPE_JSON:
 			/* JSON and JSONB output will already be escaped */
 			outputstr = OidOutputFunctionCall(outfuncoid, val);
diff --git a/src/test/regress/expected/json.out b/src/test/regress/expected/json.out
index 9f08676..c4dc8b0 100644
--- a/src/test/regress/expected/json.out
+++ b/src/test/regress/expected/json.out
@@ -403,6 +403,29 @@ SELECT row_to_json(row((select array_agg(x) as d from generate_series(5,10) x)),
  {f1:[5,6,7,8,9,10]}
 (1 row)
 
+-- to_json, timestamps
+select to_json(timestamp '2014-05-28 12:22:35.614298');
+   to_json
+--
+ 2014-05-28T12:22:35.614298
+(1 row)
+
+BEGIN;
+SET LOCAL TIME ZONE 10.5;
+select to_json(timestamptz '2014-05-28 12:22:35.614298-04');
+  to_json   
+
+ 2014-05-29T02:52:35.614298+10:30
+(1 row)
+
+SET LOCAL TIME ZONE -8;
+select to_json(timestamptz '2014-05-28 12:22:35.614298-04');
+ to_json 
+-
+ 2014-05-28T08:22:35.614298-08
+(1 row)
+
+COMMIT;
 --json_agg
 SELECT json_agg(q)
   FROM ( SELECT $$a$$ || x AS b, y AS c,
diff --git a/src/test/regress/expected/json_1.out b/src/test/regress/expected/json_1.out
index 13f7687..629e98e 100644
--- a/src/test/regress/expected/json_1.out
+++ b/src/test/regress/expected/json_1.out
@@ -403,6 +403,29 @@ SELECT row_to_json(row((select array_agg(x) as d from generate_series(5,10) x)),
  {f1:[5,6,7,8,9,10]}
 (1 row)
 
+-- to_json, timestamps

Re: [HACKERS] replication protocol documentation inconsistencies

Hi,

On 2014-05-21 07:29:53 -0400, Peter Eisentraut wrote:
 Looking at
 http://www.postgresql.org/docs/devel/static/protocol-replication.html
 under START_REPLICATION it goes
 
 
 The payload of each CopyData message from server to the client contains
 a message of one of the following formats:
 
 If a slot's name is provided via slotname, it will be updated as
 replication progresses so that the server knows which WAL segments - and
 if hot_standby_feedback is on which transactions - are still needed by
 the standby.
 
 XLogData (B)
 ...
 
 Primary keepalive message (B)
 ...
 
 
 That second paragraph was inserted recently and doesn't make sense
 there.  It should be moved somewhere else.

Hm. I am not sure why it doesn't make sense there? It's about the SLOT
$slotname parameter to START_REPLICATION?

 More generally, it is weird that the message formats are described
 there, even though the rest of the protocol documentation only mentions
 the messages by name and then describes the formats later
 (http://www.postgresql.org/docs/devel/static/protocol-message-types.html
 and
 http://www.postgresql.org/docs/devel/static/protocol-message-formats.html).
  For example, the meaning of the (B) code isn't described until two
 sections later.
 

I am not sure that makes sense. These messages cannot be sent as
toplevel messages - they're just describing the contents of the CopyBoth
stream after START_REPLICATION has begun. It seems wierd to add these
'subprotocol' messages to the toplevel protocol description.

 I think the description of the details of the these messages should also
 be moved there.  The CopyBothResponse, which is also used for
 replication only, is also listed among the normal message formats.

I think that's different because CopyBoth is a toplevel protocol issue.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.

Andres Freund and...@2ndquadrant.com writes:
 On 2014-05-28 22:55:28 +0200, Andres Freund wrote:
 On 2014-05-28 19:42:35 +, Tom Lane wrote:
 Fix bogus %name-prefix option syntax in all our Bison files.

 Are you sure about this? When I saw those warnings first after debian
 unstable got bison 3.0 I've read the release notes and interpreted it
 differently: By accident *only* the = syntax worked for a long time. Then
 somewhere around 2.8 they added the syntax without =. That means that 
 2.8 versions are likely not to work anymore.

 According to git tag --contains the syntax without = has been added in
 2.4 (not 2.8 as I'd remembered) which was released 2008-11-02. It's
 warning since 3.0 which was released 2013-07-25.

Yeah, that's what the buildfarm is showing: members with bison 2.3 or
less are failing :-(.

 It's imo not realistic to rely on bison = 2.4, at least not in the
 backbranches. Pretty damn annoying. We'll have to live with those
 warnings for a couple of years.

Agreed; even relatively modern platforms such as OS X 10.9 are still
shipping 2.3, or maybe even lower.  Considering that up to now our
benchmark requirement was bison 1.875, requiring 2.4 is a pretty big
jump just to get rid of a warning.

I guess we have to revert this, and IMO we should also lobby the Bison
people to not emit the deprecation warnings yet.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.

On 2014-05-28 18:52:22 -0400, Tom Lane wrote:
 I guess we have to revert this

Looks like it.

 and IMO we should also lobby the Bison people to not emit the
 deprecation warnings yet.

That's a good idea. What i've been thinking about is to add
-Wno-deprecated to the bison rule in the interim. Maybe after a
configure test for the option. All deprecation warnings so far seem to
be pretty unhelpful.

Btw, the bison release process and documentation suck. Majorly. The most
efficient way to learn about changes seems to be to look at the git
repository.

Andres

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.

Andres Freund and...@2ndquadrant.com writes:
 On 2014-05-28 18:52:22 -0400, Tom Lane wrote:
 and IMO we should also lobby the Bison people to not emit the
 deprecation warnings yet.

 That's a good idea. What i've been thinking about is to add
 -Wno-deprecated to the bison rule in the interim. Maybe after a
 configure test for the option. All deprecation warnings so far seem to
 be pretty unhelpful.

Meh.  If we just hide them permanently, we're likely to be blindsided
somewhere down the road when they turn a deprecation into an error.

What I was wondering about was if we could modify the .y files when
building with a pre-2.4 bison.  It'd be easy enough to fix this with
sed, say.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [COMMITTERS] pgsql: Fix bogus %name-prefix option syntax in all our Bison files.

On 2014-05-28 19:12:44 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-05-28 18:52:22 -0400, Tom Lane wrote:
  and IMO we should also lobby the Bison people to not emit the
  deprecation warnings yet.
 
  That's a good idea. What i've been thinking about is to add
  -Wno-deprecated to the bison rule in the interim. Maybe after a
  configure test for the option. All deprecation warnings so far seem to
  be pretty unhelpful.
 
 Meh.  If we just hide them permanently, we're likely to be blindsided
 somewhere down the road when they turn a deprecation into an error.

I think some bleeding edge buildfarm animal will warn us soon
enough. It's not as if we're able to do much about the deprecations as
is without breaking with older releases.

 What I was wondering about was if we could modify the .y files when
 building with a pre-2.4 bison.  It'd be easy enough to fix this with
 sed, say.

.oO(m4). Should be doable and might actually be interesting for a couple
of other things.

I think I'll just stick a BISONFLAGS=+ -Wno-deprecated in my
Makefile.custom for now. I.e. I am not volunteering.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Compression of full-page-writes

On Wed, May 28, 2014 at 04:04:13PM +0100, Simon Riggs wrote:
 On 28 May 2014 15:34, Fujii Masao masao.fu...@gmail.com wrote:
 
  Also, compress_backup_block GUC needs to be merged with full_page_writes.
 
  Basically I agree with you because I don't want to add new GUC very similar 
  to
  the existing one.
 
  But could you imagine the case where full_page_writes = off. Even in this 
  case,
  FPW is forcibly written only during base backup. Such FPW also should be
  compressed? Which compression algorithm should be used? If we want to
  choose the algorithm for such FPW, we would not be able to merge those two
  GUCs. IMO it's OK to always use the best compression algorithm for such FPW
  and merge them, though.
 
 I'd prefer a new name altogether
 
 torn_page_protection = 'full_page_writes'
 torn_page_protection = 'compressed_full_page_writes'
 torn_page_protection = 'none'
 
 this allows us to add new techniques later like
 
 torn_page_protection = 'background_FPWs'
 
 or
 
 torn_page_protection = 'double_buffering'
 
 when/if we add those new techniques

Uh, how would that work if you want to compress the background_FPWs? 
Use compressed_background_FPWs?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

2014-05-28 Thread Claudio Freire

On Wed, May 28, 2014 at 6:51 PM, Peter Geoghegan p...@heroku.com wrote:
 Have you thought about things like specialized prefetching for nested
 loop joins?

Currently, such a thing would need some non-trivial changes to the
execution nodes, I believe.

For nestloop, correct me if I'm wrong, but index scan nodes don't have
visibility of the next tuple to be searched for.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Extended Prefetching using Asynchronous IO - proposal and patch

On Wed, May 28, 2014 at 5:59 PM, Claudio Freire klaussfre...@gmail.com wrote:
 For nestloop, correct me if I'm wrong, but index scan nodes don't have
 visibility of the next tuple to be searched for.

Nested loop joins are considered a particularly compelling case for
prefetching, actually.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Avoiding re-creation of uuid_t state with OSSP UUID

While mucking around with contrib/uuid-ossp, I realized that its usage
of the OSSP UUID library is really rather broken: it creates and destroys
a uuid_t object for each call of the UUID creation functions.  This is not
the way you're supposed to use that library.  The uuid_t object is meant
to hold persistent state such as the system MAC address, so really we
ought to make such an object once per session and re-use it, as per the
attached proposed patch.  Doing it like this has multiple benefits:

* saving the cycles needed to create and destroy a uuid_t object, notably
including kernel calls to find out the system's MAC address.  On my
machine, this patch reduces the runtime of uuid_generate_v1() from about
16 microseconds to about 3.

* reducing the amount of entropy we draw from /dev/urandom.  strace'ing
shows that at least with RHEL6's version of OSSP UUID, each uuid_create
call reads 2 bytes from /dev/urandom, thus depleting the available
entropy system-wide.

* enabling the code to actually guarantee that successive V1-style UUIDs
are distinct.  OSSP remembers the last timestamp it read from the kernel,
and if the new one is the same, it increments the saved clock sequence
value to guarantee a distinct result.  But that logic all depends on
re-using a cached uuid_t!  Without that, if successive gettimeofday calls
produce the same result, we're at risk of producing duplicate UUIDs if
the clock sequence values are the same.  uuid_create does initialize the
clock sequence value to a random number, but since only 14 bits are
available for it in the V1 UUID format, the chance of collision is 1 in
16K.  Now admittedly you'd need a machine noticeably faster than mine to
get duplicate timestamps with the existing code, but we're not that far
away from having it happen.

So I think the existing code is not only rather broken, but its effects
on the system entropy pool are not far short of being a security bug.
Accordingly, I'd like to not only apply this to HEAD but back-patch it.
Comments?

regards, tom lane

diff --git a/contrib/uuid-ossp/uuid-ossp.c b/contrib/uuid-ossp/uuid-ossp.c
index 88da168..9e9905b 100644
*** a/contrib/uuid-ossp/uuid-ossp.c
--- b/contrib/uuid-ossp/uuid-ossp.c
*** pguuid_complain(uuid_rc_t rc)
*** 141,146 
--- 141,182 
   errmsg(OSSP uuid library failure: error code %d, rc)));
  }
  
+ /*
+  * We create a uuid_t object just once per session and re-use it for all
+  * operations in this module.  OSSP UUID caches the system MAC address and
+  * other state in this object.  Reusing the object has a number of benefits:
+  * saving the cycles needed to fetch the system MAC address over and over,
+  * reducing the amount of entropy we draw from /dev/urandom, and providing a
+  * positive guarantee that successive generated V1-style UUIDs don't collide.
+  * (On a machine fast enough to generate multiple UUIDs per microsecond,
+  * or whatever the system's wall-clock resolution is, we'd otherwise risk
+  * collisions whenever random initialization of the uuid_t's clock sequence
+  * value chanced to produce duplicates.)
+  *
+  * However: when we're doing V3 or V5 UUID creation, uuid_make needs two
+  * uuid_t objects, one holding the namespace UUID and one for the result.
+  * It's unspecified whether it's safe to use the same uuid_t for both cases,
+  * so let's cache a second uuid_t for use as the namespace holder object.
+  */
+ static uuid_t *
+ get_cached_uuid_t(int which)
+ {
+ 	static uuid_t *cached_uuid[2] = {NULL, NULL};
+ 
+ 	if (cached_uuid[which] == NULL)
+ 	{
+ 		uuid_rc_t	rc;
+ 
+ 		rc = uuid_create(cached_uuid[which]);
+ 		if (rc != UUID_RC_OK)
+ 		{
+ 			cached_uuid[which] = NULL;
+ 			pguuid_complain(rc);
+ 		}
+ 	}
+ 	return cached_uuid[which];
+ }
+ 
  static char *
  uuid_to_string(const uuid_t *uuid)
  {
*** string_to_uuid(const char *str, uuid_t *
*** 171,190 
  static Datum
  special_uuid_value(const char *name)
  {
! 	uuid_t	   *uuid;
  	char	   *str;
  	uuid_rc_t	rc;
  
- 	rc = uuid_create(uuid);
- 	if (rc != UUID_RC_OK)
- 		pguuid_complain(rc);
  	rc = uuid_load(uuid, name);
  	if (rc != UUID_RC_OK)
  		pguuid_complain(rc);
  	str = uuid_to_string(uuid);
- 	rc = uuid_destroy(uuid);
- 	if (rc != UUID_RC_OK)
- 		pguuid_complain(rc);
  
  	return DirectFunctionCall1(uuid_in, CStringGetDatum(str));
  }
--- 207,220 
  static Datum
  special_uuid_value(const char *name)
  {
! 	uuid_t	   *uuid = get_cached_uuid_t(0);
  	char	   *str;
  	uuid_rc_t	rc;
  
  	rc = uuid_load(uuid, name);
  	if (rc != UUID_RC_OK)
  		pguuid_complain(rc);
  	str = uuid_to_string(uuid);
  
  	return DirectFunctionCall1(uuid_in, CStringGetDatum(str));
  }
*** special_uuid_value(const char *name)
*** 193,212 
  static Datum
  uuid_generate_internal(int mode, const uuid_t *ns, const char *name, int len)
  {
! 	uuid_t	   *uuid;
  	char	   *str;
  	uuid_rc_t	rc;
  
- 	rc = uuid_create(uuid);
- 	if (rc != UUID_RC_OK)

Re: [HACKERS] SQL access to database attributes

2014-05-28 Thread Vik Fearing

On 05/26/2014 08:19 PM, Vik Fearing wrote:
 On 05/26/2014 07:10 AM, Stephen Frost wrote:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 I don't really object to doing an unlocked check for another such
 database, but I'm not convinced that additional locking to try to
 prevent a race is worth its keep.
 +1 on the nannyism, and +1 to ignoring the race.
 Okay, I'll submit a new patch with racy nannyism and some grammar
 changes that Tom and I have been discussing in private.

Attached are two patches.

The first is a refactoring of the createdb/alterdb grammars mostly by
Tom which makes all of the options non-keywords that don't otherwise
need to be.  Not only does this remove the two unreserved keywords I had
added (ALLOW and CONNECTIONS) but also removes two existing ones
(LC_COLLATE and LC_TYPE), reducing gram.o by about half a percent by
Tom's calculations.  That's much better than increasing it like my
original patch did.

The problem is we foolishly adopted a two-word option name (CONNECTION
LIMIT) which complicates the grammar.  My aping of that for IS TEMPLATE
and ALLOW CONNECTIONS only aggravated the situation.  And so I changed
all the documentation (and pg_dumpall etc) to use CONNECTION_LIMIT
instead.  We might hopefully one day deprecate the with-space version so
the sooner the documentation recommends the without-space version, the
better.  The old syntax is of course still valid.

I also changed the documentation to say connection_limit instead of
connlimit.  Documentation is for humans, something like connlimit (and
later as we'll see allowconn) is for programmers.  It also indirectly
reminds us that we should not add another multi-word option like I
initially did.

Now that anything goes grammar-wise, the elog for unknown option is now
ereport.

I am hoping this gets backpatched.

---

The second patch adds my original functionality, except this time the
syntax is IS_TEMPLATE and ALLOW_CONNECTIONS (both one word, neither
being keywords).  It also changes all the catalog updates we used to do
because we didn't have this patch.

As for the nannyism, the point was to find another database having
datallowconn=true but since we have to be connected to a database to
issue this command, the simplest is just to disallow the current
database (credit Andres on IRC) so that's what I've done as explained
with this in-code comment:

/*
* In order to avoid getting locked out and having to go through
standalone
* mode, we refuse to disallow connections on the database we're
currently
* connected to.  Lockout can still happen with concurrent
sessions but the
* likeliness of that is not high enough to worry about.
*/

I also changed the C variable connlimit to dbconnlimit in
AlterDatabase() to be consistent with its analog in createdb().

---

The third and non-existent patch is about regression tests.  Tom put in
gram.y that we should have some now that the grammar doesn't regulate
it, and I wanted some anyway in my original patch; but I don't know what
they should look like or where they should go so I'm asking for help on
that.

-- 
Vik

*** a/contrib/sepgsql/expected/alter.out
--- b/contrib/sepgsql/expected/alter.out
***
*** 110,116  SET search_path = regtest_schema, regtest_schema_2, public;
  --
  -- misc ALTER commands
  --
! ALTER DATABASE regtest_sepgsql_test_database CONNECTION LIMIT 999;
  LOG:  SELinux: allowed { setattr } scontext=unconfined_u:unconfined_r:unconfined_t:s0 tcontext=unconfined_u:object_r:sepgsql_db_t:s0 tclass=db_database name=regtest_sepgsql_test_database
  ALTER DATABASE regtest_sepgsql_test_database SET search_path TO regtest_schema, public; -- not supported yet
  ALTER TABLE regtest_table ADD COLUMN d float;
--- 110,116 
  --
  -- misc ALTER commands
  --
! ALTER DATABASE regtest_sepgsql_test_database CONNECTION_LIMIT 999;
  LOG:  SELinux: allowed { setattr } scontext=unconfined_u:unconfined_r:unconfined_t:s0 tcontext=unconfined_u:object_r:sepgsql_db_t:s0 tclass=db_database name=regtest_sepgsql_test_database
  ALTER DATABASE regtest_sepgsql_test_database SET search_path TO regtest_schema, public; -- not supported yet
  ALTER TABLE regtest_table ADD COLUMN d float;
*** a/contrib/sepgsql/sql/alter.sql
--- b/contrib/sepgsql/sql/alter.sql
***
*** 85,91  SET search_path = regtest_schema, regtest_schema_2, public;
  --
  -- misc ALTER commands
  --
! ALTER DATABASE regtest_sepgsql_test_database CONNECTION LIMIT 999;
  ALTER DATABASE regtest_sepgsql_test_database SET search_path TO regtest_schema, public; -- not supported yet
  
  ALTER TABLE regtest_table ADD COLUMN d float;
--- 85,91 
  --
  -- misc ALTER commands
  --
! ALTER DATABASE regtest_sepgsql_test_database CONNECTION_LIMIT 999;
  ALTER DATABASE regtest_sepgsql_test_database SET search_path TO regtest_schema, public; -- not supported yet
  
  ALTER TABLE regtest_table ADD COLUMN d float;
*** a/doc/src/sgml/ref/alter_database.sgml
---

[HACKERS] Ancient bug in formatting.c/to_char()

Consider this example:

[local]/postgres=# SELECT to_char(1e9::float8,'9D9');
 to_char
--
 10.0
(1 row)

[local]/postgres=# SELECT to_char(1e20::float8,'9D9');
to_char

  1
(1 row)

[local]/postgres=# SELECT to_char(1e20,'9D9');
 to_char
--
  1.0
(1 row)


There is access to uninitialized memory here. #define
DEBUG_TO_FROM_CHAR shows this to be the case (garbage is printed to
stdout).

Valgrind brought this to my attention. I propose the attached patch as
a fix for this issue.

The direct cause appears to be that float8_to_char() feels entitled to
truncate Number description post-digits, while that doesn't happen
within numeric_to_char(), say. This is very old code, from the
original to_char() patch back in 2000, and the interactions here are
not obvious. I think that that should be okay to not do this
truncation, since the value of MAXDOUBLEWIDTH was greatly increased in
2007 as part of a round of fixes to bugs of similar vintage. There is
still a defensive measure against over-sized input (we bail), but that
ought to be unnecessary, strictly speaking.

-- 
Peter Geoghegan
*** a/src/backend/utils/adt/formatting.c
--- b/src/backend/utils/adt/formatting.c
***
*** 59,64 
--- 59,65 
   * ---
   */
  
+ /* XXX: DEBUG_TO_FROM_CHAR may access uninitialized memory */
  #ifdef DEBUG_TO_FROM_CHAR
  #define DEBUG_elog_output	DEBUG3
  #endif
*** float8_to_char(PG_FUNCTION_ARGS)
*** 5427,5440 
  			Num.pre += Num.multi;
  		}
  		orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1);
! 		len = snprintf(orgnum, MAXDOUBLEWIDTH + 1, %.0f, fabs(val));
! 		if (Num.pre  len)
! 			plen = Num.pre - len;
! 		if (len = DBL_DIG)
! 			Num.post = 0;
! 		else if (Num.post + len  DBL_DIG)
! 			Num.post = DBL_DIG - len;
! 		snprintf(orgnum, MAXDOUBLEWIDTH + 1, %.*f, Num.post, val);
  
  		if (*orgnum == '-')
  		{		/*  0 */
--- 5428,5439 
  			Num.pre += Num.multi;
  		}
  		orgnum = (char *) palloc(MAXDOUBLEWIDTH + 1);
! 
! 		len = snprintf(orgnum, MAXDOUBLEWIDTH + 1, %.*f, Num.post, val);
! 
! 		/* Defensive */
! 		if (len = MAXDOUBLEWIDTH + 1)
! 			elog(ERROR, double precision for character conversion is too wide);
  
  		if (*orgnum == '-')
  		{		/*  0 */
*** a/src/test/regress/expected/window.out
--- b/src/test/regress/expected/window.out
*** SELECT to_char(SUM(n::float8) OVER (ORDE
*** 1765,1771 
FROM (VALUES(1,1e20),(2,1)) n(i,n);
   to_char  
  --
!   1
1.0
  (2 rows)
  
--- 1765,1771 
FROM (VALUES(1,1e20),(2,1)) n(i,n);
   to_char  
  --
!   1.0
1.0
  (2 rows)
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat directory and pg_stat_statements

On Thu, May 29, 2014 at 4:55 AM, Tomas Vondra t...@fuzzy.cz wrote:
 On 28.5.2014 19:52, Fujii Masao wrote:
 On Thu, May 29, 2014 at 12:37 AM, Peter Geoghegan p...@heroku.com wrote:
 On Wed, May 28, 2014 at 7:01 AM, Fujii Masao masao.fu...@gmail.com wrote:
 But pg_stat_statements file is saved under $PGDATA/global yet.
 Is this intentional or just oversight?


 I think it's an oversight.

 OK, patch attached.

 I'm afraid that it's not okay to change the file layout in $PGDATA at this 
 beta1
 stage because that change basically seems to need initdb. Otherwise something
 like no such file or directory error can happen. But in this case what we 
 need
 to change is only the location of the pg_stat_statements permanent stats 
 file.
 So, without initdb, the server will not be able to find the
 pg_stat_statements stats
 file, but this is not so harmful. Only the problem is that the
 pg_stat_statements
 stats which were collected in past would disappear. OTOH, the server can keep
 running successfully from then and no critical data will not
 disappear. Therefore
 I think we can commit this patch even at beta1. Thought?

 For HEAD, probably yes. But what about backpatching 9.3?

I think No. So we should not backpatch this to 9.3.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Odd uuid-ossp behavior on smew and shearwater

Buildfarm critters smew and shearwater are reporting regression test
failures that suggest that the UUID library can't get a system MAC
address:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=smewdt=2014-05-28%2023%3A38%3A28
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=shearwaterdt=2014-05-29%2000%3A24%3A32

I've just committed regression test adjustments to prevent that from
being a failure case, but I am confused about why it's happening.
I wouldn't be surprised at not getting a MAC address on a machine that
lacks any internet connection, but that surely can't describe the
buildfarm environment.  Are you curious enough to poke into it and
see what's going on?  It might be useful to strace a backend that's
trying to execute uuid_generate_v1() and see what the kernel interaction
looks like exactly.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-05-28 Thread Ashesh Vashi

On Thu, May 29, 2014 at 8:52 AM, Fujii Masao masao.fu...@gmail.com wrote:

 On Thu, May 29, 2014 at 4:55 AM, Tomas Vondra t...@fuzzy.cz wrote:
  On 28.5.2014 19:52, Fujii Masao wrote:
  On Thu, May 29, 2014 at 12:37 AM, Peter Geoghegan p...@heroku.com
 wrote:
  On Wed, May 28, 2014 at 7:01 AM, Fujii Masao masao.fu...@gmail.com
 wrote:
  But pg_stat_statements file is saved under $PGDATA/global yet.
  Is this intentional or just oversight?
 
 
  I think it's an oversight.
 
  OK, patch attached.
 
  I'm afraid that it's not okay to change the file layout in $PGDATA at
 this beta1
  stage because that change basically seems to need initdb. Otherwise
 something
  like no such file or directory error can happen. But in this case
 what we need
  to change is only the location of the pg_stat_statements permanent
 stats file.
  So, without initdb, the server will not be able to find the
  pg_stat_statements stats
  file, but this is not so harmful. Only the problem is that the
  pg_stat_statements
  stats which were collected in past would disappear. OTOH, the server
 can keep
  running successfully from then and no critical data will not
  disappear. Therefore
  I think we can commit this patch even at beta1. Thought?
 
  For HEAD, probably yes. But what about backpatching 9.3?

 I think No. So we should not backpatch this to 9.3.

Just curious.
Will it work in upgrade scenario?

--

Thanks  Regards,
Ashesh Vashi



 Regards,

 --
 Fujii Masao


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposing pg_hibernate

2014-05-28 Thread Amit Kapila

On Wed, May 28, 2014 at 5:30 PM, Gurjeet Singh gurj...@singh.im wrote:
 On Wed, May 28, 2014 at 2:15 AM, Amit Kapila amit.kapil...@gmail.com
wrote:
  How about the cases when shared buffers already contain some
  data:
  a. Before Readers start filling shared buffers, if this cluster wishes
  to join replication as a slave and receive the data from master, then
  this utility might need to evict some buffers filled during startup
  phase.

 A cluster that wishes to be a replication standby, it would do so
 while it's in startup phase. The BlockReaders are launched immediately
 on cluster reaching consistent state, at which point, I presume, in
 most of the cases, most of the buffers would be unoccupied.

Even to reach consistent state, it might need to get the records
from master (example to get to STANDBY_SNAPSHOT_READY state).

 Hence
 BlockReaders might evict the occupied buffers, which may be a small
 fraction of total shared_buffers.

Yes, but I think still it depends on how much redo replay happens
on different pages.

  b. As soon as the server completes startup (reached consistent
  point), it allows new connections which can also use some shared
  buffers before Reader process could use shared buffers or are you
  planing to change the time when users can connect to database.

 The BlockReaders are launched immediately after the cluster reaches
 consistent state, that is, just about when it is ready to accept
 connections. So yes, there is a possibility that the I/O caused by the
 BlockReaders may affect the performance of queries executed right at
 cluster startup. But given that the performance of those queries was
 anyway going to be low (because of empty shared buffers), and that
 BlockReaders tend to cause sequential reads, and that by default
 there's only one BlockReader active at a time, I think this won't be a
 concern in most of the cases. By the time the shared buffers start
 getting filled up, the buffer replacement strategy will evict any
 buffers populated by BlockReaders if they are not used by the normal
 queries.

Even Block Readers might need to evict buffers filled by user
queries or by itself in which case there is chance of contention, but
again all these are quite rare scenario's.

  I am not sure if replacing shared buffer contents in such cases can
  always be considered useful.

 IMHO, all of these caveats, would affect a very small fraction of
 use-cases and are eclipsed by the benefits this extension provides in
 normal cases.

I agree with you that there are only few corner cases where evicting
shared buffers by this utility would harm, but was wondering if we could
even save those, say if it would only use available free buffers.  I think
currently there is no such interface and inventing a new interface for this
case doesn't seem to reasonable unless we see any other use case of
such a interface.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] libpq: PQexec may block indefinitly

2014-05-28 Thread Amit Kapila

On Mon, May 26, 2014 at 1:34 PM, Dmitry Samonenko shreddingw...@gmail.com
wrote:
 1. Connection to PSQL server is made without an option to specify
SO_RCVTIMEO and SO_SNDTIMEO. Why is that? Is setting socket timeouts
considered harmful?
 2. PQexec ultimately leads to PQwait, which after some function calls
lands in pqSocketCheck and pqSocketPoll. These 2 functions have parameter
end_time. It is set (-1) for PQexec scenario, which leads to infinite poll
timeout in pqSocketPoll. Is it possible to implement configurable timeout
for PQexec calls? Is there some implemented features, which should be used
to handle situation like this?

Have you tried using Cancel functionality:
http://www.postgresql.org/docs/9.4/static/libpq-cancel.html

 Currently, I have changed Linux kernel tcp4 stack counters responsible
for retransmission, so OS actually closes socket after some period. This is
detected by pqSocketPoll's poll and libpq handles situation correctly -
error is reported to my application. But it's just a workaround.

There are certain tcp parameters which can be configured for connections.

tcp_keepalives_idle, tcp_keepalives_interval, tcp_keepalives_count


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] pg_stat directory and pg_stat_statements