Re: [HACKERS] 9.4 regression

2013-08-09 Thread Andres Freund
On 2013-08-08 22:58:42 -0500, Jon Nelson wrote:
 On Thu, Aug 8, 2013 at 9:27 PM, Andres Freund and...@2ndquadrant.com wrote:
  On 2013-08-08 16:12:06 -0500, Jon Nelson wrote:
 ...
 
  At this point I'm convinced that the issue is a pathological case in
  ext4. The performance impact disappears as soon as the unwritten
  extent(s) are written to with real data. Thus, even though allocating
  files with posix_fallocate is - frequently - orders of magnitude
  quicker than doing it with write(2), the subsequent re-write can be
  more expensive.  At least, that's what I'm gathering from the various
  threads.
 
 
   Why this issue didn't crop up in earlier testing and why I
  can't seem to make test_fallocate do it (even when I modify
  test_fallocate to write to the newly-allocated file in a mostly-random
  fashion) has me baffled.
 
  It might be kernel version specific and concurrency seems to play a
  role. If you reproduce the problem, could you run a perf record -ga to
  collect a systemwide profile?
 
 Finally, an excuse to learn how to use 'perf'! I'll try to provide
 that info when I am able.

Running perf record as above during the first minute and then doing a
perf report  somefile (redirected to get the noninteractive version)
should get you started.

  There's some more things to test:
  - is the slowdown dependent on the scale? I.e is it visible with -j 1 -c
1?
 
 scale=1 (-j 1 -c 1):
 with fallocate: 685 tps
 without: 727
 
 scale=20
 with fallocate: 129
 without: 402
 
 scale=40
 with fallocate: 163
 without: 511

Ok, so there's some clear correlation with the amount of writers.

  - Does it also occur in synchronous_commit=off configurations? Those
don't fdatasync() from so many backends, that might play a role.
 
 With synchronous_commit=off, the performance is vastly improved.
 Interestingly, the fallocate case is (immaterially) faster than the
 non-fallocate case:   3766tps vs 3700tps.

That's interesting because in the synchronous_commit=off case most of
the writing and syncing should be done by the wal writer. So there's
another hint that there's some scalability issue causing place,
presumably in the kernel.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command

2013-08-09 Thread Vik Fearing
On 08/08/2013 02:26 PM, Fabien COELHO wrote:

 As part of routine maintenance monitoring, it is interesting for us to
 have statistics on the CLUSTER command (timestamp of last run, and
 number of runs since stat reset) like we have for (auto)ANALYZE and
 (auto)VACUUM.  Patch against today's HEAD attached.

 I would add this to the next commitfest but I seem to be unable to log
 in with my community account (I can log in to the wiki).  Help
 appreciated.

 Done.


Thank you, but it seems you've duplicated the title from the other patch
(and thanks for adding that one, too!).

https://commitfest.postgresql.org/action/patch_view?id=1190

Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command

2013-08-09 Thread Fabien COELHO



Thank you, but it seems you've duplicated the title from the other patch
(and thanks for adding that one, too!).


Indeed, possibly a wrong copy paste. Fixed.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] question about HTTP API

2013-08-09 Thread Andrew Tipton
On Fri, Aug 9, 2013 at 2:44 AM, Szymon Guz mabew...@gmail.com wrote:


 Do we have any attempts of implementation the HTTP server described at
 http://wiki.postgresql.org/wiki/HTTP_API?

 It seems like there are design ideas only. Are there any ideas about
 implementation like using some existing http servers or writing everything
 from scratch?


I recently threw together a quick-and-dirty prototype of this idea.  It was
an external tool which used the libmicrohttpd library to accept incoming
requests, convert them to a SQL query (which called a stored procedure),
and return the query results.  (It allowed *any* content-type to be
returned, not just JSON.)  I only got as far as handling GET requests.  The
code is available here:

  http://code.malloclabs.com/pghttpd.v1

I'm also aware of an nginx module (ngx_postgres) that lets you transform
requests into queries against a Postgres database, but it isn't generic
-- you have to configure it for each URL that should be handled.


Regards,
Andrew Tipton


Re: [HACKERS] question about HTTP API

2013-08-09 Thread Andrew Tipton
On Fri, Aug 9, 2013 at 3:44 AM, Josh Berkus j...@agliodbs.com wrote:

 Well, there's HTSQL: http://htsql.org/

 Other than that, no.  I was thinking of creating a general tool as a
 custom background worker, which would take stored procedure calls and
 pass them through to PostgreSQL, returning results as JSON.  Mainly
 because I need it for a project.  However, this wouldn't accept any query.


I'm actually in the process of writing an HTTP server that lives inside
Postgres.  I hope to have a WIP patch ready in the next week or two, and
then (if all goes well) submit it for CF2.  [There are a few impediments to
doing this as an extension module, which I shall detail as part of the WIP
patch...]

Why integrate a webserver with Postgres?  Well, the trend that we're seeing
in web development is to push much of the display logic into client-side
Javascript frameworks and expose the database through an HTTP API.  (Good
examples of this are Parse and Meteor.)  CouchDB can even host the
application's static content alongside the data.   As a result, many
applications don't need any middle-tier Python/Ruby/Java framework at all.

One of my goals is to allow Postgres to directly serve HTTP requests and
return arbitrary content (e.g. text/html in addition to JSON) directly to
end-user browsers.  With the JSON datatype and PL/v8, code can even be
re-used on both client and server.  Getting rid of an entire middle tier
would make small-scale application development dramatically easier.


Regards,
Andrew Tipton


Re: [HACKERS] question about HTTP API

2013-08-09 Thread Greg Stark
On Fri, Aug 9, 2013 at 9:21 AM, Andrew Tipton and...@kiwidrew.com wrote:

 I recently threw together a quick-and-dirty prototype of this idea.  It
 was an external tool which used the libmicrohttpd library to accept
 incoming requests, convert them to a SQL query (which called a stored
 procedure), and return the query results.  (It allowed *any* content-type
 to be returned, not just JSON.)  I only got as far as handling GET
 requests.  The code is available here:


I looked at the wiki and thought it had a lot of good ideas but also a lot
of good questions. do you have any idea how to tackle the session problem?

Postgres has always assumed session == backend == connection. TPC prepared
transactions are the one main break in this model and they can take a lot
of short cuts because they know there will be no more operations in the
transaction aside from commit or rollback.

A decent HTTP RPC layer will need to have some way of creating a session
and issuing multiple requests on that session. That session will need to be
a stored and available for future requests. The obvious concern is state
like the current database, current role, gucs, and prepared queries. But
even if you're prepared to discard those for a stateless interface the
performance issues of not having a relcache built will be pretty severe.

I suspect this is something better built into something like pgbouncer
which already has to deal with multiplexing many clients onto a single
connection.


-- 
greg


Re: [HACKERS] question about HTTP API

2013-08-09 Thread Andrew Tipton
On 9 Aug 2013 17:03, Greg Stark st...@mit.edu wrote:
 I looked at the wiki and thought it had a lot of good ideas but also a
lot of good questions. do you have any idea how to tackle the session
problem?
 [...]
 A decent HTTP RPC layer will need to have some way of creating a session
and issuing multiple requests on that session. That session will need to be
a stored and available for future requests. The obvious concern is state
like the current database, current role, gucs, and prepared queries. But
even if you're prepared to discard those for a stateless interface the
performance issues of not having a relcache built will be pretty severe.

The performance certainly will be poor to start with, yes.  Sessions and
HTTP simply don't go together, and so I think we need to accept that each
request is going to be stateless.  (We could use Websockets, and pass the
socket to libpq  but that hardly counts as an HTTP API.)

For my patch, I plan to use pre-forked bgworkers which have already
connected to the backend, so that populating the relcache and other process
startup costs don't impact on the HTTP response time.  (This still means
queries are being planned and function code is being compiled for each
request, of course...)

This is going to be a very long series of patches, but IMHO we have to
start somewhere!  For some applications, performance is far less important
than ease-of-use and ease-of-deployment.

Regards,
Andrew Tipton


Re: [HACKERS] Proposal for XML Schema Validation

2013-08-09 Thread Bisen Vikrantsingh Mohansingh MT2012036
Hi Craig Ringer,

yeah, you are right indeed. I tried to answer your question in three section as 
below.

(A) XML Schema update

User may wish to update schema in future. So we will provide them one more 
function

UPDATE_XML_SCHEMA(URL OF SCHEMA,NAMESPACE,NEW CONTENT OF .XSD)

Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite 
content of .xsd field no provision
of altering a selective portion xsd in place.

Whenever an update to schema happens
before committing changes we will run small algo as below
 1. For all table in which this schema is used
 2.if(!validate xml document for each row)
 3.   abort/exit with error;
 4. commit

If user modify schema by adding some extra optional tags then their won't be 
any error
,error will arise in cases such as adding new compulsory/required tags, 
datatype in .xsd for
certain tag is modified. This is beyond our control an obvious solution as 
suggested by you
could be used, user will manually go through rows which are violating schema 
(for simplicity,
we will mention row number which are violating schema in our error message) and 
do
modification/deletion as required.


(
similar case happen, suppose you have a table t(a,b,c) with lot of data, later 
on
you want to add primary key constraints to column 'a', but if data in column 
'a' is
not unique then it may fail, and user has to manually handle this situation may 
be by deleting or
modifying respective rows.
)


(B) Alter Table

   Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle 
(somewhere on
oracle site i found this type of syntax), I may not be correct but check 
constraint is only used to
 limit the value ranges. So it is better to introduce new meaningful keyword or 
else no problem
to work embed this feature with CHECK()

(C)

yes , there are memory management related issue with libxml as mentioned on 
below link
http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues
It is also mention there that this issue can be resolved(how? don't know!).



Thanks,
Vikrantsingh  Pridhvi
IIIT Bangalore

From: Craig Ringer cr...@2ndquadrant.com
Sent: Friday, August 09, 2013 8:27 AM
To: Kodamasimham Pridhvi (MT2012066)
Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036
Subject: Re: [HACKERS] Proposal for XML Schema Validation

On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote:

 Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008)

 Description:
 We’ve gone through current support of xml in postgreSQL and found that there 
 is a check for well-formedness of xml document while inserting and updating. 
 We want to extend this feature by adding xml schema validation.
We will be providing user with DDL commands for creating and deleting 
 XML Schema, also provision of associating xml schema with table while 
 creation of new table or while altering table structure, we are planning to 
 use libxml2 library. Proposed syntax is given below.

The first thing that comes to mind here is what if the user wants to
update/replace the schema ? How would you handle re-validating the fields?

Sure, updating XML schemas is not a great idea, but it doesn't stop
people doing it. It might be reasonable to say if you want to do this
you have to drop the dependent constraints, drop the schema, re-create
the schema and re-create the schema constraints though.

Why extend the create table / alter table syntax with USE_SCHEMA? Is
there a compatibility/standards reason to do this? If not, what
advantage does this provide over using a suitable CHECK constraint?

IIRC there were some memory management issues with libxml2 in Pg. Anyone
remember anything about that?

--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


FW: [HACKERS] Proposal for XML Schema Validation

2013-08-09 Thread Kodamasimham Pridhvi (MT2012066)
Hi Craig Ringer,


(Sorry for reposting it as it got posted in different thread previously)
yeah, you are right indeed. I tried to answer your question in three section as 
below.

(A) XML Schema update

User may wish to update schema in future. So we will provide them one more 
function

UPDATE_XML_SCHEMA(URL OF SCHEMA,NAMESPACE,NEW CONTENT OF .XSD)

Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite 
content of .xsd field no provision
of altering a selective portion xsd in place.

Whenever an update to schema happens
before committing changes we will run small algo as below
 1. For all table in which this schema is used
 2.if(!validate xml document for each row)
 3.   abort/exit with error;
 4. commit

If user modify schema by adding some extra optional tags then their won't be 
any error
,error will arise in cases such as adding new compulsory/required tags, 
datatype in .xsd for
certain tag is modified. This is beyond our control an obvious solution as 
suggested by you
could be used, user will manually go through rows which are violating schema 
(for simplicity,
we will mention row number which are violating schema in our error message) and 
do
modification/deletion as required.


(
similar case happen, suppose you have a table t(a,b,c) with lot of data, later 
on
you want to add primary key constraints to column 'a', but if data in column 
'a' is
not unique then it may fail, and user has to manually handle this situation may 
be by deleting or
modifying respective rows.
)


(B) Alter Table

   Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle 
(somewhere on
oracle site i found this type of syntax), I may not be correct but check 
constraint is only used to
 limit the value ranges. So it is better to introduce new meaningful keyword or 
else no problem
to work embed this feature with CHECK()

(C)

yes , there are memory management related issue with libxml as mentioned on 
below link
http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues
It is also mention there that this issue can be resolved(how? don't know!).


Thanks,
Vikrantsingh  Pridhvi
IIIT Bangalore

From: Craig Ringer cr...@2ndquadrant.com
Sent: Friday, August 09, 2013 8:27 AM
To: Kodamasimham Pridhvi (MT2012066)
Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036
Subject: Re: [HACKERS] Proposal for XML Schema Validation

On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote:

 Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008)

 Description:
 We’ve gone through current support of xml in postgreSQL and found that there 
 is a check for well-formedness of xml document while inserting and updating. 
 We want to extend this feature by adding xml schema validation.
We will be providing user with DDL commands for creating and deleting 
 XML Schema, also provision of associating xml schema with table while 
 creation of new table or while altering table structure, we are planning to 
 use libxml2 library. Proposed syntax is given below.

The first thing that comes to mind here is what if the user wants to
update/replace the schema ? How would you handle re-validating the fields?

Sure, updating XML schemas is not a great idea, but it doesn't stop
people doing it. It might be reasonable to say if you want to do this
you have to drop the dependent constraints, drop the schema, re-create
the schema and re-create the schema constraints though.

Why extend the create table / alter table syntax with USE_SCHEMA? Is
there a compatibility/standards reason to do this? If not, what
advantage does this provide over using a suitable CHECK constraint?

IIRC there were some memory management issues with libxml2 in Pg. Anyone
remember anything about that?

--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for XML Schema Validation

2013-08-09 Thread Kodamasimham Pridhvi (MT2012066)
Hi Craig Ringer,


(Sorry for reposting it as it got posted in different thread previously)
yeah, you are right indeed. I tried to answer your question in three section as 
below.

(A) XML Schema update

User may wish to update schema in future. So we will provide them one more 
function

UPDATE_XML_SCHEMA(URL OF SCHEMA,NAMESPACE,NEW CONTENT OF .XSD)

Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite 
content of .xsd field no provision
of altering a selective portion xsd in place.

Whenever an update to schema happens
before committing changes we will run small algo as below
 1. For all table in which this schema is used
 2.if(!validate xml document for each row)
 3.   abort/exit with error;
 4. commit

If user modify schema by adding some extra optional tags then their won't be 
any error
,error will arise in cases such as adding new compulsory/required tags, 
datatype in .xsd for
certain tag is modified. This is beyond our control an obvious solution as 
suggested by you
could be used, user will manually go through rows which are violating schema 
(for simplicity,
we will mention row number which are violating schema in our error message) and 
do
modification/deletion as required.


(
similar case happen, suppose you have a table t(a,b,c) with lot of data, later 
on
you want to add primary key constraints to column 'a', but if data in column 
'a' is
not unique then it may fail, and user has to manually handle this situation may 
be by deleting or
modifying respective rows.
)


(B) Alter Table

   Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle 
(somewhere on
oracle site i found this type of syntax), I may not be correct but check 
constraint is only used to
 limit the value ranges. So it is better to introduce new meaningful keyword or 
else no problem
to work embed this feature with CHECK()

(C)

yes , there are memory management related issue with libxml as mentioned on 
below link
http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues
It is also mention there that this issue can be resolved(how? don't know!).


Thanks,
Pridhvi  Vikrantsingh
IIIT Bangalore

From: Craig Ringer cr...@2ndquadrant.com
Sent: Friday, August 09, 2013 8:27 AM
To: Kodamasimham Pridhvi (MT2012066)
Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036
Subject: Re: [HACKERS] Proposal for XML Schema Validation

On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote:

 Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008)

 Description:
 We’ve gone through current support of xml in postgreSQL and found that there 
 is a check for well-formedness of xml document while inserting and updating. 
 We want to extend this feature by adding xml schema validation.
We will be providing user with DDL commands for creating and deleting 
 XML Schema, also provision of associating xml schema with table while 
 creation of new table or while altering table structure, we are planning to 
 use libxml2 library. Proposed syntax is given below.

The first thing that comes to mind here is what if the user wants to
update/replace the schema ? How would you handle re-validating the fields?

Sure, updating XML schemas is not a great idea, but it doesn't stop
people doing it. It might be reasonable to say if you want to do this
you have to drop the dependent constraints, drop the schema, re-create
the schema and re-create the schema constraints though.

Why extend the create table / alter table syntax with USE_SCHEMA? Is
there a compatibility/standards reason to do this? If not, what
advantage does this provide over using a suitable CHECK constraint?

IIRC there were some memory management issues with libxml2 in Pg. Anyone
remember anything about that?

--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for XML Schema Validation

2013-08-09 Thread Bisen Vikrantsingh Mohansingh MT2012036
Hi ,

In support to proposal, I'm extending by providing use case scenario.

Consider a table student(id int, profile xml) where you are storing student id 
and their profile in xml format.
Let xml format for profile be as follow.

profile.xml===
student
name
firstnamexyz/firstname
lastnameabc/lastname
/name
age20/age
courseDB101/course
address
citybangalore/city
pincode560100/pincode
/address
/student


There may a situation where course (tag) may be missing in xml file. The 
insertion of this file will not prompt any error message (even though it is one 
of important field)  because current version of postgresql only check for 
well-formedness of xml document, no provision for validation against xml 
schema. 

   Initially user has to register his schema by using our function  
REGSITER_XML_SCHEMA( “URL OF SCHEMA”, “NAMESPACE” , “CONTENT OF SCHEMA 
.XSD FILE) where user has to provide the URI  for the schema , Namespace to 
avoid namespace clashes and need to provide the complete schema in text and 
while creating the table user must associate the schema with column as below :

Create table tablename (col_name  datatype, col_name  xml 
USE_SCHEMA URL OF SCHEMA );


eg: Create table student (id int, profile xml USE_SCHEMA 
http://www.example.com/profile;);

 So we are providing facility  to register xml schema against a column with a 
datatype xml in a table, so that when ever an insertion in that column happens 
query processor (or which ever unit involved in this process) will first call 
our function validate_xml_schema()(as proposed in previous mail), which will 
decide to proceed or abort the insertion.

  In this way we can ensure that all the documents in the xml column are valid 
against schema.


Thanks,
Vikrantsingh

From: Kodamasimham Pridhvi (MT2012066)
Sent: Thursday, August 08, 2013 10:09 PM
To: pgsql-hackers@postgresql.org
Cc: Bisen Vikrantsingh Mohansingh MT2012036
Subject: RE: [HACKERS] Proposal for XML Schema Validation

Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008)

Description:
We’ve gone through current support of xml in postgreSQL and found that there is 
a check for well-formedness of xml document while inserting and updating. We 
want to extend this feature by adding xml schema validation.
 We will be providing user with DDL commands for creating and deleting 
XML Schema, also provision of associating xml schema with table while creation 
of new table or while altering table structure, we are planning to use libxml2 
library. Proposed syntax is given below.

1. End user perspective:

1.1 DDL
1.1.1   Register xmlschema
   Syntax
 REGSITER_XML_SCHEMA( “URL OF SCHEMA”, “NAMESPACE” , “CONTENT 
OF SCHEMA .XSD FILE)
   We will save this information into system catalog

1.1.2   Delete xmlschema
  Syntax
DELETE_XML_SCHEMA(  “URL OF SCHEMA”)

1.1.3   Modification in Create Table commands
Syntax
Create table tablename (col_name  datatype, col_name
  xml USE_SCHEMA URL OF SCHEMA )
 We will keep a flag in catalog for xml schema validation for each 
table. If xml schema is specified then while every insert/update sql query we 
will call valdate_xml_schema() [currently built in xml_is_well_formed() is 
called while inserting/updating, we can place our function call just next to it]


1.1.4   Similarly for Alter Table commands


2. Developer perspective

2.1. C-Function for Validation of xml doc

2.1.1 Validating XML

Syntax
   Interr_code  validate_xml_schema(char xml[], char xml_schema[])
This function will return 0 if validate successfully else return 
respective error code (which we will define later)
We are planning to use libxml2
This function will called while insert/update sql query



From: Andrew Dunstan and...@dunslane.net
Sent: Thursday, August 08, 2013 6:54 PM
To: Kodamasimham Pridhvi (MT2012066)
Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036; 
r...@iiitb.ac.in
Subject: Re: [HACKERS] Proposal for XML Schema Validation

On 08/08/2013 12:42 AM, Kodamasimham Pridhvi (MT2012066) wrote:
 Hello pgsql-hackers  ,
 With reference to Add XML Schema validation and xmlvalidate
 functions (SQL:2008) in ToDo list, we have gone through pgsql-mailing
 list  but we didn't find any significant work in this area, so we are
 proposing our own model for xml schema validation . please kindly go
 through it and let us know how can we improve it.Please find the
 attached proposal document.






Please post your proposal as text, not as a PDF attachment. That's what
is preferred on this mailing list.

cheers

andrew

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-08-09 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Sun, Aug 4, 2013 at 4:26 PM, Dimitri Fontaine dimi...@2ndquadrant.fr 
 wrote:
  What I'm yet unsure about is that there's a consensus that the use cases
  are worthy of a new shared catalog in the system. Also I didn't look how
  hard it is to actually provide for it.
 
 A new shared catalog wouldn't actually help, because the actual
 procedure to be run has to live in pg_proc, which is not shared.  And
 that has references to all sorts of other things (like pg_language)
 that aren't shared either.

A shared catalog which defined which *database* to run the trigger in,
with a way to fire off a new backend worker in that database and tell it
to run the trigger, might be interesting and would deal with the issue
that the trigger would behave differently depending on the database
connected to.  That would bring along other issues, of course, but it
seemed an interesting enough idea to mention.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext

2013-08-09 Thread Robert Haas
On Thu, Aug 8, 2013 at 1:28 PM, Andres Freund and...@2ndquadrant.com wrote:
 Well. It isn't. At least not in general. The specific case triggered
 here though are cache invalidations being processed which can lead to
 the catalog being read (pretty crummy, but not easy to get rid
 of). That's actually safe since before we process the invalidations we
 have done:
 1) CurrentTransactionState-state = TRANS_ABORT
 2) RecordTransactionAbort(), marking the transaction as aborted in the
   clog
 3) marked subxacts as aborted
 3) ProcArrayEndTransaction() (for toplevel ones)

 Due to these any tqual stuff will treat the current (sub-)xact and it's
 children as aborted. So the catalog lookups will use the catalog in a
 sensible state.

 Now, one could argue that it's certainly not safe for anything but
 xact.c itself to play such games. And would be pretty damn right. We
 could add some flat to signal catcache.c to temporarily use
 Assert(IsTransactionBlock()) instead of IsTransactionStmt() but that
 seems overly complex. I think the danger of code doing stuff in an
 aborted transaction isn't that big.

 This certainly deserves a good comment...

Do you want to propose something?

I basically don't have a very good feeling about this.  Processing
invalidations should invalidate stuff, not try to reread it.  You may
be right that our MVCC snapshot is OK at the point invalidations are
processed, but we don't know what caused the transaction to abort in
the first place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-08-09 Thread Dimitri Fontaine
Stephen Frost sfr...@snowman.net writes:
 A shared catalog which defined which *database* to run the trigger in,
 with a way to fire off a new backend worker in that database and tell it
 to run the trigger, might be interesting and would deal with the issue
 that the trigger would behave differently depending on the database
 connected to.  That would bring along other issues, of course, but it
 seemed an interesting enough idea to mention.

Yeah, I like that approach. The only drawback is that it requires having
PLproxy in core first, or something like Foreign Functions or something.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CREATE EVENT TRIGGER syntax

2013-08-09 Thread Robert Haas
On Mon, Aug 5, 2013 at 4:53 PM, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:
 Bruce Momjian br...@momjian.us writes:
 So do we want to keep that AND in the 9.3beta and 9.4 documentation?

 The grammar as in gram.y still allows the AND form, and I think we're
 used to maintain documentation that matches the code here. So I think it
 makes sense to remove both capabilities as we failed to deliver any
 other filter.

 But if we wanted to clean that, what about having the grammar check for
 the only one item we support rather than waiting until into
 CreateEventTrigger() to ereport a syntax error?

I have found that it's generally better to recognize such errors in
the post-parse phase rather than during parsing.  When you start
adding more options, that tends to quickly become the only workable
option anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for XML Schema Validation

2013-08-09 Thread Craig Ringer
On 08/09/2013 05:55 PM, Kodamasimham Pridhvi (MT2012066) wrote:

 (B) Alter Table
 
Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle 
 (somewhere on
 oracle site i found this type of syntax)

Well, there's certainly precedent for that - see to_char, the various
different BEGIN permutations, etc.

I would suggest doing that as a second separate step though. First
produce a function based interface that can be tried and tested without
the need to mess with the syntax and the parser. Then once that's in
good shape propose a patch that adds the compatibility syntax.

Among other things, if you're not adding new syntax you're more likely
to be able to prototype this as an extension.

I'm very far from being an expert in getting patches into Pg, though, so
please don't just take my word for it.

 I may not be correct but check constraint is only used to
  limit the value ranges.

A CHECK constraint can be any logic that refers only to the current row.

Using it with non-immutable (stable/volatile) functions isn't prevented,
but is also not a great idea, so updating an xsd would be a concern, but
it'd otherwise be fine.

 yes , there are memory management related issue with libxml as mentioned on 
 below link
 http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues
 It is also mention there that this issue can be resolved(how? don't know!).

Well, if you're planning on relying on libxml in core (and it'll have to
be in core if you're adding new syntax) then you'll need a solid, well
researched answer to that one or an alternative XML library that's
portable and doesn't have those issues.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext

2013-08-09 Thread Andres Freund
On 2013-08-09 09:05:51 -0400, Robert Haas wrote:
 On Thu, Aug 8, 2013 at 1:28 PM, Andres Freund and...@2ndquadrant.com wrote:
  Well. It isn't. At least not in general. The specific case triggered
  here though are cache invalidations being processed which can lead to
  the catalog being read (pretty crummy, but not easy to get rid
  of). That's actually safe since before we process the invalidations we
  have done:
  1) CurrentTransactionState-state = TRANS_ABORT
  2) RecordTransactionAbort(), marking the transaction as aborted in the
clog
  3) marked subxacts as aborted
  3) ProcArrayEndTransaction() (for toplevel ones)
 
  Due to these any tqual stuff will treat the current (sub-)xact and it's
  children as aborted. So the catalog lookups will use the catalog in a
  sensible state.
 
  Now, one could argue that it's certainly not safe for anything but
  xact.c itself to play such games. And would be pretty damn right. We
  could add some flat to signal catcache.c to temporarily use
  Assert(IsTransactionBlock()) instead of IsTransactionStmt() but that
  seems overly complex. I think the danger of code doing stuff in an
  aborted transaction isn't that big.
 
  This certainly deserves a good comment...
 
 Do you want to propose something?

I can, but it will have to wait a couple of days. I am only still online
because my holiday plans didn't 100% work out and got delayed by a
day...

 I basically don't have a very good feeling about this.  Processing
 invalidations should invalidate stuff, not try to reread it.

I agree. But fixing that seems to require a good amount of surgery. The
problem is that currently we cannot know for sure some index doesn't
still use the index support infrastructure :(.

 You may
 be right that our MVCC snapshot is OK at the point invalidations are
 processed, but we don't know what caused the transaction to abort in
 the first place.

Well, we know it has been an ERROR and nothing worse. And that it
successfully longjmp'ed up the way to postgres.c or one of the PLs.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for XML Schema Validation

2013-08-09 Thread Tom Lane
Craig Ringer cr...@2ndquadrant.com writes:
 On 08/09/2013 05:55 PM, Kodamasimham Pridhvi (MT2012066) wrote:
 Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle 
 (somewhere on
 oracle site i found this type of syntax)

 Well, there's certainly precedent for that - see to_char, the various
 different BEGIN permutations, etc.

 I would suggest doing that as a second separate step though. First
 produce a function based interface that can be tried and tested without
 the need to mess with the syntax and the parser. Then once that's in
 good shape propose a patch that adds the compatibility syntax.

TBH I think any such syntax would be rejected.  We have enough trouble
dealing with the SQL standards committee's creative ideas about weird
syntax with unnecessary keywords.  Oracle compatibility is not going
to be enough of an argument for inventing another keyword.  Especially
not if it has to be reserved, which seems rather likely given where
you're proposing to put it.

Having to add another catalog column for the sole use of this feature is
another thing that's unlikely to fly.  (A general rule of thumb is that
if a proposed feature imposes overhead on everybody, whether they ever
use that feature or not, it had better be something that a pretty large
percentage of people *will* use.  I doubt this meets that standard.)

So if you can do it along the lines of CHECK(xml_validates(xml_col_name,
'schema name')), I would strongly urge you to pursue that path.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: ALTER SYSTEM SET command to change postgresql.conf parameters (RE: [HACKERS] Proposal for Allow postgresql.conf values to be changed via SQL [review])

2013-08-09 Thread David Fetter
On Fri, Aug 09, 2013 at 03:08:45PM +0200, Dimitri Fontaine wrote:
 Stephen Frost sfr...@snowman.net writes:
  A shared catalog which defined which *database* to run the trigger
  in, with a way to fire off a new backend worker in that database
  and tell it to run the trigger, might be interesting and would
  deal with the issue that the trigger would behave differently
  depending on the database connected to.  That would bring along
  other issues, of course, but it seemed an interesting enough idea
  to mention.
 
 Yeah, I like that approach. The only drawback is that it requires
 having PLproxy in core first, or something like Foreign Functions or
 something.

SQL/MED does define such an API.  Whether we find it useful enough to
make it the default way of doing things is a separate matter.  I'll do
some research.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext

2013-08-09 Thread Noah Misch
On Tue, Aug 06, 2013 at 09:06:59AM +0200, Andres Freund wrote:
 On 2013-08-05 13:09:31 -0400, Noah Misch wrote:
  When we call AtEOSubXact_Inval() or AtEOXact_Inval() with a relation still
  open, we can potentially get a relcache rebuild and therefore a syscache
  lookup as shown above.  CommitSubTransaction() is also potentially affected,
  though I don't have an SQL-level test case for that.  It calls
  CommandCounterIncrement() after moving to TRANS_COMMIT.  That CCI had better
  find no invalidations of open relations, or we'll make syscache lookups.  
  (In
  simple tests, any necessary invalidations tend to happen at the CCI in
  CommitTransactionCommand(), so the later CCI does in fact find nothing to 
  do.
  I have little confidence that should be counted upon, though.)
 
  How might we best rearrange things to avoid these hazards?
 
 Ok. After a good bit of code reading, I think this isn't an actual bug
 but an overzealous Assert(). I think it should be
 Assert(IsTransactionBlock()) not Assert(IsTransactionState());

IsTransactionBlock() is for higher-level things that care about actual use of
BEGIN.  It's false in the middle of executing a single-statement transaction,
but that's of course a perfectly valid time for syscache lookups.

 The reason for that is that when we do the AtEO(Sub)?Xact_Inval(), we've
 already done a RecordTransactionAbort(true|false) and
 CurrentTransactionState-state = TRANS_ABORT. So the visibility routines
 have enough information to consider rows created by the aborted
 transaction as invisible.
 
 I am not really happy with the RelationReloadIndexInfo()s in
 RelationClearRelation() when we're in an aborted state, especially as
 the comment surrounding them are clearly out of date, but I don't see a
 bug there anymore.

Interesting.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] confusing error message

2013-08-09 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 This is all sufficiently bizarre that I don't know if there's an
 easy explanation.

It occurs to me that users, when faced with complex error messages, are
very likely to go to their favorite search engine with it and rarely
does that lead them to any documentation on the subject but instead they
get mailing list archives or q-and-a type of websites.

Perhaps we should add an area to our documentation which provides more
information about the specific error messages which PostgreSQL returns?
That's not a terribly exciting bit of documentation to write, but
perhaps it would be very useful for our users.

Thoughts?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
 Thoughts?  In particular, anyone want to bikeshed on the message wording?

Looks like a good idea to me and the wording looks fine to me.

 Does this rise to the level of a usability bug that ought to be
 back-patched?  As I said, we've seen this type of thinko multiple
 times before.

For this, I'd say to not back-patch it; we seem to have had enough fun
with changing error messaging in back branches already lately (eg:
the recent autovacuum changes..).

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] question about HTTP API

2013-08-09 Thread Josh Berkus

 For my patch, I plan to use pre-forked bgworkers which have already
 connected to the backend, so that populating the relcache and other process
 startup costs don't impact on the HTTP response time.  (This still means
 queries are being planned and function code is being compiled for each
 request, of course...)
 
 This is going to be a very long series of patches, but IMHO we have to
 start somewhere!  For some applications, performance is far less important
 than ease-of-use and ease-of-deployment.

Agreed.  Too bad you can't do this as an extension, it would allow you
to rev releases a lot faster than once a year.

Actually, maybe you should look at what is the minimum patch required
to enable a webserver extension, with the idea that most of the
webserver code would still live outside the core?  That way you could
continue to develop it a lot faster.

Also, if all aspects of the web services model (management of sessions,
sercurity, etc.) need to be a core PostgreSQL patch, you're in for a
really long set of arguments since there's no one best way to do these
things.  Keeping the web services engine outside the core would let you
not have those arguments on this list, which otherwise would likely
cause the feature to miss 9.4.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Tom Lane
Stephen Frost sfr...@snowman.net writes:
 * Tom Lane (t...@sss.pgh.pa.us) wrote:
 Does this rise to the level of a usability bug that ought to be
 back-patched?  As I said, we've seen this type of thinko multiple
 times before.

 For this, I'd say to not back-patch it; we seem to have had enough fun
 with changing error messaging in back branches already lately (eg:
 the recent autovacuum changes..).

Well, since these would be new messages, they'd just not get translated
(until the translators got around to them).  Seems like the worst case
scenario is that someone who didn't understand the English version would
remain as clueless as before, which isn't much of a downside.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump and schema names

2013-08-09 Thread Bruce Momjian
On Fri, Aug  9, 2013 at 01:48:43AM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  pg_dump goes to great lengths not to hard-code the schema name into
  commands like CREATE TABLE, instead setting the search_path before
  creating the table;  these commands:
 
  CREATE SCHEMA xx;
  CREATE TABLE xx.test(x int);
 
  generates this output:
 
  SET search_path = xx, pg_catalog;
  CREATE TABLE test (
  x integer
  );
 
  If you dump a schema and want to reload it into another schema, you
  should only need to update that one search_path line.  However, later in
  the dump file, we hardcode the schema name for setting the object owner:
 
  ALTER TABLE xx.test OWNER TO postgres;
 
  Could we use search_path here to avoid the schema designation?
 
 Perhaps, but that's not likely to reduce the number of places you have to
 edit, unless your dump is only one schema anyway.
 
 The practical difficulties involved can be seen by reading the comments
 and code for _getObjectDescription().

Yes, I looked at that.Seems _getObjectDescription() is only called
from _printTocEntry(), and that function has a call to
_selectOutputSchema() at the top, so we already know we have search_path
set to the proper schema.

The attached patch removes the unnecessary schema qualification for
ALTER OWNER, and the attached dump file show a two-schema dump that
restores just fine.

Basically, if we are going to use search_path to avoid schema
specification, we should do it in ALTER OWNER too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
new file mode 100644
index cd7669b..0a79489
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
*** _getObjectDescription(PQExpBuffer buf, T
*** 2906,2913 
  		strcmp(type, TEXT SEARCH CONFIGURATION) == 0)
  	{
  		appendPQExpBuffer(buf, %s , type);
- 		if (te-namespace  te-namespace[0])	/* is null pre-7.3 */
- 			appendPQExpBuffer(buf, %s., fmtId(te-namespace));
  
  		/*
  		 * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name
--- 2906,2911 
--
-- PostgreSQL database dump
--

SET statement_timeout = 0;
SET lock_timeout = 0;
SET client_encoding = 'UTF8';
SET standard_conforming_strings = on;
SET check_function_bodies = false;
SET client_min_messages = warning;

--
-- Name: xx; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA xx;


ALTER SCHEMA xx OWNER TO postgres;

--
-- Name: yy; Type: SCHEMA; Schema: -; Owner: postgres
--

CREATE SCHEMA yy;


ALTER SCHEMA yy OWNER TO postgres;

--
-- Name: plpgsql; Type: EXTENSION; Schema: -; Owner: 
--

CREATE EXTENSION IF NOT EXISTS plpgsql WITH SCHEMA pg_catalog;


--
-- Name: EXTENSION plpgsql; Type: COMMENT; Schema: -; Owner: 
--

COMMENT ON EXTENSION plpgsql IS 'PL/pgSQL procedural language';


SET search_path = xx, pg_catalog;

SET default_tablespace = '';

SET default_with_oids = false;

--
-- Name: test2; Type: TABLE; Schema: xx; Owner: postgres; Tablespace: 
--

CREATE TABLE test2 (
x integer
);


ALTER TABLE test2 OWNER TO postgres;

SET search_path = yy, pg_catalog;

--
-- Name: zz; Type: TABLE; Schema: yy; Owner: postgres; Tablespace: 
--

CREATE TABLE zz (
x integer
);


ALTER TABLE zz OWNER TO postgres;

SET search_path = xx, pg_catalog;

--
-- Data for Name: test2; Type: TABLE DATA; Schema: xx; Owner: postgres
--

COPY test2 (x) FROM stdin;
\.


SET search_path = yy, pg_catalog;

--
-- Data for Name: zz; Type: TABLE DATA; Schema: yy; Owner: postgres
--

COPY zz (x) FROM stdin;
\.


--
-- Name: public; Type: ACL; Schema: -; Owner: postgres
--

REVOKE ALL ON SCHEMA public FROM PUBLIC;
REVOKE ALL ON SCHEMA public FROM postgres;
GRANT ALL ON SCHEMA public TO postgres;
GRANT ALL ON SCHEMA public TO PUBLIC;


--
-- PostgreSQL database dump complete
--


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] question about HTTP API

2013-08-09 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Agreed.  Too bad you can't do this as an extension, it would allow you
 to rev releases a lot faster than once a year.

 Actually, maybe you should look at what is the minimum patch required
 to enable a webserver extension, with the idea that most of the
 webserver code would still live outside the core?  That way you could
 continue to develop it a lot faster.

+1.  I think for reasons such as security, a lot of people would rather
*not* see any such thing in core anyway, independent of development
issues.  It's also far from clear that there is only one desirable
behavior of this sort, so a design path that offers the possibility
of multiple webserver implementations as separate extensions seems
attractive.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump and schema names

2013-08-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Fri, Aug  9, 2013 at 01:48:43AM -0400, Tom Lane wrote:
 The practical difficulties involved can be seen by reading the comments
 and code for _getObjectDescription().

 Yes, I looked at that.Seems _getObjectDescription() is only called
 from _printTocEntry(), and that function has a call to
 _selectOutputSchema() at the top, so we already know we have search_path
 set to the proper schema.

 The attached patch removes the unnecessary schema qualification for
 ALTER OWNER, and the attached dump file show a two-schema dump that
 restores just fine.

This really requires more than no attention to the comments, especially
since you just removed the only apparent reason for _getObjectDescription
to make a distinction between objects whose name includes a schema and
those that don't.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] confusing error message

2013-08-09 Thread Craig Ringer
On 08/10/2013 12:09 AM, Stephen Frost wrote:
 Perhaps we should add an area to our documentation which provides
 more information about the specific error messages which PostgreSQL
 returns? That's not a terribly exciting bit of documentation to
 write, but perhaps it would be very useful for our users.

PG00204 Error reading control file

More seriously, with interpolated strings for relation names etc it can
be hard to know which chunks to search for, and search engines aren't
always good at having the whole message thrown at them.

I'm not actually proposing in-text message identifiers ... I'm sure
enough people have ignored all sanity and reason and parsed message
strings that this would cause breakage all over the place. It'd also be
a nightmare for translators and would easily be confused for an SQLSTATE.

A separate field in the structured messages would be saner, but of
course most clients would not then display it so the user would never
know it was there, much the same way few people seem to realise you can
get the function, source file and line from error messages already.
After all, how many people do you think run with VERBOSITY=verbose in psql?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump and schema names

2013-08-09 Thread Bruce Momjian
On Fri, Aug  9, 2013 at 12:53:20PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Fri, Aug  9, 2013 at 01:48:43AM -0400, Tom Lane wrote:
  The practical difficulties involved can be seen by reading the comments
  and code for _getObjectDescription().
 
  Yes, I looked at that.Seems _getObjectDescription() is only called
  from _printTocEntry(), and that function has a call to
  _selectOutputSchema() at the top, so we already know we have search_path
  set to the proper schema.
 
  The attached patch removes the unnecessary schema qualification for
  ALTER OWNER, and the attached dump file show a two-schema dump that
  restores just fine.
 
 This really requires more than no attention to the comments, especially
 since you just removed the only apparent reason for _getObjectDescription
 to make a distinction between objects whose name includes a schema and
 those that don't.

I am confused.  Are you saying I didn't read the comments, or that I can
now merge the schema-qualified and non-schema-qualified object sections? 

Also, this seems like dead code as there is no test for INDEX in the
if() block it exists in:

/*
 * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name
 * into te-tag for an index. This check is heuristic, so make its
 * scope as narrow as possible.
 */
if (AH-version  K_VERS_1_7 
te-tag[0] == '' 
te-tag[strlen(te-tag) - 1] == '' 
strcmp(type, INDEX) == 0)
appendPQExpBuffer(buf, %s, te-tag);
else

Please advise.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] confusing error message

2013-08-09 Thread Alvaro Herrera
I seem to remember somebody proposed an errurl() macro so that we could
add URLs to certain particularly confusing error reports.

[searches the archives]  Bah, that was me, and some other ideas were
proposed:
http://www.postgresql.org/message-id/48ca9d5f.6060...@esilo.com

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for XML Schema Validation

2013-08-09 Thread Bisen Vikrantsingh Mohansingh MT2012036
Hi Craig Ringer,

yeah, you are right indeed. I tried to answer your question in three section as 
below.

(A) XML Schema update

User may wish to update schema in future. So we will provide them one more 
function

UPDATE_XML_SCHEMA(URL OF SCHEMA,NAMESPACE,NEW CONTENT OF .XSD)

Here we are assuming URL OF SCHEMA as a primary key. And we will overwrite 
content of .xsd field no provision
of altering a selective portion xsd in place.

Whenever an update to schema happens
before committing changes we will run small algo as below
 1. For all table in which this schema is used
 2.if(!validate xml document for each row) 
 3.   abort/exit with error;
 4. commit  

If user modify schema by adding some extra optional tags then their won't be 
any error
,error will arise in cases such as adding new compulsory/required tags, 
datatype in .xsd for 
certain tag is modified. This is beyond our control an obvious solution as 
suggested by you
could be used, user will manually go through rows which are violating schema 
(for simplicity, 
we will mention row number which are violating schema in our error message) and 
do 
modification/deletion as required.


(
similar case happen, suppose you have a table t(a,b,c) with lot of data, later 
on
you want to add primary key constraints to column 'a', but if data in column 
'a' is 
not unique then it may fail, and user has to manually handle this situation may 
be by deleting or
modifying respective rows.
)


(B) Alter Table

   Only sole purpose of making use of keyword USE_SCHEMA is to mimic oracle 
(somewhere on 
oracle site i found this type of syntax), I may not be correct but check 
constraint is only used to
 limit the value ranges. So it is better to introduce new meaningful keyword or 
else no problem 
to work embed this feature with CHECK()

(C)

yes , there are memory management related issue with libxml as mentioned on 
below link
http://wiki.postgresql.org/wiki/XML_Support#Implementation_Issues
It is also mention there that this issue can be resolved(how? don't know!).



Thanks,
Vikrantsingh  Pridhvi
IIIT Bangalore

From: Craig Ringer cr...@2ndquadrant.com
Sent: Friday, August 09, 2013 8:27 AM
To: Kodamasimham Pridhvi (MT2012066)
Cc: pgsql-hackers@postgresql.org; Bisen Vikrantsingh Mohansingh MT2012036
Subject: Re: [HACKERS] Proposal for XML Schema Validation

On 08/09/2013 12:39 AM, Kodamasimham Pridhvi (MT2012066) wrote:

 Objective: To Add XML Schema validation and xmlvalidate functions (SQL:2008)

 Description:
 We’ve gone through current support of xml in postgreSQL and found that there 
 is a check for well-formedness of xml document while inserting and updating. 
 We want to extend this feature by adding xml schema validation.
We will be providing user with DDL commands for creating and deleting 
 XML Schema, also provision of associating xml schema with table while 
 creation of new table or while altering table structure, we are planning to 
 use libxml2 library. Proposed syntax is given below.

The first thing that comes to mind here is what if the user wants to
update/replace the schema ? How would you handle re-validating the fields?

Sure, updating XML schemas is not a great idea, but it doesn't stop
people doing it. It might be reasonable to say if you want to do this
you have to drop the dependent constraints, drop the schema, re-create
the schema and re-create the schema constraints though.

Why extend the create table / alter table syntax with USE_SCHEMA? Is
there a compatibility/standards reason to do this? If not, what
advantage does this provide over using a suitable CHECK constraint?

IIRC there were some memory management issues with libxml2 in Pg. Anyone
remember anything about that?

--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Josh Berkus
Tom,

 I thought about trying to leave similar breadcrumbs if the logging
 parameters are changed while the postmaster is running, but it would add a
 fair amount of complication to the patch, and I'm not sure there's a lot
 of value in it.  On-the-fly logging parameter changes don't happen without
 active DBA involvement, so it's a lot harder to credit thaat somebody would
 not be expecting the data to start going somewhere else.

Well, I think doing that ALSO would be worthwhile for the TODO list.
I've often wished, for example, that if we switch log_directory the
*last* message in the old log file be reloading postgresql with new
configuration or something similar, so that I would know to look for a
new log file somewhere else.  If you are, for example, logging only
errors, you wouldn't necessarily realize that logging on the file you're
tailing/monitoring has stopped.

The active DBA involvement argument doesn't hold much water given the
many avenues for someone to accidentally introduce a configuration
change they didn't intend.

However, I also realize that the complexity of this feature's
implementation would likely eclipse its usefulness.  As such, I'd like
to put it on the TODO list for some future occasion when we need to mess
with log-switching code *anyway* and can include this.

 
 Thoughts?  In particular, anyone want to bikeshed on the message wording?
 
 Does this rise to the level of a usability bug that ought to be
 back-patched?  As I said, we've seen this type of thinko multiple
 times before.

Hmmm.  On the one hand, I can't see the harm in it.  On the other hand,
I'm reluctant to introduce non-critical behavior changes into
backbranches no matter how minor.  What if we just put this in 9.3 and up?


-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump and schema names

2013-08-09 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 On Fri, Aug  9, 2013 at 12:53:20PM -0400, Tom Lane wrote:
 This really requires more than no attention to the comments, especially
 since you just removed the only apparent reason for _getObjectDescription
 to make a distinction between objects whose name includes a schema and
 those that don't.

 I am confused.  Are you saying I didn't read the comments, or that I can
 now merge the schema-qualified and non-schema-qualified object sections? 

Well, it's certainly not immediately obvious why we shouldn't merge them.
But I would have expected the function's header comment to now explain
that the output is intentionally not schema-qualified and assumes that the
search path is set for the object's schema if any.

 Also, this seems like dead code as there is no test for INDEX in the
 if() block it exists in:

 /*
  * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name
  * into te-tag for an index. This check is heuristic, so make its
  * scope as narrow as possible.
  */
 if (AH-version  K_VERS_1_7 
 te-tag[0] == '' 
 te-tag[strlen(te-tag) - 1] == '' 
 strcmp(type, INDEX) == 0)
 appendPQExpBuffer(buf, %s, te-tag);
 else

Huh, yeah it is dead code, since _printTocEntry doesn't call this function
for INDEX objects.  And anyway I doubt anybody still cares about reading
7.2-era archive files.  No objection to removing that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext

2013-08-09 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2013-08-08 09:27:24 -0400, Robert Haas wrote:
 How can it be safe to try to read catalogs if the transaction is aborted?

 Well. It isn't. At least not in general. The specific case triggered
 here though are cache invalidations being processed which can lead to
 the catalog being read (pretty crummy, but not easy to get rid
 of). That's actually safe since before we process the invalidations we
 have done:
 1) CurrentTransactionState-state = TRANS_ABORT
 2) RecordTransactionAbort(), marking the transaction as aborted in the
   clog
 3) marked subxacts as aborted
 3) ProcArrayEndTransaction() (for toplevel ones)

 Due to these any tqual stuff will treat the current (sub-)xact and it's
 children as aborted. So the catalog lookups will use the catalog in a
 sensible state.

I don't have any faith in this argument.  You might be right that we'll
correctly see our own output rows as aborted, but that's barely the tip
of the iceberg of risk here.  Is it safe to take new locks in an aborted
transaction?  (What if we're already past the lock-release point in
the abort sequence?)  For that matter, given that we don't know what
exactly caused the transaction abort, how safe is it to do anything at
all --- we might for instance be nearly out of memory.  If the catalog
reading attempt itself fails, won't we be in an infinite loop of
transaction aborts?  I could probably think of ten more risks if
I spent a few more minutes at it.

Cache invalidation during abort should *not* lead to any attempt to
immediately revalidate the cache.  No amount of excuses will make that
okay.  I have not looked to see just what the path of control is in this
particular case, but we need to fix it, not paper over it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump and schema names

2013-08-09 Thread Bruce Momjian
On Fri, Aug  9, 2013 at 01:39:35PM -0400, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  On Fri, Aug  9, 2013 at 12:53:20PM -0400, Tom Lane wrote:
  This really requires more than no attention to the comments, especially
  since you just removed the only apparent reason for _getObjectDescription
  to make a distinction between objects whose name includes a schema and
  those that don't.
 
  I am confused.  Are you saying I didn't read the comments, or that I can
  now merge the schema-qualified and non-schema-qualified object sections? 
 
 Well, it's certainly not immediately obvious why we shouldn't merge them.
 But I would have expected the function's header comment to now explain
 that the output is intentionally not schema-qualified and assumes that the
 search path is set for the object's schema if any.

OK, done with the attached patch.  The dump output is unchanged.

  Also, this seems like dead code as there is no test for INDEX in the
  if() block it exists in:
 
  /*
   * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name
   * into te-tag for an index. This check is heuristic, so make its
   * scope as narrow as possible.
   */
  if (AH-version  K_VERS_1_7 
  te-tag[0] == '' 
  te-tag[strlen(te-tag) - 1] == '' 
  strcmp(type, INDEX) == 0)
  appendPQExpBuffer(buf, %s, te-tag);
  else
 
 Huh, yeah it is dead code, since _printTocEntry doesn't call this function
 for INDEX objects.  And anyway I doubt anybody still cares about reading
 7.2-era archive files.  No objection to removing that.

Removed.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/pg_dump/pg_backup_archiver.c b/src/bin/pg_dump/pg_backup_archiver.c
new file mode 100644
index cd7669b..5204ceb
*** a/src/bin/pg_dump/pg_backup_archiver.c
--- b/src/bin/pg_dump/pg_backup_archiver.c
*** _selectTablespace(ArchiveHandle *AH, con
*** 2879,2889 
  /*
   * Extract an object description for a TOC entry, and append it to buf.
   *
!  * This is not quite as general as it may seem, since it really only
!  * handles constructing the right thing to put into ALTER ... OWNER TO.
!  *
!  * The whole thing is pretty grotty, but we are kind of stuck since the
!  * information used is all that's available in older dump files.
   */
  static void
  _getObjectDescription(PQExpBuffer buf, TocEntry *te, ArchiveHandle *AH)
--- 2879,2885 
  /*
   * Extract an object description for a TOC entry, and append it to buf.
   *
!  * This is used for ALTER ... OWNER TO.
   */
  static void
  _getObjectDescription(PQExpBuffer buf, TocEntry *te, ArchiveHandle *AH)
*** _getObjectDescription(PQExpBuffer buf, T
*** 2895,2901 
  		strcmp(type, MATERIALIZED VIEW) == 0)
  		type = TABLE;
  
! 	/* objects named by a schema and name */
  	if (strcmp(type, COLLATION) == 0 ||
  		strcmp(type, CONVERSION) == 0 ||
  		strcmp(type, DOMAIN) == 0 ||
--- 2891,2897 
  		strcmp(type, MATERIALIZED VIEW) == 0)
  		type = TABLE;
  
! 	/* objects that don't require special decoration */
  	if (strcmp(type, COLLATION) == 0 ||
  		strcmp(type, CONVERSION) == 0 ||
  		strcmp(type, DOMAIN) == 0 ||
*** _getObjectDescription(PQExpBuffer buf, T
*** 2903,2937 
  		strcmp(type, TYPE) == 0 ||
  		strcmp(type, FOREIGN TABLE) == 0 ||
  		strcmp(type, TEXT SEARCH DICTIONARY) == 0 ||
! 		strcmp(type, TEXT SEARCH CONFIGURATION) == 0)
! 	{
! 		appendPQExpBuffer(buf, %s , type);
! 		if (te-namespace  te-namespace[0])	/* is null pre-7.3 */
! 			appendPQExpBuffer(buf, %s., fmtId(te-namespace));
! 
! 		/*
! 		 * Pre-7.3 pg_dump would sometimes (not always) put a fmtId'd name
! 		 * into te-tag for an index. This check is heuristic, so make its
! 		 * scope as narrow as possible.
! 		 */
! 		if (AH-version  K_VERS_1_7 
! 			te-tag[0] == '' 
! 			te-tag[strlen(te-tag) - 1] == '' 
! 			strcmp(type, INDEX) == 0)
! 			appendPQExpBuffer(buf, %s, te-tag);
! 		else
! 			appendPQExpBuffer(buf, %s, fmtId(te-tag));
! 		return;
! 	}
! 
! 	/* objects named by just a name */
! 	if (strcmp(type, DATABASE) == 0 ||
  		strcmp(type, PROCEDURAL LANGUAGE) == 0 ||
  		strcmp(type, SCHEMA) == 0 ||
  		strcmp(type, FOREIGN DATA WRAPPER) == 0 ||
  		strcmp(type, SERVER) == 0 ||
  		strcmp(type, USER MAPPING) == 0)
  	{
  		appendPQExpBuffer(buf, %s %s, type, fmtId(te-tag));
  		return;
  	}
--- 2899,2914 
  		strcmp(type, TYPE) == 0 ||
  		strcmp(type, FOREIGN TABLE) == 0 ||
  		strcmp(type, TEXT SEARCH DICTIONARY) == 0 ||
! 		strcmp(type, TEXT SEARCH CONFIGURATION) == 0 ||
! 		/* non-schema-specified objects */
! 		strcmp(type, DATABASE) == 0 ||
  		strcmp(type, PROCEDURAL LANGUAGE) == 0 ||
  		strcmp(type, SCHEMA) == 0 ||
  		strcmp(type, FOREIGN DATA WRAPPER) == 0 ||
  		strcmp(type, SERVER) == 0 ||

Re: [HACKERS] confusing error message

2013-08-09 Thread Stephen Frost
* Craig Ringer (cr...@2ndquadrant.com) wrote:
 More seriously, with interpolated strings for relation names etc it can
 be hard to know which chunks to search for, and search engines aren't
 always good at having the whole message thrown at them.

It's not perfect, but if the searches are getting to mailing list
archive messages then it's clearly possible for us to do better.  Having
an embedded code or URL or what-have-you might be an option too, but we
need the documentation first in any case, so we could do these other
things later..

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump and schema names

2013-08-09 Thread Bruce Momjian
On Fri, Aug  9, 2013 at 02:15:31PM -0400, Bruce Momjian wrote:
 On Fri, Aug  9, 2013 at 01:39:35PM -0400, Tom Lane wrote:
  Bruce Momjian br...@momjian.us writes:
   On Fri, Aug  9, 2013 at 12:53:20PM -0400, Tom Lane wrote:
   This really requires more than no attention to the comments, especially
   since you just removed the only apparent reason for _getObjectDescription
   to make a distinction between objects whose name includes a schema and
   those that don't.
  
   I am confused.  Are you saying I didn't read the comments, or that I can
   now merge the schema-qualified and non-schema-qualified object sections? 
  
  Well, it's certainly not immediately obvious why we shouldn't merge them.
  But I would have expected the function's header comment to now explain
  that the output is intentionally not schema-qualified and assumes that the
  search path is set for the object's schema if any.
 
 OK, done with the attached patch.  The dump output is unchanged.

To be honest, I never got to modifying the comments because I expected
someone to say the patch was wrong.  I also didn't expect to find dead
code in there too.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Stephen Frost
Josh, Tom,

* Josh Berkus (j...@agliodbs.com) wrote:
  Does this rise to the level of a usability bug that ought to be
  back-patched?  As I said, we've seen this type of thinko multiple
  times before.
 
 Hmmm.  On the one hand, I can't see the harm in it.  On the other hand,
 I'm reluctant to introduce non-critical behavior changes into
 backbranches no matter how minor.  What if we just put this in 9.3 and up?

I'd be fine w/ this going into 9.3.  What was perhaps not entirely clear
from my last mail is that I was complaining about the autovacuum changes
in 9.2.3 (iirc?) which caused it to be more chatty which likely
surprised some poor DBAs.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command

2013-08-09 Thread Stefan Kaltenbrunner
On 08/09/2013 12:02 AM, Vik Fearing wrote:
 On 08/08/2013 07:57 PM, Stefan Kaltenbrunner wrote:
 
 On 08/08/2013 01:52 PM, Vik Fearing wrote:
 I would add this to the next commitfest but I seem to be unable to log
 in with my community account (I can log in to the wiki).  Help appreciated.
 whould be a bit easier to diagnose if we knew your community account name 
 
 Sorry, it's glaucous.

hmm looks like your account may be affected by one of the buglets
introduced (and fixed shortly afterwards) of the main infrastructure to
debian wheezy - please try logging in to the main website and change
your password at least once. That should make it working again for the
commitfest app...


Stefan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext

2013-08-09 Thread Noah Misch
On Fri, Aug 09, 2013 at 02:11:46PM -0400, Tom Lane wrote:
 Cache invalidation during abort should *not* lead to any attempt to
 immediately revalidate the cache.  No amount of excuses will make that
 okay.  I have not looked to see just what the path of control is in this
 particular case, but we need to fix it, not paper over it.

+1.  What if (sub)transaction end only manipulated the local invalidation
message queue for later processing?  Actual processing would happen after
CleanupSubTransaction() returns control to the owning xact, or at the start of
the next transaction for a top-level ending.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Tom,
 I thought about trying to leave similar breadcrumbs if the logging
 parameters are changed while the postmaster is running, but it would add a
 fair amount of complication to the patch, and I'm not sure there's a lot
 of value in it.  On-the-fly logging parameter changes don't happen without
 active DBA involvement, so it's a lot harder to credit thaat somebody would
 not be expecting the data to start going somewhere else.

 Well, I think doing that ALSO would be worthwhile for the TODO list.
 I've often wished, for example, that if we switch log_directory the
 *last* message in the old log file be reloading postgresql with new
 configuration or something similar, so that I would know to look for a
 new log file somewhere else.  If you are, for example, logging only
 errors, you wouldn't necessarily realize that logging on the file you're
 tailing/monitoring has stopped.

In principle I see the risk, but I don't think I've ever seen an actual
report of someone getting confused this way by an on-the-fly logging
parameter change.  Whereas there are numerous examples in the archives
of people not realizing that pg_ctl -l foo doesn't necessarily mean
that all the useful log output is in file foo.  (Usually it's because
they're using a logging setup chosen by some packager, not by themselves.)
So I'm feeling that what you're suggesting is solving a different and
far less pressing problem than what I'm on about.

I did think a little bit about how to do it.  For parameters that affect
where the logging collector writes data (probably only log_directory is
worth special handling), it would be quite easy to make the got_SIGHUP
code segment in syslogger.c emit a log message just before switching the
active value.  However, if you want something similar for log_destination,
syslog_facility, syslog_ident, or event_source, it's far more problematic
because those settings affect the behavior of individual processes, and
so there's no unique point where we're switching from one log target to
another.  We could have the postmaster report a value change but it's
likely that that message would not appear very close to the end of the
messages directed to the old target.

Another point here is that if you're tailing the current log file,
a plain old rotation (no parameter change anywhere) would also cut
you off without obvious notice.  Maybe we shouldn't think about this
as a change of parameter problem, but just say it's worth emitting
an end of log file message anytime we're about to change to a new log
file, no matter the reason.  But again, that's specific to the logging
collector case and doesn't help with any other log target.  I also note
that I'm not familiar with any system logging tool that emits such
messages.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Noah Misch
On Thu, Aug 08, 2013 at 10:32:17PM -0400, Tom Lane wrote:
 The attached patch is motivated by
 http://www.postgresql.org/message-id/cajyqwwryt9rmbzs-sh6ucr1otg4joxqkdf-fkoyp6pv12t0...@mail.gmail.com

 This patch arranges to emit a hint message when/if we switch away from
 logging to the original postmaster stderr during startup.  There are two
 cases to cover: we're still using LOG_DESTINATION_STDERR but redirecting
 stderr to a syslogger process, or we stop writing to stderr altogether,
 presumably in favor of going to syslog or something.

At LOG level, this feels a bit chatty: it's a 100% increase in startup-time
messages if you count both the main message and the HINT.  I can't think of
another program with configuration-directed logging that does this on every
startup.  Makes perfect sense to me at DEBUG1, though, and that would have
been enough for the situation you cite above.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Statistics collection for CLUSTER command

2013-08-09 Thread Vik Fearing
On 08/09/2013 10:37 PM, Stefan Kaltenbrunner wrote:
 On 08/08/2013 01:52 PM, Vik Fearing wrote:
 I would add this to the next commitfest but I seem to be unable to log
 in with my community account (I can log in to the wiki).  Help appreciated.
 hmm looks like your account may be affected by one of the buglets
 introduced (and fixed shortly afterwards) of the main infrastructure to
 debian wheezy - please try logging in to the main website and change
 your password at least once. That should make it working again for the
 commitfest app...

That worked.  Thank you.

Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Thu, Aug 08, 2013 at 10:32:17PM -0400, Tom Lane wrote:
 This patch arranges to emit a hint message when/if we switch away from
 logging to the original postmaster stderr during startup.  There are two
 cases to cover: we're still using LOG_DESTINATION_STDERR but redirecting
 stderr to a syslogger process, or we stop writing to stderr altogether,
 presumably in favor of going to syslog or something.

 At LOG level, this feels a bit chatty: it's a 100% increase in startup-time
 messages if you count both the main message and the HINT.  I can't think of
 another program with configuration-directed logging that does this on every
 startup.  Makes perfect sense to me at DEBUG1, though, and that would have
 been enough for the situation you cite above.

Hm.  That would be enough for users who think to increase log_min_messages
while trying to resolve their problem.  But what I'm mainly worried about
here is people who are relative novices, so I don't have a lot of
confidence that the patch would still help them if we made the message not
appear at default logging verbosity.

Also, I'm not sure that the chattiness argument is relevant, because no
message will be emitted at all unless you're switching to some log target
different from the postmaster's initial stderr.  So the message won't show
up in the official log target files, only in an arguably vestigial
startup-time-messages-only file.

Does that ameliorate your concern, or do you still want it to be DEBUG1?
I'd be happier with DEBUG1 than with no message at all, but I don't think
it's going to help as many people at DEBUG1 as it would at LOG level.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Josh Berkus
On 08/09/2013 03:40 PM, Tom Lane wrote:
 In principle I see the risk, but I don't think I've ever seen an actual
 report of someone getting confused this way by an on-the-fly logging
 parameter change.  Whereas there are numerous examples in the archives
 of people not realizing that pg_ctl -l foo doesn't necessarily mean
 that all the useful log output is in file foo.  (Usually it's because
 they're using a logging setup chosen by some packager, not by themselves.)
 So I'm feeling that what you're suggesting is solving a different and
 far less pressing problem than what I'm on about.

No question.  That's why I suggested it as a TODO item instead of anyone
working on it right now.

The main benefit I see for this is security, especially with ALTER
SYSTEM SET pending.   Switching log destinations is a good way to cover
your tracks if you have some kind of temporary superuser access (for
example, by exploiting a SECURITY DEFINER function).  If the switch were
recorded somewhere other than the new log location, it would provide a
little more sleuthing information for later auditors.

 I did think a little bit about how to do it.  For parameters that affect
 where the logging collector writes data (probably only log_directory is
 worth special handling), it would be quite easy to make the got_SIGHUP
 code segment in syslogger.c emit a log message just before switching the
 active value.  However, if you want something similar for log_destination,
 syslog_facility, syslog_ident, or event_source, it's far more problematic
 because those settings affect the behavior of individual processes, and
 so there's no unique point where we're switching from one log target to
 another.  We could have the postmaster report a value change but it's
 likely that that message would not appear very close to the end of the
 messages directed to the old target.

But wait, there's more complications: what if you're switching
log_directory because the disk on the old log location is full?  Then we
*can't* emit a log entry on switch, because we can't write it.

Like I said, complicated out of proportion to be benefit, at least right
now.

 Another point here is that if you're tailing the current log file,
 a plain old rotation (no parameter change anywhere) would also cut
 you off without obvious notice. 

Yeah, I'm not concerned about that, since the location of the new log is
predictable.

 Maybe we shouldn't think about this
 as a change of parameter problem, but just say it's worth emitting
 an end of log file message anytime we're about to change to a new log
 file, no matter the reason.  But again, that's specific to the logging
 collector case and doesn't help with any other log target.  I also note
 that I'm not familiar with any system logging tool that emits such
 messages.

I'm not familiar with other system logging tools.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] confusing error message

2013-08-09 Thread Craig Ringer
On 08/10/2013 02:43 AM, Stephen Frost wrote:
 * Craig Ringer (cr...@2ndquadrant.com) wrote:
 More seriously, with interpolated strings for relation names etc
 it can be hard to know which chunks to search for, and search
 engines aren't always good at having the whole message thrown at
 them.
 
 It's not perfect, but if the searches are getting to mailing list 
 archive messages then it's clearly possible for us to do better.
 Having an embedded code or URL or what-have-you might be an option
 too, but we need the documentation first in any case, so we could
 do these other things later..

Well said; you're quite right.

I complain myself that a focus on perfection can prevent progress.
Lets start with helpful.

My first two would be:

No pg_hba.conf entry for ...

and

fe_sendauth: no password supplied

as both seem to confuse new users endlessly.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] killing pg_dump leaves backend process

2013-08-09 Thread Tatsuo Ishii
I noticed pg_dump does not exit gracefully when killed.

start pg_dump
kill pg_dump by ctrl-c
ps x

27246 ?Ds96:02 postgres: t-ishii dbt3 [local] COPY
29920 ?S  0:00 sshd: ishii@pts/5
29921 pts/5Ss 0:00 -bash
30172 ?Ss 0:00 postgres: t-ishii dbt3 [local] LOCK TABLE waiting

As you can see, after killing pg_dump, a backend process is (LOCK
TABLE waiting) left behind. I think this could be easily fixed by
adding signal handler to pg_dump so that it catches the signal and
issues a query cancel request.

Thoughts?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] mvcc catalo gsnapshots and TopTransactionContext

2013-08-09 Thread Andres Freund
On 2013-08-09 14:11:46 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2013-08-08 09:27:24 -0400, Robert Haas wrote:
  How can it be safe to try to read catalogs if the transaction is aborted?
 
  Well. It isn't. At least not in general. The specific case triggered
  here though are cache invalidations being processed which can lead to
  the catalog being read (pretty crummy, but not easy to get rid
  of). That's actually safe since before we process the invalidations we
  have done:
  1) CurrentTransactionState-state = TRANS_ABORT
  2) RecordTransactionAbort(), marking the transaction as aborted in the
clog
  3) marked subxacts as aborted
  3) ProcArrayEndTransaction() (for toplevel ones)
 
  Due to these any tqual stuff will treat the current (sub-)xact and it's
  children as aborted. So the catalog lookups will use the catalog in a
  sensible state.
 
 I don't have any faith in this argument.  You might be right that we'll
 correctly see our own output rows as aborted, but that's barely the tip
 of the iceberg of risk here.  Is it safe to take new locks in an aborted
 transaction?  (What if we're already past the lock-release point in
 the abort sequence?)

Don't get me wrong. I find the idea of doing catalog lookup during abort
horrid. But it's been that way for at least 10 years (I checked 7.4), so
it has at least some resemblance of working.
Today we do a good bit less than back then, for one we don't do a full
cache reload during abort anymore, just for the index support
infrastructure. Also, you've reduced the amount of lookups a bit with the
relmapper introduction.

 For that matter, given that we don't know what
 exactly caused the transaction abort, how safe is it to do anything at
 all --- we might for instance be nearly out of memory.  If the catalog
 reading attempt itself fails, won't we be in an infinite loop of
 transaction aborts?

Looks like that's possible, yes. There seem to be quite some other
opportunities for this to happen if you look at the amount of work done
in AbortSubTransaction(). I guess it rarely happens because we
previously release some memory...

 I could probably think of ten more risks if I spent a few more minutes
 at it.

No need to convince me here. I neither could believe we were doing this,
nor figure out why it even works for the first hour of looking at it.

 Cache invalidation during abort should *not* lead to any attempt to
 immediately revalidate the cache.  No amount of excuses will make that
 okay.  I have not looked to see just what the path of control is in this
 particular case, but we need to fix it, not paper over it.

I agree, although that's easier said than done in the case of
subtransactions. The problem we have there is that it's perfectly valid
to still have references to a relation from the outer, not aborted,
transaction. Those need to be valid for anybody looking at the relcache
entry after we've processed the ROLLBACK TO/...

I guess the fix is something like we do in the commit case, where we
transfer invalidations to the parent transaction. If we then process
local invalidations *after* we've cleaned up the subtransaction
completely we should be fine. We already do an implicity
CommandCounterIncrement() in CommitSubTransaction()...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] killing pg_dump leaves backend process

2013-08-09 Thread Tom Lane
Tatsuo Ishii is...@postgresql.org writes:
 I noticed pg_dump does not exit gracefully when killed.
 start pg_dump
 kill pg_dump by ctrl-c
 ps x

 27246 ?Ds96:02 postgres: t-ishii dbt3 [local] COPY
 29920 ?S  0:00 sshd: ishii@pts/5
 29921 pts/5Ss 0:00 -bash
 30172 ?Ss 0:00 postgres: t-ishii dbt3 [local] LOCK TABLE waiting

 As you can see, after killing pg_dump, a backend process is (LOCK
 TABLE waiting) left behind. I think this could be easily fixed by
 adding signal handler to pg_dump so that it catches the signal and
 issues a query cancel request.

If we think that's a problem (which I'm not convinced of) then pg_dump
is the wrong place to fix it.  Any other client would behave the same
if it were killed while waiting for some backend query.  So the right
fix would involve figuring out a way for the backend to kill itself
if the client connection goes away while it's waiting.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: leave a hint when switching logging away from stderr

2013-08-09 Thread Noah Misch
On Fri, Aug 09, 2013 at 06:59:13PM -0400, Tom Lane wrote:
 Noah Misch n...@leadboat.com writes:
  On Thu, Aug 08, 2013 at 10:32:17PM -0400, Tom Lane wrote:
  This patch arranges to emit a hint message when/if we switch away from
  logging to the original postmaster stderr during startup.  There are two
  cases to cover: we're still using LOG_DESTINATION_STDERR but redirecting
  stderr to a syslogger process, or we stop writing to stderr altogether,
  presumably in favor of going to syslog or something.
 
  At LOG level, this feels a bit chatty: it's a 100% increase in startup-time
  messages if you count both the main message and the HINT.  I can't think of
  another program with configuration-directed logging that does this on every
  startup.  Makes perfect sense to me at DEBUG1, though, and that would have
  been enough for the situation you cite above.
 
 Hm.  That would be enough for users who think to increase log_min_messages
 while trying to resolve their problem.  But what I'm mainly worried about
 here is people who are relative novices, so I don't have a lot of
 confidence that the patch would still help them if we made the message not
 appear at default logging verbosity.
 
 Also, I'm not sure that the chattiness argument is relevant, because no
 message will be emitted at all unless you're switching to some log target
 different from the postmaster's initial stderr.  So the message won't show
 up in the official log target files, only in an arguably vestigial
 startup-time-messages-only file.

Perhaps the chatter would most affect use, typically casual, of pg_ctl without
-l or similar.

 Does that ameliorate your concern, or do you still want it to be DEBUG1?
 I'd be happier with DEBUG1 than with no message at all, but I don't think
 it's going to help as many people at DEBUG1 as it would at LOG level.

I think of the implicit sequence messages we moved from NOTICE to DEBUG1
somewhat recently.  No doubt those messages had helped at times, but they
didn't quite carry their weight at NOTICE.  My gut prediction is that this
will fall in that same utility range.  But you make a valid point about noise
in the startup log being easier to discount.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers