[HACKERS] Comment typo in xlog.c

2010-12-06 Thread Fujii Masao
Hi,

I found two typos in xlog.c. Could you apply the attached patch?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


typo.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Fujii Masao
Hi,

Walsender doesn't need the periodic wakeups anymore, thanks to
the latch feature. So wal_sender_delay is basically useless now.
How about dropping wal_sender_delay or increasing the default
value?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 Why is there a variadic replace() in this patch at all?  It seems just
 about entirely unrelated to the stated purpose of the patch, as well
 as being of dubious usefulness.

It used not to being exposed at the SQL level, but just an internal loop
in pg_execute_sql_file() when using the placeholders enabled
variant. Then Itagaki wanted me to expose internals so that he basically
can implement the logics in SQL directly.  It seems like we went a step
too far in exposing this facility too.  Agreed in removing it at the SQL
level.

I assume you want me to prepare a patch, I'm not sure about being able
to send it to the list before Thursday --- unless I get around the git
network errors at pgday.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2010-12-06 Thread Hitoshi Harada
2010/11/25 Shigeru HANADA han...@metrosystems.co.jp:
 Hi, hackers,

 Attached is a patch that adds file_fdw, FDW which reads records from
 files on the server side, as a contrib module.  This patch is based on
 SQL/MED core functionality patch.

 [SQL/MED - core functionality]
 http://archives.postgresql.org/pgsql-hackers/2010-11/msg01698.php

 File_fdw can be installed with the steps similar to other contrib
 modules, and you can create FDW with the script:
    $SHAREDIR/contrib/file_fdw.sql
 Note that you need to create file_fdw for each database.

 Document for file_fdw is included in the patch, although the contents
 might not be enough.

 Any comments and questions are welcome.

I think it is better to add encoding option to FileFdwOption. In the
patch the encoding of file is assumed as client_encoding, but we may
want to SELECT from different-encoded csv in a query.

Apart from the issue with fdw, I've been thinking client_encoding for
COPY is not appropriate in any way. client_encoding is the encoding of
the statement the client sends, not the COPY target which is on the
server's filesystem. Adding encoding option to COPY will eliminate
allowEncodingChanges option from JDBC driver.

Regards,



-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment typo in xlog.c

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 09:01, Fujii Masao wrote:

I found two typos in xlog.c. Could you apply the attached patch?


Applied.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread Itagaki Takahiro
On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut pete...@gmx.net wrote:
 Here is an updated patch to address the issues discussed during this
 commitfest.

Here are comments and questions after I tested the latest patch:

 Issues 
* initdb itself seems to be succeeded, but it says could not determine
encoding for locale messages for any combination of encoding=utf8/eucjp
and locale=ja_JP.utf8/ja_JP.eucjp/C. Is it an expected behavior?

creating collations ...initdb: locale name has non-ASCII characters,
skipped: bokm虱
initdb: locale name has non-ASCII characters, skipped: fran軋is
could not determine encoding for locale hy_AM.armscii8: codeset is ARMSCII-8
... (a dozen of lines) ...
could not determine encoding for locale vi_VN.tcvn: codeset is TCVN5712-1
ok


* contrib/citext raises an encoding error when COLLATE is specified
even if it is the collation as same as the database default.
We might need some special treatment for C locale.
=# SHOW lc_collate;  == C
=# SELECT ('A'::citext) = ('a'::citext);  == false
=# SELECT ('A'::citext) = ('a'::citext) COLLATE C;
ERROR:  invalid multibyte character for locale
HINT:  The server's LC_CTYPE locale is probably incompatible with the
database encoding.

* pg_dump would generate unportable files for different platforms
because collation names

 Source codes 
* PG_GETARG_COLLATION might be a better name rather than PG_GET_COLLATION.

* What is the different between InvalidOid and DEFAULT_COLLATION_OID
for collation oids? The patch replaces DirectFunctionCall to
DirectFunctionCallC in some places, but we could shrink the diff size
if we can use InvalidOid instead of DEFAULT_COLLATION_OID,

* I still think an explicit passing collations from-function-to-function
is horrible because we might forget it in some places, and almost existing
third party module won't work.  Is it possible to make it a global variable,
and push/pop the state when changed? Sorry I'm missing something, but
I think we could treat the collation setting as like as GUC settings.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #5662: Incomplete view

2010-12-06 Thread Peter Eisentraut
On sön, 2010-09-19 at 14:28 -0400, Tom Lane wrote:
 Or maybe we could implement that function, call it like this
 
CAST((pg_sequence_parameters(c.oid)).max_value AS
 cardinal_number) AS maximum_value,
 
 and plan on optimizing the view when we get LATERAL.

Here is an implementation of that.

I'm not exactly sure if the accesses to the sequence are correctly
locked/unlocked, but it appears to work.

I also revised the definition of the info schema view slightly, after
juggling several more recent SQL standard drafts.

diff --git a/doc/src/sgml/information_schema.sgml b/doc/src/sgml/information_schema.sgml
index 9d30949..0c1cb5d 100644
--- a/doc/src/sgml/information_schema.sgml
+++ b/doc/src/sgml/information_schema.sgml
@@ -4013,31 +4013,42 @@ ORDER BY c.ordinal_position;
  /row
 
  row
-  entryliteralmaximum_value/literal/entry
-  entrytypecardinal_number/type/entry
-  entryNot yet implemented/entry
+  entryliteralstart_value/literal/entry
+  entrytypecharacter_data/type/entry
+  entryThe start value of the sequence/entry
  /row
 
  row
   entryliteralminimum_value/literal/entry
-  entrytypecardinal_number/type/entry
-  entryNot yet implemented/entry
+  entrytypecharacter_data/type/entry
+  entryThe minimum value of the sequence/entry
+ /row
+
+ row
+  entryliteralmaximum_value/literal/entry
+  entrytypecharacter_data/type/entry
+  entryThe maximum value of the sequence/entry
  /row
 
  row
   entryliteralincrement/literal/entry
-  entrytypecardinal_number/type/entry
-  entryNot yet implemented/entry
+  entrytypecharacter_data/type/entry
+  entryThe increment of the sequence/entry
  /row
 
  row
   entryliteralcycle_option/literal/entry
   entrytypeyes_or_no/type/entry
-  entryNot yet implemented/entry
+  entryliteralYES/literal if the sequence cycles, else literalNO/literal/entry
  /row
 /tbody
/tgroup
   /table
+
+  para
+   Note that in accordance with the SQL standard, the start, minimum,
+   maximum, and increment values are returned as character strings.
+  /para
  /sect1
 
  sect1 id=infoschema-sql-features
diff --git a/src/backend/catalog/information_schema.sql b/src/backend/catalog/information_schema.sql
index 8d9790d..1c2bd85 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -1430,16 +1430,17 @@ CREATE VIEW sequences AS
CAST(64 AS cardinal_number) AS numeric_precision,
CAST(2 AS cardinal_number) AS numeric_precision_radix,
CAST(0 AS cardinal_number) AS numeric_scale,
-   CAST(null AS cardinal_number) AS maximum_value, -- FIXME
-   CAST(null AS cardinal_number) AS minimum_value, -- FIXME
-   CAST(null AS cardinal_number) AS increment, -- FIXME
-   CAST(null AS yes_or_no) AS cycle_option-- FIXME
+   CAST((pg_sequence_parameters(c.oid)).start_value AS character_data) AS start_value,
+   CAST((pg_sequence_parameters(c.oid)).minimum_value AS character_data) AS minimum_value,
+   CAST((pg_sequence_parameters(c.oid)).maximum_value AS character_data) AS maximum_value,
+   CAST((pg_sequence_parameters(c.oid)).increment AS character_data) AS increment,
+   CAST(CASE WHEN (pg_sequence_parameters(c.oid)).cycle_option THEN 'YES' ELSE 'NO' END AS yes_or_no) AS cycle_option
 FROM pg_namespace nc, pg_class c
 WHERE c.relnamespace = nc.oid
   AND c.relkind = 'S'
   AND (NOT pg_is_other_temp_schema(nc.oid))
   AND (pg_has_role(c.relowner, 'USAGE')
-   OR has_table_privilege(c.oid, 'SELECT, UPDATE') );
+   OR has_sequence_privilege(c.oid, 'SELECT, UPDATE, USAGE') );
 
 GRANT SELECT ON sequences TO PUBLIC;
 
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index bb8ebce..0070bb0 100644
--- a/src/backend/commands/sequence.c
+++ b/src/backend/commands/sequence.c
@@ -24,6 +24,7 @@
 #include commands/defrem.h
 #include commands/sequence.h
 #include commands/tablecmds.h
+#include funcapi.h
 #include miscadmin.h
 #include nodes/makefuncs.h
 #include storage/bufmgr.h
@@ -1420,6 +1421,56 @@ process_owned_by(Relation seqrel, List *owned_by)
 }
 
 
+/*
+ * Return sequence parameters, for use by information schema
+ */
+Datum
+pg_sequence_parameters(PG_FUNCTION_ARGS)
+{
+	Oid			relid = PG_GETARG_OID(0);
+	TupleDesc	tupdesc;
+	Datum		values[5];
+	bool		isnull[5];
+	SeqTable	elm;
+	Relation	seqrel;
+	Buffer		buf;
+	Form_pg_sequence seq;
+
+	/* open and AccessShareLock sequence */
+	init_sequence(relid, elm, seqrel);
+
+	if (pg_class_aclcheck(relid, GetUserId(), ACL_SELECT | ACL_UPDATE | ACL_USAGE) != ACLCHECK_OK)
+		ereport(ERROR,
+(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg(permission denied for sequence %s,
+		RelationGetRelationName(seqrel;
+
+	tupdesc = CreateTemplateTupleDesc(5, false);
+	

Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 2:29 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.12.2010 02:55, Robert Haas wrote:

 On Sun, Dec 5, 2010 at 1:28 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

 I'm wondering if we should reconsider the pass-it-through-the-client
 approach, because if we could make that work it would be more general and
 it wouldn't need any special privileges.  The trick seems to be to apply
 sufficient sanity testing to the snapshot proposed to be installed in
 the subsidiary transaction.  I think the requirements would basically be
 (1) xmin= any listed XIDs  xmax
 (2) xmin not so old as to cause GlobalXmin to decrease
 (3) xmax not beyond current XID counter
 (4) XID list includes all still-running XIDs in the given range

 Thoughts?

 I think this is too ugly to live.  I really think it's a very bad idea
 for database clients to need to explicitly know anywhere near this
 many details about how the server represents snapshots.  It's not
 impossible we might want to change this in the future, and even if we
 don't, it seems to me to be exposing a whole lot of unnecessary
 internal grottiness.

 The client doesn't need to know anything about the snapshot blob that the
 server gives it. It just needs to pass it back to the server through the
 other connection. To the client, it's just an opaque chunk of bytes.

I suppose that would work, but I still think it's a bad idea.  We made
this mistake with expression trees.  Any oversight in the code that
validates the chunk of bytes when it (or a modified version) is sent
back to the server turns into a security hole.  I think it's a whole
lot simpler and cleaner to keep the representation details private to
the server.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED - file_fdw

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 5:48 AM, Hitoshi Harada umi.tan...@gmail.com wrote:
 I think it is better to add encoding option to FileFdwOption. In the
 patch the encoding of file is assumed as client_encoding, but we may
 want to SELECT from different-encoded csv in a query.

 Apart from the issue with fdw, I've been thinking client_encoding for
 COPY is not appropriate in any way. client_encoding is the encoding of
 the statement the client sends, not the COPY target which is on the
 server's filesystem. Adding encoding option to COPY will eliminate
 allowEncodingChanges option from JDBC driver.

Yeah, this point has been raised before, and I agree with it.  I
haven't heard anyone who speaks a European language complain about
this, but it seems to keep coming up for Japanese speakers.  I am
guessing that means that using multiple encodings is fairly common in
Japan.  I typically don't run into anything other than UTF-8 and
Latin-1, which are mostly compatible especially if you're an English
speaker, but if it weren't for that happy coincidence I think this
would be quite annoying.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Merlin Moncure
On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander mag...@hagander.net wrote:
 On Sun, Dec 5, 2010 at 10:22, Marc Balmer m...@msys.ch wrote:
 I am suggesting adding a function to libpq:

 PGresult *PQvexec(PGconn *conn, const char *fmt, ...);

 It behaves similar to PQexec, but it allows for printf style varargs and

 How is that not a horrible idea, compared to using PQexecParams()? You
 have to remember to do all your escaping and things manually, whereas
 PQexecParams() does it automatically.

It's only horrible if you stick to printf style formatting and you are
using sting techniques to inject parameters into the query.  Non
parameterized queries should obviously be discouraged.  However, it's
entirely possible to wrap the parameterized interfaces with vararg
interface (I should know, because we did exactly that) :-).  This
gives you the best of both worlds, easy coding without sacrificing
safety.  You might not remember the libpqtypes proposal, but libpq was
specifically extended with callbacks so that libpqtypes could exist
after the community determined that libpqtypes was too big of a change
to the libpq library.  I think ultimately this should be revisited,
with libpqtypes going in core or something even richer...I've been
thinking for a while that postgres types should be abstracted out of
the backend into a library that both client and server depend on.

With libpqtypes, we decided to use postgres style format markers:
select PQexecf(conn, select %int4 + %int8, an_int, a_bigint);

Everything is schema qualified, so that user types are supported (of
course, this requires implementing handling on the client).

Data routed through the binary protocol, with all the byte swapping
etc handled by the library.  No escaping necessary.  We also added
full support for arrays and composites, which are a nightmare to deal
with over straight libpq, and various other niceties like thread safe
error handling.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 14:57, Robert Haas wrote:

On Mon, Dec 6, 2010 at 2:29 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

The client doesn't need to know anything about the snapshot blob that the
server gives it. It just needs to pass it back to the server through the
other connection. To the client, it's just an opaque chunk of bytes.


I suppose that would work, but I still think it's a bad idea.  We made
this mistake with expression trees.  Any oversight in the code that
validates the chunk of bytes when it (or a modified version) is sent
back to the server turns into a security hole.


True, but a snapshot is a lot simpler than an expression tree. It's 
pretty much impossible to plug all the holes in the expression-tree 
reading functions, and keep them hole-free in the future. The expression 
tree format is constantly in flux. A snapshot, however, is a fairly 
isolated small data structure that rarely changes.



 I think it's a whole
lot simpler and cleaner to keep the representation details private to
the server.


Well, then you need some sort of cross-backend communication, which is 
always a bit clumsy.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-12-06 Thread Oleg Bartunov

On Sat, 4 Dec 2010, Greg Stark wrote:


On Sat, Dec 4, 2010 at 6:07 PM, Oleg Bartunov o...@sai.msu.su wrote:

We'll sync contrib/btree_gist with current API. Also, we're thinking
about distance for ltree, boxes, circles. I'm not sure about polygons,
though.
So, we'll have rather complete set of knn-fied data types.


I kind of assumed the natural client for KNN-gist was the tsearch full
text search indexes handling sorting by relevance. For example if I
search for Postgres DBA I should find documents where those words
appear adjacent first and documents where the two words appear far
apart in the document sorted further down. Is that not on the list of
operators supported or planned to be supported?


We'll start thinking about this once we know how to store coordinate 
information in index :)


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: o...@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.12.2010 14:57, Robert Haas wrote:

 On Mon, Dec 6, 2010 at 2:29 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 The client doesn't need to know anything about the snapshot blob that the
 server gives it. It just needs to pass it back to the server through the
 other connection. To the client, it's just an opaque chunk of bytes.

 I suppose that would work, but I still think it's a bad idea.  We made
 this mistake with expression trees.  Any oversight in the code that
 validates the chunk of bytes when it (or a modified version) is sent
 back to the server turns into a security hole.

 True, but a snapshot is a lot simpler than an expression tree. It's pretty
 much impossible to plug all the holes in the expression-tree reading
 functions, and keep them hole-free in the future. The expression tree format
 is constantly in flux. A snapshot, however, is a fairly isolated small data
 structure that rarely changes.

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.

  I think it's a whole
 lot simpler and cleaner to keep the representation details private to
 the server.

 Well, then you need some sort of cross-backend communication, which is
 always a bit clumsy.

A temp file seems quite sufficient, and not at all difficult.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 08:51, Fujii Masao wrote:

On Mon, Dec 6, 2010 at 3:42 PM, Fujii Masaomasao.fu...@gmail.com  wrote:

On Fri, Oct 15, 2010 at 9:41 PM, Fujii Masaomasao.fu...@gmail.com  wrote:

The timeout doesn't oppose to 'wait-forever'. Even if you choose 'wait
-forever' (i.e., you set allow_standalone_master to false), the master
should detect the standby crash as soon as possible by using the
timeout. For example, imagine that max_wal_senders is set to one and
the master cannot detect the standby crash because of absence of the
timeout. In this case, even if you start new standby, it will not be
able to connect to the master since there is no free walsender slot.
As the result, the master actually waits forever.


This occurred to me that the timeout would be required even for
asynchronous streaming replication. So, how about implementing the
replication timeout feature before synchronous replication itself?


Here is the patch. This is one of features required for synchronous
replication, so I added this into current CF as a part of synchronous
replication.


Hmm, that's actually a quite different timeout than what's required for 
synchronous replication. In synchronous replication, you need to get an 
acknowledgment within a timeout. This patch only puts a timeout on how 
long we wait to have enough room in the TCP send buffer. That doesn't 
seem all that useful.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Marc Balmer
Am 06.12.10 15:37, schrieb Merlin Moncure:
 On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander mag...@hagander.net wrote:
 On Sun, Dec 5, 2010 at 10:22, Marc Balmer m...@msys.ch wrote:
 I am suggesting adding a function to libpq:

 PGresult *PQvexec(PGconn *conn, const char *fmt, ...);

 It behaves similar to PQexec, but it allows for printf style varargs and

 How is that not a horrible idea, compared to using PQexecParams()? You
 have to remember to do all your escaping and things manually, whereas
 PQexecParams() does it automatically.
 
 It's only horrible if you stick to printf style formatting and you are
 using sting techniques to inject parameters into the query.  Non
 parameterized queries should obviously be discouraged.  However, it's
 entirely possible to wrap the parameterized interfaces with vararg
 interface (I should know, because we did exactly that) :-).  This
 gives you the best of both worlds, easy coding without sacrificing
 safety.  You might not remember the libpqtypes proposal, but libpq was
 specifically extended with callbacks so that libpqtypes could exist
 after the community determined that libpqtypes was too big of a change
 to the libpq library.  I think ultimately this should be revisited,
 with libpqtypes going in core or something even richer...I've been
 thinking for a while that postgres types should be abstracted out of
 the backend into a library that both client and server depend on.
 
 With libpqtypes, we decided to use postgres style format markers:
 select PQexecf(conn, select %int4 + %int8, an_int, a_bigint);
 
 Everything is schema qualified, so that user types are supported (of
 course, this requires implementing handling on the client).
 
 Data routed through the binary protocol, with all the byte swapping
 etc handled by the library.  No escaping necessary.  We also added
 full support for arrays and composites, which are a nightmare to deal
 with over straight libpq, and various other niceties like thread safe
 error handling.

That would be a *HUGE* piece of software compared the relatively small
thing I am suggesting...

As for escaping (or not escaping) of string arguments, that can be seen
as a bug or a feature.  I do not wan't automatic escaping of string
arguments in all cases, e.g. I might to construct an SQL statement with
dynamic parts WHERE xy or AND a = b.

hypothetical example:

filter = WHERE name like 'Balmer%';
if (sort == SORT_DESC)
sort =  ORDER BY name DESCENDING;

PQvexec(conn, SELECT name, nr, id FROM address %s%s, filter, sort);

So what I am aiming at right now is a PQvexec() function that basically
has printf() like semantics, but adds an additional token to the format
string (printf uses %s and %b to produce strings.) I am thinking of
adding %S and %B, which produce strings that are escaped.

That would be a small function, and reasonably safe.  Or rather, the
safety is in the hands of the programmer.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 15:53, Robert Haas wrote:

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.


Note that we already have txid_current_snapshot() function, which 
exposes all that.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 Walsender doesn't need the periodic wakeups anymore, thanks to
 the latch feature. So wal_sender_delay is basically useless now.
 How about dropping wal_sender_delay or increasing the default
 value?

If we don't need it, we should remove it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Merlin Moncure
On Mon, Dec 6, 2010 at 9:55 AM, Marc Balmer m...@msys.ch wrote:
 Am 06.12.10 15:37, schrieb Merlin Moncure:
 On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander mag...@hagander.net wrote:
 On Sun, Dec 5, 2010 at 10:22, Marc Balmer m...@msys.ch wrote:
 I am suggesting adding a function to libpq:

 PGresult *PQvexec(PGconn *conn, const char *fmt, ...);

 It behaves similar to PQexec, but it allows for printf style varargs and

 How is that not a horrible idea, compared to using PQexecParams()? You
 have to remember to do all your escaping and things manually, whereas
 PQexecParams() does it automatically.

 It's only horrible if you stick to printf style formatting and you are
 using sting techniques to inject parameters into the query.  Non
 parameterized queries should obviously be discouraged.  However, it's
 entirely possible to wrap the parameterized interfaces with vararg
 interface (I should know, because we did exactly that) :-).  This
 gives you the best of both worlds, easy coding without sacrificing
 safety.  You might not remember the libpqtypes proposal, but libpq was
 specifically extended with callbacks so that libpqtypes could exist
 after the community determined that libpqtypes was too big of a change
 to the libpq library.  I think ultimately this should be revisited,
 with libpqtypes going in core or something even richer...I've been
 thinking for a while that postgres types should be abstracted out of
 the backend into a library that both client and server depend on.

 With libpqtypes, we decided to use postgres style format markers:
 select PQexecf(conn, select %int4 + %int8, an_int, a_bigint);

 Everything is schema qualified, so that user types are supported (of
 course, this requires implementing handling on the client).

 Data routed through the binary protocol, with all the byte swapping
 etc handled by the library.  No escaping necessary.  We also added
 full support for arrays and composites, which are a nightmare to deal
 with over straight libpq, and various other niceties like thread safe
 error handling.

 That would be a *HUGE* piece of software compared the relatively small
 thing I am suggesting...

well, it's already written. All you would have to do is compile it.

 As for escaping (or not escaping) of string arguments, that can be seen
 as a bug or a feature.  I do not wan't automatic escaping of string
 arguments in all cases, e.g. I might to construct an SQL statement with
 dynamic parts WHERE xy or AND a = b.

libpqtypes doesn't escape at all.  It uses the internal parameterized
interfaces that don't require it.  For particular types, like bytea
and timestamps, this much faster because we use the binary wire
format.  Less load on the client and the server.

 hypothetical example:

 filter = WHERE name like 'Balmer%';
 if (sort == SORT_DESC)
        sort =  ORDER BY name DESCENDING;

 PQvexec(conn, SELECT name, nr, id FROM address %s%s, filter, sort);

 So what I am aiming at right now is a PQvexec() function that basically
 has printf() like semantics, but adds an additional token to the format
 string (printf uses %s and %b to produce strings.) I am thinking of
 adding %S and %B, which produce strings that are escaped.

 That would be a small function, and reasonably safe.  Or rather, the
 safety is in the hands of the programmer.

What you are suggesting doesn't provide a lot of value over sprintf
the query first, then exec it.  You can do what you are suggesting
yourself, wrapping PQexec:

A hypothetical wrapper would be implemented something like:
va_list ap;
char buf[BUFSZ];
va_start(ap, query)
vsnprintf(buf, BUFSZ. query, ap);
va_end(ap);
return PQexec(buf);

This is a bad idea (security, escaping, performance)...we wrote a
faster, safer way to do it, with richer type support.  Or you can do
it yourself.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane
Dimitri Fontaine dimi...@2ndquadrant.fr writes:
 Tom Lane t...@sss.pgh.pa.us writes:
 Why is there a variadic replace() in this patch at all?  It seems just
 about entirely unrelated to the stated purpose of the patch, as well
 as being of dubious usefulness.

 It used not to being exposed at the SQL level, but just an internal loop
 in pg_execute_sql_file() when using the placeholders enabled
 variant. Then Itagaki wanted me to expose internals so that he basically
 can implement the logics in SQL directly.  It seems like we went a step
 too far in exposing this facility too.  Agreed in removing it at the SQL
 level.

Well, actually, my next question was going to be about removing the
variadic substitution in pg_execute_string too.  It's not apparent to me
that that function should have a rather lame substitution mechanism
hard-wired into it, when you can do the same thing with replace() in
front of it.

On the whole I'd prefer not to have any substitution functionality
hard-wired into pg_execute_file either, though I can see the argument
that it's necessary for practical use.  Basically I'm concerned that
replace-equivalent behavior is not going to be satisfactory over the
long run: I think eventually we're going to need to think about
quoting/escaping behavior.  So I think it's a bad idea to expose the
assumption that it'll be done that way at the SQL level.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 9:54 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 This occurred to me that the timeout would be required even for
 asynchronous streaming replication. So, how about implementing the
 replication timeout feature before synchronous replication itself?

 Here is the patch. This is one of features required for synchronous
 replication, so I added this into current CF as a part of synchronous
 replication.

 Hmm, that's actually a quite different timeout than what's required for
 synchronous replication. In synchronous replication, you need to get an
 acknowledgment within a timeout. This patch only puts a timeout on how long
 we wait to have enough room in the TCP send buffer. That doesn't seem all
 that useful.

Yeah.  If we rely on the TCP send buffer filling up, then the amount
of time the master takes to notice a dead standby is going to be hard
for the user to predict.  I think the standby ought to send some sort
of heartbeat and the master should declare the standby dead if it
doesn't see a heartbeat soon enough.  Maybe the heartbeat could even
include the receive/fsync/replay LSNs, so that sync rep can use the
same machinery but with more aggressive policies about when they must
be sent.

I also can't help noticing that this approach requires drilling a hole
through the abstraction stack.  We just invented latches; if the API
is going to have to change every time someone wants to implement a
feature, we've built ourselves an awfully porous abstraction layer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 06.12.2010 15:53, Robert Haas wrote:

 I guess.  It still seems far too much like exposing the server's guts
 for my taste.  It might not be as bad as the expression tree stuff,
 but there's nothing particularly good about it either.

 Note that we already have txid_current_snapshot() function, which exposes
 all that.

Fair enough, and I think that's actually useful for Slony c.  But I
don't think we should shy away of providing a cleaner API here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FK's to refer to rows in inheritance child

2010-12-06 Thread Alvaro Herrera
Excerpts from Andrew Dunstan's message of dom dic 05 14:41:20 -0300 2010:

 What I would like to see is people publishing the location of 
 development repos so that they can be pulled from or merged, especially 
 for any large patch.

Yes, this is pretty useful.  Dimitri published his repos for the
extension stuff which I followed for a while, made some smallish changes
and sent them back, etc.  Very time-saving.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Kenneth Marshall
On Mon, Dec 06, 2010 at 10:14:55AM -0500, Merlin Moncure wrote:
 On Mon, Dec 6, 2010 at 9:55 AM, Marc Balmer m...@msys.ch wrote:
  Am 06.12.10 15:37, schrieb Merlin Moncure:
  On Sun, Dec 5, 2010 at 5:10 AM, Magnus Hagander mag...@hagander.net 
  wrote:
  On Sun, Dec 5, 2010 at 10:22, Marc Balmer m...@msys.ch wrote:
  I am suggesting adding a function to libpq:
 
  PGresult *PQvexec(PGconn *conn, const char *fmt, ...);
 
  It behaves similar to PQexec, but it allows for printf style varargs and
 
  How is that not a horrible idea, compared to using PQexecParams()? You
  have to remember to do all your escaping and things manually, whereas
  PQexecParams() does it automatically.
 
  It's only horrible if you stick to printf style formatting and you are
  using sting techniques to inject parameters into the query. ?Non
  parameterized queries should obviously be discouraged. ?However, it's
  entirely possible to wrap the parameterized interfaces with vararg
  interface (I should know, because we did exactly that) :-). ?This
  gives you the best of both worlds, easy coding without sacrificing
  safety. ?You might not remember the libpqtypes proposal, but libpq was
  specifically extended with callbacks so that libpqtypes could exist
  after the community determined that libpqtypes was too big of a change
  to the libpq library. ?I think ultimately this should be revisited,
  with libpqtypes going in core or something even richer...I've been
  thinking for a while that postgres types should be abstracted out of
  the backend into a library that both client and server depend on.
 
  With libpqtypes, we decided to use postgres style format markers:
  select PQexecf(conn, select %int4 + %int8, an_int, a_bigint);
 
  Everything is schema qualified, so that user types are supported (of
  course, this requires implementing handling on the client).
 
  Data routed through the binary protocol, with all the byte swapping
  etc handled by the library. ?No escaping necessary. ?We also added
  full support for arrays and composites, which are a nightmare to deal
  with over straight libpq, and various other niceties like thread safe
  error handling.
 
  That would be a *HUGE* piece of software compared the relatively small
  thing I am suggesting...
 
 well, it's already written. All you would have to do is compile it.
 
  As for escaping (or not escaping) of string arguments, that can be seen
  as a bug or a feature. ?I do not wan't automatic escaping of string
  arguments in all cases, e.g. I might to construct an SQL statement with
  dynamic parts WHERE xy or AND a = b.
 
 libpqtypes doesn't escape at all.  It uses the internal parameterized
 interfaces that don't require it.  For particular types, like bytea
 and timestamps, this much faster because we use the binary wire
 format.  Less load on the client and the server.
 
  hypothetical example:
 
  filter = WHERE name like 'Balmer%';
  if (sort == SORT_DESC)
  ? ? ? ?sort =  ORDER BY name DESCENDING;
 
  PQvexec(conn, SELECT name, nr, id FROM address %s%s, filter, sort);
 
  So what I am aiming at right now is a PQvexec() function that basically
  has printf() like semantics, but adds an additional token to the format
  string (printf uses %s and %b to produce strings.) I am thinking of
  adding %S and %B, which produce strings that are escaped.
 
  That would be a small function, and reasonably safe. ?Or rather, the
  safety is in the hands of the programmer.
 
 What you are suggesting doesn't provide a lot of value over sprintf
 the query first, then exec it.  You can do what you are suggesting
 yourself, wrapping PQexec:
 
 A hypothetical wrapper would be implemented something like:
 va_list ap;
 char buf[BUFSZ];
 va_start(ap, query)
 vsnprintf(buf, BUFSZ. query, ap);
 va_end(ap);
 return PQexec(buf);
 
 This is a bad idea (security, escaping, performance)...we wrote a
 faster, safer way to do it, with richer type support.  Or you can do
 it yourself.
 
 merlin
 

I have used the libpqtypes library and it is very easy to use.

+1 for adding it or something like it to the PostgreSQL core.
I have people who will try and roll their own because it does
not come with the core. While it is a hoot to see what reinventing
the wheel produces, it is also prone to mistakes.

Regards,
Ken

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Andrew Chernow




That would be a *HUGE* piece of software compared the relatively small
thing I am suggesting...



Sometimes complex and large solutions are required for the simplest of 
ideas.  I believe this is one of those cases.  You can't solve the 
printf style PQexec properly by merely implementing a sprintf wrapper.



As for escaping (or not escaping) of string arguments, that can be seen
as a bug or a feature.  I do not wan't automatic escaping of string
arguments in all cases, e.g. I might to construct an SQL statement with
dynamic parts WHERE xy or AND a = b.

hypothetical example:

filter = WHERE name like 'Balmer%';
if (sort == SORT_DESC)
sort =  ORDER BY name DESCENDING;

PQvexec(conn, SELECT name, nr, id FROM address %s%s, filter, sort);

So what I am aiming at right now is a PQvexec() function that basically
has printf() like semantics, but adds an additional token to the format
string (printf uses %s and %b to produce strings.) I am thinking of
adding %S and %B, which produce strings that are escaped.



This suffers from becoming cryptic over time, see Tom Lane's comments 
back in 2007 on this 
(http://archives.postgresql.org/pgsql-hackers/2007-12/msg00362.php). 
libpqtypes uses the human readable %schema.typename (schema is optional) 
to specify format specifiers.  There is no learning curve or ambiguity, 
if you want a point than use %point, or %my_type  libpqtypes 
allows you to register aliases (PQregisterSubClasses) so that you can 
map %text to %s to make it feel more like C..


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 10:22 AM, Robert Haas wrote:

On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 06.12.2010 15:53, Robert Haas wrote:

I guess.  It still seems far too much like exposing the server's guts
for my taste.  It might not be as bad as the expression tree stuff,
but there's nothing particularly good about it either.

Note that we already have txid_current_snapshot() function, which exposes
all that.

Fair enough, and I think that's actually useful for Slonyc.  But I
don't think we should shy away of providing a cleaner API here.



Just don't let the perfect get in the way of the good :P

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 10:35 AM, Andrew Dunstan and...@dunslane.net wrote:
 On 12/06/2010 10:22 AM, Robert Haas wrote:

 On Mon, Dec 6, 2010 at 9:58 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com  wrote:

 On 06.12.2010 15:53, Robert Haas wrote:

 I guess.  It still seems far too much like exposing the server's guts
 for my taste.  It might not be as bad as the expression tree stuff,
 but there's nothing particularly good about it either.

 Note that we already have txid_current_snapshot() function, which exposes
 all that.

 Fair enough, and I think that's actually useful for Slonyc.  But I
 don't think we should shy away of providing a cleaner API here.


 Just don't let the perfect get in the way of the good :P

I'll keep that in mind.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Well, then you need some sort of cross-backend communication, which is
 always a bit clumsy.

 A temp file seems quite sufficient, and not at all difficult.

Not at all difficult is nonsense.  To do that, you need to invent some
mechanism for sender and receivers to identify which temp file they want
to use, and you need to think of some way to clean up the files when the
client forgets to tell you to do so.  That's going to be at least as
ugly as anything else.  And I think it's unproven that this approach
would be security-hole-free either.  For instance, what about some other
session overwriting pg_dump's snapshot temp file?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 10:40 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 Well, then you need some sort of cross-backend communication, which is
 always a bit clumsy.

 A temp file seems quite sufficient, and not at all difficult.

 Not at all difficult is nonsense.  To do that, you need to invent some
 mechanism for sender and receivers to identify which temp file they want
 to use,

Why is this even remotely hard?  That's the whole point of having the
publish operation return a token.  The token either is, or uniquely
identifies, the file name.

 and you need to think of some way to clean up the files when the
 client forgets to tell you to do so.  That's going to be at least as
 ugly as anything else.

Backends don't forget to call their end-of-transaction hooks, do they?
 They might crash, but we already have code to remove temp files on
server restart.  At most it would need minor adjustment.

  And I think it's unproven that this approach
 would be security-hole-free either.  For instance, what about some other
 session overwriting pg_dump's snapshot temp file?

Why would this be any different from any other temp file?  We surely
must have a mechanism in place to ensure that the temporary files used
by sorts or hash joins don't get overwritten by some other session, or
the system would be totally unstable.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 10:40 AM, Tom Lane wrote:

Robert Haasrobertmh...@gmail.com  writes:

On Mon, Dec 6, 2010 at 9:45 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

Well, then you need some sort of cross-backend communication, which is
always a bit clumsy.

A temp file seems quite sufficient, and not at all difficult.

Not at all difficult is nonsense.  To do that, you need to invent some
mechanism for sender and receivers to identify which temp file they want
to use, and you need to think of some way to clean up the files when the
client forgets to tell you to do so.  That's going to be at least as
ugly as anything else.  And I think it's unproven that this approach
would be security-hole-free either.  For instance, what about some other
session overwriting pg_dump's snapshot temp file?




Yeah. I'm still not convinced that using shared memory is a bad way to 
pass these around. Surely we're not talking about large numbers of them. 
What am I missing here?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Dmitriy Igrishin
IMO, it would be better to implement some utility functions to
make it easy to construct arrays dynamically for PQexecParams
and PQexecPrepared. This seems to me more universal solution
and it is useful for both -- high level libpq-libraries authors and for
those who like to use libpq directly.

2010/12/6 Andrew Chernow a...@esilo.com



 That would be a *HUGE* piece of software compared the relatively small
 thing I am suggesting...


 Sometimes complex and large solutions are required for the simplest of
 ideas.  I believe this is one of those cases.  You can't solve the printf
 style PQexec properly by merely implementing a sprintf wrapper.


  As for escaping (or not escaping) of string arguments, that can be seen
 as a bug or a feature.  I do not wan't automatic escaping of string
 arguments in all cases, e.g. I might to construct an SQL statement with
 dynamic parts WHERE xy or AND a = b.

 hypothetical example:

 filter = WHERE name like 'Balmer%';
 if (sort == SORT_DESC)
sort =  ORDER BY name DESCENDING;

 PQvexec(conn, SELECT name, nr, id FROM address %s%s, filter, sort);

 So what I am aiming at right now is a PQvexec() function that basically
 has printf() like semantics, but adds an additional token to the format
 string (printf uses %s and %b to produce strings.) I am thinking of
 adding %S and %B, which produce strings that are escaped.


 This suffers from becoming cryptic over time, see Tom Lane's comments back
 in 2007 on this (
 http://archives.postgresql.org/pgsql-hackers/2007-12/msg00362.php).
 libpqtypes uses the human readable %schema.typename (schema is optional) to
 specify format specifiers.  There is no learning curve or ambiguity, if you
 want a point than use %point, or %my_type  libpqtypes allows you to
 register aliases (PQregisterSubClasses) so that you can map %text to %s to
 make it feel more like C..


 --
 Andrew Chernow
 eSilo, LLC
 every bit counts
 http://www.esilo.com/

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
// Dmitriy.


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Andrew Chernow

On 12/6/2010 11:40 AM, Dmitriy Igrishin wrote:

IMO, it would be better to implement some utility functions to
make it easy to construct arrays dynamically for PQexecParams
and PQexecPrepared. This seems to me more universal solution
and it is useful for both -- high level libpq-libraries authors and for
those who like to use libpq directly.



Hmm, your idea isn't better, it is identical to what libpqtypes already 
does :)

http://libpqtypes.esilo.com/browse_source.html?file=exec.c

We wrap PQexecParams and friends.  You are coding libpq.  We extended 
much effort to provide the same result interface (PGresult), including 
handling composites and arrays.  You getf composites and arrays as 
PGresults; where a composite is a single tuple multiple field result, an 
array is a multiple tuple single field result and composite arrays are 
multiple tuples and multiple fields.  We've just made a more formal set 
of utility functions, typically called an API, in an attempt to match 
the coding standards of the postgresql project.


There is no libpq param interface like results, so we added PGparam 
stuff. This allows you to pack parameters (PQputf) and than execute it.


--
Andrew Chernow
eSilo, LLC
every bit counts
http://www.esilo.com/

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suggesting a libpq addition

2010-12-06 Thread Dmitriy Igrishin
2010/12/6 Andrew Chernow a...@esilo.com

 On 12/6/2010 11:40 AM, Dmitriy Igrishin wrote:

 IMO, it would be better to implement some utility functions to
 make it easy to construct arrays dynamically for PQexecParams
 and PQexecPrepared. This seems to me more universal solution
 and it is useful for both -- high level libpq-libraries authors and for
 those who like to use libpq directly.


 Hmm, your idea isn't better, it is identical to what libpqtypes already
 does :)
 http://libpqtypes.esilo.com/browse_source.html?file=exec.c

Actually I don't  need this functionality :-). I've implemented a library on
C++
which does many things, including auto memory management, type
conversion and binary transfers easy...
But I believe, that including proposed utility functions are better than
printf-like addition... Although, both of these a excess.


 We wrap PQexecParams and friends.  You are coding libpq.  We extended much
 effort to provide the same result interface (PGresult), including handling
 composites and arrays.  You getf composites and arrays as PGresults; where a
 composite is a single tuple multiple field result, an array is a multiple
 tuple single field result and composite arrays are multiple tuples and
 multiple fields.  We've just made a more formal set of utility functions,
 typically called an API, in an attempt to match the coding standards of the
 postgresql project.

There is no libpq param interface like results, so we added PGparam stuff.
 This allows you to pack parameters (PQputf) and than execute it.

So, let libpq will not be bloated. Let libpq remain low-level library for
projects like libpqtypes, pqxx and so on (my library too) ;-)




 --
 Andrew Chernow
 eSilo, LLC
 every bit counts
 http://www.esilo.com/




-- 
// Dmitriy.


Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Attached is a patch that allows CopyReadAttibutesText() and 
 CopyReadAttributesCSV() to read arbitrary numbers of attributes. 
 Underflowing attributes are recorded as null, and space is made for 
 overflowing attributes on a line.

Why are you still passing nfields as a separate parameter instead of
relying on the value you added to the struct?  That can't do anything
except cause confusion, especially once the two values diverge due to a
previous array-expansion.  Also, why did you change the setup code to
not compute nfields in binary mode?  That seems at best an unnecessary
change, and at worst a breakage of the binary path --- did you test it?

Also please be a little more careful with the formatting.  This for
instance is pretty sloppy:

!  * strings.  cstate-raw_fields[k] is set to point to the k'th attribute 
!  * string, * or NULL when the input matches the null marker string.  

and there seem to be some gratuitous whitespace changes as well.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Kevin Grittner
I wrote:
 Tom Lane  wrote:
 
 I assume this would have to be a hard definition of READ ONLY,
 not the rather squishy definition we use now?
 
 I'm excluding temporary tables from SSI on the grounds that they
 are only read and written by a single transaction and therefore
 can't be a source of rw-dependencies, and I'm excluding system
 tables on the grounds that they don't follow normal snapshot
 isolation rules. Hint bit rewrites are not an issue for SSI.  Are
 there any other squishy aspect I might need to consider?
 
I reviewed the documentation and played around with this a bit and
can't find any areas where the current PostgreSQL implementation of
READ ONLY is incompatible with what is needed for the SSI
optimizations where it is used.  There are a large number of tests
which exercise this, and they're all passing.
 
Did you have something in particular in mind which I should check? 
An example of code you think might break would be ideal, but
anything more concrete than the word squishy would be welcome.
 
Any thoughts on the original question about what to use as a
heavyweight lock to support the subject feature?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FK's to refer to rows in inheritance child

2010-12-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Dec 5, 2010 at 12:41 PM, Andrew Dunstan and...@dunslane.net wrote:
 Well, ISTM that amounts to not having official topic branches :-) I agree
 that this is supposed to be one of git's strengths (or more exactly a
 strength of distributed SCM's generally).  I don't really see any great
 value in sanctifying a particular topic branch with some official status.

 I think the value in an official topic branch would be to allow formal
 incremental commit of large patches.  In other words, we could decide
 that a commit to the official topic branch must meet the same
 standards of quality normally applied to a commit to the master
 branch, and must go through the same process.  It would be understood
 that the topic branch would eventually be merged (with or without
 squash) back into the master branch, but that objections were to be
 raised as pieces were committed to the topic branch, not at merge
 time.  The merge itself would require consensus as to timing, but we'd
 agree to take a dim view of I haven't reviewed anything that's been
 going on here for the last six months but now hate all of it.

Topic branches defined that way seem like a pretty bad idea from here.
They would save no effort at all for committers, because if you're not
allowed to object to something after it's gone into a topic branch, then
it's just like master in terms of having to keep up with changes as they
happen.  Moreover, we'd have to keep them in pretty close sync with
master --- otherwise what happens when you discover that some long-ago
change on master breaks the topic branch?  So AFAICS this would just
increase the amount of keeping-branches-in-sync dogwork without any
offsetting advantage.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Yeah. I'm still not convinced that using shared memory is a bad way to 
 pass these around. Surely we're not talking about large numbers of them. 
 What am I missing here?

They're not of a very predictable size.

Robert's idea of publish() returning a temp file identifier, which then
gets removed at transaction end, might work all right.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 Why not just say give me the snapshot currently held by process ?

There's not a unique snapshot held by a particular process.  Also, we
don't want to expend the overhead to fully publish every snapshot.
I think it's really necessary that the sending process take some
deliberate action to publish a snapshot.

 And please, not temp files if possible.

Barring the cleanup issue, I don't see why not.  This is a relatively
low-usage feature, I think, so I wouldn't be much in favor of dedicating
shmem to it even if the space requirement were predictable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 One possible way to do make an improvement in this area would be to
 move the responsibility for accepting connections out of the
 postmaster.  Instead, you'd have a group of children that would all
 call accept() on the socket, and the OS would arbitrarily pick one to
 receive each new incoming connection.  The postmaster would just be
 responsible for making sure that there were enough children hanging
 around.  You could in fact make this change without doing anything
 else, in which case it wouldn't save any work but would possibly
 reduce connection latency a bit since more of the work could be done
 before the connection actually arrived.

This seems like potentially a good idea independent of anything else,
just to reduce connection latency: fork() (not to mention exec() on
Windows) now happens before not after receipt of the connection request.
However, I see a couple of stumbling blocks:

1. Does accept() work that way everywhere (Windows, I'm looking at you)

2. What do you do when max_connections is exceeded, and you don't have
anybody at all listening on the socket?  Right now we are at least able
to send back an error message explaining the problem.

Another issue that would require some thought is what algorithm the
postmaster uses for deciding to spawn new children.  But that doesn't
sound like a potential showstopper.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Sun, Dec 5, 2010 at 6:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Why is there a variadic replace() in this patch at all?  It seems just
 about entirely unrelated to the stated purpose of the patch, as well
 as being of dubious usefulness.  When would it be superior to
replace(replace(orig, from1, to1), from2, to2), ...

 An iterated replacement has different semantics from a simultaneous
 replace - replacing N placeholders with values simultaneously means
 you don't need to worry about the case where one of the replacement
 strings contains something that looks like a placeholder.

Good point, but what the patch implements is in fact iterated
replacement ... or at least it looked that way in a quick once-over.

 I actually
 think a simultaneous replacement feature would be quite handy but I
 make no comment on whether it belongs as part of this patch.

My point is that the replacement stuff really really needs to be
factored out of the string-execution stuff, precisely because the
desired behavior is debatable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Josh Berkus

On 12/06/2010 09:38 AM, Tom Lane wrote:

Another issue that would require some thought is what algorithm the
postmaster uses for deciding to spawn new children.  But that doesn't
sound like a potential showstopper.


We'd probably want a couple of different ones, optimized for different 
connection patterns.  Realistically.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 12:28 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

Yeah. I'm still not convinced that using shared memory is a bad way to
pass these around. Surely we're not talking about large numbers of them.
What am I missing here?

They're not of a very predictable size.




Ah. Ok.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 12:38 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 One possible way to do make an improvement in this area would be to
 move the responsibility for accepting connections out of the
 postmaster.  Instead, you'd have a group of children that would all
 call accept() on the socket, and the OS would arbitrarily pick one to
 receive each new incoming connection.  The postmaster would just be
 responsible for making sure that there were enough children hanging
 around.  You could in fact make this change without doing anything
 else, in which case it wouldn't save any work but would possibly
 reduce connection latency a bit since more of the work could be done
 before the connection actually arrived.

 This seems like potentially a good idea independent of anything else,
 just to reduce connection latency: fork() (not to mention exec() on
 Windows) now happens before not after receipt of the connection request.
 However, I see a couple of stumbling blocks:

 1. Does accept() work that way everywhere (Windows, I'm looking at you)

Not sure.  It might be useful to look at what Apache does, but I don't
have time to do that ATM.

 2. What do you do when max_connections is exceeded, and you don't have
 anybody at all listening on the socket?  Right now we are at least able
 to send back an error message explaining the problem.

Sending back an error message explaining the problem seems like a
non-negotiable requirement.  I'm not quite sure how to dance around
this.  Perhaps if max_connections is exhausted, the postmaster itself
joins the accept() queue and launches a dead-end backend for each new
connection.  Or perhaps we reserve one extra backend slot for a
probably-dead-end backend that will just sit there and mail rejection
notices; except that if it sees that a regular backend slot has opened
up it grabs it and turns itself into a regular backend.

 Another issue that would require some thought is what algorithm the
 postmaster uses for deciding to spawn new children.  But that doesn't
 sound like a potential showstopper.

The obvious algorithm would be to try to keep N spare workers around.
Any time the number of unconnected backends drops below N the
postmaster starts spawning new ones until it gets back up to N.  I
think the trick may not be the algorithm so much as finding a way to
make the signaling sufficiently robust and lightweight.  For example,
I bet having each child that gets a new connection signal() the
postmaster is a bad plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Josh Berkus



At some point Hackers should look at pg vs MySQL multi tenantry but it
is way tangential today.


My understanding is that our schemas work like MySQL databases; and
our databases are an even higher level of isolation.  No?


That's correct.  Drizzle is looking at implementing a feature like our 
databases called catalogs (per the SQL spec).


Let me stress that not everyone is happy with the MySQL multi-tenantry 
approach.  But it does make multi-tenancy on a scale which you seldom 
see with PG possible, even if it has problems.  It's worth seeing 
whether we can steal any of their optimization ideas without breaking PG.


I was specifically looking at the login model, which works around the 
issue that we have: namely that different login ROLEs can't share a 
connection pool.  In MySQL, they can share the built-in connection 
pool because role-switching effectively is a session variable. 
AFAICT, anyway.


For that matter, if anyone knows any other DB which does multi-tenant 
well/better, we should be looking at them too.


--
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread David E. Wheeler
On Dec 6, 2010, at 4:06 AM, Itagaki Takahiro wrote:

 * contrib/citext raises an encoding error when COLLATE is specified
 even if it is the collation as same as the database default.
 We might need some special treatment for C locale.

I've been wondering if this patch will support case-insensitve collations. If 
so, then citext should probably be revised to use one.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler
On Dec 6, 2010, at 7:19 AM, Tom Lane wrote:

 On the whole I'd prefer not to have any substitution functionality
 hard-wired into pg_execute_file either, though I can see the argument
 that it's necessary for practical use.  Basically I'm concerned that
 replace-equivalent behavior is not going to be satisfactory over the
 long run: I think eventually we're going to need to think about
 quoting/escaping behavior.  So I think it's a bad idea to expose the
 assumption that it'll be done that way at the SQL level.

+1

I suspect that, for the purposes of the extensions patch, if CREATE EXTENSION 
could be modified to handle setting the schema itself, without requiring that 
the file have this magic line:

   SET search_path = @extschema@;

Then there would be no need for substitutions at all.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 12:57 PM, Josh Berkus j...@agliodbs.com wrote:
 At some point Hackers should look at pg vs MySQL multi tenantry but it
 is way tangential today.

 My understanding is that our schemas work like MySQL databases; and
 our databases are an even higher level of isolation.  No?

 That's correct.  Drizzle is looking at implementing a feature like our
 databases called catalogs (per the SQL spec).

 Let me stress that not everyone is happy with the MySQL multi-tenantry
 approach.  But it does make multi-tenancy on a scale which you seldom see
 with PG possible, even if it has problems.  It's worth seeing whether we can
 steal any of their optimization ideas without breaking PG.

Please make sure to articulate what you think is wrong with our existing model.

 I was specifically looking at the login model, which works around the issue
 that we have: namely that different login ROLEs can't share a connection
 pool.  In MySQL, they can share the built-in connection pool because
 role-switching effectively is a session variable. AFAICT, anyway.

Please explain more precisely what is wrong with SET SESSION
AUTHORIZATION / SET ROLE.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 12:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sun, Dec 5, 2010 at 6:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Why is there a variadic replace() in this patch at all?  It seems just
 about entirely unrelated to the stated purpose of the patch, as well
 as being of dubious usefulness.  When would it be superior to
        replace(replace(orig, from1, to1), from2, to2), ...

 An iterated replacement has different semantics from a simultaneous
 replace - replacing N placeholders with values simultaneously means
 you don't need to worry about the case where one of the replacement
 strings contains something that looks like a placeholder.

 Good point, but what the patch implements is in fact iterated
 replacement ... or at least it looked that way in a quick once-over.

Oh.  Well, -1 from me for including that.

 I actually
 think a simultaneous replacement feature would be quite handy but I
 make no comment on whether it belongs as part of this patch.

 My point is that the replacement stuff really really needs to be
 factored out of the string-execution stuff, precisely because the
 desired behavior is debatable.

+1 for committing the uncontroversial parts separately.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 12:11 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

Attached is a patch that allows CopyReadAttibutesText() and
CopyReadAttributesCSV() to read arbitrary numbers of attributes.
Underflowing attributes are recorded as null, and space is made for
overflowing attributes on a line.

Why are you still passing nfields as a separate parameter instead of
relying on the value you added to the struct?  That can't do anything
except cause confusion, especially once the two values diverge due to a
previous array-expansion.


Good point. will fix.


   Also, why did you change the setup code to
not compute nfields in binary mode?  That seems at best an unnecessary
change, and at worst a breakage of the binary path --- did you test it?


AFAICT it's not used in binary mode at all. But I will double check.


Also please be a little more careful with the formatting.


Ok, Will fix also. Thanks for he comments.

cheers

andre

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread Pavel Stehule
2010/12/6 David E. Wheeler da...@kineticode.com:
 On Dec 6, 2010, at 4:06 AM, Itagaki Takahiro wrote:

 * contrib/citext raises an encoding error when COLLATE is specified
 even if it is the collation as same as the database default.
 We might need some special treatment for C locale.

 I've been wondering if this patch will support case-insensitve collations. If 
 so, then citext should probably be revised to use one.

what I know - no. It's support only system based collations

Pavel


 Best,

 David


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 
 I'm still not convinced that using shared memory is a bad way to 
 pass these around. Surely we're not talking about large numbers
 of them.  What am I missing here?
 
 They're not of a very predictable size.
 
Surely you can predict that any snapshot is no larger than a fairly
small fixed portion plus sizeof(TransactionId) * MaxBackends?  So,
for example, if you're configured for 100 connections, you'd be
limited to something under 1kB, maximum?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 I reviewed the documentation and played around with this a bit and
 can't find any areas where the current PostgreSQL implementation of
 READ ONLY is incompatible with what is needed for the SSI
 optimizations where it is used.  There are a large number of tests
 which exercise this, and they're all passing.
 
 Did you have something in particular in mind which I should check? 

I did not, just thought it was a point that merited examination.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 12/06/2010 12:11 PM, Tom Lane wrote:
 Also, why did you change the setup code to
 not compute nfields in binary mode?  That seems at best an unnecessary
 change, and at worst a breakage of the binary path --- did you test it?

 AFAICT it's not used in binary mode at all. But I will double check.

Well, even if it is not used at the moment, it seems potentially of use
in that path.  So I'd vote for continuing to set it correctly, rather
than making it deliberately incorrect as this patch is going out of its
way to do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 I'm still not convinced that using shared memory is a bad way to 
 pass these around. Surely we're not talking about large numbers
 of them.  What am I missing here?
 
 They're not of a very predictable size.
 
 Surely you can predict that any snapshot is no larger than a fairly
 small fixed portion plus sizeof(TransactionId) * MaxBackends?

No.  See subtransactions.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 
 Surely you can predict that any snapshot is no larger than a fairly
 small fixed portion plus sizeof(TransactionId) * MaxBackends?
 
 No.  See subtransactions.
 
Subtransactions are included in snapshots?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] allow COPY routines to read arbitrary numbers of fields

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 01:23 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 12/06/2010 12:11 PM, Tom Lane wrote:

Also, why did you change the setup code to
not compute nfields in binary mode?  That seems at best an unnecessary
change, and at worst a breakage of the binary path --- did you test it?

AFAICT it's not used in binary mode at all. But I will double check.

Well, even if it is not used at the moment, it seems potentially of use
in that path.  So I'd vote for continuing to set it correctly, rather
than making it deliberately incorrect as this patch is going out of its
way to do.




Ok.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 On Dec 6, 2010, at 7:19 AM, Tom Lane wrote:
 On the whole I'd prefer not to have any substitution functionality
 hard-wired into pg_execute_file either, though I can see the argument
 that it's necessary for practical use.  Basically I'm concerned that
 replace-equivalent behavior is not going to be satisfactory over the
 long run: I think eventually we're going to need to think about
 quoting/escaping behavior.  So I think it's a bad idea to expose the
 assumption that it'll be done that way at the SQL level.

 +1

 I suspect that, for the purposes of the extensions patch, if CREATE EXTENSION 
 could be modified to handle setting the schema itself, without requiring that 
 the file have this magic line:

SET search_path = @extschema@;

 Then there would be no need for substitutions at all.

That's an interesting idea, but I'm not sure it's wise to design around
the assumption that we won't need substitutions ever.  What I was
thinking was that we should try to limit knowledge of the substitution
behavior to the extension definition files and the implementation of
CREATE EXTENSION itself.  I don't agree with exposing that information
at the SQL level.

(On the other hand, if we *could* avoid using any explicit
substitutions, it would certainly ease testing of extension files no?
They'd be sourceable into psql then.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 No.  See subtransactions.
 
 Subtransactions are included in snapshots?

Sure, see GetSnapshotData().  You could avoid it by setting
suboverflowed, but that comes at a nontrivial performance cost.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler
On Dec 6, 2010, at 10:43 AM, Tom Lane wrote:

 That's an interesting idea, but I'm not sure it's wise to design around
 the assumption that we won't need substitutions ever.  What I was
 thinking was that we should try to limit knowledge of the substitution
 behavior to the extension definition files and the implementation of
 CREATE EXTENSION itself.  I don't agree with exposing that information
 at the SQL level.
 
 (On the other hand, if we *could* avoid using any explicit
 substitutions, it would certainly ease testing of extension files no?
 They'd be sourceable into psql then.)

Yes. And extension authors would not have to remember to include the magic line 
(which at any rate would break extensions for earlier versions of PostgreSQL).

Best,

dAvid
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Kevin Grittner
Tom Lane t...@sss.pgh.pa.us wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Tom Lane t...@sss.pgh.pa.us wrote:
 No.  See subtransactions.
 
 Subtransactions are included in snapshots?
 
 Sure, see GetSnapshotData().  You could avoid it by setting
 suboverflowed, but that comes at a nontrivial performance cost.
 
Yeah, sorry for blurting like that before I checked.  I was somewhat
panicked that I'd missed something important for SSI, because my
XidIsConcurrent check just uses xmin, xmax, and xip; I was afraid
what I have would fall down in the face of subtransactions.  But on
review I found that I'd thought that through and (discussion in in
the archives) I always wanted to associate the locks and conflicts
with the top level transaction; so that was already identified
before checking for overlap, and it was therefore more efficient to
just check that.
 
Sorry for the senior moment.  :-/
 
Perhaps a line or two of comments about that in the SSI patch would
be a good idea.  And maybe some tests involving subtransactions
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 On Dec 6, 2010, at 10:43 AM, Tom Lane wrote:
 (On the other hand, if we *could* avoid using any explicit
 substitutions, it would certainly ease testing of extension files no?
 They'd be sourceable into psql then.)

 Yes. And extension authors would not have to remember to include the magic 
 line (which at any rate would break extensions for earlier versions of 
 PostgreSQL).

Well, I don't put any stock in the idea that it's important for existing
module .sql files to be usable as-is as extension definition files.  If
it happens to fall out that way, fine, but we shouldn't give up anything
else to get that.  Letting extension files be directly sourceable in
psql is probably worth a bit more, but I'm not sure how much.  The
argument that forgetting to include a magic source_path command would
make CREATE EXTENSION behave surprisingly seems to have a good deal of
merit though, certainly enough to justify having CREATE EXTENSION take
care of that internally if at all possible.

The real question in my mind is whether there are any other known or
foreseeable cases where we would need to have substitution capability
and there's not another good way to handle it.  I haven't been paying
real close attention to the threads about this patch --- do we have any
specific use-cases in mind for substitution, besides this one?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler
On Dec 6, 2010, at 11:12 AM, Tom Lane wrote:

 Well, I don't put any stock in the idea that it's important for existing
 module .sql files to be usable as-is as extension definition files.  If
 it happens to fall out that way, fine, but we shouldn't give up anything
 else to get that.

I agree, but I don't think we have to lose anything.

  Letting extension files be directly sourceable in
 psql is probably worth a bit more, but I'm not sure how much.  The
 argument that forgetting to include a magic source_path command would
 make CREATE EXTENSION behave surprisingly seems to have a good deal of
 merit though, certainly enough to justify having CREATE EXTENSION take
 care of that internally if at all possible.

Yes.

The other question I have, though, is how important is it to have extensions 
live in a particular schema since there seems to be no advantage to doing so. 
With the current patch, I can put extension foo in schema bar, but I can't 
put any other extension named foo in any other schema. It's in schema bar 
but is at the same time global. That doesn't make much sense to me.

Best,

David
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread Peter Eisentraut
On mån, 2010-12-06 at 21:06 +0900, Itagaki Takahiro wrote:
 On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut pete...@gmx.net wrote:
  Here is an updated patch to address the issues discussed during this
  commitfest.
 
 Here are comments and questions after I tested the latest patch:
 
  Issues 
 * initdb itself seems to be succeeded, but it says could not determine
 encoding for locale messages for any combination of encoding=utf8/eucjp
 and locale=ja_JP.utf8/ja_JP.eucjp/C. Is it an expected behavior?
 
 creating collations ...initdb: locale name has non-ASCII characters,
 skipped: bokm虱
 initdb: locale name has non-ASCII characters, skipped: fran軋is
 could not determine encoding for locale hy_AM.armscii8: codeset is 
 ARMSCII-8
 ... (a dozen of lines) ...
 could not determine encoding for locale vi_VN.tcvn: codeset is TCVN5712-1
 ok
 

What this does it take the output of locale -a and populate the
pg_collation catalog with the locales it finds.  When it finds an
operating system locale that uses an encoding that is not recognized,
you will see this warning.

I understand that that would probably annoy users.  We could hide the
warning and silently skip those locales.  But then could that hide
genuine configuration problems?

 * contrib/citext raises an encoding error when COLLATE is specified
 even if it is the collation as same as the database default.
 We might need some special treatment for C locale.
 =# SHOW lc_collate;  == C
 =# SELECT ('A'::citext) = ('a'::citext);  == false
 =# SELECT ('A'::citext) = ('a'::citext) COLLATE C;
 ERROR:  invalid multibyte character for locale
 HINT:  The server's LC_CTYPE locale is probably incompatible with the
 database encoding.

OK, I can reproduce that.  That's fallout from the lc_ctype_is_c()
optimization that I removed, as explained in another email.  I'll have
to think about that again.

 * pg_dump would generate unportable files for different platforms
 because collation names

pg_dump can already produce unportable files for a number of other
reasons, including per-database locale, tablespaces, OS-dependent
configuration settings.

The way I imagine this working is that someone who wants to design a
genuinely portable application using this feature would create their own
collation based on the existing, OS-specific collation (using a
to-be-added CREATE COLLATION command).  As mentioned earlier, however,
we can't actually solve the problem that the OS locales may not behave
the same across systems.

  Source codes 
 * PG_GETARG_COLLATION might be a better name rather than PG_GET_COLLATION.

It's not the collation of a function argument, it's the collation of a
function call.  (You could conceivably also fetch the collation of a
function argument, but that isn't used in any way.)

 * What is the different between InvalidOid and DEFAULT_COLLATION_OID
 for collation oids? The patch replaces DirectFunctionCall to
 DirectFunctionCallC in some places, but we could shrink the diff size
 if we can use InvalidOid instead of DEFAULT_COLLATION_OID,

Think of DEFAULT_COLLATION_OID as analogous to UKNOWNOID.  A long time
ago we used InvalidOid for all kinds of types, including unknown,
pseudotypes, cstring, and no type at all.  The reason we changed this
was that this masked errors and made processing of the unknown type
difficult/impossible.  I know this makes the code bigger, but it's
necessary.  I originally coded the patch using InvalidOid for
everything, but that wasn't very robust.

This also ties into the next question ...

 * I still think an explicit passing collations from-function-to-function
 is horrible because we might forget it in some places, and almost existing
 third party module won't work.  Is it possible to make it a global variable,
 and push/pop the state when changed? Sorry I'm missing something, but
 I think we could treat the collation setting as like as GUC settings.

A collation is a property of a datum or an expression.  You might as
well argue that we don't keep track of types of expressions and instead
store it globally.  Doesn't make sense.

Extensions are not required to support collations.  Those that might
want to will usually end up calling one of the locale-enabled functions
such as varstr_cmp(), and there the function prototype will ensure that
specifying a collation cannot be missed.

Additionally, the distinction of InvalidOid and DEFAULT_COLLATION_OID
does a great deal to ensure that in case a collation is unspecified or
missing in some new code, you will get a proper error message instead of
unspecified behavior.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread Peter Eisentraut
On mån, 2010-12-06 at 10:01 -0800, David E. Wheeler wrote:
 I've been wondering if this patch will support case-insensitve
 collations. If so, then citext should probably be revised to use one.

This has been touch upon several times during the discussions on past
patches.

Essentially, the current patch only arranges that you can specify a sort
order for data.  The system always breaks ties using a binary
comparison.  This could conceivably be changed, but it's a separate
problem.  Some of the necessary investigation work has presumably
already been done in the context of citext.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 The other question I have, though, is how important is it to have extensions 
 live in a particular schema since there seems to be no advantage to doing so. 
 With the current patch, I can put extension foo in schema bar, but I 
 can't put any other extension named foo in any other schema. It's in schema 
 bar but is at the same time global. That doesn't make much sense to me.

There's a difference between whether an extension as such is considered
to belong to a schema and whether its contained objects do.  We can't
really avoid the fact that functions, operators, etc must be assigned to
some particular schema.  It seems not particularly important that
extension names be schema-qualified, though --- the use-case for having
two different extensions named foo installed simultaneously seems
pretty darn small.  On the other hand, if we were enforcing that all
objects contained in an extension belong to the same schema, it'd make
logistical sense to consider that the extension itself belongs to that
schema as well.  But last I heard we didn't want to enforce such a
restriction.

I believe what the search_path substitution is actually about is to
provide a convenient shorthand for the case that all the contained
objects do indeed live in one schema, and you'd like to be able to
select that schema at CREATE EXTENSION time.  Which seems like a useful
feature for a common case.  We've certainly heard multiple complaints
about the fact that you can't do that easily now.

BTW, I did think of a case where substitution solves a problem we don't
presently have any other solution for: referring to the target schema
within the definition of a contained object.  As an example, you might
wish to attach SET search_path = @target_schema@ to the definition of
a SQL function in an extension, to prevent search-path-related security
issues in the use of the function.  Without substitution you'll be
reduced to hard-wiring the name of the target schema.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread David E. Wheeler
On Dec 6, 2010, at 11:29 AM, Peter Eisentraut wrote:

 This has been touch upon several times during the discussions on past
 patches.
 
 Essentially, the current patch only arranges that you can specify a sort
 order for data.  The system always breaks ties using a binary
 comparison.  This could conceivably be changed, but it's a separate
 problem.  Some of the necessary investigation work has presumably
 already been done in the context of citext.

Okay, thanks, good to know.

Best,

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Label switcher function

2010-12-06 Thread Robert Haas
2010/11/25 KaiGai Kohei kai...@ak.jp.nec.com:
 The attached patch is a revised one.

 It provides two hooks; the one informs core PG whether the supplied
 function needs to be hooked, or not. the other is an actual hook on
 prepare, start, end and abort of function invocations.

  typedef bool (*needs_function_call_type)(Oid fn_oid);

  typedef void (*function_call_type)(FunctionCallEventType event,
                                     FmgrInfo *flinfo, Datum *private);

 The hook prototype was a bit modified since the suggestion from
 Robert. Because FmgrInfo structure contain OID of the function,
 it might be redundant to deliver OID of the function individually.

 Rest of parts are revised according to the comment.

 I also fixed up source code comments which might become incorrect.

FCET_PREPARE looks completely unnecessary to me.  Any necessary
one-time work can easily be done at FCET_START time, assuming that the
private-data field is initialized to (Datum) 0.

I'm fairly certain that the following is not portable:

+   ObjectAddress   object = { .classId = ProcedureRelationId,
+  .objectId = fn_oid,
+  .objectSubId = 0 };

I'd suggest renaming needs_function_call_type and function_call_type
to needs_fmgr_hook_type and fmgr_hook_type.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Josh Berkus

 Please explain more precisely what is wrong with SET SESSION
 AUTHORIZATION / SET ROLE.

1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't
had any time to work on)

2) Users can always issue their own SET ROLE and then hack into other
users' data.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread Alexandre Riveira

Please

It would be very important to us that the Brazilian LIKE collate worked 
with, and possible case-insensitive and accent-insensitive


Tank's

Alexandre Riveira
Brazil

Peter Eisentraut escreveu:

On mån, 2010-12-06 at 10:01 -0800, David E. Wheeler wrote:
  

I've been wondering if this patch will support case-insensitve
collations. If so, then citext should probably be revised to use one.



This has been touch upon several times during the discussions on past
patches.

Essentially, the current patch only arranges that you can specify a sort
order for data.  The system always breaks ties using a binary
comparison.  This could conceivably be changed, but it's a separate
problem.  Some of the necessary investigation work has presumably
already been done in the context of citext.



  



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_execute_from_file review

2010-12-06 Thread David E. Wheeler
On Dec 6, 2010, at 11:36 AM, Tom Lane wrote:

 There's a difference between whether an extension as such is considered
 to belong to a schema and whether its contained objects do.  We can't
 really avoid the fact that functions, operators, etc must be assigned to
 some particular schema.  

Right, of course.

 It seems not particularly important that
 extension names be schema-qualified, though --- the use-case for having
 two different extensions named foo installed simultaneously seems
 pretty darn small.  On the other hand, if we were enforcing that all
 objects contained in an extension belong to the same schema, it'd make
 logistical sense to consider that the extension itself belongs to that
 schema as well.  But last I heard we didn't want to enforce such a
 restriction.

Okay.

 I believe what the search_path substitution is actually about is to
 provide a convenient shorthand for the case that all the contained
 objects do indeed live in one schema, and you'd like to be able to
 select that schema at CREATE EXTENSION time.  Which seems like a useful
 feature for a common case.  We've certainly heard multiple complaints
 about the fact that you can't do that easily now.

Yes, it *is* useful. But what happens if I have 

  SET search_path = whatever;

In my extension install script, and someone executes CREATE EXTENSION FOO WITH 
SCHEMA bar; Surprise! Everything is in whatever, not in bar.

 BTW, I did think of a case where substitution solves a problem we don't
 presently have any other solution for: referring to the target schema
 within the definition of a contained object.  As an example, you might
 wish to attach SET search_path = @target_schema@ to the definition of
 a SQL function in an extension, to prevent search-path-related security
 issues in the use of the function.  Without substitution you'll be
 reduced to hard-wiring the name of the target schema.

You lost me. :-(

David



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Florian Pflug
On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
 The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE under
 SSI would be to have each non-read-only serializable transaction
 acquire a heavyweight lock which can coexist with other locks at the
 same level (SHARE looks good) on some common object and hold that for
 the duration of the transaction, while a SERIALIZABLE READ ONLY
 DEFERRABLE transaction would need to acquire a conflicting lock
 (EXCLUSIVE looks good) before it could acquire a snapshot, and
 release the lock immediately after acquiring the snapshot.

Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to acquire the lock, 
no other transaction would be allowed to start until the SERIALIZABLE READ ONLY 
DEFERRABLE transaction has been able to acquire its snapshot. For pg_dump's 
purposes at least, that seems undesirable, since a single long-running 
transaction at the time you start pg_dump would effectly DoS your system until 
the long-running transaction finishes.

The alternative seems to be to drop the guarantee that a SERIALIZABLE READ ONLY 
DEFERRABLE won't be starved forever by a stream of overlapping non-READ ONLY 
transactions. Then a flag in the proc array that marks non-READ ONLY 
transactions should be sufficient, plus a wait-and-retry loop to take snapshots 
for SERIALIZABLE READ ONLY DEFERRABLE transactions.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread marcin mank
On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IIRC, in old discussions of this problem we first considered allowing
 clients to pull down an explicit representation of their snapshot (which
 actually is an existing feature now, txid_current_snapshot()) and then
 upload that again to become the active snapshot in another connection.

Could a hot standby use such a snapshot representation? I.e. same
snapshot on the master and the standby?

Greetings
Marcin Mańk

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 Regardless, I'm now leaning heavily toward the idea of avoiding 
 open_datasync by default given this bug, and backpatching that change to 
 at least 8.4.  I'll do some more database-level performance tests here 
 just as a final sanity check on that.  My gut feel is now that we'll 
 eventually be taking something like Marti's patch, adding some more 
 documentation around it, and applying that to HEAD as well as some 
 number of back branches.

I think we have got consensus that (1) open_datasync should not be the
default on Linux, and (2) this change needs to be back-patched.  What
is not clear to me is whether we have consensus to change the option
preference order globally, or restrict the change to just be effective
on Linux.  The various testing that's been reported so far is all for
Linux and thus doesn't directly address the question of whether other
kernels will have similar performance properties.  However, it seems
reasonable to me to suppose that open_datasync could only be a win in
very restricted scenarios and thus shouldn't be a preferred default.
Also, I dread trying to document the behavior if the preference order
becomes platform-dependent.

With the holidays fast approaching, our window to do something about
this in a timely fashion grows short.  If we don't schedule update
releases to be made this week, I think we're looking at not getting the
updates out till after New Year's.  Do we want to wait that long?  Is
anyone actually planning to do performance testing that would prove
anything about non-Linux platforms?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 21:48, marcin mank wrote:

On Sun, Dec 5, 2010 at 7:28 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

IIRC, in old discussions of this problem we first considered allowing
clients to pull down an explicit representation of their snapshot (which
actually is an existing feature now, txid_current_snapshot()) and then
upload that again to become the active snapshot in another connection.


Could a hot standby use such a snapshot representation? I.e. same
snapshot on the master and the standby?


Hmm, I suppose it could. That's an interesting idea, you could run 
parallel pg_dump or something else against master and/or multiple hot 
standby servers, all working on the same snapshot.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
marcin mank marcin.m...@gmail.com writes:
 On Sun, Dec 5, 2010 at 7:28 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IIRC, in old discussions of this problem we first considered allowing
 clients to pull down an explicit representation of their snapshot (which
 actually is an existing feature now, txid_current_snapshot()) and then
 upload that again to become the active snapshot in another connection.

 Could a hot standby use such a snapshot representation? I.e. same
 snapshot on the master and the standby?

Hm, that's a good question.  It seems like it's at least possibly
workable, but I'm not sure if there are any showstoppers.  The other
proposal of publish-a-snapshot would presumably NOT support this, since
we'd not want to ship the snapshot temp files down the WAL stream.

However, if you were doing something like parallel pg_dump you could
just run the parent and child instances all against the slave, so the
pg_dump scenario doesn't seem to offer much of a supporting use-case for
worrying about this.  When would you really need to be able to do it?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Kevin Grittner
Florian Pflug f...@phlo.org wrote:
 On Dec5, 2010, at 16:11 , Kevin Grittner wrote:
 The simple way to implement SERIALIZABLE READ ONLY DEFERRABLE
 under SSI would be to have each non-read-only serializable
 transaction acquire a heavyweight lock which can coexist with
 other locks at the same level (SHARE looks good) on some common
 object and hold that for the duration of the transaction, while a
 SERIALIZABLE READ ONLY DEFERRABLE transaction would need to
 acquire a conflicting lock (EXCLUSIVE looks good) before it could
 acquire a snapshot, and release the lock immediately after
 acquiring the snapshot.
 
 Hm, so once a SERIALIZABLE READ ONLY DEFERRABLE is waiting to
 acquire the lock, no other transaction would be allowed to start
 until the SERIALIZABLE READ ONLY DEFERRABLE transaction has been
 able to acquire its snapshot. For pg_dump's purposes at least,
 that seems undesirable, since a single long-running transaction at
 the time you start pg_dump would effectly DoS your system until
 the long-running transaction finishes.
 
Well, when you put it that way, it sounds pretty grim.  :-(  Since
one of the bragging points of SSI is that it doesn't introduce any
blocking beyond current snapshot isolation, I don't want to do
something here which blocks anything except the transaction which
has explicitly requested the DEFERRABLE property.  I guess that,
simple as that technique might be, it just isn't a good idea.
 
 The alternative seems to be to drop the guarantee that a
 SERIALIZABLE READ ONLY DEFERRABLE won't be starved forever by a
 stream of overlapping non-READ ONLY transactions. Then a flag in
 the proc array that marks non-READ ONLY transactions should be
 sufficient, plus a wait-and-retry loop to take snapshots for
 SERIALIZABLE READ ONLY DEFERRABLE transactions.
 
If I can find a way to pause an active process I already have
functions in which I maintain the count of active SERIALIZABLE READ
WRITE transactions as they begin and end -- I could release pending
DEFERRABLE transactions when the count hits zero without any
separate loop.  That has the added attraction of being a path to the
more complex checking which could allow the deferrable process to
start sooner in some circumstances.  The simple solution with the
heavyweight lock would not have been a good path to that.
 
What would be the correct way for a process to put itself to sleep,
and for another process to later wake it up?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Heikki Linnakangas

On 06.12.2010 22:53, Kevin Grittner wrote:

What would be the correct way for a process to put itself to sleep,
and for another process to later wake it up?


See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] serializable read only deferrable

2010-12-06 Thread Kevin Grittner
Heikki Linnakangas heikki.linnakan...@enterprisedb.com wrote:
 On 06.12.2010 22:53, Kevin Grittner wrote:
 What would be the correct way for a process to put itself to
 sleep, and for another process to later wake it up?
 
 See ProcWaitSignal/ProcSendSignal. Or the new 'latch' code.
 
Is there a reason to prefer one over the other?
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Josh Berkus

 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?

If you had several standbys, you could distribute the work of the
pg_dump among them.  This would be a huge speedup for a large database,
potentially, thanks to parallelization of I/O and network.  Imagine
doing a pg_dump of a 300GB database in 10min.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Greg Smith

Tom Lane wrote:

The various testing that's been reported so far is all for
Linux and thus doesn't directly address the question of whether other
kernels will have similar performance properties.


Survey of some popular platforms:

Linux:  don't want O_DIRECT by default for reliability reasons, and 
there's no clear performance win in the default config with small 
wal_buffers


Solaris:  O_DIRECT doesn't work, there's another API support has never 
been added for; see 
http://blogs.sun.com/jkshah/entry/postgresql_wal_sync_method_and


Windows:  Small reported gains for O_DIRECT, i.e 10% at 
http://archives.postgresql.org/pgsql-hackers/2007-03/msg01615.php


FreeBSD:  It probably works there, but I've never seen good performance 
tests of it on this platform.


Mac OS X:  Like Solaris, there's a similar mechanism but it's not 
O_DIRECT; see 
http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag 
for notes about the F_NOCACHE  feature used.  Same basic situation as 
Solaris; there's an API, but PostgreSQL doesn't use it yet.


So my guess is that some small percentage of Windows users might notice 
a change here, and some testing on FreeBSD would be useful too.  That's 
about it for platforms that I think anybody needs to worry about.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
PostgreSQL 9.0 High Performance: http://www.2ndQuadrant.com/books



[HACKERS] the number of file descriptors when using POSIX semaphore

2010-12-06 Thread flyusa2010 fly
Hi, folks,

in src/template/darwin:

# Select appropriate semaphore support. Darwin 6.0 (Mac OS X 10.2) and up
# support System V semaphores; before that we have to use POSIX semaphores,
# which are less good for our purposes because they eat a file descriptor
# per backend per max_connection slot.

To my understanding, the number of descriptors created by POSIX semaphores
would be # of actual clients times max_connection.
However, I monitor the number of open files using sysctl, and I find that
kern.num_files doesn't match the result calculated by the formula that is
inferred by me...

So, what would the number of file descriptors be, when using POSIX
semaphore?

Thanks!


Re: [HACKERS] WIP patch for parallel pg_dump

2010-12-06 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 However, if you were doing something like parallel pg_dump you could
 just run the parent and child instances all against the slave, so the
 pg_dump scenario doesn't seem to offer much of a supporting use-case for
 worrying about this.  When would you really need to be able to do it?

 If you had several standbys, you could distribute the work of the
 pg_dump among them.  This would be a huge speedup for a large database,
 potentially, thanks to parallelization of I/O and network.  Imagine
 doing a pg_dump of a 300GB database in 10min.

That does sound kind of attractive.  But to do that I think we'd have to
go with the pass-the-snapshot-through-the-client approach.  Shipping
internal snapshot files through the WAL stream doesn't seem attractive
to me.

While I see Robert's point about preferring not to expose the snapshot
contents to clients, I don't think it outweighs all other considerations
here; and every other one is pointing to doing it the other way.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Josh Berkus
On 12/5/10 2:12 PM, Greg Smith wrote:
 Josh Berkus wrote:
 I modified test_fsync in two ways to run this; first, to make it support
 O_DIRECT, and second to make it run in the *current* directory.
 
 Patch please?  I agree with the latter change; what test_fsync does is
 surprising.

Attached.

Making it support O_DIRECT would be possible but more complex; I don't
see the point unless we think we're going to have open_sync_with_odirect
as a seperate option.

 I suggested a while ago that we refactor test_fsync to use a common set
 of source code as the database itself for detecting things related to
 wal_sync_method, perhaps just extract that whole set of DEFINE macro
 logic to somewhere else.  That happened at a bad time in the development
 cycle (right before a freeze) and nobody ever got back to the idea
 afterwards.  If this code is getting touched, and it's clear it is in
 some direction, I'd like to see things change so it's not possible for
 the two to diverge again afterwards.

I don't quite follow you.  Maybe nobody else did last time, either.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com
diff --git a/src/tools/fsync/test_fsync.c b/src/tools/fsync/test_fsync.c
index 28c2119..12a83e1 100644
*** a/src/tools/fsync/test_fsync.c
--- b/src/tools/fsync/test_fsync.c
***
*** 23,34 
  #include string.h
  
  
! #ifdef WIN32
  #define FSYNC_FILENAME	./test_fsync.out
- #else
- /* /tmp might be a memory file system */
- #define FSYNC_FILENAME	/var/tmp/test_fsync.out
- #endif
  
  #define WRITE_SIZE	(8 * 1024)	/* 8k */
  
--- 23,32 
  #include string.h
  
  
! /* put the temp files in the local directory
!this is a change from older versions which used
!/var/tmp */
  #define FSYNC_FILENAME	./test_fsync.out
  
  #define WRITE_SIZE	(8 * 1024)	/* 8k */
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Steve Singer

On 10-12-06 06:56 PM, Greg Smith wrote:

Tom Lane wrote:

The various testing that's been reported so far is all for
Linux and thus doesn't directly address the question of whether other
kernels will have similar performance properties.


Survey of some popular platforms:



snip


So my guess is that some small percentage of Windows users might notice
a change here, and some testing on FreeBSD would be useful too. That's
about it for platforms that I think anybody needs to worry about.


If you tell me which options to pgbench and which .conf file settings 
you'd like to see I can probably arrange to run some tests on AIX.






--
Greg Smith   2ndQuadrant usg...@2ndquadrant.comBaltimore, MD
PostgreSQL Training, Services and Supportwww.2ndQuadrant.us
PostgreSQL 9.0 High Performance:http://www.2ndQuadrant.com/books




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Tom Lane
Greg Smith g...@2ndquadrant.com writes:
 So my guess is that some small percentage of Windows users might notice 
 a change here, and some testing on FreeBSD would be useful too.  That's 
 about it for platforms that I think anybody needs to worry about.

To my mind, O_DIRECT is not really the key issue here, it's whether to
prefer O_DSYNC or fdatasync.  I looked back in the archives, and I think
that the main reason we prefer O_DSYNC when available is the results
I got here:

http://archives.postgresql.org/pgsql-hackers/2001-03/msg00381.php

which demonstrated a performance benefit on HPUX 10.20, though with a
test tool much more primitive than test_fsync.  I still have that
machine, although the disk that was in it at the time died awhile back.
What's in there now is a Seagate ST336607LW spinning at 1 RPM (166
rev/sec) and today I get numbers like this from test_fsync:

Simple write:
8k write  28331.020/second

Compare file sync methods using one write:
open_datasync 8k write  161.190/second
open_sync 8k write  156.478/second
8k write, fdatasync  54.302/second
8k write, fsync  51.810/second

Compare file sync methods using two writes:
2 open_datasync 8k writes81.702/second
2 open_sync 8k writes80.172/second
8k write, 8k write, fdatasync40.829/second
8k write, 8k write, fsync39.836/second

Compare open_sync with different sizes:
open_sync 16k write  80.192/second
2 open_sync 8k writes78.018/second

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
8k write, fsync, close   52.527/second
8k write, close, fsync   54.092/second

So *on that rather ancient platform* there's a measurable performance
benefit to O_DSYNC, but this seems to be largely because fdatasync is
stubbed to fsync in userspace rather than because fdatasync wouldn't
be a better idea in the abstract.  Also, a lot of the argument against
fsync at the time was that it forced the kernel to iterate through all
the buffers for the WAL file to see if any were dirty.  I would imagine
that modern kernels are a tad smarter about that; and even if they
aren't, the CPU speed versus disk speed tradeoff has changed enough
since 2001 that iterating through 16MB of buffers isn't as interesting
as it was then.

So to my mind, switching to the preference order fdatasync,
fsync_writethrough, fsync seems like the thing to do.  Since we assume
fsync is always available, that means that O_DSYNC/O_SYNC will not be
the defaults on any platform.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Making it support O_DIRECT would be possible but more complex; I don't
 see the point unless we think we're going to have open_sync_with_odirect
 as a seperate option.

Whether it's complex or not isn't really the issue.  The issue is that
what test_fsync is testing had better match what the backend does, or
people will be making choices based on not-comparable test results.
I think we should have test_fsync just automatically fold in O_DIRECT
the same way the backend does.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Andrew Dunstan



On 12/06/2010 08:38 PM, Tom Lane wrote:

Josh Berkusj...@agliodbs.com  writes:

Making it support O_DIRECT would be possible but more complex; I don't
see the point unless we think we're going to have open_sync_with_odirect
as a seperate option.

Whether it's complex or not isn't really the issue.  The issue is that
what test_fsync is testing had better match what the backend does, or
people will be making choices based on not-comparable test results.
I think we should have test_fsync just automatically fold in O_DIRECT
the same way the backend does.




Indeed. We were quite confused for a while when we were dealing with 
this about a week ago, and my handwritten test program failed as 
expected but test_fsync didn't. Anything other than behaving just as the 
backend does violates POLA, in my view.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus
Steve,

 If you tell me which options to pgbench and which .conf file settings
 you'd like to see I can probably arrange to run some tests on AIX.

Compile and run test_fsync in PGSRC/src/tools/fsync.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Timeout for asynchronous replication Re: Timeout and wait-forever in sync rep

2010-12-06 Thread Fujii Masao
On Mon, Dec 6, 2010 at 11:54 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 Hmm, that's actually a quite different timeout than what's required for
 synchronous replication. In synchronous replication, you need to get an
 acknowledgment within a timeout. This patch only puts a timeout on how long
 we wait to have enough room in the TCP send buffer. That doesn't seem all
 that useful.

Yeah, I'm planning to implement that timeout for synchronous replication later.
Since I thought that we should implement the timeout for *asynchronous*
replication first and then extend it for synchronous replication, I created this
patch. This kind of timeout is required for asynchronous replication since
since there is no acknowledgement from the standby in it.

Most part of the patch implements the non-blocking send function and
changes walsender so that it uses that function instead of existing blocking
one. This will be infrastructure for the timeout for synchronous replication.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus

 Mac OS X:  Like Solaris, there's a similar mechanism but it's not
 O_DIRECT; see
 http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag
 for notes about the F_NOCACHE  feature used.  Same basic situation as
 Solaris; there's an API, but PostgreSQL doesn't use it yet.

Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
From my run, it looks like even so regular fsync might be better than
open_sync.  Results from a MacBook:

Sidney-Stratton:fsync josh$ ./test_fsync
Loops = 1

Simple write:
8k write   2121.004/second

Compare file sync methods using one write:
(open_datasync unavailable)
open_sync 8k write 1993.833/second
(fdatasync unavailable)
8k write, fsync1878.154/second

Compare file sync methods using two writes:
(open_datasync unavailable)
2 open_sync 8k writes  1005.009/second
(fdatasync unavailable)
8k write, 8k write, fsync  1709.862/second

Compare open_sync with different sizes:
open_sync 16k write1728.803/second
2 open_sync 8k writes   969.416/second

Test if fsync on non-write file descriptor is honored:
(If the times are similar, fsync() can sync data written
on a different descriptor.)
8k write, fsync, close 1772.572/second
8k write, close, fsync 1939.897/second


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Josh Berkus

 Whether it's complex or not isn't really the issue.  The issue is that
 what test_fsync is testing had better match what the backend does, or
 people will be making choices based on not-comparable test results.
 I think we should have test_fsync just automatically fold in O_DIRECT
 the same way the backend does.

OK, patch coming then.  Right now test_fsync aborts when O_DIRECT fails.
 What should I have it do instead?

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus j...@agliodbs.com wrote:

 Mac OS X:  Like Solaris, there's a similar mechanism but it's not
 O_DIRECT; see
 http://stackoverflow.com/questions/2299402/how-does-one-do-raw-io-on-mac-os-x-ie-equivalent-to-linuxs-o-direct-flag
 for notes about the F_NOCACHE  feature used.  Same basic situation as
 Solaris; there's an API, but PostgreSQL doesn't use it yet.

 Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
 From my run, it looks like even so regular fsync might be better than
 open_sync.

But I think you need to use fsync_writethrough if you actually want durability.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Robert Haas
On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus j...@agliodbs.com wrote:

 Please explain more precisely what is wrong with SET SESSION
 AUTHORIZATION / SET ROLE.

 1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't
 had any time to work on)

 2) Users can always issue their own SET ROLE and then hack into other
 users' data.

Makes sense.  It would be nice to fix those issues, independent of
anything else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus j...@agliodbs.com wrote:
 Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
 From my run, it looks like even so regular fsync might be better than
 open_sync.

 But I think you need to use fsync_writethrough if you actually want 
 durability.

Yeah.  Unless your laptop contains an SSD, those numbers are garbage on
their face.  So that's another problem with test_fsync: it omits
fsync_writethrough.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus
On 12/6/10 6:10 PM, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Dec 6, 2010 at 9:04 PM, Josh Berkus j...@agliodbs.com wrote:
 Actually, on OSX 10.5.8, o_dsync and fdatasync aren't even available.
 From my run, it looks like even so regular fsync might be better than
 open_sync.
 
 But I think you need to use fsync_writethrough if you actually want 
 durability.
 
 Yeah.  Unless your laptop contains an SSD, those numbers are garbage on
 their face.  So that's another problem with test_fsync: it omits
 fsync_writethrough.

Yeah, the issue with test_fsync appears to be that it's designed to work
without os-specific switches no matter what, not to accurately reflect
how we access wal.

I'll see if I can do better.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Revert default wal_sync_method to fdatasync on Linux 2.6.33+

2010-12-06 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 OK, patch coming then.  Right now test_fsync aborts when O_DIRECT fails.
  What should I have it do instead?

Report that it fails, and keep testing the other methods.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Spread checkpoint sync

2010-12-06 Thread Alvaro Herrera
Excerpts from Greg Smith's message of dom dic 05 20:02:48 -0300 2010:

 When ends up happening if you push toward fully sync I/O is the design 
 you see in some other databases, where you need multiple writer 
 processes.  Then requests for new pages can continue to allocate as 
 needed, while keeping any one write from blocking things.  That's one 
 sort of a way to simulate asynchronous I/O, and you can substitute true 
 async I/O instead in many of those implementations.  We didn't have much 
 luck with portability on async I/O when that was last experimented with, 
 and having multiple background writer processes seems like overkill; 
 that whole direction worries me.

Why would multiple bgwriter processes worry you?

Of course, it wouldn't work to have multiple processes trying to execute
a checkpoint simultaneously, but what if we separated the tasks so that
one process is in charge of checkpoints, and another oneZis in charge of
the LRU scan?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] We really ought to do something about O_DIRECT and data=journalled on ext4

2010-12-06 Thread Josh Berkus
All,

Geirth's results from his FreeBSD 7.1 server using 8.4's test_fsync:

Simple write timing:
write0.007081

Compare fsync times on write() and non-write() descriptor:
If the times are similar, fsync() can sync data written
on a different descriptor.
write, fsync, close  5.937933
write, close, fsync  8.056394

Compare one o_sync write to two:
one 16k o_sync write 7.366927
two 8k o_sync writes15.299300

Compare file sync methods with one 8k write:
(o_dsync unavailable)
open o_sync, write   7.512682
(fdatasync unavailable)
write, fsync 5.856480

Compare file sync methods with two 8k writes:
(o_dsync unavailable)
open o_sync, write  15.472910
(fdatasync unavailable)
write, fsync 5.880319


... again, open_sync does not look very impressive.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] profiling connection overhead

2010-12-06 Thread Alvaro Herrera
Excerpts from Robert Haas's message of lun dic 06 23:09:56 -0300 2010:
 On Mon, Dec 6, 2010 at 2:47 PM, Josh Berkus j...@agliodbs.com wrote:
 
  Please explain more precisely what is wrong with SET SESSION
  AUTHORIZATION / SET ROLE.
 
  1) Session GUCS do not change with a SET ROLE (this is a TODO I haven't
  had any time to work on)
 
  2) Users can always issue their own SET ROLE and then hack into other
  users' data.
 
 Makes sense.  It would be nice to fix those issues, independent of
 anything else.

It seems plausible to fix the first one, but how would you fix the
second one?  You either allow SET ROLE (which you need, to support the
pooler changing authorization), or you don't.  There doesn't seem to be
a usable middleground.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] wal_sender_delay is still required?

2010-12-06 Thread Fujii Masao
On Tue, Dec 7, 2010 at 12:08 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Fujii Masao masao.fu...@gmail.com writes:
 Walsender doesn't need the periodic wakeups anymore, thanks to
 the latch feature. So wal_sender_delay is basically useless now.
 How about dropping wal_sender_delay or increasing the default
 value?

 If we don't need it, we should remove it.

The attached patch removes wal_sender_delay and uses hard-coded
10 seconds instead of wal_sender_delay as the delay between activity
rounds for walsender.

One problem with the patch is that it takes longer (at most 10s) to
detect the unexpected death of postmaster (by calling PostmasterIsAlive()).
This is OK for me. But does anyone want to specify the delay to detect
that within a short time?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


drop_wal_sender_delay_v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-12-06 Thread Itagaki Takahiro
On Sun, Dec 5, 2010 at 01:04, Peter Eisentraut pete...@gmx.net wrote:
 Here is an updated patch to address the issues discussed during this
 commitfest.

I found another issue in the patch; ILIKE in WHERE clause doesn't work.
It was surprising because LIKE in WHERE clause and ILIKE in SELECT list
works expectedly.
 - SELECT * FROM pg_class WHERE relname LIKE 'pg%'
 - SELECT relname ILIKE 'pg%' FROM pg_class;


postgres=# SELECT name, setting FROM pg_settings
 WHERE name IN ('lc_ctype', 'lc_collate', 'server_encoding');
  name   | setting
-+-
 lc_collate  | C
 lc_ctype| C
 server_encoding | UTF8
(3 rows)

postgres=# SELECT * FROM pg_class WHERE relname ILIKE 'pg%';
ERROR:  no collation was derived


-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >