Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Dean Rasheed
On 25 October 2010 21:31, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 LABEL is already an unreserved keyword, and I'm pretty sure that's all
 we'll need.

 The only reason it's a keyword is the SECURITY LABEL patch that went
 in a month or so ago; which is syntax that might still be thought
 better of before it gets to a release.

 But I seem to be in the minority, so I'll shut up now.

                        regards, tom lane


In mathematics (and I think also computer science), the term
conventionally used the refer to the things in an enumeration is
element, so how about ADD ELEMENT?

The label is just one of the ways of identifying the element, and the
value is element's OID. The thing you're adding is an element, with
both a label and a value.

Regards,
Dean


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Composite Types and Function Parameters

2010-10-26 Thread Greg
Hi Merlin, I completely forgot about hstore! I'll give it a go. Thanks!






From: Merlin Moncure mmonc...@gmail.com
To: Greg grigo...@yahoo.co.uk
Cc: Pavel Stehule pavel.steh...@gmail.com; pgsql-hackers@postgresql.org
Sent: Mon, 25 October, 2010 23:52:55
Subject: Re: [HACKERS] Composite Types and Function Parameters

On Mon, Oct 25, 2010 at 6:38 PM, Greg grigo...@yahoo.co.uk wrote:

 Hi Pavel, thanks! Yeah, thats what I though. I have to have a custom type or 
 a 
very ugly looking solution for passing the params then.

 To Postgre dev. team: If anyone who involved in Postgre development reading 
this, just a feature suggestion: allow array that can accept combination of 
any 
data types to be passed to a function, for example:
   // declare
   create function TEST ( anytypearray[] ) ...
   // calling
   perform TEST (array[bool, int, etc.] ) 
 This would make such a nice adition to the development for postgre. Although 
this may be complecated to achieve.

probably hstore would be more appropriate for something like that.
You can also declare functions taking composite arrays, anyarray,
variadic array, and variadic any, although the latter requires
function implementation in C to get the most out of it.

merlin



  

Re: [HACKERS] Extensible executor nodes for preparation of SQL/MED

2010-10-26 Thread Itagaki Takahiro
Here is a WIP patch to extensible executor nodes.

It replaces large switch blocks with function-pointer calls (renamed to
PlanProcs from VTable). It has small performance win (Please test it!)
and capsulize some details of executor nodes, but is not perfect.

The infrastructure might be used by SQL/MED, but then not only PlanState
but also Plan and Path are required to be customizable. Complete cleanup
would be difficult, but I'm trying to find common ground for existing
postgres' implementation and extensible planner and executor.
Comments and suggestions welcome.


On Tue, Oct 26, 2010 at 12:21 PM, Itagaki Takahiro
itagaki.takah...@gmail.com wrote:
 I didn't intend performance, but there is small but measurable win
 in it if I avoided indirections. We might not always need to copy
 a whole vtable into planstate; only ExecProcNode might be enough.
 I'll continue the research.

 24957.767 ms : master (a big switch)
 25059.838 ms : two indirections (planstate-plan-vtable-fn)
 24819.298 ms : one indirection (planstate-plan-vtable.fn)
 24118.436 ms : direct call (planstate-vtable.fn)

 So, major benefits of the change might be performance and code refactoring.
 Does anyone have comments about it for the functionality? It might also be
 used by SQL/MED and executor hooks, but I have no specific idea yet.


-- 
Itagaki Takahiro


extensible_execnodes-20101026.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for view triggers in psql

2010-10-26 Thread Dean Rasheed
On 25 October 2010 21:01, David Fetter da...@fetter.org wrote:
 Folks,

 Please find attached patch for $subject :)


Thanks for looking at this. I forgot about tab completion.

I think that the change to ALTER TRIGGER is not necessary. AFAICT it
works OK unmodified. In fact, the modified code here:

*** 971,977  psql_completion(char *text, int start, int end)
else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
 pg_strcasecmp(prev3_wd, TRIGGER) == 0 
 pg_strcasecmp(prev_wd, ON) == 0)
!   COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);

/* ALTER TRIGGER name ON name */
else if (pg_strcasecmp(prev4_wd, TRIGGER) == 0 
--- 1055,1061 
else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
 pg_strcasecmp(prev3_wd, TRIGGER) == 0 
 pg_strcasecmp(prev_wd, ON) == 0)
!   COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_writeables, NULL);

/* ALTER TRIGGER name ON name */
else if (pg_strcasecmp(prev4_wd, TRIGGER) == 0 

appears to be unreachable, because it is preceded by

else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
 pg_strcasecmp(prev3_wd, TRIGGER) == 0)
{
completion_info_charp = prev2_wd;
COMPLETE_WITH_QUERY(Query_for_list_of_tables_for_trigger);
}

which works for tables and views, and makes the next elseif
impossible to satisfy. So I think that block could just be deleted,
right?

Regards,
Dean


 Cheers,
 David.
 --
 David Fetter da...@fetter.org http://fetter.org/
 Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
 Skype: davidfetter      XMPP: david.fet...@gmail.com
 iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

 Remember to vote!
 Consider donating to Postgres: http://www.postgresql.org/about/donate


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-10-26 Thread Dimitri Fontaine

Le 25 oct. 2010 à 17:26, Alvaro Herrera a écrit :
 Ah, some reading of the patch reveals that the script defaults to the
 control file name, but it can be overridden.

Yes, that's new in v10. In v11 I've kept that and removed the name property in 
the control file, so that we have:

cat contrib/intarray/intarray.control.in 
# intarray
comment = 'one-dimensional arrays of integers: functions, operators, index 
support'
version = 'EXTVERSION'
script = '_int.sql'


 I noticed that you're using ExclusiveLock when creating an extension,
 citing the handling of the global variable create_extension for this.
 There are two problems here: one is that you're releasing the lock way
 too early: if you wanted this to be effective, you'd need to hold on to
 the lock until after you've registered the extension.
 
 The other is that there is no need for this at all, because this backend
 cannot be concurrently running another CREATE  EXTENSION comand, and
 this is a backend-local variable.  So there's no point.

I wanted to protect from another backend trying to create the same extension at 
the same time. So the critical path is the inserting into the catalog. I now 
see I failed to include the duplicate object check into the critical path, when 
I added that later.

Do you confirm protecting the insertion in the catalog is not worthy of special 
locking? To get proper locking requires some more thinking than I did put in, 
but if you say I'd better remove it...

 Why palloc create_extension every time?  Isn't it better to initialize
 it properly and have a boolean value telling whether it's to be used?
 Also, if an extension fails partway through creation, the var will be
 left set.  I think you need a PG_TRY block to reset it.

Good catches. I'm still uneasy with which memory context what allocation 
belongs too, hence the palloc()ing here.

 (I find the repeated coding pattern that tests create_extension for
 NULL-ness before calling recordDependencyOn a bit awkward; maybe hide it
 in a function or macro?  But then maybe that's just me.  Also, why
 palloc it?  Seems better to have it static.  Notice your new calls to
 recordDependencyOn are the only ones with operands not using the 
 operator.)

In fact the goal of the test is to check if we're in the code path for CREATE 
EXTENSION, rather than pointer validity per-say. I'll go have it static, too, 
with a bool to determine the code path.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

PS: I hope to be able to send this email, but uploading the git repo will be 
uneasy from the wifi here at best. Will send patches if email is ok.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-10-26 Thread Dimitri Fontaine

Le 25 oct. 2010 à 17:26, Alvaro Herrera a écrit :
 Ah, some reading of the patch reveals that the script defaults to the
 control file name, but it can be overridden.

Yes, that's new in v10. In v11 I've kept that and removed the name property in 
the control file, so that we have:

cat contrib/intarray/intarray.control.in 
# intarray
comment = 'one-dimensional arrays of integers: functions, operators, index 
support'
version = 'EXTVERSION'
script = '_int.sql'


 I noticed that you're using ExclusiveLock when creating an extension,
 citing the handling of the global variable create_extension for this.
 There are two problems here: one is that you're releasing the lock way
 too early: if you wanted this to be effective, you'd need to hold on to
 the lock until after you've registered the extension.
 
 The other is that there is no need for this at all, because this backend
 cannot be concurrently running another CREATE  EXTENSION comand, and
 this is a backend-local variable.  So there's no point.

I wanted to protect from another backend trying to create the same extension at 
the same time. So the critical path is the inserting into the catalog. I now 
see I failed to include the duplicate object check into the critical path, when 
I added that later.

Do you confirm protecting the insertion in the catalog is not worthy of special 
locking? To get proper locking requires some more thinking than I did put in, 
but if you say I'd better remove it...

 Why palloc create_extension every time?  Isn't it better to initialize
 it properly and have a boolean value telling whether it's to be used?
 Also, if an extension fails partway through creation, the var will be
 left set.  I think you need a PG_TRY block to reset it.

Good catches. I'm still uneasy with which memory context what allocation 
belongs too, hence the palloc()ing here.

 (I find the repeated coding pattern that tests create_extension for
 NULL-ness before calling recordDependencyOn a bit awkward; maybe hide it
 in a function or macro?  But then maybe that's just me.  Also, why
 palloc it?  Seems better to have it static.  Notice your new calls to
 recordDependencyOn are the only ones with operands not using the 
 operator.)

In fact the goal of the test is to check if we're in the code path for CREATE 
EXTENSION, rather than pointer validity per-say. I'll go have it static, too, 
with a bool to determine the code path.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

PS: I hope to be able to send this email, but uploading the git repo will be 
uneasy from the wifi here at best. Will send patches if email is ok.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/MED with simple wrappers

2010-10-26 Thread Shigeru HANADA
Thanks for your comments.

On Mon, 25 Oct 2010 15:05:51 +0200
Pavel Stehule pavel.steh...@gmail.com wrote:
  4) List of foreign connections
  Users (especially DBAs?) might want to see list of foreign connections.
  Currently postgresql_fdw provides its own connection list via
  postgresql_fdw_connections view.  Common view such as
  pg_foreign_connections would be needed?  If so, function which returns
  list of active connections would be necessary in FDW API.
 
 
 + list of foreign tables?

I've implemented that functionality in some places.

1) \det psql command shows list of foreign table in the format like
   \dew and \des.
2) pg_foreign_tables catalog shows pair of OIDs (relation oid and
   server oid) and options in raw format.
3) views in information_schema, foreign_tables and foreign_table_options
   show information about foreign tables in SQL standard format.

Here the detail of \det psql command is.

\det psql command (followed naming of \des/\dew) shows list of
foreign tables, and \det pattern shows the list of foreign tables
whose name match the pattern.  For example of file_fdw:

postgres=# \det
   List of foreign tables
Table |   Server
--+-
 csv_accounts | file_server
 csv_branches | file_server
 csv_history  | file_server
 csv_tellers  | file_server
(4 rows)

Adding postfix + shows per-table generic options too.

postgres=# \det+
   List of foreign tables
Table |   Server|Options
--+-+
 csv_accounts | file_server | 
{format=csv,filename=/home/hanada/DB/CSV/pgbench_accounts.csv}
 csv_branches | file_server | 
{format=csv,filename=/home/hanada/DB/CSV/pgbench_branches.csv}
 csv_history  | file_server | 
{format=csv,filename=/home/hanada/DB/CSV/pgbench_history.csv}
 csv_tellers  | file_server | 
{format=csv,filename=/home/hanada/DB/CSV/pgbench_tellers.csv}
(4 rows)

I have chosen \det+ command to show per-table options because \d+
command has already many columns and seems difficult to add long
values.

In addition to \det, \dec command would be necessary to show per-column
options with columns.  It hasn't been implemented yet, though.

  5) Routine mapping
  If a function in local query can be evaluated on the remote side in
  same semantics, it seems efficient to push the function down to the
  remote side.  But how can we know whether the function can be pushed
  down or not?  For such purpose, SQL/MED standard defines routine
  mapping.  Should we implement routine mapping?
 
 
 is it related to aggregate functions? If yes, this it can be really
 significant help

Yes.  I was thinking about only normal functions at original post,
though.  To push down aggregate function to remote side, FDW would
need additional planner hook to merge Aggregate node in to ForeignScan
node.  Such planner hook might be able to handle Order or Limit node
too.

  7) Using cursor in postgresql_fdw
  postgresql_fdw fetches all of the result tuples from the foreign
  server in the first pgIterate() call.  It could cause out-of-memory if
  the result set was huge.  If libpq supports protocol-level cursor,
  postgresql_fdw will be able to divide result set into some sets and
  lower the usage of memory.  Or should we use declare implicit cursor
  with DECLARE statement?  One connection can be used by multiple
  ForeignScan nodes in a local query alternately.  An issue is that
  cursor requires implicit transaction block.  Is it OK to start
  transaction automatically?
 
 I don't know why DECLARE statement is problem? Can you explain it, please.

The most serious issue would be that SQL-level cursors require
explicit transaction block.  To use SQL-level cursors with shared
connections between multiple ForeignScan, postgresql_fdw need to
manage transaction state per connection.  Especially, recovering
error would make codes complex.  Or, we would be able to take the
easiest way, discarding connection at the error.

I'll try to implement cursor-version of postgresql_fdw experimentally
to make this issue clearer.

Regards,
--
Shigeru Hanada


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for view triggers in psql

2010-10-26 Thread David Fetter
On Tue, Oct 26, 2010 at 12:35:13PM +0100, Dean Rasheed wrote:
 On 25 October 2010 21:01, David Fetter da...@fetter.org wrote:
  Folks,
 
  Please find attached patch for $subject :)
 
 
 Thanks for looking at this. I forgot about tab completion.
 
 I think that the change to ALTER TRIGGER is not necessary. AFAICT it
 works OK unmodified. In fact, the modified code here:
 
 *** 971,977  psql_completion(char *text, int start, int end)
   else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
pg_strcasecmp(prev3_wd, TRIGGER) == 0 
pg_strcasecmp(prev_wd, ON) == 0)
 ! COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_tables, NULL);
 
   /* ALTER TRIGGER name ON name */
   else if (pg_strcasecmp(prev4_wd, TRIGGER) == 0 
 --- 1055,1061 
   else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
pg_strcasecmp(prev3_wd, TRIGGER) == 0 
pg_strcasecmp(prev_wd, ON) == 0)
 ! COMPLETE_WITH_SCHEMA_QUERY(Query_for_list_of_writeables, NULL);
 
   /* ALTER TRIGGER name ON name */
   else if (pg_strcasecmp(prev4_wd, TRIGGER) == 0 
 
 appears to be unreachable, because it is preceded by
 
 else if (pg_strcasecmp(prev4_wd, ALTER) == 0 
  pg_strcasecmp(prev3_wd, TRIGGER) == 0)
 {
 completion_info_charp = prev2_wd;
 COMPLETE_WITH_QUERY(Query_for_list_of_tables_for_trigger);
 }

It is indeed unreachable.

 which works for tables and views, and makes the next elseif
 impossible to satisfy.  So I think that block could just be deleted,
 right?

Yes.  Good catch.  New patch attached :)

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
*** a/src/bin/psql/tab-complete.c
--- b/src/bin/psql/tab-complete.c
***
*** 303,308  static const SchemaQuery Query_for_list_of_tables = {
--- 303,375 
NULL
  };
  
+ /* The bit masks for the following three functions come from
+  * src/include/catalog/pg_trigger.h.
+  */
+ static const SchemaQuery Query_for_list_of_insertables = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind = 'r' OR (c.relkind = 'v' AND c.relhastriggers AND EXISTS 
+   (SELECT 1 FROM pg_catalog.pg_trigger t WHERE t.tgrelid = c.oid AND 
t.tgtype | (1  2) = t.tgtype)),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
+ static const SchemaQuery Query_for_list_of_deleteables = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind = 'r' OR (c.relkind = 'v' AND c.relhastriggers AND EXISTS 
+   (SELECT 1 FROM pg_catalog.pg_trigger t WHERE t.tgrelid = c.oid AND 
t.tgtype | (1  3) = t.tgtype)),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
+ static const SchemaQuery Query_for_list_of_updateables = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind = 'r' OR (c.relkind = 'v' AND c.relhastriggers AND EXISTS 
+   (SELECT 1 FROM pg_catalog.pg_trigger t WHERE t.tgrelid = c.oid AND 
t.tgtype | (1  4) = t.tgtype)),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
+ static const SchemaQuery Query_for_list_of_writeables = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind = 'r' OR (c.relkind = 'v' AND c.relhastriggers AND EXISTS 
+   (SELECT 1 FROM pg_catalog.pg_trigger t WHERE t.tgrelid = c.oid AND 
(t.tgtype  ( (12) | (13) | (14)))::bool),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   /* namespace */
+   c.relnamespace,
+   /* result */
+   pg_catalog.quote_ident(c.relname),
+   /* qualresult */
+   NULL
+ };
+ 
  static const SchemaQuery Query_for_list_of_tisv = {
/* catname */
pg_catalog.pg_class c,
***
*** 333,338  static const SchemaQuery Query_for_list_of_tsv = {
--- 400,420 
NULL
  };
  
+ static const SchemaQuery Query_for_list_of_tv = {
+   /* catname */
+   pg_catalog.pg_class c,
+   /* selcondition */
+   c.relkind IN ('r', 'v'),
+   /* viscondition */
+   pg_catalog.pg_table_is_visible(c.oid),
+   

Re: [HACKERS] Range Types, discrete and/or continuous

2010-10-26 Thread Hitoshi Harada
2010/10/26 Robert Haas robertmh...@gmail.com:
 On Mon, Oct 25, 2010 at 8:13 PM, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2010-10-25 at 18:03 -0400, Robert Haas wrote:
 Hmm.  Do you have some concrete examples of cases where a range type
 might want to do some representational optimization?

 Let's say for instance you want to keep an timestamp range in 16 bytes.
 You could have an 8-byte timestamp, a 7-byte integer that represents the
 offset from that timestamp in microseconds, and one byte for flags (e.g.
 NULL or infinite boundaries, etc.). I'm not sure that you can make that
 representation work in a generic way.

 See, that gets complicated, because now you're restricting the range
 of values that can be expressed by the range type to something less
 than the natural range of the data type.  I am not sure the value of
 supporting that is sufficient to justify the amount of extra code that
 will be required to make it work.  I'd say for a first version, nail
 down the representation.  Perhaps in a future version you could have
 compress/uncompress methods sort of like GIST, but for a first cut it
 seems highly desirable to be able to say something like:

 CREATE TYPE int_range AS RANGE (BASETYPE = int);

 I hear you complaining that we don't know the values you've called
 dtype, cmpfunc, addfunc, and subfunc.  It seems pretty reasonable to
 extract cmpfunc, if unspecified, from the default btree opclass for
 the data type.  For the rest, I'm inclined to propose that we support
 something like:

 ALTER TYPE timestamptz
    ADD INTERFACE increment timestamp_pl_interval(timestamptz, interval),
    ADD INTERFACE decrement timestamp_mi_interval(timestamptz, interval);

 or

 ALTER TYPE integer
   ADD INTERFACE increment int4pl (integer, integer),
   ADD INTERFACE decrement int4mi (integer, integer),
   ADD VALUE addative_unit 1::integer,
   ADD VALUE addative_identity 0::integer;

 IIRC, Window functions need this information too, so there's value in
 associating it with the base type, even if we want to allow users to
 override it when creating ranges.

Yeah, window functions will require 'how to add or subtract value' of
ORDER BY expression data type to search window frame boundary when we
want to support RANGE BETWEEN frame option. I tried to retrieve the
information by hard-coding '+'/'-' to get it from pg_oper in the
previous patch, but actually we found out we need more general way
like above. This will help it. But I don't have blue-print of catalog
format for it yet, between knn gist and range types.

Regards,


-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Andrew Dunstan



On 10/26/2010 03:02 AM, Dean Rasheed wrote:

In mathematics (and I think also computer science), the term
conventionally used the refer to the things in an enumeration is
element, so how about ADD ELEMENT?


Unlike the other suggestions, ELEMENT is not currently a keyword. That 
doesn't rule it out entirely, but it's a factor worth consideration.



The label is just one of the ways of identifying the element, and the
value is element's OID. The thing you're adding is an element, with
both a label and a value.



No, I think that's the wrong way of thinking about it entirely. The 
label *is* the value and the OID is simply an implementation detail, 
which for the most part we keep completely hidden from the user. We 
could have implemented enums in ways that did not involve OIDs at all, 
with identical semantics.


Notwithstanding the above, I don't think ELEMENT would be a very bad choice.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Rollback sequence reset on TRUNCATE rollback patch

2010-10-26 Thread Steve Singer
The attached patch modifies TRUNCATE ... RESTART IDENTITY so that if the 
transaction rolls back the restart of the sequence will also be rolled back.


It follows the general outline discussed at 
http://archives.postgresql.org/pgsql-hackers/2008-05/msg00550.php of 
assigning a new reffilenode to the sequence.



I will add this to the next commitfest.


Steve
diff --git a/doc/src/sgml/ref/truncate.sgml b/doc/src/sgml/ref/truncate.sgml
index f32d255..137eade 100644
*** a/doc/src/sgml/ref/truncate.sgml
--- b/doc/src/sgml/ref/truncate.sgml
*** TRUNCATE [ TABLE ] [ ONLY ] replaceable
*** 159,190 
 transaction does not commit.
/para
  
-   warning
-para
- Any commandALTER SEQUENCE RESTART/ operations performed as a
- consequence of using the literalRESTART IDENTITY/ option are
- nontransactional and will not be rolled back on failure.  To minimize
- the risk, these operations are performed only after all the rest of
- commandTRUNCATE/'s work is done.  However, there is still a risk
- if commandTRUNCATE/ is performed inside a transaction block that is
- aborted afterwards.  For example, consider
  
- programlisting
- BEGIN;
- TRUNCATE TABLE foo RESTART IDENTITY;
- COPY foo FROM ...;
- COMMIT;
- /programlisting
- 
- If the commandCOPY/ fails partway through, the table data
- rolls back correctly, but the sequences will be left with values
- that are probably smaller than they had before, possibly leading
- to duplicate-key failures or other problems in later transactions.
- If this is likely to be a problem, it's best to avoid using
- literalRESTART IDENTITY/, and accept that the new contents of
- the table will have higher serial numbers than the old.
-/para
-   /warning
   /refsect1
  
   refsect1
--- 159,165 
diff --git a/src/backend/commands/sequence.c b/src/backend/commands/sequence.c
index 04b0c71..4fb9093 100644
*** a/src/backend/commands/sequence.c
--- b/src/backend/commands/sequence.c
***
*** 35,41 
  #include utils/lsyscache.h
  #include utils/resowner.h
  #include utils/syscache.h
! 
  
  /*
   * We don't want to log each fetching of a value from a sequence,
--- 35,41 
  #include utils/lsyscache.h
  #include utils/resowner.h
  #include utils/syscache.h
! #include utils/snapmgr.h
  
  /*
   * We don't want to log each fetching of a value from a sequence,
*** static void init_params(List *options, b
*** 96,101 
--- 96,104 
  static void do_setval(Oid relid, int64 next, bool iscalled);
  static void process_owned_by(Relation seqrel, List *owned_by);
  
+ static void init_seq_relation(Relation rel,TupleDesc tupDesc,Datum * value,
+ 			  bool * null,List * owned_by);
+ 
  
  /*
   * DefineSequence
*** DefineSequence(CreateSeqStmt *seq)
*** 109,119 
  	CreateStmt *stmt = makeNode(CreateStmt);
  	Oid			seqoid;
  	Relation	rel;
! 	Buffer		buf;
! 	Page		page;
! 	sequence_magic *sm;
! 	HeapTuple	tuple;
! 	TupleDesc	tupDesc;
  	Datum		value[SEQ_COL_LASTCOL];
  	bool		null[SEQ_COL_LASTCOL];
  	int			i;
--- 112,118 
  	CreateStmt *stmt = makeNode(CreateStmt);
  	Oid			seqoid;
  	Relation	rel;
! 	TupleDesc tupDesc;
  	Datum		value[SEQ_COL_LASTCOL];
  	bool		null[SEQ_COL_LASTCOL];
  	int			i;
*** DefineSequence(CreateSeqStmt *seq)
*** 210,217 
  
  	rel = heap_open(seqoid, AccessExclusiveLock);
  	tupDesc = RelationGetDescr(rel);
  
! 	/* Initialize first page of relation with special magic number */
  
  	buf = ReadBuffer(rel, P_NEW);
  	Assert(BufferGetBlockNumber(buf) == 0);
--- 209,293 
  
  	rel = heap_open(seqoid, AccessExclusiveLock);
  	tupDesc = RelationGetDescr(rel);
+ 	init_seq_relation(rel, tupDesc,value,null,owned_by);
+ 	heap_close(rel, NoLock);
+ }
  
! /**
!  * Resets the relation used by a sequence.
!  *
!  * The sequence is reset to its initial values,
!  * the old sequence is left in place in case the
!  * transaction rolls back.
!  */
! void ResetSequenceRelation(Oid seq_relid,List * options)
! {
! 	Relation seq_rel = relation_open(seq_relid,AccessExclusiveLock);
! 	SeqTable elm;
! 	Page page;
! 	Form_pg_sequence seq;
! 	Buffer buf;
! 	TupleDesc tupDesc;
! 	sequence_magic * sm;
! 	HeapTupleData tuple;
! 	ItemId lp;
! 	Datum * values;
! 	bool * isnull;
! 
! 	/**
! 	 * Read the old sequence.
! 	 *
! 	 */
! 	init_sequence(seq_relid,elm,seq_rel);
! 	seq = read_info(elm,seq_rel,buf);
! 	page = BufferGetPage(buf);
! 	sm = (sequence_magic *) PageGetSpecialPointer(page);
! 
! 	if (sm-magic != SEQ_MAGIC)
! 		elog(ERROR, bad magic number in sequence \%s\: %08X,
! 			 RelationGetRelationName(seq_rel), sm-magic);
! 
! 	lp = PageGetItemId(page, FirstOffsetNumber);
! 	Assert(ItemIdIsNormal(lp));
! 
! 	tuple.t_data = (HeapTupleHeader) PageGetItem(page, lp);
! 	tupDesc = RelationGetDescr(seq_rel);
! 	values=palloc(sizeof(Datum)*tupDesc-natts);
! 	isnull=palloc(sizeof(bool)*tupDesc-natts);
! 	

Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 9:54 AM, Andrew Dunstan and...@dunslane.net wrote:


 On 10/26/2010 03:02 AM, Dean Rasheed wrote:

 In mathematics (and I think also computer science), the term
 conventionally used the refer to the things in an enumeration is
 element, so how about ADD ELEMENT?

 Unlike the other suggestions, ELEMENT is not currently a keyword. That
 doesn't rule it out entirely, but it's a factor worth consideration.

 The label is just one of the ways of identifying the element, and the
 value is element's OID. The thing you're adding is an element, with
 both a label and a value.


 No, I think that's the wrong way of thinking about it entirely. The label
 *is* the value and the OID is simply an implementation detail, which for the
 most part we keep completely hidden from the user. We could have implemented
 enums in ways that did not involve OIDs at all, with identical semantics.

 Notwithstanding the above, I don't think ELEMENT would be a very bad choice.

I still think we should just go for LABEL and be done with it.  But
y'all can ignore me if you want...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Alvaro Herrera
Excerpts from Andrew Dunstan's message of mar oct 26 10:54:59 -0300 2010:

 On 10/26/2010 03:02 AM, Dean Rasheed wrote:
  In mathematics (and I think also computer science), the term
  conventionally used the refer to the things in an enumeration is
  element, so how about ADD ELEMENT?
 
 Unlike the other suggestions, ELEMENT is not currently a keyword. That 
 doesn't rule it out entirely, but it's a factor worth consideration.

It can be added as an unreserved keyword, as in the attached patch.

I also like ELEMENT better than the other suggestions, so I'm gonna
commit this unless there are objections.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


0001-Change-syntax-to-add-a-new-enum-value-to-ALTER-TYPE-.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Andrew Dunstan's message of mar oct 26 10:54:59 -0300 2010:
 Unlike the other suggestions, ELEMENT is not currently a keyword. That 
 doesn't rule it out entirely, but it's a factor worth consideration.

 It can be added as an unreserved keyword, as in the attached patch.

 I also like ELEMENT better than the other suggestions, so I'm gonna
 commit this unless there are objections.

I definitely vote *against* adding a new keyword for this, unreserved or
otherwise.  Every keyword we add bloats the bison parser by some
fractional amount, costing performance across the board.

While I'm not very thrilled with LABEL, it at least has the virtue that
we already paid the price for it.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Jeff Janes
On Mon, Oct 25, 2010 at 6:31 PM, Robert Haas robertmh...@gmail.com wrote:
 On Fri, Oct 22, 2010 at 3:08 PM, fazool mein fazoolm...@gmail.com wrote:
 I'm writing a function that will read data from the buffer in xlog (i.e.
 from XLogCtl-pages and XLogCtl-xlblocks). I want to make sure that I am
 doing it correctly.
 For reading from the buffer, do I need to lock WALInsertLock or
 WALWriteLock? Also, can you explain a bit the usage of 'LW_SHARED'. Can we
 use it for read purposes?

 Holding WALInsertLock in shared mode prevents other processes from
 inserting WAL, or in other words it keeps the end position from
 moving, while holding WALWriteLock in shared mode prevents other
 processes from writing the WAL from the buffers out to the operating
 system, or in other words it keeps the start position from moving.
 So you could probably take WALInsertLock in shared mode, figure out
 the current end of WAL position, release the lock;

Once you release the WALInsertLock, someone else can grab it and
overwrite the part of the buffer you think you are reading.
So I think you have to hold WALInsertLock throughout the duration of
the operation.

Of course it couldn't be overwritten if the wal record itself is not
yet written from buffer to the OS/disk.  But since you are not yet
holding the WALWriteLock, this could be happening at any time.

 then take
 WALWriteLock in shared mode, read any WAL before the end of WAL
 position, and release the lock.  But note that this wouldn't guarantee
 that you read all WAL as it's generated

I don't think that holding WALWriteLock accomplishes much.  It
prevents part of the buffer from being written out to OS/disk, and
thus becoming eligible for being overwritten in the buffer, but the
WALInsertLock prevents it from actually being overwritten.  And what
if the part of the buffer you want to read was already eligible for
overwriting but not yet actually overwritten?  WALWriteLock won't
allow you to safely access it, but WALInsertLock will (assuming you
have a safe way to identify the record in the first place).  For
either case, holding it in shared mode would be sufficient.


Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Alvaro Herrera
Excerpts from Jeff Janes's message of mar oct 26 12:22:38 -0300 2010:

 I don't think that holding WALWriteLock accomplishes much.  It
 prevents part of the buffer from being written out to OS/disk, and
 thus becoming eligible for being overwritten in the buffer, but the
 WALInsertLock prevents it from actually being overwritten.  And what
 if the part of the buffer you want to read was already eligible for
 overwriting but not yet actually overwritten?  WALWriteLock won't
 allow you to safely access it, but WALInsertLock will (assuming you
 have a safe way to identify the record in the first place).  For
 either case, holding it in shared mode would be sufficient.

And horrible for performance, I imagine.  Those locks are highly trafficked.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add label to enum syntax

2010-10-26 Thread David E. Wheeler
On Oct 26, 2010, at 7:15 AM, Robert Haas wrote:

 Notwithstanding the above, I don't think ELEMENT would be a very bad choice.
 
 I still think we should just go for LABEL and be done with it.  But
 y'all can ignore me if you want...

+1

David


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Jeff Janes's message of mar oct 26 12:22:38 -0300 2010:
 I don't think that holding WALWriteLock accomplishes much.  It
 prevents part of the buffer from being written out to OS/disk, and
 thus becoming eligible for being overwritten in the buffer, but the
 WALInsertLock prevents it from actually being overwritten.  And what
 if the part of the buffer you want to read was already eligible for
 overwriting but not yet actually overwritten?  WALWriteLock won't
 allow you to safely access it, but WALInsertLock will (assuming you
 have a safe way to identify the record in the first place).  For
 either case, holding it in shared mode would be sufficient.

 And horrible for performance, I imagine.  Those locks are highly trafficked.

I think you might actually need *both* locks to ensure the WAL buffers
aren't changing underneath you.  If you don't have the walwriter locked
out, it is free to change the state of a buffer from dirty to
written and then to prepared to receive next page of WAL.  If the
latter doesn't involve changing the content of the buffer today, it
still could tomorrow.

And on top of all that, there remains the problem that the piece of WAL
you want might already be gone from the buffers.

Might I suggest adopting the same technique walsender does, ie just read
the data back from disk?  There's a reason why we gave up trying to have
walsender read directly from the buffers.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign keys for array/period contains relationships

2010-10-26 Thread Peter Eisentraut
On mån, 2010-10-25 at 22:10 +0200, Pavel Stehule wrote:
 2010/10/25 Robert Haas robertmh...@gmail.com:
  Example #1: Foreign key side is an array, every member must match some
  PK.
 
  CREATE TABLE pk (a int PRIMARKY KEY, ...);
 
  CREATE TABLE fk (x int[] REFERENCES pk (a), ...);
 
 What about optimalizations and planning?

Some work will be required to get the respective checking queries to do
the right think fast, but it's solvable in principle.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign keys for array/period contains relationships

2010-10-26 Thread Peter Eisentraut
On mån, 2010-10-25 at 17:38 -0700, Jeff Davis wrote:
  Implementing the foreign key side of this merely requires the system
 to
  have some knowledge of the required contains operator, which it
 does
  in the array case, and something can surely be arranged for the
 range
  case.  The problem is you can't do cascading updates or deletes, but
 you
  could do on update/delete restrict, which is still useful.
 
 Why can't you do cascading updates/deletes?

Let's say you have

PK

1
2
3
4
5

FK

[1,2,3]
[3,4,5]
[4,4,4]

When you delete PK = 3, what do you  expect to happen?  OK, you might
decide to delete the first two rows from the FK table.  This might or
might not make sense in a particular case, but on delete cascade is an
option the user has to choose explicitly.  But I don't see what to do
about cascading an update when you, say, update PK 1 = 6.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types, discrete and/or continuous

2010-10-26 Thread Jeff Davis
On Mon, 2010-10-25 at 21:07 -0400, Robert Haas wrote:
 See, that gets complicated, because now you're restricting the range
 of values that can be expressed by the range type to something less
 than the natural range of the data type.  I am not sure the value of
 supporting that is sufficient to justify the amount of extra code that
 will be required to make it work.  I'd say for a first version, nail
 down the representation.  Perhaps in a future version you could have
 compress/uncompress methods sort of like GIST,

OK, I can live with that. 

 ALTER TYPE timestamptz
 ADD INTERFACE increment timestamp_pl_interval(timestamptz, interval),
 ADD INTERFACE decrement timestamp_mi_interval(timestamptz, interval);

I think we chatted about this before. Sounds like a good idea to me
(except the name -- increment is not the same as plus).

However, this is orthogonal, I think. I can always ask the user to
specify everything when creating a Range Type, and then we can make them
default to use the interface functions later. Some, like plus might be
constant, but people certainly might want to specify alternate
comparators.

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign keys for array/period contains relationships

2010-10-26 Thread Peter Eisentraut
On mån, 2010-10-25 at 17:57 -0700, Greg Stark wrote:
 Well if you lock multiple records then it's not clear what operations
 you should conflict with. Removing any one of them wouldn't actually
 invalidate the foreign key reference unless you remove the last one.
 
 I always assumed this was why we require the unique constraint at all.

I did mention that you would need an exclusion constraint in some of the
cases, to get an effect analogous to unique constraints.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types, discrete and/or continuous

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 1:26 PM, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2010-10-25 at 21:07 -0400, Robert Haas wrote:
 See, that gets complicated, because now you're restricting the range
 of values that can be expressed by the range type to something less
 than the natural range of the data type.  I am not sure the value of
 supporting that is sufficient to justify the amount of extra code that
 will be required to make it work.  I'd say for a first version, nail
 down the representation.  Perhaps in a future version you could have
 compress/uncompress methods sort of like GIST,

 OK, I can live with that.

 ALTER TYPE timestamptz
     ADD INTERFACE increment timestamp_pl_interval(timestamptz, interval),
     ADD INTERFACE decrement timestamp_mi_interval(timestamptz, interval);

 I think we chatted about this before. Sounds like a good idea to me
 (except the name -- increment is not the same as plus).

 However, this is orthogonal, I think. I can always ask the user to
 specify everything when creating a Range Type, and then we can make them
 default to use the interface functions later. Some, like plus might be
 constant, but people certainly might want to specify alternate
 comparators.

If it were me, I would go design and implement the type interface part
first.   But it's not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types, discrete and/or continuous

2010-10-26 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Oct 26, 2010 at 1:26 PM, Jeff Davis pg...@j-davis.com wrote:
 However, this is orthogonal, I think. I can always ask the user to
 specify everything when creating a Range Type, and then we can make them
 default to use the interface functions later. Some, like plus might be
 constant, but people certainly might want to specify alternate
 comparators.

 If it were me, I would go design and implement the type interface part
 first.   But it's not.

I agree with Jeff's plan: seems like taking a first cut at the higher
level is worthwhile, to make sure you know what you need from the
type-system interfaces.

FWIW, I don't agree with the proposed syntax.  We already have a
perfectly extensible CREATE TYPE syntax, so there is no reason to
implement this as an ALTER TYPE operation.  What's more, altering
existing datatype declarations is fraught with all kinds of fun
risks, as we were reminded with the recent enum patch.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simplifying replication

2010-10-26 Thread Josh Berkus

 What happens if max_wal_size is less than checkpoint_segments?
 Currently a checkpoint tries to leave WAL files which were generated
 from the prior ckpt start to current ckpt end. Because those WAL files
 are required for crash recovery. But we should delete some of them
 according to max_wal_size?

The ideas is that max_wal_size would *replace* checkpoint_segments.  The
checkpoint_segments setting is baffling to most PG DBAs.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types, discrete and/or continuous

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 1:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Oct 26, 2010 at 1:26 PM, Jeff Davis pg...@j-davis.com wrote:
 However, this is orthogonal, I think. I can always ask the user to
 specify everything when creating a Range Type, and then we can make them
 default to use the interface functions later. Some, like plus might be
 constant, but people certainly might want to specify alternate
 comparators.

 If it were me, I would go design and implement the type interface part
 first.   But it's not.

 I agree with Jeff's plan: seems like taking a first cut at the higher
 level is worthwhile, to make sure you know what you need from the
 type-system interfaces.

 FWIW, I don't agree with the proposed syntax.  We already have a
 perfectly extensible CREATE TYPE syntax, so there is no reason to
 implement this as an ALTER TYPE operation.  What's more, altering
 existing datatype declarations is fraught with all kinds of fun
 risks, as we were reminded with the recent enum patch.

Fair enough.  I'm not wedded to the syntax (or the order of development).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread fazool mein
 Might I suggest adopting the same technique walsender does, ie just read
 the data back from disk?  There's a reason why we gave up trying to have
 walsender read directly from the buffers.


That is exactly what I do not want to do, i.e. read from disk, as long as
the piece of WAL is available in the buffers. Can you please describe why
walsender reading directly from the buffers was given up? To avoid a lot of
locking?
The locking issue might not be a problem considering synchronous
replication. In synchronous replication, the primary will anyways wait for
the standby to send a confirmation before it can do more WAL inserts. Hence,
reading from buffers might be better in this case.

So, as I understand from the emails, we need to lock both WALWriteLock and
WALInsertLock in exclusive mode for reading from buffers. Agreed?

Thanks.


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Heikki Linnakangas

On 26.10.2010 21:03, fazool mein wrote:

Might I suggest adopting the same technique walsender does, ie just read
the data back from disk?  There's a reason why we gave up trying to have
walsender read directly from the buffers.


That is exactly what I do not want to do, i.e. read from disk, as long as
the piece of WAL is available in the buffers.


Why not? If the reason is performance, I'd like to see some performance 
numbers to show that it's worth the trouble. You could perhaps do a 
quick and dirty hack that doesn't do the locking 100% correctly first, 
and do some benchmarking on that to get a ballpark number of how much 
potential there is. Or run oprofile on the current walsender 
implementation to see how much time is currently spent reading WAL from 
the kernel buffers.



Can you please describe why
walsender reading directly from the buffers was given up? To avoid a lot of
locking?


To avoid locking yes, and complexity in general.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] security label error message

2010-10-26 Thread Peter Eisentraut
Isn't this error message logically backwards?

=# SECURITY LABEL ON SCHEMA public IS NULL;
ERROR:  22023: security label providers have been loaded



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 2:13 PM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 26.10.2010 21:03, fazool mein wrote:

 Might I suggest adopting the same technique walsender does, ie just read
 the data back from disk?  There's a reason why we gave up trying to have
 walsender read directly from the buffers.

 That is exactly what I do not want to do, i.e. read from disk, as long as
 the piece of WAL is available in the buffers.

 Why not? If the reason is performance, I'd like to see some performance
 numbers to show that it's worth the trouble. You could perhaps do a quick
 and dirty hack that doesn't do the locking 100% correctly first, and do some
 benchmarking on that to get a ballpark number of how much potential there
 is. Or run oprofile on the current walsender implementation to see how much
 time is currently spent reading WAL from the kernel buffers.

 Can you please describe why
 walsender reading directly from the buffers was given up? To avoid a lot
 of
 locking?

 To avoid locking yes, and complexity in general.

And the fact that it might allow the standby to get ahead of the
master, leading to silent database corruption.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] EOCF

2010-10-26 Thread David Fetter
Folks,

I just realized I hadn't closed out the commitfest earlier.  Have done
so.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Dean Rasheed
On 26 October 2010 17:04, David E. Wheeler da...@kineticode.com wrote:
 On Oct 26, 2010, at 7:15 AM, Robert Haas wrote:

 Notwithstanding the above, I don't think ELEMENT would be a very bad choice.

 I still think we should just go for LABEL and be done with it.  But
 y'all can ignore me if you want...

 +1


Well ELEMENT is a reserved keyword in SQL:2008, to support multisets,
so if we ever supported that feature...

But I don't feel strongly about this. I think the overall consensus so
far is in favour of LABEL.

Regards,
Dean

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign keys for array/period contains relationships

2010-10-26 Thread Jeff Davis
On Tue, 2010-10-26 at 20:25 +0300, Peter Eisentraut wrote:
 Let's say you have
 
 PK
 
 1
 2
 3
 4
 5
 
 FK
 
 [1,2,3]
 [3,4,5]
 [4,4,4]
 
 When you delete PK = 3, what do you  expect to happen?  OK, you might
 decide to delete the first two rows from the FK table.  This might or
 might not make sense in a particular case, but on delete cascade is an
 option the user has to choose explicitly.

That's what I would expect.

 But I don't see what to do
 about cascading an update when you, say, update PK 1 = 6.

Intuitively, I would expect all 1's to be replaced by 6's in all arrays.
But I can now see why you would be hesitant to try to support that.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign keys for array/period contains relationships

2010-10-26 Thread Jeff Davis
On Mon, 2010-10-25 at 17:57 -0700, Greg Stark wrote:
 On Mon, Oct 25, 2010 at 5:24 PM, Jeff Davis pg...@j-davis.com wrote:
  I think that's easier when the PK must contain the FK, because then you
  only need to lock one record. Even when you need to lock multiple
  records, it seems feasible, and is just an index lookup, right? Do you
  see a particular problem?
 
 Well if you lock multiple records then it's not clear what operations
 you should conflict with. Removing any one of them wouldn't actually
 invalidate the foreign key reference unless you remove the last one.

I didn't word my statement clearly. If the PK contains the FK, and you
have an Exclusion Constraint on the PK (as Peter suggested), then you
only need to lock one record. I think that logic would be pretty much
the same as a normal FK.

The case where you might need to lock multiple records is when the FK
contains the PK (case #1 in Peter's original email). But in that case,
you would still have a UNIQUE constraint on the PK (right?) and removing
any referenced element should cause a conflict.

Case #2 is the strange one, I think. It's not actually just an
adaptation of #1. #1 requires that all elements of the array have a
corresponding PK value; but #2 just requires that one of them does.
Peter, can you clarify case #2? Did you have a use case in mind?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensible executor nodes for preparation of SQL/MED

2010-10-26 Thread Greg Stark
On Mon, Oct 25, 2010 at 8:28 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 But it might be a good change anyway from a performance standpoint,
 in case a call through a function pointer is faster than a big switch.
 Have you tried benchmarking it on common platforms?

I've always wondered why we didn't use function pointers. It seems
like it would make the code a lot easier to maintain with fewer places
that need to be updated every time we add a node.

But I always assumed a big part of the reason was performance.
Generally compilers hate optimizing code using function pointers. They
pretty much kill any inter-procedural optimization since it's very
hard to figure out what set of functions you might have called and
make any deductions about what side effects those functions might or
might not have had. Even at the chip level function pointers tend to
be slow since they cause full pipeline stalls where the processor has
no idea where the next instruction is coming from until it's finished
loading the data from the previous instruction.

On the other hand of course it's not obvious what algorithm gcc should
use to implement largish switch statements like these. It might be
doing a fairly large number of operations doing a binary search or
hash lookup to determine which branch to take.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security label error message

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 2:20 PM, Peter Eisentraut pete...@gmx.net wrote:
 Isn't this error message logically backwards?

 =# SECURITY LABEL ON SCHEMA public IS NULL;
 ERROR:  22023: security label providers have been loaded

Ouch.  How embarrassing.  Fixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread fazool mein
On Tue, Oct 26, 2010 at 11:23 AM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Oct 26, 2010 at 2:13 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 
  Can you please describe why
  walsender reading directly from the buffers was given up? To avoid a lot
  of
  locking?
 
  To avoid locking yes, and complexity in general.

 And the fact that it might allow the standby to get ahead of the
 master, leading to silent database corruption.


I agree that the standby might get ahead, but this doesn't necessarily lead
to database corruption. Here, the interesting case is what happens when the
primary fails, which can lead to *either* of the following two cases:
1) The standby, due to some triggering mechanism, becomes the new primary.
In this case, even if the standby was ahead, its fine.
2) The primary comes back as primary. In this case, the standby will connect
again to the primary. At this point, *if* somehow we are able to detect that
the standby is ahead, then we should abort the standby and create a standby
from scratch.

I agree with Heikki that going through all this trouble only makes sense if
there is a huge performance boost.


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 2:57 PM, fazool mein fazoolm...@gmail.com wrote:

 On Tue, Oct 26, 2010 at 11:23 AM, Robert Haas robertmh...@gmail.com wrote:

 On Tue, Oct 26, 2010 at 2:13 PM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 
  Can you please describe why
  walsender reading directly from the buffers was given up? To avoid a
  lot
  of
  locking?
 
  To avoid locking yes, and complexity in general.

 And the fact that it might allow the standby to get ahead of the
 master, leading to silent database corruption.


 I agree that the standby might get ahead, but this doesn't necessarily lead
 to database corruption. Here, the interesting case is what happens when the
 primary fails, which can lead to *either* of the following two cases:
 1) The standby, due to some triggering mechanism, becomes the new primary.
 In this case, even if the standby was ahead, its fine.

True.

 2) The primary comes back as primary. In this case, the standby will connect
 again to the primary. At this point, *if* somehow we are able to detect that
 the standby is ahead, then we should abort the standby and create a standby
 from scratch.

Unless you set restart_after_crash=off, the master could
crash-and-restart before you can do anything about it.  But that
doesn't exist in the 9.0 branch.

 I agree with Heikki that going through all this trouble only makes sense if
 there is a huge performance boost.

There's probably quite a large performance boost in the sync rep case
from allowing the master and standby to fsync() in parallel, but first
we need to get something that works at all.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Josh Berkus

 I agree that the standby might get ahead, but this doesn't necessarily
 lead to database corruption. Here, the interesting case is what happens
 when the primary fails, which can lead to *either* of the following two
 cases:
 1) The standby, due to some triggering mechanism, becomes the new
 primary. In this case, even if the standby was ahead, its fine.
 2) The primary comes back as primary. In this case, the standby will
 connect again to the primary. At this point, *if* somehow we are able to
 detect that the standby is ahead, then we should abort the standby and
 create a standby from scratch.

Yes.  And we weren't able to implement that for 9.0.  It's worth
revisiting for 9.1.  In fact, the issue of is the standby ahead of the
master has come up repeatedly in potential failure scenarios; I think
we're going to need a fairly bulletproof method to determine this.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] xlog.c: WALInsertLock vs. WALWriteLock

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 3:00 PM, Josh Berkus j...@agliodbs.com wrote:

 I agree that the standby might get ahead, but this doesn't necessarily
 lead to database corruption. Here, the interesting case is what happens
 when the primary fails, which can lead to *either* of the following two
 cases:
 1) The standby, due to some triggering mechanism, becomes the new
 primary. In this case, even if the standby was ahead, its fine.
 2) The primary comes back as primary. In this case, the standby will
 connect again to the primary. At this point, *if* somehow we are able to
 detect that the standby is ahead, then we should abort the standby and
 create a standby from scratch.

 Yes.  And we weren't able to implement that for 9.0.  It's worth
 revisiting for 9.1.  In fact, the issue of is the standby ahead of the
 master has come up repeatedly in potential failure scenarios; I think
 we're going to need a fairly bulletproof method to determine this.

Agreed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] foreign keys for array/period contains relationships

2010-10-26 Thread Peter Eisentraut
On tis, 2010-10-26 at 11:53 -0700, Jeff Davis wrote:
 Case #2 is the strange one, I think. It's not actually just an
 adaptation of #1. #1 requires that all elements of the array have a
 corresponding PK value; but #2 just requires that one of them does.
 Peter, can you clarify case #2? Did you have a use case in mind?

[ That's the period references timestamp case. ]

You're right, it's probably not useful.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensible executor nodes for preparation of SQL/MED

2010-10-26 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Mon, Oct 25, 2010 at 8:28 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 But it might be a good change anyway from a performance standpoint,
 in case a call through a function pointer is faster than a big switch.
 Have you tried benchmarking it on common platforms?

 I've always wondered why we didn't use function pointers. It seems
 like it would make the code a lot easier to maintain with fewer places
 that need to be updated every time we add a node.

 But I always assumed a big part of the reason was performance.
 Generally compilers hate optimizing code using function pointers. They
 pretty much kill any inter-procedural optimization since it's very
 hard to figure out what set of functions you might have called and
 make any deductions about what side effects those functions might or
 might not have had. Even at the chip level function pointers tend to
 be slow since they cause full pipeline stalls where the processor has
 no idea where the next instruction is coming from until it's finished
 loading the data from the previous instruction.

At least in the case of the plan-node-related switches, the called
functions tend to be big and ugly enough that it's really hard to credit
any meaningful inter-procedural optimization could happen.  Side effects
would have to be pretty much everything.

 On the other hand of course it's not obvious what algorithm gcc should
 use to implement largish switch statements like these. It might be
 doing a fairly large number of operations doing a binary search or
 hash lookup to determine which branch to take.

I confess to not having actually examined the assembly code anytime
recently, but I'd always supposed it would be an array-lookup anytime
the set of case labels was reasonably dense, which it should be in these
cases.  So I'm not sure I believe the pipeline stall argument either.
Any way you slice it, there's going to be a call to a function that
is going to be really really hard to predict in advance --- unless of
course you rely on statistics like where did I jump to the last few
times I was here, which I think modern CPUs do have the ability to
do.

But this is all just arm-waving of course.  Benchmarks would be a lot
more convincing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ask for review of MERGE

2010-10-26 Thread Simon Riggs
On Mon, 2010-10-25 at 16:58 -0400, Robert Haas wrote:
 On Mon, Oct 25, 2010 at 4:10 PM, Greg Stark gsst...@mit.edu wrote:
  On Mon, Oct 25, 2010 at 12:40 PM, Robert Haas robertmh...@gmail.com wrote:
  Now, as Greg says, that might be what some people want, but it's
  certainly monumentally unserializable.
 
  To be clear when I said it's what people want what I meant was that in
  the common cases it's doing exactly what people want. As opposed to
  getting closer to what people want in general but not quite hitting
  the mark in the common cases.
 
  Just as an example I think it's important that in the simplest case,
  upsert of a single record, it be 100% guaranteed to do the naive
  upsert. If two users are doing the merge of a single key at the same
  time one of them had better insert and one of them had better update
  or else users are going to be monumentally surprised.
 
 Hmm, so let's think about that case.
 
 The first merge comes along and finds no match so it fires the NOT
 MATCHED rule, which inserts a tuple.  The second merge comes along and
 finds no match, so it also fires the NOT MATCHED rule and tries to
 insert a tuple.  But upon consulting the PRIMARY KEY index it finds
 that an in-doubt tuple exists so it goes to sleep waiting for the
 first transaction to commit or abort.  If the first transaction
 commits it then decides that the jig is up and fails.  We could
 (maybe) fix this by doing something similar to what EPQ does for
 updates: when the first transaction commits, instead of redoing the
 insert, we back up and recheck whether the new tuple would have
 matched the join clause and, if so, we instead fire the MATCHED action
 on the updated tuple.  If not, we fire NOT MATCHED anyway.  I'm not
 sure how hard that would be, or whether it would introduce any other
 nasty anomalies in more complex cases.
 
 Alternatively, we could introduce an UPSERT or REPLACE statement
 intended to handle exactly this case and leave MERGE for more complex
 situations.  It's pretty easy to imagine what the coding of that
 should look like: if we encounter an in-doubt tuple in we wait on its
 xmin.  If the transaction aborts, we insert.  If it commits, and we're
 in READ COMMITTED mode, we update it; but if we're in REPEATABLE READ
 or SERIALIZABLE mode, we abort with a serialization error.  That's a
 lot simpler to understand and reason about than MERGE in its full
 generality.
 
 I think it's pretty much hopeless to think that MERGE is going to work
 in complex concurrent scenarios without creating serialization
 anomalies, or at least rollbacks.  I think that's baked into the
 nature of what the statement does.  To simulate MERGE, you need to
 read from the target table and then do writes that depend on what you
 read.  If you do that with the commands that are available today,
 you're going to get serialization anomalies and/or rollbacks under
 concurrency.  The mere fact of that logic being inside the database
 rather than outside isn't going to make that go away.  Now sometimes,
 as with exclusion constraints, you can play games with dirty snapshots
 to get the semantics you want, but whether that's possible in a
 particular case depends on the details of the operation being
 performed, and here I think it can't be done.  Some operations are
 *fundamentally* unserializable.

I agree with your analysis. Let me review...

There is a case that locking alone won't resolve, however that locking
works. The transaction history for that is:

T1: takes snapshot
T2: INSERT row1
T2: COMMIT;
T1: attempts to determine if MATCHED or NOT MATCHED. 

The answer is neither of those two answers. If we say it is NOT MATCHED
then we will just fail on any INSERT, or if we say it is MATCHED then
technically we can't see the row so the UPDATE should fail. The COMMIT
of T2 releases any locks that would have helped resolve this, and note
that even if T1 attempts to lock that row as early as possible, only a
table level lock would prevent T2 from doing that action.

Two options for resolution are

1) Throw SERIALIZABLE error

2) Implement something similar to EvalPlanQual
As you say, we already resolve this situation for concurrent updates by
following the update chain from a row that is visible to the latest row.
For MERGE the semantics need to be subtely different here: we need to
follow the update chain to the latest row, but from a row that we
*can't* see.

MERGE is still very useful without the need for 2), though fails in some
cases for concurrent use. The failure rate would increase as the number
of concurrent MERGErs and/or number of rows in source table. Those
errors are no more serious than are possible now.

So IMHO we should implement 1) now and come back later to implement 2).
Given right now there may be other issues with 2) it seems unsafe to
rely on the assumption that we'll fix them by end of release.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, 

Re: [HACKERS] Simplifying replication

2010-10-26 Thread Simon Riggs
On Thu, 2010-10-21 at 20:57 -0400, Robert Haas wrote:

 Very true.  But the lack of a -1 setting for wal_keep_segments means
 that if you would like to take a backup without archiving, you must
 set wal_keep_segments to a value greater than or equal to the rate at
 which you generate WAL segments multiplied by the time it takes you to
 run a backup.  If that doesn't qualify as requiring arcane knowledge,
 I'm mystified as to what ever could.

People are missing the point here:

You have to put the WAL files *somewhere* while you do the base backup.
PostgreSQL can't itself work out where that is, nor can it work out
ahead of time how big it will need to be, since it is up to you how you
do your base backup. Setting a parameter to -1 doesn't make the problem
go away, it just pretends and hopes it doesn't exist, but screws you
badly if you do hit the wall. 

My view is that is irresponsible, even if I share people's wish that the
problem did not exist.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ask for review of MERGE

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 8:10 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I agree with your analysis. Let me review...
 [review]

Sounds like we're on the same page.

 Two options for resolution are

 1) Throw SERIALIZABLE error

 2) Implement something similar to EvalPlanQual
 As you say, we already resolve this situation for concurrent updates by
 following the update chain from a row that is visible to the latest row.
 For MERGE the semantics need to be subtely different here: we need to
 follow the update chain to the latest row, but from a row that we
 *can't* see.

 MERGE is still very useful without the need for 2), though fails in some
 cases for concurrent use. The failure rate would increase as the number
 of concurrent MERGErs and/or number of rows in source table. Those
 errors are no more serious than are possible now.

 So IMHO we should implement 1) now and come back later to implement 2).
 Given right now there may be other issues with 2) it seems unsafe to
 rely on the assumption that we'll fix them by end of release.

Yeah.  In fact, I'm not sure we're ever going to want to implement #2
- I think that needs more study to determine whether there's even
something there that makes sense to implement at all.  But certainly I
wouldn't count on it happening for 9.1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] crash in plancache with subtransactions

2010-10-26 Thread Tom Lane
I wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 One simple idea is to keep a flag along with the executor state to 
 indicate that the executor state is currently in use. Set it just before 
 calling ExecEvalExpr, and reset afterwards. If the flag is already set 
 in the beginning of exec_eval_simple_expr, we have recursed, and must 
 create a new executor state.

 Yeah, the same thought occurred to me in the shower this morning.
 I'm concerned about possible memory leakage during repeated recursion,
 but maybe that can be dealt with.

I spent some time poking at this today, and developed the attached
patch, which gets rid of all the weird assumptions associated with
simple expressions in plpgsql, in favor of just doing another
ExecInitExpr per expression in each call of the function.  While this is
a whole lot cleaner than what we have, I'm afraid that it's unacceptably
slow.  I'm seeing an overall slowdown of 2X to 3X on function execution
with examples like:

create or replace function speedtest10(x float8) returns float8 as $$
declare
  z float8 := x;
begin
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  z := z * 2 + 1;
  return z;
end
$$
language plpgsql immutable;

Now, this is about the worst case for the patch.  This function's
runtime depends almost entirely on the speed of simple expressions,
and because there's no internal looping, we only get to use the result
of each ExecInitExpr once per function call.  So probably typical use
cases wouldn't be quite so bad; but still it seems like we can't go this
route.  We need to be able to use the ExecInitExpr results across
successive calls one way or another.

I'll look into creating an in-use flag next.

regards, tom lane

diff --git a/src/pl/plpgsql/src/gram.y b/src/pl/plpgsql/src/gram.y
index a28c6707e4670be17ed1c947d383f1d11b9c90c5..0963efe1b5b5e7e974a682d2a7f71d6427180e9b 100644
*** a/src/pl/plpgsql/src/gram.y
--- b/src/pl/plpgsql/src/gram.y
*** decl_statement	: decl_varname decl_const
*** 490,495 
--- 490,496 
  		curname_def = palloc0(sizeof(PLpgSQL_expr));
  
  		curname_def-dtype = PLPGSQL_DTYPE_EXPR;
+ 		curname_def-expr_num = plpgsql_nExprs++;
  		strcpy(buf, SELECT );
  		cp1 = new-refname;
  		cp2 = buf + strlen(buf);
*** read_sql_construct(int until,
*** 2277,2282 
--- 2278,2284 
  	expr-query			= pstrdup(ds.data);
  	expr-plan			= NULL;
  	expr-paramnos		= NULL;
+ 	expr-expr_num  = plpgsql_nExprs++;
  	expr-ns= plpgsql_ns_top();
  	pfree(ds.data);
  
*** make_execsql_stmt(int firsttoken, int lo
*** 2476,2481 
--- 2478,2484 
  	expr-query			= pstrdup(ds.data);
  	expr-plan			= NULL;
  	expr-paramnos		= NULL;
+ 	expr-expr_num  = plpgsql_nExprs++;
  	expr-ns= plpgsql_ns_top();
  	pfree(ds.data);
  
diff --git a/src/pl/plpgsql/src/pl_comp.c b/src/pl/plpgsql/src/pl_comp.c
index 3ddcf3b5a595e0847627cc10c4084258f44cc352..4feb14d2bec87de5d8a66071ab3a66da3c2a2d76 100644
*** a/src/pl/plpgsql/src/pl_comp.c
--- b/src/pl/plpgsql/src/pl_comp.c
***
*** 37,42 
--- 37,47 
  
  /* --
   * Our own local and global variables
+  *
+  * Ideally these would live in some dynamically-allocated structure, but
+  * it's unpleasant to pass such a thing around in a Bison parser.  As long
+  * as plpgsql function compilation isn't re-entrant, it's okay to use
+  * static storage for these.
   * --
   */
  PLpgSQL_stmt_block *plpgsql_parse_result;
*** int			plpgsql_nDatums;
*** 46,51 
--- 51,58 
  PLpgSQL_datum **plpgsql_Datums;
  static int	datums_last = 0;
  
+ int			plpgsql_nExprs;
+ 
  char	   *plpgsql_error_funcname;
  bool		plpgsql_DumpExecTree = false;
  bool		plpgsql_check_syntax = false;
*** do_compile(FunctionCallInfo fcinfo,
*** 367,372 
--- 374,381 
  	 sizeof(PLpgSQL_datum *) * datums_alloc);
  	datums_last = 0;
  
+ 	plpgsql_nExprs = 0;
+ 
  	switch (is_trigger)
  	{
  		case false:
*** do_compile(FunctionCallInfo fcinfo,
*** 693,698 
--- 702,708 
  	function-datums = palloc(sizeof(PLpgSQL_datum *) * plpgsql_nDatums);
  	for (i = 0; i  plpgsql_nDatums; i++)
  		function-datums[i] = plpgsql_Datums[i];
+ 	function-nexprs = plpgsql_nExprs;
  
  	/* Debug dump for completed functions */
  	if (plpgsql_DumpExecTree)
*** plpgsql_compile_inline(char *proc_source
*** 788,793 
--- 798,804 
  	plpgsql_nDatums = 0;
  	plpgsql_Datums = palloc(sizeof(PLpgSQL_datum *) * datums_alloc);
  	datums_last = 0;
+ 	plpgsql_nExprs = 0;
  
  	/* Set up as though in a function returning VOID */
  	function-fn_rettype = VOIDOID;
*** plpgsql_compile_inline(char *proc_source
*** 838,843 
--- 849,855 
  	function-datums = palloc(sizeof(PLpgSQL_datum *) * 

Re: [HACKERS] ask for review of MERGE

2010-10-26 Thread Simon Riggs
On Tue, 2010-10-26 at 16:08 -0400, Robert Haas wrote:
 On Tue, Oct 26, 2010 at 8:10 AM, Simon Riggs si...@2ndquadrant.com wrote:
  I agree with your analysis. Let me review...
  [review]
 
 Sounds like we're on the same page.
 
  Two options for resolution are
 
  1) Throw SERIALIZABLE error
 
  2) Implement something similar to EvalPlanQual
  As you say, we already resolve this situation for concurrent updates by
  following the update chain from a row that is visible to the latest row.
  For MERGE the semantics need to be subtely different here: we need to
  follow the update chain to the latest row, but from a row that we
  *can't* see.
 
  MERGE is still very useful without the need for 2), though fails in some
  cases for concurrent use. The failure rate would increase as the number
  of concurrent MERGErs and/or number of rows in source table. Those
  errors are no more serious than are possible now.
 
  So IMHO we should implement 1) now and come back later to implement 2).
  Given right now there may be other issues with 2) it seems unsafe to
  rely on the assumption that we'll fix them by end of release.
 
 Yeah.  In fact, I'm not sure we're ever going to want to implement #2
 - I think that needs more study to determine whether there's even
 something there that makes sense to implement at all.  But certainly I
 wouldn't count on it happening for 9.1.

2) sounds weird, until you realise it is exactly how you would need to
code a PL/pgSQL procedure to do the equivalent of MERGE. Not doing it
just makes no sense in the longer term. I agree it will take a while to
think it through in sufficient detail.

In the meantime it's a good argument for the ELSE construct at the end
of the WHEN clauses, so we can do something other than skip a row or
throw an ERROR.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/books/
 PostgreSQL Development, 24x7 Support, Training and Services
 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simplifying replication

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 8:27 AM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, 2010-10-21 at 20:57 -0400, Robert Haas wrote:

 Very true.  But the lack of a -1 setting for wal_keep_segments means
 that if you would like to take a backup without archiving, you must
 set wal_keep_segments to a value greater than or equal to the rate at
 which you generate WAL segments multiplied by the time it takes you to
 run a backup.  If that doesn't qualify as requiring arcane knowledge,
 I'm mystified as to what ever could.

 People are missing the point here:

 You have to put the WAL files *somewhere* while you do the base backup.
 PostgreSQL can't itself work out where that is, nor can it work out
 ahead of time how big it will need to be, since it is up to you how you
 do your base backup. Setting a parameter to -1 doesn't make the problem
 go away, it just pretends and hopes it doesn't exist, but screws you
 badly if you do hit the wall.

If you set wal_keep_segments=0, archive_mode=on, and
archive_command=something, you might run out of disk space.

If you set wal_keep_segments=-1, you might run out of disk space.

Are you any more screwed in the second case than you are in the first
case?  Why?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: Don't close stdin, don't leak file descriptor with ON_ERROR_STOP

2010-10-26 Thread Robert Haas
On Wed, Oct 20, 2010 at 5:54 PM, Marti Raudsepp ma...@juffo.org wrote:
 Here's the second patch from my coccicheck run. Originally it flagged
 the fact that the opened file in psql's process_file() wasn't being
 closed in the ON_ERROR_STOP path, but there seem to be two more
 unintended behaviors here.

 (1) In the error path, the value of pset.inputfile wasn't being
 properly restored. The caller does free(fname) on line 786, so
 psql.inputfile would point to unallocated memory.

 (2) The more significant issue is that stdin *was closed in the
 success return path. So when you run a script with two \i - lines,
 the first \q would close stdin and the next one would fail with:
    psql:-:0: could not read from input file: Bad file descriptor

 In fact, this means that stdin was being accessed after being
 fclose()d, which is undefined behavior per ANSI C, though it seems to
 work just fine on Linux.

 The new behavior requires the same amount of \qs as the number of
 executions of '-' because stdin is never closed.

Thanks, committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Alvaro Herrera
Excerpts from Dean Rasheed's message of mar oct 26 15:46:56 -0300 2010:
 On 26 October 2010 17:04, David E. Wheeler da...@kineticode.com wrote:
  On Oct 26, 2010, at 7:15 AM, Robert Haas wrote:
 
  Notwithstanding the above, I don't think ELEMENT would be a very bad 
  choice.
 
  I still think we should just go for LABEL and be done with it.  But
  y'all can ignore me if you want...
 
  +1
 
 Well ELEMENT is a reserved keyword in SQL:2008, to support multisets,
 so if we ever supported that feature...

Hah!

Well, here's a patch for LABEL in any case.  If we're going to have to
reserve ELEMENT in the future then there doesn't seem to be much point
in not choosing that one though.  Should we take a poll?

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


0001-Change-syntax-to-add-a-new-enum-value-to-ALTER-TYPE-.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] O_DSYNC broken on MacOS X?

2010-10-26 Thread Robert Haas
On Mon, Oct 25, 2010 at 12:51 PM, Peter Eisentraut pete...@gmx.net wrote:
 On mån, 2010-10-25 at 09:33 -0400, Robert Haas wrote:
 It seems we're still missing some relevant details, because hdparm
 doesn't seem to work on SCSI devices.  Is sdparm the right utility in
 that case?  Does anyone know what the correct incantations look like?

 Search the sdparm man page for Writeback Cache.  It has detailed
 examples.

Here's a patch.  This adds a few more details about sdparm and makes
it clear that it applies to both FreeBSD and Linux.  But, perhaps more
significantly, it rearranges what is currently a fairly long paragraph
into a bulleted list, which I think is more readable.  Comments?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


wal-reliability-list.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] add label to enum syntax

2010-10-26 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 Excerpts from Dean Rasheed's message of mar oct 26 15:46:56 -0300 2010:
 Well ELEMENT is a reserved keyword in SQL:2008, to support multisets,
 so if we ever supported that feature...

 Hah!

Hmmm ... I dug through SQL:2008, and so far as I can find, the only use
of ELEMENT as a keyword is for multiset element reference, which
is defined as return the sole element of a multiset of one element:

multiset element reference ::= ELEMENT left paren multiset value 
expression right paren

This is stated to be equivalent to

( SELECT M.E FROM UNNEST (mve) AS M(E) )

AFAICS, if we were to implement this, we'd do it as an ordinary function
named element(), just like unnest() is an ordinary function in our
implementation.  Reserving a common word for as tiny of a notational
savings as this would be stupid.

Of course, it's possible that in future versions the committee might
extend ELEMENT() in ways that we can't duplicate as a simple function.
But that's all hypothetical --- you could as well argue that they might
decide to reserve any other word, too.


But ... having said all that, I have to agree that ELEMENT seems
preferable to LABEL if we ignore micro-considerations of parser
efficiency --- I still think LABEL is a pretty poor choice of word
here.  Personally I'd still take VALUE as being my first choice though.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] security hook on authorization

2010-10-26 Thread Robert Haas
On Mon, Oct 25, 2010 at 9:45 PM, Robert Haas robertmh...@gmail.com wrote:
 Oh.  You know, I am realizing that I misread this patch.  This hook is
 actually after authentication has been done; it's merely before we've
 told the client what happened.  So maybe this is committable as-is,
 modulo some work on the comments.

Committed, with some work on the comments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] max_wal_senders must die

2010-10-26 Thread Robert Haas
On Thu, Oct 21, 2010 at 8:33 PM, Bruce Momjian br...@momjian.us wrote:
 Robert Haas wrote:
 On Thu, Oct 21, 2010 at 4:21 PM, Josh Berkus j...@agliodbs.com wrote:
  On 10/20/10 6:54 PM, Robert Haas wrote:
  I find it impossible to believe that's
  a good decision, and IMHO we should be focusing on how to make the
  parameters PGC_SIGHUP rather than PGC_POSTMASTER, which would give us
  most of the same benefits without throwing away hard-won performance.
 
  I'd be happy to accept that. ?Is it possible, though?

 I sketched an outline of the problem AIUI here:

 http://archives.postgresql.org/pgsql-hackers/2010-10/msg01348.php

 I think it's possible; I'm not quite sure how hard it is.
 Unfortunately, I've not had as much PG-hacking time lately as I'd
 like...

 Have we documented these TODOs?

I have not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simplifying replication

2010-10-26 Thread Josh Berkus

 If you set wal_keep_segments=0, archive_mode=on, and
 archive_command=something, you might run out of disk space.
 
 If you set wal_keep_segments=-1, you might run out of disk space.
 
 Are you any more screwed in the second case than you are in the first
 case?

It is the same to the user either way.  In either case you have to
change some settings and restart the master.

Well, for the archive case, you could conceivably mass-delete the
archive files.

-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Simplifying replication

2010-10-26 Thread Robert Haas
On Tue, Oct 26, 2010 at 9:59 PM, Josh Berkus j...@agliodbs.com wrote:

 If you set wal_keep_segments=0, archive_mode=on, and
 archive_command=something, you might run out of disk space.

 If you set wal_keep_segments=-1, you might run out of disk space.

 Are you any more screwed in the second case than you are in the first
 case?

 It is the same to the user either way.  In either case you have to
 change some settings and restart the master.

Except that changing wal_keep_segments doesn't require restarting the master.

The point of allowing -1 was to allow someone to set it to that value
temporarily, to be able to do a hot backup without having to guess how
large to set it.  If you don't have enough disk space for a backup to
complete, you're kind of hosed either way.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers