Re: [PATCHES] [GENERAL] ISO week dates

2007-02-15 Thread Bruce Momjian

Followup --- something weird is going on.  I am seeing _random_ failures
of the regression tests here in that same place, and the build farm
seems to fail in the same place, but with different row counts.

I am heading to bed but when I wake up, if it still an issue, I will
revert the patch.

---

Bruce Momjian wrote:
> 
> I am seeing buildfarm failures from the new regression tests added by
> this patch.  Would someone research why this is happening?
> 
>   http://www.pgbuildfarm.org/cgi-bin/show_status.pl
> 
> ---
> 
> bruce wrote:
> > 
> > Patch applied.  Thanks.
> > 
> > ---
> > 
> > 
> > Brendan Jurd wrote:
> > > The attached patch implements my proposal to extend support for the
> > > ISO week date calendar.
> > > 
> > > I have added two new format fields for use with to_char, to_date and
> > > to_timestamp:
> > > - ID for day-of-week
> > > - IDDD for day-of-year
> > > 
> > > This makes it possible to convert ISO week dates to and from text
> > > fully represented in either week ('IYYY-IW-ID') or day-of-year
> > > ('IYYY-IDDD') format.
> > > 
> > > I have also added an 'isoyear' field for use with extract / date_part.
> > > 
> > > The patch includes documentation updates and some extra tests in the
> > > regression suite for the new fields.
> > > 
> > > I have tried to implement these features with as little disruption to
> > > the existing code as possible.  I built on the existing date2iso*
> > > functions in src/backend/utils/adt/timestamp.c, and added a few
> > > functions of my own, but I wonder if these functions would be more
> > > appropriately located in datetime.c, alongside date2j and j2date?
> > > 
> > > I'd also like to raise the topic of how conversion from text to ISO
> > > week dates should be handled, where the user has specified a bogus
> > > mixture of fields.  Existing code basically ignores these issues; for
> > > example, if a user were to call to_date('1998-01-01 2454050',
> > > '-MM-DD J') the function returns 2006-01-01, a result of setting
> > > the year field from , then overwriting year, month and day with
> > > the values from the Julian date in J, then setting the month and day
> > > normally from MM and DD.
> > > 
> > > 2006-01-01 is not a valid representation of either of the values the
> > > user specified.  Now you might say "ask a silly question, get a silly
> > > answer"; the user shouldn't send nonsense arguments to to_date and
> > > expect a sensible result.  But perhaps the right way to respond to a
> > > broken timestamp definition is to throw an error, rather than behave
> > > as though everything has gone to plan, and return something which is
> > > not correct.
> > > 
> > > The same situation can arise if the user mixes ISO and Gregorian data;
> > > how should Postgres deal with something like to_date('2006-250',
> > > 'IYYY-DDD')?  The current behaviour in my patch is actually to assume
> > > that the user meant to say 'IYYY-IDDD', since "the 250th Gregorian day
> > > of the ISO year 2006" is total gibberish.  But perhaps it should be
> > > throwing an error message.
> > > 
> > > That's all for now, thanks for your time.
> > > BJ
> > 
> > [ Attachment, skipping... ]
> > 
> > > 
> > > ---(end of broadcast)---
> > > TIP 2: Don't 'kill -9' the postmaster
> > 
> > -- 
> >   Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
> >   EnterpriseDB   http://www.enterprisedb.com
> > 
> >   + If your life is a hard drive, Christ can be your backup. +
> 
> -- 
>   Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
>   EnterpriseDB   http://www.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] WIP patch - INSERT-able log statements

2007-02-15 Thread Alvaro Herrera
Tom Lane wrote:
> "FAST PostgreSQL" <[EMAIL PROTECTED]> writes:
> > On Fri, 16 Feb 2007 11:50, Tom Lane wrote:
> >> Why not output the data in COPY format instead?
> 
> > Yeah, that was my initial idea too... But because the TODO item clearly 
> > mentions INSERT, I thought maybe there was some very specific reason for 
> > the 
> > output to be in INSERT stmts.. ..
> 
> I don't think the TODO entry was particularly carefully thought out.

... which is why discussing stuff in -hackers *before* coding tends to
be a good idea.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [GENERAL] ISO week dates

2007-02-15 Thread Bruce Momjian

I am seeing buildfarm failures from the new regression tests added by
this patch.  Would someone research why this is happening?

http://www.pgbuildfarm.org/cgi-bin/show_status.pl

---

bruce wrote:
> 
> Patch applied.  Thanks.
> 
> ---
> 
> 
> Brendan Jurd wrote:
> > The attached patch implements my proposal to extend support for the
> > ISO week date calendar.
> > 
> > I have added two new format fields for use with to_char, to_date and
> > to_timestamp:
> > - ID for day-of-week
> > - IDDD for day-of-year
> > 
> > This makes it possible to convert ISO week dates to and from text
> > fully represented in either week ('IYYY-IW-ID') or day-of-year
> > ('IYYY-IDDD') format.
> > 
> > I have also added an 'isoyear' field for use with extract / date_part.
> > 
> > The patch includes documentation updates and some extra tests in the
> > regression suite for the new fields.
> > 
> > I have tried to implement these features with as little disruption to
> > the existing code as possible.  I built on the existing date2iso*
> > functions in src/backend/utils/adt/timestamp.c, and added a few
> > functions of my own, but I wonder if these functions would be more
> > appropriately located in datetime.c, alongside date2j and j2date?
> > 
> > I'd also like to raise the topic of how conversion from text to ISO
> > week dates should be handled, where the user has specified a bogus
> > mixture of fields.  Existing code basically ignores these issues; for
> > example, if a user were to call to_date('1998-01-01 2454050',
> > '-MM-DD J') the function returns 2006-01-01, a result of setting
> > the year field from , then overwriting year, month and day with
> > the values from the Julian date in J, then setting the month and day
> > normally from MM and DD.
> > 
> > 2006-01-01 is not a valid representation of either of the values the
> > user specified.  Now you might say "ask a silly question, get a silly
> > answer"; the user shouldn't send nonsense arguments to to_date and
> > expect a sensible result.  But perhaps the right way to respond to a
> > broken timestamp definition is to throw an error, rather than behave
> > as though everything has gone to plan, and return something which is
> > not correct.
> > 
> > The same situation can arise if the user mixes ISO and Gregorian data;
> > how should Postgres deal with something like to_date('2006-250',
> > 'IYYY-DDD')?  The current behaviour in my patch is actually to assume
> > that the user meant to say 'IYYY-IDDD', since "the 250th Gregorian day
> > of the ISO year 2006" is total gibberish.  But perhaps it should be
> > throwing an error message.
> > 
> > That's all for now, thanks for your time.
> > BJ
> 
> [ Attachment, skipping... ]
> 
> > 
> > ---(end of broadcast)---
> > TIP 2: Don't 'kill -9' the postmaster
> 
> -- 
>   Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
>   EnterpriseDB   http://www.enterprisedb.com
> 
>   + If your life is a hard drive, Christ can be your backup. +

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] Avg performance for int8/numeric

2007-02-15 Thread Bruce Momjian

I have tested this patch but it generates regression failures.

There was some code drift, so I am attaching an updated version of the
patch, and the regression diffs.  The 'four' column is an 'int4' so my
guess is that somehow the wrong aggregate is being called.

---

Mark Kirkwood wrote:
> Neil Conway wrote:
> > On Fri, 2006-11-24 at 11:08 +1300, Mark Kirkwood wrote:
> >> - Modifies do_numeric_accum to have an extra bool parameter and does not 
> >> calc sumX2 when it is false.
> > 
> > I think it would be clearer to reorganize this function slightly, and
> > have only a single branch on "useSumX2". On first glance it isn't
> > obviously that transdatums[2] is defined (but unchanged) when useSumX2
> > is false.
> > 
> 
> Right - new patch attached that adds a new function do_numeric_avg_accum 
> that only uses N and sum(X). This means I could amend the avg aggregates 
> for numeric, int8 to have a initvalues of {0,0}.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
*** ./expected/aggregates.out	Fri Jul 28 14:33:04 2006
--- ./results/aggregates.out	Thu Feb 15 23:54:45 2007
***
*** 238,248 
  
  -- user-defined aggregates
  SELECT newavg(four) AS avg_1 FROM onek;
!avg_1
! 
!  1.5000
! (1 row)
! 
  SELECT newsum(four) AS sum_1500 FROM onek;
   sum_1500 
  --
--- 238,244 
  
  -- user-defined aggregates
  SELECT newavg(four) AS avg_1 FROM onek;
! ERROR:  expected 2-element numeric array
  SELECT newsum(four) AS sum_1500 FROM onek;
   sum_1500 
  --

==

Index: src/backend/utils/adt/numeric.c
===
RCS file: /cvsroot/pgsql/src/backend/utils/adt/numeric.c,v
retrieving revision 1.99
diff -c -c -r1.99 numeric.c
*** src/backend/utils/adt/numeric.c	16 Jan 2007 21:41:13 -	1.99
--- src/backend/utils/adt/numeric.c	16 Feb 2007 04:30:27 -
***
*** 2165,2170 
--- 2165,2204 
  	return result;
  }
  
+ /*
+  * Improve avg performance by not caclulating sum(X*X).
+  */
+ static ArrayType *
+ do_numeric_avg_accum(ArrayType *transarray, Numeric newval)
+ {
+ 	Datum	   *transdatums;
+ 	int			ndatums;
+ 	Datum		N,
+ sumX;
+ 	ArrayType  *result;
+ 
+ 	/* We assume the input is array of numeric */
+ 	deconstruct_array(transarray,
+ 	  NUMERICOID, -1, false, 'i',
+ 	  &transdatums, NULL, &ndatums);
+ 	if (ndatums != 2)
+ 		elog(ERROR, "expected 2-element numeric array");
+ 	N = transdatums[0];
+ 	sumX = transdatums[1];
+ 
+ 	N = DirectFunctionCall1(numeric_inc, N);
+ 	sumX = DirectFunctionCall2(numeric_add, sumX,
+ 			   NumericGetDatum(newval));
+ 
+ 	transdatums[0] = N;
+ 	transdatums[1] = sumX;
+ 
+ 	result = construct_array(transdatums, 2,
+ 			 NUMERICOID, -1, false, 'i');
+ 
+ 	return result;
+ }
+ 
  Datum
  numeric_accum(PG_FUNCTION_ARGS)
  {
***
*** 2175,2180 
--- 2209,2226 
  }
  
  /*
+  * Optimized case for average of numeric.
+  */
+ Datum
+ numeric_avg_accum(PG_FUNCTION_ARGS)
+ {
+ 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
+ 	Numeric		newval = PG_GETARG_NUMERIC(1);
+ 
+ 	PG_RETURN_ARRAYTYPE_P(do_numeric_avg_accum(transarray, newval));
+ }
+ 
+ /*
   * Integer data types all use Numeric accumulators to share code and
   * avoid risk of overflow.	For int2 and int4 inputs, Numeric accumulation
   * is overkill for the N and sum(X) values, but definitely not overkill
***
*** 2219,2224 
--- 2265,2286 
  	PG_RETURN_ARRAYTYPE_P(do_numeric_accum(transarray, newval));
  }
  
+ /*
+  * Optimized case for average of int8.
+  */
+ Datum
+ int8_avg_accum(PG_FUNCTION_ARGS)
+ {
+ 	ArrayType  *transarray = PG_GETARG_ARRAYTYPE_P(0);
+ 	Datum		newval8 = PG_GETARG_DATUM(1);
+ 	Numeric		newval;
+ 
+ 	newval = DatumGetNumeric(DirectFunctionCall1(int8_numeric, newval8));
+ 
+ 	PG_RETURN_ARRAYTYPE_P(do_numeric_avg_accum(transarray, newval));
+ }
+ 
+ 
  Datum
  numeric_avg(PG_FUNCTION_ARGS)
  {
***
*** 2232,2242 
  	deconstruct_array(transarray,
  	  NUMERICOID, -1, false, 'i',
  	  &transdatums, NULL, &ndatums);
! 	if (ndatums != 3)
! 		elog(ERROR, "expected 3-element numeric array");
  	N = DatumGetNumeric(transdatums[0]);
  	sumX = DatumGetNumeric(transdatums[1]);
- 	/* ignore sumX2 */
  
  	/* SQL92 defines AVG of no values to be NULL */
  	/* N is zero iff no digits (cf. numeric_uminus) */
--- 2294,2303 
  	deconstruct_array(transarray,
  	  NUMERICOID, -1, false, 'i',
  	  &transdatums, NULL, &ndatums);
! 	if (ndatums != 2)
! 		elog(ERROR, "expected 2-element numeric array");
  	N = DatumGetNumeric(transdatums[0]);
  	sumX = DatumGetNumeric(transdatums[1]);
  
  	/

Re: [PATCHES] WIP patch - INSERT-able log statements

2007-02-15 Thread Tom Lane
"FAST PostgreSQL" <[EMAIL PROTECTED]> writes:
> On Fri, 16 Feb 2007 11:50, Tom Lane wrote:
>> Why not output the data in COPY format instead?

> Yeah, that was my initial idea too... But because the TODO item clearly 
> mentions INSERT, I thought maybe there was some very specific reason for the 
> output to be in INSERT stmts.. ..

I don't think the TODO entry was particularly carefully thought out.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] [GENERAL] ISO week dates

2007-02-15 Thread Bruce Momjian

Patch applied.  Thanks.

---


Brendan Jurd wrote:
> The attached patch implements my proposal to extend support for the
> ISO week date calendar.
> 
> I have added two new format fields for use with to_char, to_date and
> to_timestamp:
> - ID for day-of-week
> - IDDD for day-of-year
> 
> This makes it possible to convert ISO week dates to and from text
> fully represented in either week ('IYYY-IW-ID') or day-of-year
> ('IYYY-IDDD') format.
> 
> I have also added an 'isoyear' field for use with extract / date_part.
> 
> The patch includes documentation updates and some extra tests in the
> regression suite for the new fields.
> 
> I have tried to implement these features with as little disruption to
> the existing code as possible.  I built on the existing date2iso*
> functions in src/backend/utils/adt/timestamp.c, and added a few
> functions of my own, but I wonder if these functions would be more
> appropriately located in datetime.c, alongside date2j and j2date?
> 
> I'd also like to raise the topic of how conversion from text to ISO
> week dates should be handled, where the user has specified a bogus
> mixture of fields.  Existing code basically ignores these issues; for
> example, if a user were to call to_date('1998-01-01 2454050',
> '-MM-DD J') the function returns 2006-01-01, a result of setting
> the year field from , then overwriting year, month and day with
> the values from the Julian date in J, then setting the month and day
> normally from MM and DD.
> 
> 2006-01-01 is not a valid representation of either of the values the
> user specified.  Now you might say "ask a silly question, get a silly
> answer"; the user shouldn't send nonsense arguments to to_date and
> expect a sensible result.  But perhaps the right way to respond to a
> broken timestamp definition is to throw an error, rather than behave
> as though everything has gone to plan, and return something which is
> not correct.
> 
> The same situation can arise if the user mixes ISO and Gregorian data;
> how should Postgres deal with something like to_date('2006-250',
> 'IYYY-DDD')?  The current behaviour in my patch is actually to assume
> that the user meant to say 'IYYY-IDDD', since "the 250th Gregorian day
> of the ISO year 2006" is total gibberish.  But perhaps it should be
> throwing an error message.
> 
> That's all for now, thanks for your time.
> BJ

[ Attachment, skipping... ]

> 
> ---(end of broadcast)---
> TIP 2: Don't 'kill -9' the postmaster

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] WIP patch - INSERT-able log statements

2007-02-15 Thread FAST PostgreSQL
On Fri, 16 Feb 2007 11:50, Tom Lane wrote:
> "FAST PostgreSQL" <[EMAIL PROTECTED]> writes:
> > The second variable is of interest. We need to specify a table in the
> > insert command. My preferred option is for the user to give one and he
> > can create it if and when he wants to. The alternative is we decide the
> > table name and make initdb to create one.
>
> Why not output the data in COPY format instead?  That (a) eliminates the
> problem of needing to predetermine a destination table name, and (b)
> should be considerably faster to load than thousands of INSERT statements.

Yeah, that was my initial idea too... But because the TODO item clearly 
mentions INSERT, I thought maybe there was some very specific reason for the 
output to be in INSERT stmts.. ..

Rgds,
Arul Shaji



>
>   regards, tom lane
This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 
693 481. It is confidential to the ordinary user of the email address to which 
it was addressed and may contain copyright and/or legally privileged 
information. No one else may read, print, store, copy or forward all or any of 
it or its attachments. If you receive this email in error, please return to 
sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia 
Software Technology Pty Ltd, please email [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] SSL enhancement patch ver.2

2007-02-15 Thread Bruce Momjian

Patch applied --- SSL improvements:

o read global SSL configuration file
o add GUC "ssl_ciphers" to control allowed ciphers
o add libpq environment variable PGSSLKEY to control SSL
  hardware keys

I adjusted the documentation wording and some of the single-letter
variable names you used --- the applied verison is attached.  Thanks.

---

Victor B. Wagner wrote:
> This patch adds following functionality to PostgreSQL
> 
> 1. If PostgreSQL is compiled with OpenSSL version 0.9.7 and above,
> both backend and libpq read site-wide OpenSSL configuration file as
> described in OPENSSL_config functon manual page. 
> 
> This allows to use hardware crypto acceleration modules (engines) and,
> in future version 0.9.9 would allow to use additional cryptoalgorithms
> (i.e. national standards) which are not included in core OpenSSL.
> 
> All other configuration parameters which are supported by OpenSSL
> library also are taken into account.
> 
> 
> 2. New configuration option "ssl_ciphers" is added to postgresql.conf.
> This option allows to change list of ciphers, acceptable by backend
> during SSL connection. Changing list of ciphers can be desirable to
> tighten or relax security of particular installation, and allows quick
> fix on configuration file level in case if vulnerability is discovered
> in one of cryptoalgorithms or their OpenSSL implementation - cipher
> suites which use such algorithm can be easily disabled.
> 
> 
> 3. If libpq compiled with OpenSSL 0.9.7 and above, compiled with engine
> support, it is possible to store secret key of client certificate on the
> hardware token, supported by one of OpenSSL engines (Hardware Security
> Module). Name of engine which supports token and engine-specific key ID
> are specifyed using environment variable PGSSLKEY.
> 
> This allows use of hardware tokens such as smartcards to identify
> clients, connecting to database.
> 
> This functionality can be used in installations with high security
> requirements or in situations where several people can use same terminal
> (such as cash register in shops or malls).
> 
> If PostgreSQL is compiled with version of OpenSSL which do not support
> engines or doesn't have OPENSSL_config function, related functionality
> is excluded by preprocessor conditionals, based on value of 
> SSLEAY_VERSION_NUMBER preprocessor symbol which is defined by all
> versions of OpenSSL.
> 

[ Attachment, skipping... ]

> 
> ---(end of broadcast)---
> TIP 5: don't forget to increase your free space map settings

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/config.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/config.sgml,v
retrieving revision 1.110
diff -c -c -r1.110 config.sgml
*** doc/src/sgml/config.sgml	8 Feb 2007 15:46:03 -	1.110
--- doc/src/sgml/config.sgml	16 Feb 2007 01:26:20 -
***
*** 569,574 
--- 569,588 

   
  
+  
+   ssl_ciphers> (string)
+   
+ssl_ciphers configuration parameter
+   
+   
+
+ Specifies a list of SSL ciphers which can be used to
+ establish secure connections. See the openssl
+ manual page for a list of supported ciphers.
+
+   
+  
+ 
   
password_encryption (boolean)

Index: doc/src/sgml/libpq.sgml
===
RCS file: /cvsroot/pgsql/doc/src/sgml/libpq.sgml,v
retrieving revision 1.228
diff -c -c -r1.228 libpq.sgml
*** doc/src/sgml/libpq.sgml	6 Feb 2007 03:03:11 -	1.228
--- doc/src/sgml/libpq.sgml	16 Feb 2007 01:26:22 -
***
*** 4175,4180 
--- 4175,4192 
  
  
  
+ PGSSLKEY
+ 
+ PGSSLKEY
+ specifies the hardware token which stores the secret key for the client
+ certificate, instead of a file. The value of this variable should consist
+ of a colon-separated engine name (engines are OpenSSL
+ loadable modules) and an engine-specific key identifier.
+ 
+ 
+ 
+ 
+ 
   PGKRBSRVNAME
  
  PGKRBSRVNAME sets the Kerberos service name to use when
***
*** 4438,4457 
 for increased security. See  for details
 about the server-side SSL functionality.

! 

 If the server demands a client certificate, 
 libpq
 will send the certificate stored in file
 ~/.postgresql/postgresql.crt within the user's home directory.
 A matching private key file ~/.postgresql/postgresql.key
!must also be present, and must not be world-readable.
 (On Microsoft Windows these files are named
 %APPDATA%\postgresql\postgresql.crt and
 %APPDATA%\postgresql\postgresql.key.)

  

Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Andrew Dunstan



Tom Lane wrote:

David Fetter <[EMAIL PROTECTED]> writes:
  

I've obviously misunderstood the scope of the TODO because it appears
that an INSERT into pg_type at creation time for compound types that
looks something like the below would do it.  What have I missed?



There are a couple of issues.  One is that we probably don't want two
pg_type entries for every single table.  Will you be satisfied if only
CREATE TYPE AS ... makes an array type?  
  


There should be some better way to create the array type for tables than 
directly mangling pg_type, though. Maybe a builtin function that took an 
oid?


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [PATCHES] remove dbname arg from bootstrap mode

2007-02-15 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> (c) Can't you remove the output-dbname argument you added to InitPostgres?

> (I'm not sure if this last comment is related to this particular patch
> or a gripe about the InitPostgres change in the autovac patch?)

No, I was thinking you could re-simplify, but I had the bootstrap and
autovac cases confused.  Never mind ...

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] remove dbname arg from bootstrap mode

2007-02-15 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > This little patch removes the dbname argument to InitPostgres in the
> > bootstrap mode.  It also cleans a bit of fallout: initdb uselessly
> > passing an unused database name, postmaster starting a dummy process,
> > and removes the -y option to the bootstrap process, which is
> > undocumented and useless.
> 
> (a) Doesn't postgres-ref.sgml need to be updated for this?

Ah, that's where this was documented!  Sure.

> (b) What dummy process?

Startup and bgwriter (neither of which connect to a database)

> (c) Can't you remove the output-dbname argument you added to InitPostgres?

Hmm, I don't see how -- the autovacuum worker needs it.  And the return
value of InitPostgres is already taken by the is_superuser flag.

One simplification I tried in that area was making a separate
InitBootPostgres to cater for bootstrap, but the common code is too much
and too interleaved to be useful.

(I'm not sure if this last comment is related to this particular patch
or a gripe about the InitPostgres change in the autovac patch?)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread David Fetter
On Thu, Feb 15, 2007 at 07:35:46PM -0500, Tom Lane wrote:
> David Fetter <[EMAIL PROTECTED]> writes:
> > I've obviously misunderstood the scope of the TODO because it appears
> > that an INSERT into pg_type at creation time for compound types that
> > looks something like the below would do it.  What have I missed?
> 
> There are a couple of issues.  One is that we probably don't want
> two pg_type entries for every single table.

Now that you mention it, I would want that if that's what it takes to
get arrays for them.  The long-term goal here is to make all of
PostgreSQL's types play nicely together.  I'm guessing that SETOF
will eventually be a way to describe a collection because MULTISET is
in SQL:2003.

> Will you be satisfied if only CREATE TYPE AS ... makes an array
> type?  The other is that, at least at the time they were written,
> the array support routines couldn't handle composite array values.
> Things might or might not be easier today; I don't think we had
> record_in and record_out in their current form then.

OK.  What about pg_depend?

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] remove dbname arg from bootstrap mode

2007-02-15 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> This little patch removes the dbname argument to InitPostgres in the
> bootstrap mode.  It also cleans a bit of fallout: initdb uselessly
> passing an unused database name, postmaster starting a dummy process,
> and removes the -y option to the bootstrap process, which is
> undocumented and useless.

(a) Doesn't postgres-ref.sgml need to be updated for this?
(b) What dummy process?
(c) Can't you remove the output-dbname argument you added to InitPostgres?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] WIP patch - INSERT-able log statements

2007-02-15 Thread Tom Lane
"FAST PostgreSQL" <[EMAIL PROTECTED]> writes:
> The second variable is of interest. We need to specify a table in the insert 
> command. My preferred option is for the user to give one and he can create it 
> if and when he wants to. The alternative is we decide the table name and make 
> initdb to create one.  

Why not output the data in COPY format instead?  That (a) eliminates the
problem of needing to predetermine a destination table name, and (b)
should be considerably faster to load than thousands of INSERT statements.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Tom Lane
David Fetter <[EMAIL PROTECTED]> writes:
> I've obviously misunderstood the scope of the TODO because it appears
> that an INSERT into pg_type at creation time for compound types that
> looks something like the below would do it.  What have I missed?

There are a couple of issues.  One is that we probably don't want two
pg_type entries for every single table.  Will you be satisfied if only
CREATE TYPE AS ... makes an array type?  The other is that, at least at
the time they were written, the array support routines couldn't handle
composite array values.  Things might or might not be easier today;
I don't think we had record_in and record_out in their current form
then.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[PATCHES] remove dbname arg from bootstrap mode

2007-02-15 Thread Alvaro Herrera
This little patch removes the dbname argument to InitPostgres in the
bootstrap mode.  It also cleans a bit of fallout: initdb uselessly
passing an unused database name, postmaster starting a dummy process,
and removes the -y option to the bootstrap process, which is
undocumented and useless.

Unless there are objections I intend to apply this shortly.
AFAICS it doesn't need a catversion bump.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.
Index: src/backend/bootstrap/bootstrap.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/bootstrap/bootstrap.c,v
retrieving revision 1.231
diff -c -p -r1.231 bootstrap.c
*** src/backend/bootstrap/bootstrap.c	15 Feb 2007 23:23:22 -	1.231
--- src/backend/bootstrap/bootstrap.c	15 Feb 2007 23:36:19 -
*** BootstrapMain(int argc, char *argv[])
*** 208,214 
  {
  	char	   *progname = argv[0];
  	int			i;
- 	char	   *dbname;
  	int			flag;
  	int			xlogop = BS_XLOG_NOP;
  	char	   *userDoption = NULL;
--- 208,213 
*** BootstrapMain(int argc, char *argv[])
*** 239,245 
  	 */
  
  	/* Set defaults, to be overriden by explicit options below */
- 	dbname = NULL;
  	if (!IsUnderPostmaster)
  		InitializeGUCOptions();
  
--- 238,243 
*** BootstrapMain(int argc, char *argv[])
*** 250,256 
  		argc--;
  	}
  
! 	while ((flag = getopt(argc, argv, "B:c:d:D:Fr:x:y:-:")) != -1)
  	{
  		switch (flag)
  		{
--- 248,254 
  		argc--;
  	}
  
! 	while ((flag = getopt(argc, argv, "B:c:d:D:Fr:x:-:")) != -1)
  	{
  		switch (flag)
  		{
*** BootstrapMain(int argc, char *argv[])
*** 282,290 
  			case 'x':
  xlogop = atoi(optarg);
  break;
- 			case 'y':
- dbname = strdup(optarg);
- break;
  			case 'c':
  			case '-':
  {
--- 280,285 
*** BootstrapMain(int argc, char *argv[])
*** 320,331 
  		}
  	}
  
! 	if (!dbname && argc - optind == 1)
! 	{
! 		dbname = argv[optind];
! 		optind++;
! 	}
! 	if (!dbname || argc != optind)
  	{
  		write_stderr("%s: invalid command-line arguments\n", progname);
  		proc_exit(1);
--- 315,321 
  		}
  	}
  
! 	if (argc != optind)
  	{
  		write_stderr("%s: invalid command-line arguments\n", progname);
  		proc_exit(1);
*** BootstrapMain(int argc, char *argv[])
*** 449,455 
  	 * Do backend-like initialization for bootstrap mode
  	 */
  	InitProcess();
! 	(void) InitPostgres(dbname, InvalidOid, NULL, NULL);
  
  	/*
  	 * In NOP mode, all we really want to do is create shared memory and
--- 439,445 
  	 * Do backend-like initialization for bootstrap mode
  	 */
  	InitProcess();
! 	(void) InitPostgres(NULL, InvalidOid, NULL, NULL);
  
  	/*
  	 * In NOP mode, all we really want to do is create shared memory and
Index: src/backend/postmaster/postmaster.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/backend/postmaster/postmaster.c,v
retrieving revision 1.522
diff -c -p -r1.522 postmaster.c
*** src/backend/postmaster/postmaster.c	15 Feb 2007 23:23:23 -	1.522
--- src/backend/postmaster/postmaster.c	15 Feb 2007 23:52:17 -
*** StartChildProcess(int xlop)
*** 3729,3737 
  	snprintf(xlbuf, sizeof(xlbuf), "-x%d", xlop);
  	av[ac++] = xlbuf;
  
- 	av[ac++] = "-y";
- 	av[ac++] = "template1";
- 
  	av[ac] = NULL;
  	Assert(ac < lengthof(av));
  
--- 3729,3734 
Index: src/bin/initdb/initdb.c
===
RCS file: /home/alvherre/Code/cvs/pgsql/src/bin/initdb/initdb.c,v
retrieving revision 1.132
diff -c -p -r1.132 initdb.c
*** src/bin/initdb/initdb.c	10 Feb 2007 14:58:55 -	1.132
--- src/bin/initdb/initdb.c	16 Feb 2007 00:01:38 -
*** test_config_settings(void)
*** 1186,1192 
   "-c max_connections=%d "
   "-c shared_buffers=%d "
   "-c max_fsm_pages=%d "
!  "template1 < \"%s\" > \"%s\" 2>&1%s",
   SYSTEMQUOTE, backend_exec, boot_options,
   test_conns, test_buffs, test_max_fsm,
   DEVNULL, DEVNULL, SYSTEMQUOTE);
--- 1186,1192 
   "-c max_connections=%d "
   "-c shared_buffers=%d "
   "-c max_fsm_pages=%d "
!  "< \"%s\" > \"%s\" 2>&1%s",
   SYSTEMQUOTE, backend_exec, boot_options,
   test_conns, test_buffs, test_max_fsm,
   DEVNULL, DEVNULL, SYSTEMQUOTE);
*** test_config_settings(void)
*** 1221,1227 
   "-c max_connections=%d "
   "-c shared_buffers=%d "
   "-c max_fsm_pages=%d "
!  "template1 < \"%s\" > \"%s\" 2>&1%s",
   SYSTEMQUOTE, backend_exec, boot_options,
   n_connections, test_buffs, test_max_fsm,
   DEVNULL, DEVNULL, SYSTEMQUOTE);
--- 1221,1227 
   "-c max_connections=%d "
   "-c shared_buffers=%d "
   "-c max_fsm_pages=%d "
!  "< \"%s\" > \"%s\" 2>&1%s",
   SYSTEMQUOTE, backend_ex

[PATCHES] WIP patch - INSERT-able log statements

2007-02-15 Thread FAST PostgreSQL
Hi,

I've been working on the following TODO item and attached is an initial patch. 
(It is only partial and not yet completely functional)

"Allow server log information to be output as INSERT statements 
This would allow server log information to be easily loaded into a database for 
analysis. "

I want to confirm, if what I have done so far is what community is looking for 
and also want to clear some doubts.

What is done so far
---

Two postgresql.conf variables

#log_output_type = 'text'   #Valid values are 'SQL' or 'text'
#log_output_table_name = 'auditlogs'

These control how to output the log. Defaults to 'text' which is status quo. If 
it is set to 'SQL' log will be output as INSERT commands.

The second variable is of interest. We need to specify a table in the insert 
command. My preferred option is for the user to give one and he can create it 
if and when he wants to. The alternative is we decide the table name and make 
initdb to create one.  

The proposed log output structure
--
INSERT INTO user_defined_table values( timestamp_with_milliseconds,  timestamp, 
username,  databasename, sessionid,  host_and_port, host, proc_id, command_tag, 
 session_start, transaction_id,  error_severity,  SQL_State_Code, 
error_message);

All these columns will follow the current rules of log output. ie, unless 
explicity requested by the user, these columns will have NULL. User can still 
give log_line_prefix in any order he wants, and logger will output it in 
appropriate columns. The code has been modified to do 
this.

Issues/Questions are:
- How about 'Statement duration log'.  This will come to the logger as a single 
string and after the query execution. In the existing log we can make sense of 
the duration log by matching it with the statement above it or by the statement 
which gets printed besides it (Again as 
a single string). But when this is loaded onto a table doesn't make much sense 
untless everything is in a single row. (My preferred option is to add another 
column to the table structure defined above as 'duration'. But haven't figured 
out how to achieve this, because the 
statement is printed first and then the duration as another log.)

- If the SQL log output is to the syslog, then it becomes pretty awkward and 
possibly useless because our current syslog writer function breaks up the log 
into several lines to accomodate various platforms. Syslog also then adds other 
information before outputting it, which 
cannot be loaded onto a table. The preferred option is to educate the user 
through documentation that SQL type log output is best served when it is output 
to stderr and redirected to a file? Same goes with other aspects such as 
verbose and various other statistics log. 

- There are also other minor issues such as, the actual query currently gets 
output in log as 'Statement: CREATE '. For sql type log we may not need 
the 'Statement:' part as it will be in a column ? Do we remove this in both 
text and SQL outputs ?

Rgds,
Arul Shaji

This is an email from Fujitsu Australia Software Technology Pty Ltd, ABN 27 003 
693 481. It is confidential to the ordinary user of the email address to which 
it was addressed and may contain copyright and/or legally privileged 
information. No one else may read, print, store, copy or forward all or any of 
it or its attachments. If you receive this email in error, please return to 
sender. Thank you.

If you do not wish to receive commercial email messages from Fujitsu Australia 
Software Technology Pty Ltd, please email [EMAIL PROTECTED]
*** pgsql/src/backend/utils/misc/postgresql.conf.sample	2007-02-01 11:38:25.0 +1100
--- workingpgsql/src/backend/utils/misc/postgresql.conf.sample	2007-02-14 15:30:45.0 +1100
***
*** 257,262 
--- 257,267 
  #syslog_facility = 'LOCAL0'
  #syslog_ident = 'postgres'
  
+ # - How to Log -
+ 
+ #log_output_type = 'text'		#Valid values are 'SQL' or 'text'
+ #log_output_table_name = 'auditlogs'
+ 
  
  # - When to Log -
  
*** pgsql/src/backend/utils/misc/guc.c	2007-02-14 15:16:24.0 +1100
--- workingpgsql/src/backend/utils/misc/guc.c	2007-02-14 16:14:03.0 +1100
***
*** 153,158 
--- 153,159 
  static const char *show_tcp_keepalives_idle(void);
  static const char *show_tcp_keepalives_interval(void);
  static const char *show_tcp_keepalives_count(void);
+ static const char *check_logtype_combination(const char *facility, bool doit, GucSource source);
  
  /*
   * GUC option variables that are exported from this module
***
*** 210,215 
--- 211,217 
  static char *log_statement_str;
  static char *log_min_error_statement_str;
  static char *log_destination_string;
+ static char *log_output_type_string;
  
  #ifdef HAVE_SYSLOG
  static char *syslog_facility_str;
***
*** 324,329 
--- 326,333 
  	gettext_noop("Reporting and Logging"),
  	/* LOGGING_WHERE */
  	gett

Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Jeremy Drake
On Thu, 15 Feb 2007, Alvaro Herrera wrote:

> Jeremy Drake wrote:
>
> > The functions added are:
> > * regexp_split(str text, pattern text) RETURNS SETOF text
> >   regexp_split(str text, pattern text, flags text) RETURNS SETOF text
> >returns each section of the string delimited by the pattern.
> > * regexp_matches(str text, pattern text) RETURNS text[]
> >returns all capture groups when matching pattern against string in an
> >array
> > * regexp_matches(str text, pattern text, flags text) RETURNS SETOF
> > (prematch text, fullmatch text, matches text[], postmatch text)
> >returns all capture groups when matching pattern against string in an
> >array.  also returns the entire match in fullmatch.  if the 'g' option
> >is given, returns all matches in the string.  if the 'r' option is
> >given, also return the text before and after the match in prematch and
> >postmatch respectively.
>
> I think the position the match is in could be important.  I'm wondering
> if you could define them like
>
> create type re_match(match text, position int)
> regexp_split(str text, pattern text) returns setof re_match

So it looks like the issues are:
1. regexp_matches without flags has a different return type than
   regexp_matches with flags.  I can make them return the same OUT
   parameters, but should I declare it as returning SETOF when I know
   for a fact that the no-flags version will never return more than one
   row?

2. regexp_split does not represent the order of the results.  I can do
   something like:

 regexp_split(str text, pattern text[, flags text], OUT result text, OUT
startpos int) RETURNS SETOF record;

It could also have the int being a simple counter to represent the
relative order, rather than the position.


Thoughts?  Do these changes address the issues recently expressed?




-- 
I have yet to see any problem, however complicated, which, when looked
at in the right way, did not become still more complicated.
-- Poul Anderson

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread David Fetter
On Thu, Feb 15, 2007 at 10:57:45AM +0100, Peter Eisentraut wrote:
> Jeremy Drake wrote:

> > # With a set-returning function, it is possible to add a LIMIT
> > clause, to prevent splitting up more of the string than is
> > necessary.
> 
> You can also add such functionality to a function in form of a
> parameter.

That's what things like Perl's split do :)

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread David Fetter
On Thu, Feb 15, 2007 at 10:37:26AM -0500, Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > so that you would have the position for each match, automatically.  Is
> > this information available in the regex code?
> 
> Certainly, that's where we got the text snippets from to begin with.
> However, I'm not sure that this is important enough to justify a
> special type --- for one thing, since we don't have arrays of
> composites,

This is a TODO :)

I've obviously misunderstood the scope of the TODO because it appears
that an INSERT into pg_type at creation time for compound types that
looks something like the below would do it.  What have I missed?

INSERT INTO pg_type
VALUES (
'_foo', /* Generated by makeArrayTypeName */
16744,  /* OID of schema */
10, /* OID of owner of the base type */
-1, /* typlen indicates varlena */
'f',/* not passed by value */
'c',/* typtype is composite */
't',/* type is already defined */
',',/* typdelim */
0,  /* should this actually refer to the type? */
'foo'::regtype, /* typelem */
'array_in', /* typinput */
'array_out',/* typoutput */
'array_recv',   /* typreceive */
'array_send',   /* typsend */
0,  /* typanalyze */
'i',/* typalign.  Should this be 'd'? */
'x',/* typstorage */
'f',/* not a DOMAIN, but while we're at it, why not arrays of 
DOMAIN? */
0,  /* base type. should this be different? */
-1, /* no typmod */
0   /* dims not specified */
);


> that would foreclose responding to Peter's concern that SETOF is the
> wrong thing.  If you look at the Perl and Tcl APIs for regexes, they
> return just the strings, not the numerical positions; and I've not
> heard anyone complaining about that.

They do return them in the order in which they appear, though, which,
as far as I can tell, Jeremy's functions also do.

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Peter Eisentraut
Alvaro Herrera wrote:
> On the other hand, I don't think it's impossible to have matches that
> start earlier than others in the string, but are actually found later
> (say, because they are a parentized expression that ends later).  So
> giving the starting positions allows one to know where are they
> located, rather than where were they reported.  (I don't really know
> if the matches are sorted before reporting though.)

I have no strong opinion about how matches are returned.  Seeing the 
definitional difficulties that you point out, it may be fine to return 
them unordered.  But then all "matches" functions should do that.

For the "split" functions, however, providing the order is clearly 
important.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Autovacuum launcher

2007-02-15 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> I think a warning comment is warranted here -- will include it when I
> commit the patch.

I was thinking the same, but didn't want to create a merge problem for
you.  Maybe "Individual lock IDs end here" or some such?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Alvaro Herrera
Tom Lane wrote:
> Alvaro Herrera <[EMAIL PROTECTED]> writes:
> > so that you would have the position for each match, automatically.  Is
> > this information available in the regex code?
> 
> Certainly, that's where we got the text snippets from to begin with.
> However, I'm not sure that this is important enough to justify a special
> type --- for one thing, since we don't have arrays of composites, that
> would foreclose responding to Peter's concern that SETOF is the wrong
> thing.

My point is that if you want to have the order in which the matches were
found, you can do that easily by looking at the positions; no need to
create an ordered array.  Which does respond to Peter's concern, since
the point was to keep the ordering of matches, which an array does; but
if we provide the positions, the SETOF way does as well.

On the other hand, I don't think it's impossible to have matches that
start earlier than others in the string, but are actually found later
(say, because they are a parentized expression that ends later).  So
giving the starting positions allows one to know where are they located,
rather than where were they reported.  (I don't really know if the
matches are sorted before reporting though.)

> If you look at the Perl and Tcl APIs for regexes, they return
> just the strings, not the numerical positions; and I've not heard anyone
> complaining about that.

I know, but that may be just because it would be too much extra
complexity for them (in terms of user API) to be returning the positions
along the text.  I know I'd be fairly annoyed if =~ in Perl returned an
array of hashes { text => 'foo', position => 42} instead of array of
text.  We don't have that problem.

In fact, I would claim that's much easier to deal with a SETOF function
than is to deal with text[].

Regarding the "nobody complains" argument, I don't find that
particularly compelling; witness how people gets used to working around
limitations in MySQL ... ;-)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [PATCHES] Autovacuum launcher

2007-02-15 Thread Alvaro Herrera
Tom Lane wrote:

I forgot to comment:

> This part is very seriously broken:
> 
> diff -c -p -r1.33 lwlock.h
> *** src/include/storage/lwlock.h  5 Jan 2007 22:19:58 -   1.33
> --- src/include/storage/lwlock.h  13 Feb 2007 16:58:41 -
> *** typedef enum LWLockId
> *** 62,67 
> --- 62,68 
>   BtreeVacuumLock,
>   AddinShmemInitLock,
>   FirstBufMappingLock,
> + AutovacuumLock,
>   FirstLockMgrLock = FirstBufMappingLock + NUM_BUFFER_PARTITIONS,
>   
>   /* must be last except for MaxDynamicLWLock: */
> 
> I'm surprised it got through your testing at all, with the autovacuum
> lock conflicting with bufmgr.

I didn't look closely enough to notice the arithmetic.  I think the only
reason this worked at all was because the autovacuum lock was being held
for very short periods of time, always for assigning or reading some
variable.

I think a warning comment is warranted here -- will include it when I
commit the patch.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> so that you would have the position for each match, automatically.  Is
> this information available in the regex code?

Certainly, that's where we got the text snippets from to begin with.
However, I'm not sure that this is important enough to justify a special
type --- for one thing, since we don't have arrays of composites, that
would foreclose responding to Peter's concern that SETOF is the wrong
thing.  If you look at the Perl and Tcl APIs for regexes, they return
just the strings, not the numerical positions; and I've not heard anyone
complaining about that.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Tom Lane
Peter Eisentraut <[EMAIL PROTECTED]> writes:
> Jeremy Drake wrote:
>> # My experience with the array code leads me to believe that building
>> up an array is an expensive proposition.  I know I could code it
>> smarter so that the array is only constructed in the end.

> You can make any code arbitrarily fast if it doesn't have to give the 
> right answer.

Even more to the point, it's folly to suppose that the overhead of
processing a SETOF result is less than that of array construction.

I tend to agree with Peter's concern that returning a set is going to
make it hard to track which result is which.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [PATCHES] Autovacuum launcher

2007-02-15 Thread Alvaro Herrera
I've fixed all other problems according to suggestions, including adding
a SignalSomeChildren(int signal, bool only_autovac) function to
postmaster so that it can shut autovac workers down in case of smart
shutdown.

Tom Lane wrote:

> Making InitPostgres's call API vary depending on
> IsAutoVacuumWorkerProcess seems really ugly, and unnecessary.
> Why not just test for dbname == NULL or some other convention
> expressed by the arguments themselves?

Hmm, yeah.  Modified version below.  In BootstrapMain, it surprises me
that the dbname is getting computed, and seems to be a required
parameter, when it doesn't seem to be used at all.  That code needs a
serious refactoring ...

/* 
 * InitPostgres
 *  Initialize POSTGRES.
 *
 * The database can be specified by name, using the in_dbname parameter, or by
 * OID, using the dboid parameter.  In the latter case, the computed database
 * name is passed out to the caller as a palloc'ed string in out_dbname.
 *
 * In bootstrap mode no parameters are used.
 *
 * The return value indicates whether the userID is a superuser.  (That
 * can only be tested inside a transaction, so we want to do it during
 * the startup transaction rather than doing a separate one in postgres.c.)
 *
 * As of PostgreSQL 8.2, we expect InitProcess() was already called, so we
 * already have a PGPROC struct ... but it's not filled in yet.
 *
 * Note:
 *  Be very careful with the order of calls in the InitPostgres function.
 * 
 */
bool
InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 char **out_dbname)
{
boolbootstrap = IsBootstrapProcessingMode();
boolautovacuum = IsAutoVacuumWorkerProcess();
boolam_superuser;
char   *fullpath;
chardbname[NAMEDATALEN];

/*
 * Set up the global variables holding database id and path.  But note we
 * won't actually try to touch the database just yet.
 *
 * We take a shortcut in the bootstrap case, otherwise we have to look up
 * the db name in pg_database.
 */
if (bootstrap)
{
MyDatabaseId = TemplateDbOid;
MyDatabaseTableSpace = DEFAULTTABLESPACE_OID;
}
else
{
/*
 * Find tablespace of the database we're about to open. Since we're not
 * yet up and running we have to use one of the hackish FindMyDatabase
 * variants, which look in the flat-file copy of pg_database.
 *
 * If the in_dbname param is NULL, lookup database by OID.
 */
if (in_dbname == NULL)
{
if (!FindMyDatabaseByOid(dboid, dbname, &MyDatabaseTableSpace))
ereport(FATAL,
(errcode(ERRCODE_UNDEFINED_DATABASE),
 errmsg("database %u does not exist", dboid)));
MyDatabaseId = dboid;
/* pass the database name to the caller */
*out_dbname = pstrdup(dbname);
}
else
{
if (!FindMyDatabase(in_dbname, &MyDatabaseId, 
&MyDatabaseTableSpace))
ereport(FATAL,
(errcode(ERRCODE_UNDEFINED_DATABASE),
 errmsg("database \"%s\" does not exist",
in_dbname)));
/* our database name is gotten from the caller */
strlcpy(dbname, in_dbname, NAMEDATALEN);
}
}

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


autovac-launcher-2.patch.gz
Description: Binary data

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Alvaro Herrera
Jeremy Drake wrote:

> The functions added are:
> * regexp_split(str text, pattern text) RETURNS SETOF text
>   regexp_split(str text, pattern text, flags text) RETURNS SETOF text
>returns each section of the string delimited by the pattern.
> * regexp_matches(str text, pattern text) RETURNS text[]
>returns all capture groups when matching pattern against string in an
>array
> * regexp_matches(str text, pattern text, flags text) RETURNS SETOF
> (prematch text, fullmatch text, matches text[], postmatch text)
>returns all capture groups when matching pattern against string in an
>array.  also returns the entire match in fullmatch.  if the 'g' option
>is given, returns all matches in the string.  if the 'r' option is
>given, also return the text before and after the match in prematch and
>postmatch respectively.

I think the position the match is in could be important.  I'm wondering
if you could define them like

create type re_match(match text, position int)
regexp_split(str text, pattern text) returns setof re_match

or maybe
regexp_split(str text, pattern text, OUT match text, OUT position int);
(not sure of the exact syntax for this one)

so that you would have the position for each match, automatically.  Is
this information available in the regex code?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


[PATCHES] Move cursor support for pl/pgsql

2007-02-15 Thread Magnus Hagander
Hi!

I took a stab at implementing MOVE support for cursors in PL/pgsql,
because I wanted it for a project. Attached patch *seems* to work. But..
Given that this is my first venture into touching bison/flex files, I
may be completely off-target in how it's done.

Could somebody take a look at it and comment on if I'm completelyi off
or only slightly off?
(And if there are good arguments not to include MOVE support at all in
pl/pgsql, I'd still be interested in comments on the rest of it, in
order to learn..)

Thanks!

//Magnus

Index: src/pl/plpgsql/src/gram.y
===
RCS file: /projects/cvsroot/pgsql/src/pl/plpgsql/src/gram.y,v
retrieving revision 1.98
diff -c -r1.98 gram.y
*** src/pl/plpgsql/src/gram.y   8 Feb 2007 18:37:14 -   1.98
--- src/pl/plpgsql/src/gram.y   15 Feb 2007 12:15:20 -
***
*** 92,97 
--- 92,98 
PLpgSQL_exception_block *exception_block;
PLpgSQL_nsitem  *nsitem;
PLpgSQL_diag_item   *diagitem;
+   PLpgSQL_fetch_direction *fetchdir;
  }
  
  %type  decl_sect
***
*** 124,136 
  %type   stmt_assign stmt_if stmt_loop stmt_while stmt_exit
  %type   stmt_return stmt_raise stmt_execsql stmt_execsql_insert
  %type   stmt_dynexecute stmt_for stmt_perform stmt_getdiag
! %type   stmt_open stmt_fetch stmt_close stmt_null
  
  %type   proc_exceptions
  %type  exception_sect
  %type  proc_exception
  %type  proc_conditions
  
  
  %type   raise_level
  %typeraise_msg
--- 125,138 
  %type   stmt_assign stmt_if stmt_loop stmt_while stmt_exit
  %type   stmt_return stmt_raise stmt_execsql stmt_execsql_insert
  %type   stmt_dynexecute stmt_for stmt_perform stmt_getdiag
! %type   stmt_open stmt_fetch stmt_close stmt_move stmt_null
  
  %type   proc_exceptions
  %type  exception_sect
  %type  proc_exception
  %type  proc_conditions
  
+ %type  fetch_direction
  
  %type   raise_level
  %typeraise_msg
***
*** 140,151 
--- 142,156 
  %type   getdiag_kind getdiag_target
  
  %type   lno
+ %token  T_NUMBER
  
/*
 * Keyword tokens
 */
+ %tokenK_ABSOLUTE
  %tokenK_ALIAS
  %tokenK_ASSIGN
+ %tokenK_BACKWARD
  %tokenK_BEGIN
  %tokenK_BY
  %tokenK_CLOSE
***
*** 165,170 
--- 170,177 
  %tokenK_EXIT
  %tokenK_FOR
  %tokenK_FETCH
+ %tokenK_FIRST
+ %tokenK_FORWARD
  %tokenK_FROM
  %tokenK_GET
  %tokenK_IF
***
*** 173,180 
--- 180,189 
  %tokenK_INSERT
  %tokenK_INTO
  %tokenK_IS
+ %tokenK_LAST
  %tokenK_LOG
  %tokenK_LOOP
+ %tokenK_MOVE
  %tokenK_NEXT
  %tokenK_NOT
  %tokenK_NOTICE
***
*** 182,189 
--- 191,200 
  %tokenK_OPEN
  %tokenK_OR
  %tokenK_PERFORM
+ %tokenK_PRIOR
  %tokenK_ROW_COUNT
  %tokenK_RAISE
+ %tokenK_RELATIVE
  %tokenK_RENAME
  %tokenK_RESULT_OID
  %tokenK_RETURN
***
*** 616,621 
--- 627,634 
{ $$ = $1; }
| stmt_close
{ $$ = $1; }
+   | stmt_move
+   { $$ = $1; }
| stmt_null
{ $$ = $1; }
;
***
*** 1446,1452 
--- 1459,1553 
$$ = (PLpgSQL_stmt *)new;
}
;
+   
+ stmt_move : K_MOVE lno cursor_variable fetch_direction ';'
+   {
+   PLpgSQL_stmt_move *new;
+   
+   new = 
palloc(sizeof(PLpgSQL_stmt_move));
+   new->cmd_type = 
PLPGSQL_STMT_MOVE;
+   new->lineno = $2;
+   new->curvar = $3;
+   new->fetchdir = 
(PLpgSQL_fetch_direction *)$4;
+   
+   $$ = (PLpgSQL_stmt *)new;
+   }
+   ;
  
+ fetch_direction   : K_FORWARD T_NUMBER
+   {
+   PLpgSQL_fetch_direction *new;
+  

Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Peter Eisentraut
Jeremy Drake wrote:
> regexp_matches uses a text[] for the match groups.  If you specify
> the global flag, it could return multiple matches.  Couple this with
> the requested feature of pre- and postmatch returns (with its own
> flag) and the return would turn into some sort of nasty array of
> tuples, or multiple arrays.  It seems much cleaner to me to return a
> set of the matches found, and I find which order the matches occur in
> to be much less important than the fact that they did occur and their
> contents.

Then the fact that the flag-less matches function returns an array would 
be a mistake.  They have to return the same category of object.

> regexp_split returns setof text.  This has, in my opinion, a much
> greater case to return an array.  However, there are several issues
> with this approach:

Any programming language I have ever seen returns the result of a 
regular expression split as a structure with order.  That in turn 
implies that there are use cases for having the order, which your 
proposed function could not address.

> # My experience with the array code leads me to believe that building
> up an array is an expensive proposition.  I know I could code it
> smarter so that the array is only constructed in the end.

You can make any code arbitrarily fast if it doesn't have to give the 
right answer.

> # With a set-returning function, it is possible to add a LIMIT
> clause, to prevent splitting up more of the string than is necessary.

You can also add such functionality to a function in form of a 
parameter.  In fact, relying on a LIMIT clause to do this seems pretty 
fragile.  We argue elsewhere that LIMIT without ORDER BY is not 
well-defined, and while it's hard to imagine in the current 
implementation why the result of a set returning function would come 
back in arbitrary order, it is certainly possible in theory, so you 
still need to order the result set if you want reliable limits, but 
that is not possible of the order is lost in the result.

>  It is also immediately possible to insert them into a table, or do
> grouping on them, or call a function on each value.  Most of the time
> when I do a split, I intend to do something like this with each split
> value.

These sort of arguments remind me of the contrib/xml2 module, which also 
has a very, uh, pragmatic API.  Sure, these operations may seem useful 
to you.  But when we offer a function as part of the core API, it is 
also important that we offer a clean design that allows other users to 
combine reasonably orthogonal functionality into tools that are useful 
to them.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Jeremy Drake
On Thu, 15 Feb 2007, Peter Eisentraut wrote:

> Neil Conway wrote:
> > On Wed, 2007-02-14 at 16:49 -0800, Jeremy Drake wrote:
> > > What was the status of this?  Was there anything else I needed to
> > > do with this patch, or is it ready to be applied?  Let me know if
> > > there is anything else I need to do on this...
> >
> > Will do -- I'm planning to apply this as soon as I have the free
> > cycles to do so, likely tomorrow or Friday.
>
> I don't know which patch is actually being proposed now.  It would be
> good to make this more explicit and maybe include a synopsis of the
> functions in the email, so we know what's going on.

Sorry, my intent was just to check to see if I had gotten the patch
sufficiently fixed for Neil to apply and he just hadn't gotten to it yet
(which seems to be the case), or if there was something else he still
expected me to fix that I had missed in the prior discussions.  I suppose
I should have emailed him privately.

The patch in question can be seen in the archives here:
http://archives.postgresql.org/pgsql-patches/2007-02/msg00214.php

The functions added are:
* regexp_split(str text, pattern text) RETURNS SETOF text
  regexp_split(str text, pattern text, flags text) RETURNS SETOF text
   returns each section of the string delimited by the pattern.
* regexp_matches(str text, pattern text) RETURNS text[]
   returns all capture groups when matching pattern against string in an
   array
* regexp_matches(str text, pattern text, flags text) RETURNS SETOF
(prematch text, fullmatch text, matches text[], postmatch text)
   returns all capture groups when matching pattern against string in an
   array.  also returns the entire match in fullmatch.  if the 'g' option
   is given, returns all matches in the string.  if the 'r' option is
   given, also return the text before and after the match in prematch and
   postmatch respectively.

> What confuses me about some of the functions I've seen in earlier
> patches in this thread is that they return setof something.  But in my
> mind, regular expression matches or string splits are inherently
> ordered, so an array would be the correct return type.

They do return SETOF.  Addressing them separately:

regexp_matches uses a text[] for the match groups.  If you specify the
global flag, it could return multiple matches.  Couple this with the
requested feature of pre- and postmatch returns (with its own flag) and
the return would turn into some sort of nasty array of tuples, or multiple
arrays.  It seems much cleaner to me to return a set of the matches found,
and I find which order the matches occur in to be much less important than
the fact that they did occur and their contents.


regexp_split returns setof text.  This has, in my opinion, a much greater
case to return an array.  However, there are several issues with this
approach:

# My experience with the array code leads me to believe that building up
an array is an expensive proposition.  I know I could code it smarter so
that the array is only constructed in the end.

# With a set-returning function, it is possible to add a LIMIT clause, to
prevent splitting up more of the string than is necessary.  It is also
immediately possible to insert them into a table, or do grouping on them,
or call a function on each value.  Most of the time when I do a split, I
intend to do something like this with each split value.

# You can still get an array if you really want it:
#* SELECT ARRAY(SELECT * FROM regexp_split('first, second, third', E',\\s*'))



-- 
No problem is so formidable that you can't just walk away from it.
-- C. Schulz

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [PATCHES] patch adding new regexp functions

2007-02-15 Thread Peter Eisentraut
Neil Conway wrote:
> On Wed, 2007-02-14 at 16:49 -0800, Jeremy Drake wrote:
> > What was the status of this?  Was there anything else I needed to
> > do with this patch, or is it ready to be applied?  Let me know if
> > there is anything else I need to do on this...
>
> Will do -- I'm planning to apply this as soon as I have the free
> cycles to do so, likely tomorrow or Friday.

I don't know which patch is actually being proposed now.  It would be 
good to make this more explicit and maybe include a synopsis of the 
functions in the email, so we know what's going on.

What confuses me about some of the functions I've seen in earlier 
patches in this thread is that they return setof something.  But in my 
mind, regular expression matches or string splits are inherently 
ordered, so an array would be the correct return type.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings