[HACKERS] template0 database comment

2011-03-12 Thread Bruce Momjian
People are confused about what template0 is for, so I created the
attached one-line patch to add a database comment to template0. No
initdb, I assume, becuase it is just a comment.

I plan to work on more system table and view comments for 9.2.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
new file mode 100644
index acd2514..ba9b688
*** a/src/bin/initdb/initdb.c
--- b/src/bin/initdb/initdb.c
*** make_template0(void)
*** 1976,1981 
--- 1976,1983 
  		REVOKE CREATE,TEMPORARY ON DATABASE template1 FROM public;\n,
  		REVOKE CREATE,TEMPORARY ON DATABASE template0 FROM public;\n,
  
+ 		COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,
+ 
  		/*
  		 * Finally vacuum to clean up dead rows in pg_database
  		 */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 People are confused about what template0 is for, so I created the
 attached one-line patch to add a database comment to template0. No
 initdb, I assume, becuase it is just a comment.
  
 + COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,

No objection to the concept, but the actual text of this comment is
approximately 100% wrong.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Dave Page
On 3/12/11, Tom Lane t...@sss.pgh.pa.us wrote:
 Bruce Momjian br...@momjian.us writes:
 People are confused about what template0 is for, so I created the
 attached one-line patch to add a database comment to template0. No
 initdb, I assume, becuase it is just a comment.

 +COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,

 No objection to the concept, but the actual text of this comment is
 approximately 100% wrong.


I'd like to lodge a formal objection to the use of the word
'approximately' in the above comment.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] On-the-fly index tuple deletion vs. hot_standby

2011-03-12 Thread Noah Misch
On Thu, Dec 09, 2010 at 09:48:25AM +, Simon Riggs wrote:
 On Fri, 2010-12-03 at 21:43 +0200, Heikki Linnakangas wrote:
  On 29.11.2010 08:10, Noah Misch wrote:
   I have a hot_standby system and use it to bear the load of various 
   reporting
   queries that take 15-60 minutes each.  In an effort to avoid long pauses 
   in
   recovery, I set a vacuum_defer_cleanup_age constituting roughly three 
   hours of
   the master's transactions.  Even so, I kept seeing recovery pause for the
   duration of a long-running query.  In each case, the culprit record was an
   XLOG_BTREE_DELETE arising from on-the-fly deletion of an index tuple.  The
   attached test script demonstrates the behavior (on HEAD); the index tuple
   reclamation conflicts with a concurrent SELECT pg_sleep(600) on the 
   standby.
  
   Since this inserting transaction aborts, HeapTupleSatisfiesVacuum reports
   HEAPTUPLE_DEAD independent of vacuum_defer_cleanup_age.  We go ahead and 
   remove
   the index tuples.  On the standby, btree_xlog_delete_get_latestRemovedXid 
   does
   not regard the inserting-transaction outcome, so btree_redo proceeds to 
   conflict
   with snapshots having visibility over that transaction.  Could we 
   correctly
   improve this by teaching btree_xlog_delete_get_latestRemovedXid to ignore 
   tuples
   of aborted transactions and tuples inserted and deleted within one 
   transaction?
 
 @Noah Easily the best bug reported submitted in a long time. Thanks.
 
  Seems reasonable. HeapTupleHeaderAdvanceLatestRemovedXid() will need 
  similar treatment. Actually, btree_xlog_delete_get_latestRemovedXid() 
  could just call HeapTupleHeaderAdvanceLatestRemoveXid().
 
 Yes, it applies to other cases also. Thanks for the suggestion.
 
 Fix committed. Please double-check my work, committed early since I'm
 about to jump on a plane.

The installation that inspired my original report recently upgraded from 9.0.1
to 9.0.3, and your fix did significantly decrease its conflict frequency.  The
last several conflicts I have captured involve XLOG_BTREE_REUSE_PAGE records.
(FWIW, the index has generally been pg_attribute_relid_attnam_index.)  I've
attached a test script demonstrating the behavior.  _bt_page_recyclable approves
any page deleted no more recently than RecentXmin, because we need only ensure
that every ongoing scan has witnessed the page as dead.  For the hot standby
case, we need to account for possibly-ongoing standby transactions.  Using
RecentGlobalXmin covers that, albeit with some pessimism: we really only need
LEAST(RecentXmin, PGPROC-xmin of walsender_1, .., PGPROC-xmin of walsender_N)
- vacuum_defer_cleanup_age.  Not sure the accounting to achieve that would pay
off, though.  Thoughts?

Thanks,
nm


repro-btree-reuse.sh
Description: Bourne shell script
*** a/src/backend/access/nbtree/nbtpage.c
--- b/src/backend/access/nbtree/nbtpage.c
***
*** 678,683  bool
--- 678,684 
  _bt_page_recyclable(Page page)
  {
BTPageOpaque opaque;
+   TransactionId xmin;
  
/*
 * It's possible to find an all-zeroes page in an index --- for 
example, a
***
*** 693,700  _bt_page_recyclable(Page page)
 * interested in it.
 */
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
if (P_ISDELETED(opaque) 
!   TransactionIdPrecedesOrEquals(opaque-btpo.xact, RecentXmin))
return true;
return false;
  }
--- 694,702 
 * interested in it.
 */
opaque = (BTPageOpaque) PageGetSpecialPointer(page);
+   xmin = XLogStandbyInfoActive() ? RecentGlobalXmin : RecentXmin;
if (P_ISDELETED(opaque) 
!   TransactionIdPrecedesOrEquals(opaque-btpo.xact, xmin))
return true;
return false;
  }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-12 Thread Daniel Farina
On Wed, Mar 9, 2011 at 6:02 AM, Alvaro Herrera
alvhe...@commandprompt.com wrote:
 I'll send you a perl program we wrote for a customer to check for
 strange issues in btrees.  Please give it a spin; it may give you more
 clues.  If you find additional checks to add, please let me know!

I have also, coincidentally, encountered corruption of a system
catalog index -- 8.3.11 -- I have saved the file for forensics.  Is it
possible that I also receive a copy of this program?

-- 
fdr

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] memory-related bugs

2011-03-12 Thread Noah Misch
A suitably-instrumented run of make installcheck-world under valgrind turned
up a handful of memory-related bugs:

* memcpy()/strncpy() between overlapping regions
uniqueentry() and dispell_lexize() both deduplicate an array by iteratively
copying elements downward to occlude the duplicates.  Before finding a first
duplicate, these functions call memcpy() with identical arguments.  Similarly,
resolve_polymorphic_tupdesc() calls TupleDescInitEntry() with an attributeName
pointing directly into the TupleDesc's name bytes, causing the latter to call
strncpy() with identical arguments.  The attached mem1v1-memcpy-overlap.patch
fixes these sites by checking for equal pointers before the affected call.  For
TupleDescInitEntry(), I considered instead having resolve_polymorphic_tupdesc()
pstrdup() the value.

* read past the end of a Form_pg_type in examine_attribute()
examine_attribute() copies a Form_pg_type naively.  Since the nullable columns
at the end of the structure are not present in memory, the memcpy() reads eight
bytes past the end of the source allocation.  mem2v1-analyze-overread.patch
updates this code to match how we address the same issue for Form_pg_attribute.

* off-by-one error in shift_jis_20042euc_jis_2004()
This function grabs two bytes at a time, even when only one byte remains; this
makes it read past the end of the input.  mem3v1-sjis-offbyone.patch changes it
to not do this and to report an error when the input ends in a byte that would
start a two-byte sequence.

Thanks,
nm
diff --git a/src/backend/access/common/tupdesc.c 
b/src/backend/access/common/tupdesc.c
index d78b083..1ed7195 100644
*** a/src/backend/access/common/tupdesc.c
--- b/src/backend/access/common/tupdesc.c
***
*** 459,468  TupleDescInitEntry(TupleDesc desc,
 * fill in valid resname values in targetlists, particularly for resjunk
 * attributes.
 */
!   if (attributeName != NULL)
!   namestrcpy((att-attname), attributeName);
!   else
MemSet(NameStr(att-attname), 0, NAMEDATALEN);
  
att-attstattarget = -1;
att-attcacheoff = -1;
--- 459,468 
 * fill in valid resname values in targetlists, particularly for resjunk
 * attributes.
 */
!   if (attributeName == NULL)
MemSet(NameStr(att-attname), 0, NAMEDATALEN);
+   else if (attributeName != (att-attname))
+   namestrcpy((att-attname), attributeName);
  
att-attstattarget = -1;
att-attcacheoff = -1;
diff --git a/src/backend/tsearch/dict_ispeindex 31929c0..bfd1732 100644
*** a/src/backend/tsearch/dict_ispell.c
--- b/src/backend/tsearch/dict_ispell.c
***
*** 139,145  dispell_lexize(PG_FUNCTION_ARGS)
}
else
{
!   memcpy(cptr, ptr, sizeof(TSLexeme));
cptr++;
ptr++;
}
--- 139,146 
}
else
{
!   if (ptr != cptr)
!   memcpy(cptr, ptr, sizeof(TSLexeme));
cptr++;
ptr++;
}
diff --git a/src/backend/utils/adt/tsvecindex 6810615..6c24850 100644
*** a/src/backend/utils/adt/tsvector.c
--- b/src/backend/utils/adt/tsvector.c
***
*** 125,131  uniqueentry(WordEntryIN *a, int l, char *buf, int *outbuflen)
buflen += res-poslen * sizeof(WordEntryPos) + 
sizeof(uint16);
}
res++;
!   memcpy(res, ptr, sizeof(WordEntryIN));
}
else if (ptr-entry.haspos)
{
--- 125,132 
buflen += res-poslen * sizeof(WordEntryPos) + 
sizeof(uint16);
}
res++;
!   if (ptr != res)
!   memcpy(res, ptr, sizeof(WordEntryIN));
}
else if (ptr-entry.haspos)
{
diff --git a/src/backend/commands/analyze.c b/src/backend/commands/analyze.c
index bafdc80..2e0a869 100644
*** a/src/backend/commands/analyze.c
--- b/src/backend/commands/analyze.c
***
*** 874,881  examine_attribute(Relation onerel, int attnum, Node 
*index_expr)
typtuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(stats-attrtypid));
if (!HeapTupleIsValid(typtuple))
elog(ERROR, cache lookup failed for type %u, 
stats-attrtypid);
!   stats-attrtype = (Form_pg_type) palloc(sizeof(FormData_pg_type));
!   memcpy(stats-attrtype, GETSTRUCT(typtuple), sizeof(FormData_pg_type));
ReleaseSysCache(typtuple);
stats-anl_context = anl_context;
stats-tupattnum = attnum;
--- 874,881 
typtuple = SearchSysCache1(TYPEOID, ObjectIdGetDatum(stats-attrtypid));
if 

Re: [HACKERS] template0 database comment

2011-03-12 Thread Bruce Momjian
Dave Page wrote:
 On 3/12/11, Tom Lane t...@sss.pgh.pa.us wrote:
  Bruce Momjian br...@momjian.us writes:
  People are confused about what template0 is for, so I created the
  attached one-line patch to add a database comment to template0. No
  initdb, I assume, becuase it is just a comment.
 
  +  COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,
 
  No objection to the concept, but the actual text of this comment is
  approximately 100% wrong.
 
 
 I'd like to lodge a formal objection to the use of the word
 'approximately' in the above comment.

OK, funny guys.  ;-)  Can someone give me the right text.  Obviously I
don' know what template0 is used for either.  Is it pg_dumpall perhaps?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Thom Brown
On 12 March 2011 13:59, Bruce Momjian br...@momjian.us wrote:
 Dave Page wrote:
 On 3/12/11, Tom Lane t...@sss.pgh.pa.us wrote:
  Bruce Momjian br...@momjian.us writes:
  People are confused about what template0 is for, so I created the
  attached one-line patch to add a database comment to template0. No
  initdb, I assume, becuase it is just a comment.
 
  +          COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,
 
  No objection to the concept, but the actual text of this comment is
  approximately 100% wrong.
 

 I'd like to lodge a formal objection to the use of the word
 'approximately' in the above comment.

 OK, funny guys.  ;-)  Can someone give me the right text.  Obviously I
 don' know what template0 is used for either.  Is it pg_dumpall perhaps?

'original template database' ?

-- 
Thom Brown
Twitter: @darkixion
IRC (freenode): dark_ixion
Registered Linux user: #516935

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Bruce Momjian
Thom Brown wrote:
 On 12 March 2011 13:59, Bruce Momjian br...@momjian.us wrote:
  Dave Page wrote:
  On 3/12/11, Tom Lane t...@sss.pgh.pa.us wrote:
   Bruce Momjian br...@momjian.us writes:
   People are confused about what template0 is for, so I created the
   attached one-line patch to add a database comment to template0. No
   initdb, I assume, becuase it is just a comment.
  
   + ? ? ? ? ?COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,
  
   No objection to the concept, but the actual text of this comment is
   approximately 100% wrong.
  
 
  I'd like to lodge a formal objection to the use of the word
  'approximately' in the above comment.
 
  OK, funny guys. ?;-) ?Can someone give me the right text. ?Obviously I
  don' know what template0 is used for either. ?Is it pg_dumpall perhaps?
 
 'original template database' ?

I like that.  Perhaps unmodified template database'?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Christopher Browne
On Sat, Mar 12, 2011 at 8:59 AM, Bruce Momjian br...@momjian.us wrote:
 Dave Page wrote:
 On 3/12/11, Tom Lane t...@sss.pgh.pa.us wrote:
  Bruce Momjian br...@momjian.us writes:
  People are confused about what template0 is for, so I created the
  attached one-line patch to add a database comment to template0. No
  initdb, I assume, becuase it is just a comment.
 
  +          COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,
 
  No objection to the concept, but the actual text of this comment is
  approximately 100% wrong.
 

 I'd like to lodge a formal objection to the use of the word
 'approximately' in the above comment.

 OK, funny guys.  ;-)  Can someone give me the right text.  Obviously I
 don' know what template0 is used for either.  Is it pg_dumpall perhaps?

Whaa?!?!

pg_dump has nothing to do with it.  Only used by createdb

Possibilities include:
- 'base template database'
- 'base template (used if template1 is corrupted)'
- 'backup template (use if template1 corrupted)'

Contrast with template1
- 'default template for creation of new databases'

I dunno that those are the *best* wordings, but they may suggest one.
-- 
http://linuxfinances.info/info/linuxdistributions.html

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Christopher Browne
On Sat, Mar 12, 2011 at 9:14 AM, Bruce Momjian br...@momjian.us wrote:
 I like that.  Perhaps unmodified template database'?

why tends to be more important than what, particularly to a
confused DBA who's trying to figure out why do they have all these
extra databases???

Perhaps...
backup template database - normally immutable, used if template1 is corrupted


-- 
http://linuxfinances.info/info/linuxdistributions.html

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Bruce Momjian
Christopher Browne wrote:
 On Sat, Mar 12, 2011 at 8:59 AM, Bruce Momjian br...@momjian.us wrote:
  Dave Page wrote:
  On 3/12/11, Tom Lane t...@sss.pgh.pa.us wrote:
   Bruce Momjian br...@momjian.us writes:
   People are confused about what template0 is for, so I created the
   attached one-line patch to add a database comment to template0. No
   initdb, I assume, becuase it is just a comment.
  
   + ? ? ? ? ?COMMENT ON DATABASE template0 IS 'only used by pg_dump';\n,
  
   No objection to the concept, but the actual text of this comment is
   approximately 100% wrong.
  
 
  I'd like to lodge a formal objection to the use of the word
  'approximately' in the above comment.
 
  OK, funny guys. ?;-) ?Can someone give me the right text. ?Obviously I
  don' know what template0 is used for either. ?Is it pg_dumpall perhaps?
 
 Whaa?!?!
 
 pg_dump has nothing to do with it.  Only used by createdb
 
 Possibilities include:
 - 'base template database'
 - 'base template (used if template1 is corrupted)'
 - 'backup template (use if template1 corrupted)'
 
 Contrast with template1
 - 'default template for creation of new databases'
 
 I dunno that those are the *best* wordings, but they may suggest one.

I thought the big deal with template0 was it was used to find items that
were added to template1 by pg_dumpall.

I think Thom's idea of not describing its use but its contents might be
best, maybe unmodifiable template database.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macros for time magic values

2011-03-12 Thread Bruce Momjian

Applied.

---

Bruce Momjian wrote:
 It has bothered me that many of our time routines use special magic
 constants for time values, e.g. 24, 12, 60, etc.
 
 The attached patch changes these magic constants to macros to clarify
 the code.  I would like to apply this for 9.1 as a cleanup.
 
 -- 
   Bruce Momjian  br...@momjian.ushttp://momjian.us
   EnterpriseDB http://enterprisedb.com
 
   + It's impossible for everything to be true. +

[ text/x-diff is unsupported, treating like TEXT/PLAIN ]

 diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c
 index 80dd10b..f96fa6c 100644
 --- a/src/backend/utils/adt/date.c
 +++ b/src/backend/utils/adt/date.c
 @@ -2612,7 +2612,7 @@ timetz_zone(PG_FUNCTION_ARGS)
   type = DecodeSpecial(0, lowzone, val);
  
   if (type == TZ || type == DTZ)
 - tz = val * 60;
 + tz = val * MINS_PER_HOUR;
   else
   {
   tzp = pg_tzset(tzname);
 diff --git a/src/backend/utils/adt/datetime.c 
 b/src/backend/utils/adt/datetime.c
 index 85f0206..f0fe2e3 100644
 --- a/src/backend/utils/adt/datetime.c
 +++ b/src/backend/utils/adt/datetime.c
 @@ -342,7 +342,7 @@ j2date(int jd, int *year, int *month, int *day)
   *year = y - 4800;
   quad = julian * 2141 / 65536;
   *day = julian - 7834 * quad / 256;
 - *month = (quad + 10) % 12 + 1;
 + *month = (quad + 10) % MONTHS_PER_YEAR + 1;
  
   return;
  }/* j2date() */
 @@ -952,8 +952,8 @@ DecodeDateTime(char **field, int *ftype, int nf,
* DecodeTime()
*/
   /* test for  24:00:00 */
 - if (tm-tm_hour  24 ||
 - (tm-tm_hour == 24 
 + if (tm-tm_hour  HOURS_PER_DAY ||
 + (tm-tm_hour == HOURS_PER_DAY 
(tm-tm_min  0 || tm-tm_sec  0 || 
 *fsec  0)))
   return DTERR_FIELD_OVERFLOW;
   break;
 @@ -1371,12 +1371,12 @@ DecodeDateTime(char **field, int *ftype, int nf,
   return dterr;
  
   /* handle AM/PM */
 - if (mer != HR24  tm-tm_hour  12)
 + if (mer != HR24  tm-tm_hour  HOURS_PER_DAY / 2)
   return DTERR_FIELD_OVERFLOW;
 - if (mer == AM  tm-tm_hour == 12)
 + if (mer == AM  tm-tm_hour == HOURS_PER_DAY / 2)
   tm-tm_hour = 0;
 - else if (mer == PM  tm-tm_hour != 12)
 - tm-tm_hour += 12;
 + else if (mer == PM  tm-tm_hour != HOURS_PER_DAY / 2)
 + tm-tm_hour += HOURS_PER_DAY / 2;
  
   /* do additional checking for full date specs... */
   if (*dtype == DTK_DATE)
 @@ -2058,17 +2058,18 @@ DecodeTimeOnly(char **field, int *ftype, int nf,
   return dterr;
  
   /* handle AM/PM */
 - if (mer != HR24  tm-tm_hour  12)
 + if (mer != HR24  tm-tm_hour  HOURS_PER_DAY / 2)
   return DTERR_FIELD_OVERFLOW;
 - if (mer == AM  tm-tm_hour == 12)
 + if (mer == AM  tm-tm_hour == HOURS_PER_DAY / 2)
   tm-tm_hour = 0;
 - else if (mer == PM  tm-tm_hour != 12)
 - tm-tm_hour += 12;
 + else if (mer == PM  tm-tm_hour != HOURS_PER_DAY / 2)
 + tm-tm_hour += HOURS_PER_DAY / 2;
  
 - if (tm-tm_hour  0 || tm-tm_min  0 || tm-tm_min  59 ||
 - tm-tm_sec  0 || tm-tm_sec  60 || tm-tm_hour  24 ||
 + if (tm-tm_hour  0 || tm-tm_min  0 || tm-tm_min  MINS_PER_HOUR - 1 
 ||
 + tm-tm_sec  0 || tm-tm_sec  SECS_PER_MINUTE ||
 + tm-tm_hour  HOURS_PER_DAY ||
   /* test for  24:00:00 */
 - (tm-tm_hour == 24 
 + (tm-tm_hour == HOURS_PER_DAY 
(tm-tm_min  0 || tm-tm_sec  0 || *fsec  0)) ||
  #ifdef HAVE_INT64_TIMESTAMP
   *fsec  INT64CONST(0) || *fsec  USECS_PER_SEC
 @@ -2396,13 +2397,15 @@ DecodeTime(char *str, int fmask, int range,
  
   /* do a sanity check */
  #ifdef HAVE_INT64_TIMESTAMP
 - if (tm-tm_hour  0 || tm-tm_min  0 || tm-tm_min  59 ||
 - tm-tm_sec  0 || tm-tm_sec  60 || *fsec  INT64CONST(0) ||
 + if (tm-tm_hour  0 || tm-tm_min  0 || tm-tm_min  MINS_PER_HOUR -1 
 ||
 + tm-tm_sec  0 || tm-tm_sec  SECS_PER_MINUTE ||
 + *fsec  INT64CONST(0) ||
   *fsec  USECS_PER_SEC)
   return DTERR_FIELD_OVERFLOW;
  #else
 - if (tm-tm_hour  0 || tm-tm_min  0 || tm-tm_min  59 ||
 - tm-tm_sec  0 || tm-tm_sec  60 || *fsec  0 || *fsec  1)
 + if (tm-tm_hour  0 || tm-tm_min  0 || tm-tm_min  MINS_PER_HOUR - 1 
 ||
 + tm-tm_sec  0 || tm-tm_sec  SECS_PER_MINUTE ||
 + *fsec  0 || *fsec  1)
   return DTERR_FIELD_OVERFLOW;
  #endif
  
 @@ -2748,9 +2751,9 @@ DecodeTimezone(char *str, int *tzp)
  
   

Re: [HACKERS] FuncExpr.collid/OpExpr.collid unworkably serving double duty

2011-03-12 Thread Martijn van Oosterhout
On Fri, Mar 11, 2011 at 04:56:31PM -0500, Tom Lane wrote:
  In my implementation I needed to expand this to the general set of
  operators postgresql supported and relaxed this to only consider
  arguments to the function/operator that had the same type as the
  resulting type of the function/operator, since that's the only thing
  that makes sense.
 
 Yeah, that corresponds to Transact-SQL's distinction between functions
 that take a string and produce a string, versus those that produce a
 string without any string inputs.  But I don't see anything justifying
 such a distinction in the text of the standard.

I remember being bugged by the fact that the standard indeed said
nothing (that I could find) about the results of functions that
accepted something other than strings.

 Also note that the TSQL doc explicitly points out that collation labels
 can be carried up through changes of character string types, so I think
 you're wrong to say that collation is only carried through functions that
 produce exactly the same type as their input.  I'd say collation labels
 propagate through any function that has both collatable input(s) and a
 collatable result type.

Yes, this is the most logical and useful interpretation. Collations for
all the string types are essentially compatable so can be treated as
equivalent for this purpose.

(My bit about same type at input is due to my idea of extending
collation to more than just strings. I don't beleive the current patch
supports that anyway. I intended to treat all string types as
equivalent.

 In any case, I am sure that that what this describes is not what our
 current code does :-(, and that we can't get anywhere close to this with
 the existing infrastructure.  So at this point I'm thinking that the
 safest approach is to rip out the result-collation caching fields and
 perform collation assignment in a parsing post-pass.  That will allow us
 to revise the collation assignment algorithm without further catversion
 bumps.

Are you sure? I've grabbed the patch and an looking at it. It looks to
me like it is parse_expr properly combining the collations of the
subnodes, just like it is doing the type determination.

Hmm, but select_common_collation looks a bit dodgy indeed. For example,
whether a collation is explicit or not depends on the type of the
subnode. That's not right at all. During the parse phase you must not
only track the collation but also the state (default, implicit,
explicit, error, none). Otherwise you indeed get the issue where you
don't throw errors at the right time.

I had a collatestate object:

+   Oid colloid;/* OID of collation */
+   CollateType colltype;   /* Type of Collation */

Where CollateType is IMPLICIT, EXPLICIT, NONE or DEFAULT.

And then I had the function below to handle the combining. This I think
handles the collation rules correctly, but I don't think it can be
shoehorned into the current patch.

Hope this helps with your review.

Have a nice day,

+ /* This logic is dictated by the SQL standard */
+ CollateState *
+ CombineCollateState( CollateState *arg1, CollateState *arg2 )
+ {
+   /* If either argument is coerable (the default), return the other */
+   if( !arg1 || arg1-colltype == COLLATE_COERCIBLE )
+   return arg2;
+   if( !arg2 || arg2-colltype == COLLATE_COERCIBLE )
+   return arg1;
+   /* If both are explicit, they must be the same */
+   if( arg1-colltype == COLLATE_EXPLICIT  arg2-colltype == 
COLLATE_EXPLICIT )
+   {
+   if( arg1-colloid != arg2-colloid )
+   ereport( ERROR, (errmsg(Conflicting COLLATE 
clauses)));
+   /* Both same, doesn't matter which we return */
+   return arg1;
+   }
+   /* Otherwise, explicit overrides anything */
+   if( arg1-colltype == COLLATE_EXPLICIT )
+   return arg1;
+   if( arg2-colltype == COLLATE_EXPLICIT )
+   return arg2;
+   /* COLLATE_NONE can only be overridden by EXPLICIT */
+   if( arg1-colltype == COLLATE_NONE )
+   return arg1;
+   if( arg2-colltype == COLLATE_NONE )
+   return arg2;
+   
+   /* We only get here if both are implicit */
+   /* Same locale, return that implicitly */
+   if( arg1-colloid == arg2-colloid )
+   return arg1;
+   
+   /* Conflicting implicit collation, return NONE */
+   {
+   CollateState *result = makeNode(CollateState);
+   result-colltype = COLLATE_NONE;
+   result-colloid = InvalidOid;
+   return result;
+   }
+ }
+ 

-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: 

Re: [HACKERS] template0 database comment

2011-03-12 Thread Greg Stark
On Sat, Mar 12, 2011 at 1:59 PM, Bruce Momjian br...@momjian.us wrote:
 OK, funny guys.  ;-)  Can someone give me the right text.  Obviously I
 don' know what template0 is used for either.  Is it pg_dumpall perhaps?


template0: unmodifiable pristine empty database
template1: default template for new databases



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory-related bugs

2011-03-12 Thread Greg Stark
On Sat, Mar 12, 2011 at 1:32 PM, Noah Misch n...@leadboat.com wrote:
 A suitably-instrumented run of make installcheck-world under valgrind turned
 up a handful of memory-related bugs:


Nice work. How did you instrument things so valgrind knew about palloc
et al? I remember trying this in the past and running into problems. I
think the biggest one was that we write out structs to disk including
padding so trigger lots of reads of uninitialized data warnings.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-12 Thread Greg Stark
On Sat, Mar 12, 2011 at 3:06 AM, Nikhil Sontakke
nikhil.sonta...@enterprisedb.com wrote:
 Live 522's      (LSN: logid 29, recoff 0xd1fade3c) previous points to
 the zeroed out 523 block. Note that this seems to be latest LSN in the
 data file.


So do you have logs from the server when it was restarted? It should
say how far it recovered before it started up

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Bruce Momjian
Greg Stark wrote:
 On Sat, Mar 12, 2011 at 1:59 PM, Bruce Momjian br...@momjian.us wrote:
  OK, funny guys. ?;-) ?Can someone give me the right text. ?Obviously I
  don' know what template0 is used for either. ?Is it pg_dumpall perhaps?
 
 
 template0: unmodifiable pristine empty database
 template1: default template for new databases

I think I like unmodifiable empty database.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + It's impossible for everything to be true. +

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Sat, Mar 12, 2011 at 1:59 PM, Bruce Momjian br...@momjian.us wrote:
 OK, funny guys.  ;-)  Can someone give me the right text.  Obviously I
 don' know what template0 is used for either.  Is it pg_dumpall perhaps?

 template0: unmodifiable pristine empty database
 template1: default template for new databases

Yeah, I think that the right way to approach this is to have initdb
comment *both* of those databases.  I don't like that specific wording
for template0 though.  Maybe

template0: unmodified copy of original template1 database
template1: default template for new databases

The problem with Greg's wording is that it's falsifiable: it is possible
for someone to modify template0 if they're determined to mess things up.
So a description like unmodifiable is promising too much.

Shouldn't the postgres database get a comment too, while we're at it?
Perhaps default database to connect to?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Collations versus user-defined functions

2011-03-12 Thread Tom Lane
I've thought of another area that AFAICT the current patch fails to
address at all: what should happen in user-defined functions?
Consider

create function my_lt(x text, y text) returns bool as
$$
begin
return x  y;
end
$$ language plpgsql;

select my_lt('foo', 'bar' collate de_DE);
select my_lt('foo', 'bar' collate fr_FR);

I think it's at least arguably desirable that the results of the two
calls respond to the collation clauses, but it does not look to me
like that will happen: plpgsql isn't doing anything to propagate
its call-site collation value into expressions it evaluates, and
if it did, it'd still get the wrong answer on the second call because it
would have cached an expression plan tree containing the collation info
from the first call.

In SQL-language functions the situation is even worse, because they will
behave differently depending on whether or not they get inlined.
(I think ... haven't really tested that case.)

What do we want to do about this?  Making it work the way it seems like
it ought to will require a rather substantial investment of effort.
It looks to me like the least invasive answer would be to have plpgsql
cache different plan trees depending on the collation it receives for
its parameters, but that's still a whole lot of work.  Does the SQL
standard have anything to say on the matter, or is there a precedent in
the behavior of TSQL or other DBMSes?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory-related bugs

2011-03-12 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 A suitably-instrumented run of make installcheck-world under valgrind turned
 up a handful of memory-related bugs:

Hmm, interesting work, but I don't think I believe in the necessity for
this kluge:

 + else if (attributeName != (att-attname))
 + namestrcpy((att-attname), attributeName);

The rules against overlapping memcpy/strcpy's source and destination are
meant to cover the case of partial overlap; I find it hard to imagine an
implementation that will mess up when the source and destination are
identical.  If we did think it was important to avoid this situation I
would rather find another way, like modifying the caller.  Likewise
the other changes to avoid no-op memcpy's do not appear to me to be
bugs, though possibly they might save enough cycles to be worth doing
anyway.

 ! stats-attrtype = (Form_pg_type) palloc(sizeof(FormData_pg_type));
 ! memcpy(stats-attrtype, GETSTRUCT(typtuple), sizeof(FormData_pg_type));
 ...
 ! stats-attrtype = (Form_pg_type) palloc(TYPE_FIXED_PART_SIZE);
 ! memcpy(stats-attrtype, GETSTRUCT(typtuple), TYPE_FIXED_PART_SIZE);

I wonder whether we should instead fix this by copying the correct tuple
length.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Martijn van Oosterhout
On Sat, Mar 12, 2011 at 12:17:11PM -0500, Tom Lane wrote:
 I've thought of another area that AFAICT the current patch fails to
 address at all: what should happen in user-defined functions?

snip

The POLA suggests that the collation derivation of the original query
should not be affected by the implementation of a function. In the case
of SQL functions this means that the expansion of the function must
not change the results. This would mean introducing a CollateNode above
the SQL function after expansion, though you may be able to acheive
this by doing the collation derivation prior to expansion of the SQL
function, but I don't know if that's feasable.

(Note the introduced collate node would need to remember the collation state.)

Similarly, inside the function the parameters should be considered to
be IMPLICIT collation, to avoid strange errors depending on how its
called.

This means you can't make a set_collation function, but that doesn't
seem like a loss to me.

   select my_lt('foo', 'bar' collate de_DE);
   select my_lt('foo', 'bar' collate fr_FR);
 
 I think it's at least arguably desirable that the results of the two
 calls respond to the collation clauses, but it does not look to me
 like that will happen: plpgsql isn't doing anything to propagate
 its call-site collation value into expressions it evaluates, and
 if it did, it'd still get the wrong answer on the second call because it
 would have cached an expression plan tree containing the collation info
 from the first call.

I think you need to consider the collation to be a variation of the
type. plpgsql makes new plans for each type when dealing with any
parameters, this should fit right in.

SQL would need a recollate-label node like suggested above.

For other languages you just need to provide the info, what they do
with it is not your problem.

 What do we want to do about this?  Making it work the way it seems like
 it ought to will require a rather substantial investment of effort.
 It looks to me like the least invasive answer would be to have plpgsql
 cache different plan trees depending on the collation it receives for
 its parameters, but that's still a whole lot of work.  Does the SQL
 standard have anything to say on the matter, or is there a precedent in
 the behavior of TSQL or other DBMSes?

I can't help you with other DBs, google isn't finding me anything. But
the plpgsql problem should be done already right, given it already
handles cached plans for different types.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Greg Stark
On Sat, Mar 12, 2011 at 5:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
        create function my_lt(x text, y text) returns bool as
        $
                begin
                        return x  y;
                end
        $ language plpgsql;

        select my_lt('foo', 'bar' collate de_DE);
        select my_lt('foo', 'bar' collate fr_FR);

 I think it's at least arguably desirable that the results of the two
 calls respond to the collation clauses, but it does not look to me
 like that will happen: plpgsql isn't doing anything to propagate
 its call-site collation value into expressions it evaluates, and
 if it did, it'd still get the wrong answer on the second call because it
 would have cached an expression plan tree containing the collation info
 from the first call.


I don't think it's obvious that this is the right behaviour. I think
functions should provide the same answer on the same inputs regardless
of context unless they're really intended to be volatile. The default
collation specified there is not part of the value being passed. If
you want to affect the way a plpgsql function orders things in its
code you should pass an extra argument for collation and then the
plpgsql function should use COLLATE colarg -- though I'm not sure if
that works, can you put parameters in COLLATE arguments?

I do hope user defined functions return values are marked with
implicit/explicit collations based on their arguments though.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Tom Lane
Greg Stark gsst...@mit.edu writes:
 On Sat, Mar 12, 2011 at 5:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
create function my_lt(x text, y text) returns bool as
$
begin
return x  y;
end
$ language plpgsql;
 
select my_lt('foo', 'bar' collate de_DE);
select my_lt('foo', 'bar' collate fr_FR);
 
 I think it's at least arguably desirable that the results of the two
 calls respond to the collation clauses, but it does not look to me
 like that will happen: plpgsql isn't doing anything to propagate
 its call-site collation value into expressions it evaluates, and
 if it did, it'd still get the wrong answer on the second call because it
 would have cached an expression plan tree containing the collation info
 from the first call.

 I don't think it's obvious that this is the right behaviour.

I'm not sure of that either, but ...

 I think
 functions should provide the same answer on the same inputs regardless
 of context unless they're really intended to be volatile.

... that argument convinces me not at all, because they are *not* the
same inputs.  The collate clauses are different.  If I believed your
argument, then the built-in  function shouldn't respond to COLLATE
either.

 If you want to affect the way a plpgsql function orders things in its
 code you should pass an extra argument for collation and then the
 plpgsql function should use COLLATE colarg -- though I'm not sure if
 that works, can you put parameters in COLLATE arguments?

No, you cannot, the SQL committee has blown it on that.  COLLATE's
argument is an identifier not a variable.  There is no way to do
runtime selection of collation like that.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Sat, Mar 12, 2011 at 12:17:11PM -0500, Tom Lane wrote:
 I've thought of another area that AFAICT the current patch fails to
 address at all: what should happen in user-defined functions?

 The POLA suggests that the collation derivation of the original query
 should not be affected by the implementation of a function. In the case
 of SQL functions this means that the expansion of the function must
 not change the results. This would mean introducing a CollateNode above
 the SQL function after expansion, though you may be able to acheive
 this by doing the collation derivation prior to expansion of the SQL
 function, but I don't know if that's feasable.

CollateExpr as presently defined wouldn't get the job done, but I think
it's not a problem because those nodes aren't actually needed at runtime
--- collation assignment for operators/functions above the inlined
function should have been done when they were parsed, so it won't change
as a result of expanding an inlined function.

 Similarly, inside the function the parameters should be considered to
 be IMPLICIT collation, to avoid strange errors depending on how its
 called.

Not convinced by this.  If we say that that's how it works, then no
user-defined function should react to COLLATE in its arguments at all,
which seems pretty weird and restrictive --- especially if the COLLATE
property is expected to propagate up through the function call so
far as the calling expression is concerned.  It seems just bizarre to
me to say that a function's internal operations don't react to an
input collation spec but then its result is thought to still be affected
by that.

This would actually seem more sensible if we went with something even
simpler than the current patch's behavior, namely that COLLATE only
affects the operator it is an *immediate* input of, and nothing
propagates upward in expressions ever.  I remain unconvinced that the
SQL spec is calling for propagation ...

 I think you need to consider the collation to be a variation of the
 type. plpgsql makes new plans for each type when dealing with any
 parameters, this should fit right in.

Yeah, the same occurred to me a little bit later --- we can actually
make that work fairly easily by treating collatable input datatypes
as if they were polymorphic.  But the question is whether we should.
You seem to be arguing above that user-defined functions ought not
pay attention to COLLATE specs on their inputs.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Peter Eisentraut
On lör, 2011-03-12 at 12:01 -0500, Tom Lane wrote:
 Shouldn't the postgres database get a comment too, while we're at
 it?  Perhaps default database to connect to?

That's not actually true, though.  Maybe it's the default database used
by administration programs?  In practice it might be some otherwise
unused database that's occasionally useful. ;-)


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Peter Eisentraut
On lör, 2011-03-12 at 12:17 -0500, Tom Lane wrote:
 Does the SQL standard have anything to say on the matter, or is there
 a precedent in the behavior of TSQL or other DBMSes?

I had investigated this issue but the SQL standard doesn't say anything
about it.

The SQL inlining issue is tricky.  Other languages including PL/pgSQL
are not supported at the moment.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Greg Stark
On Sat, Mar 12, 2011 at 5:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The problem with Greg's wording is that it's falsifiable: it is possible
 for someone to modify template0 if they're determined to mess things up.
 So a description like unmodifiable is promising too much.


Eh, it's possible for someone to make any part of the documentation
wrong if they're determined to mess things up enough. Empty is not
even technically correct since it has all the system tables and stuff.
But I think there's a point of diminishing returns where if we try to
come up with something that's technically 100% true it won't help a
user understand the key attributes that make template0 useful. Under
normal usage it has no user objects in it and it is hard to change
that which tries to guarantee that that fact remains true.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Macros for time magic values

2011-03-12 Thread Peter Eisentraut
On fre, 2011-03-11 at 12:50 -0500, Bruce Momjian wrote:
 It has bothered me that many of our time routines use special magic
 constants for time values, e.g. 24, 12, 60, etc.
 
 The attached patch changes these magic constants to macros to clarify
 the code.  I would like to apply this for 9.1 as a cleanup.

I think it's much clearer with the plain numbers.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Robert Haas
On Mar 12, 2011, at 12:01 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Greg Stark gsst...@mit.edu writes:
 On Sat, Mar 12, 2011 at 1:59 PM, Bruce Momjian br...@momjian.us wrote:
 OK, funny guys.  ;-)  Can someone give me the right text.  Obviously I
 don' know what template0 is used for either.  Is it pg_dumpall perhaps?
 
 template0: unmodifiable pristine empty database
 template1: default template for new databases
 
 Yeah, I think that the right way to approach this is to have initdb
 comment *both* of those databases.  I don't like that specific wording
 for template0 though.  Maybe
 
 template0: unmodified copy of original template1 database
 template1: default template for new databases
 
 The problem with Greg's wording is that it's falsifiable: it is possible
 for someone to modify template0 if they're determined to mess things up.
 So a description like unmodifiable is promising too much.
 
 Shouldn't the postgres database get a comment too, while we're at it?
 Perhaps default database to connect to?

A preposition is something you should try not to end a sentence with.

...Robert

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] template0 database comment

2011-03-12 Thread Greg Stark
On Sat, Mar 12, 2011 at 8:42 PM, Robert Haas robertmh...@gmail.com wrote:
 A preposition is something you should try not to end a sentence with.


Something to keep in mind when someone localises Postgres for Latin
which has this rule.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Martijn van Oosterhout
On Sat, Mar 12, 2011 at 02:46:19PM -0500, Tom Lane wrote:
  Similarly, inside the function the parameters should be considered to
  be IMPLICIT collation, to avoid strange errors depending on how its
  called.
 
 Not convinced by this.  If we say that that's how it works, then no
 user-defined function should react to COLLATE in its arguments at all,
 which seems pretty weird and restrictive --- especially if the COLLATE
 property is expected to propagate up through the function call so
 far as the calling expression is concerned.  It seems just bizarre to
 me to say that a function's internal operations don't react to an
 input collation spec but then its result is thought to still be affected
 by that.

I think I didn't explain myself well. The *state* should be implicit,
the actual collation should be whatever the query says. What I was
thinking of is the following:

CREATE FUNCTION my_english_lt(text, text) RETURNS boolean AS $$
   return $1  $2 COLLATE en_US
$$;

(not sure about the syntax but you get the idea).

If you just propegate naively you would get:

my_english_lt(x COLLATE de_DE, y)   - error, conflicting collation
my_english_ly(x, y COLLATE de_DE)   - would work fine

Hence my suggestion that on input to the function the parameters would
be considered collation de_DE state IMPLICIT, so the collation in the
function overrides, but if the COLLATE in the function is removed, the
implicit collation takes hold.

 This would actually seem more sensible if we went with something even
 simpler than the current patch's behavior, namely that COLLATE only
 affects the operator it is an *immediate* input of, and nothing
 propagates upward in expressions ever.  I remain unconvinced that the
 SQL spec is calling for propagation ...

Well, it doesn't say in the general case, but there is under 6.29
string value function Syntax rule 4b 

4) If character substring function CSF is specified, then let DTCVE
be the declared type of the character value expression immediately
contained in CSF. The maximum length, character set, and collation of
the declared type DTCSF of CSF are determined as follows:

b) The character set and collation of the character substring
function are those of DTCVE.

A similar wording is for the trim function. While obviously it doesn't
cover all user defined functions, it seem obviously that once you do
propegation for a few builtins you may as well do it for all of them.
For the concatination operator is has something similar, though written
in a way only a spec committe could come up with.

Frankly, without propegation the feature seems entirely useless. Almost
all collations are going to be defined by implicit collations attached
to columns. If 

ORDER BY x

and

ORDER BY x || 'foo'

Don't use the same collation then that is a first grade violation of
the POLA.

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 Patriotism is when love of your own people comes first; nationalism,
 when hate for people other than your own comes first. 
   - Charles de Gaulle


signature.asc
Description: Digital signature


Re: [HACKERS] template0 database comment

2011-03-12 Thread Andrew Dunstan



On 03/12/2011 04:24 PM, Greg Stark wrote:

On Sat, Mar 12, 2011 at 8:42 PM, Robert Haasrobertmh...@gmail.com  wrote:

A preposition is something you should try not to end a sentence with.


Something to keep in mind when someone localises Postgres for Latin
which has this rule.




I assume Robert's comment was in jest, since it was in breach of the 
rule it was stating.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 I think I didn't explain myself well. The *state* should be implicit,
 the actual collation should be whatever the query says. What I was
 thinking of is the following:

 CREATE FUNCTION my_english_lt(text, text) RETURNS boolean AS $$
return $1  $2 COLLATE en_US
 $$;

 (not sure about the syntax but you get the idea).

 If you just propegate naively you would get:

 my_english_lt(x COLLATE de_DE, y)   - error, conflicting collation
 my_english_ly(x, y COLLATE de_DE)   - would work fine

 Hence my suggestion that on input to the function the parameters would
 be considered collation de_DE state IMPLICIT, so the collation in the
 function overrides, but if the COLLATE in the function is removed, the
 implicit collation takes hold.

Oh, I see.  Yeah, that should work correctly, because parsing inside the
function will be done with Param symbols that act pretty much like Vars
--- whatever collation they have is considered implicit.  It's important
here that we do inlining by splicing completed parsetrees together ---
we *don't* do some sort of insert-the-parameters-and-reparse-from-scratch
approach.  So the collation labelings made inside the function won't
change as a result of inlining.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump -X

2011-03-12 Thread Aaron W. Swenson
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA256

On 12/24/2010 02:32 AM, J. Roeleveld wrote:
 
 Also, unless Gentoo actually strips the man-page and --help page (which I 
 do 
 seriously doubt), I do not see the -X option in the documentation.
 
 --
 Joost
 

Delayed response: No, we don't strip the documentation. It is simply
'unfooled' around with.

- - Aaron
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.17 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iF4EAREIAAYFAk17/EYACgkQCOhwUhu5AEl0RwD/akhtWUK2FFbM4Vbg4cySN8PI
9f0seIhVO9+pZPmEC2cA/2f3FqsFlOrzp9rjZPkmCP4i3B1g8UOZlCvmcUq2PXX3
=Z4O9
-END PGP SIGNATURE-

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Collations versus user-defined functions

2011-03-12 Thread Tom Lane
Martijn van Oosterhout klep...@svana.org writes:
 On Sat, Mar 12, 2011 at 02:46:19PM -0500, Tom Lane wrote:
 This would actually seem more sensible if we went with something even
 simpler than the current patch's behavior, namely that COLLATE only
 affects the operator it is an *immediate* input of, and nothing
 propagates upward in expressions ever.  I remain unconvinced that the
 SQL spec is calling for propagation ...

 Well, it doesn't say in the general case, but there is under 6.29
 string value function Syntax rule 4b 

 4) If character substring function CSF is specified, then let DTCVE
 be the declared type of the character value expression immediately
 contained in CSF. The maximum length, character set, and collation of
 the declared type DTCSF of CSF are determined as follows:

 b) The character set and collation of the character substring
 function are those of DTCVE.

 A similar wording is for the trim function. While obviously it doesn't
 cover all user defined functions, it seem obviously that once you do
 propegation for a few builtins you may as well do it for all of them.
 For the concatination operator is has something similar, though written
 in a way only a spec committe could come up with.

 Frankly, without propegation the feature seems entirely useless.

I remain unconvinced, because there are too many corner cases.  Should
collation propagate up out of a subselect?  How about a CTE?  You're
starting to get into some pretty weird action-at-a-distance situations
if so, analogous to the function-input-arguments case that you were just
saying should NOT propagate collation.  And I still don't see anything
in the text of the spec to justify it.

My feeling is that the feature would be simple, explainable, and useful
if COLLATE only affected the immediately syntactically-containing
operator.  The rest of this stuff requires a huge amount of mechanism
whose behavior will be nothing but surprising, even though it's
inflexible as can be (cf Greg's point about not being able to select
collation at runtime).  I'm not going to say it's the worst piece of
language design that's ever come out of the SQL committee, but I'm
starting to feel like it's in the top ten.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers