Re: [HACKERS] Auto-explain patch

2008-08-31 Thread Dean Rasheed

 * auto_explain.tgz
 A contrib module version of auto-explain.
 An arguable part is initializing instruments in ExecutorRun_hook.
 The initialization should be done in ExecutorStart normally, but
 it is too late in the hook. Is it safe? or are there any better idea?
 README is a plain-text for now, and I'll rewrite it in sgml if needed.


How about adding a new hook to control instrumentation of queries in
ExecutorStart? Something like:

typedef bool (*ExecutorDoInstrument_hook_type) (QueryDesc *queryDesc, int 
eflags);

extern PGDLLIMPORT ExecutorDoInstrument_hook_type ExecutorDoInstrument_hook;

I think that would allow your module code to be simplified,
and it would allow other future extensions to hook in and
enable instrumentation before queries are run.

Regards, Dean.

_
Win a voice over part with Kung Fu Panda  Live Search   and   100’s of Kung Fu 
Panda prizes to win with Live Search
http://clk.atdmt.com/UKM/go/107571439/direct/01/
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Cause of occasional buildfarm failures in sequence test

2008-08-31 Thread Tom Lane
The current result from buildfarm member pigeon
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=pigeondt=2008-08-31%2008:04:54
shows a symptom I've noticed a few times before: everything is fine
except that one select * from sequence shows log_cnt = 31 instead
of the expected 32.  This is a bit odd since no other backend is
touching that sequence, so you'd expect perfectly reproducible results.

I finally got around to looking through the sequence code to see if I
could figure out what was happening, and indeed I did.  The relevant
parts of the test are basically
create sequence foo;
select nextval('foo');
select nextval('foo');
select * from foo;
and you can reproduce the unexpected result if you insert a checkpoint
command between the create and the first nextval.  So the buildfarm
failures are due to chance occurrences of a bgwriter-driven checkpoint
occurring just there during the test.

In the normal execution of this test, the CREATE SEQUENCE leaves the
sequence in a state where one nextval() can be done for free, without
emitting a new WAL record.  So the second nextval() is the one that
emits a WAL record and pushes log_cnt up to 32.  But if a checkpoint
intervenes, then the first nextval() decides it'd better emit the WAL
record (cf lines 500ff in sequence.c as of CVS HEAD), so it pushes
log_cnt up to 32, and then the second one decreases it to 31.

So unless we want to just live with this test failing occasionally,
it seems we have two choices: redesign the behavior of nextval()
to be insensitive to checkpoint timing, or provide an alternate
regression expected file that matches the result with log_cnt = 31.
I favor the second answer --- I don't want to touch the nextval
logic, which has been stable for over six years.

Comments?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cause of occasional buildfarm failures in sequence test

2008-08-31 Thread Hannu Krosing
On Sun, 2008-08-31 at 13:17 -0400, Tom Lane wrote:
 The current result from buildfarm member pigeon
 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=pigeondt=2008-08-31%2008:04:54
 shows a symptom I've noticed a few times before: everything is fine
 except that one select * from sequence shows log_cnt = 31 instead
 of the expected 32.  This is a bit odd since no other backend is
 touching that sequence, so you'd expect perfectly reproducible results.
 
 I finally got around to looking through the sequence code to see if I
 could figure out what was happening, and indeed I did.  The relevant
 parts of the test are basically
   create sequence foo;
   select nextval('foo');
   select nextval('foo');
   select * from foo;
 and you can reproduce the unexpected result if you insert a checkpoint
 command between the create and the first nextval.  So the buildfarm
 failures are due to chance occurrences of a bgwriter-driven checkpoint
 occurring just there during the test.
 
 In the normal execution of this test, the CREATE SEQUENCE leaves the
 sequence in a state where one nextval() can be done for free, without
 emitting a new WAL record.  So the second nextval() is the one that
 emits a WAL record and pushes log_cnt up to 32.  But if a checkpoint
 intervenes, then the first nextval() decides it'd better emit the WAL
 record (cf lines 500ff in sequence.c as of CVS HEAD), so it pushes
 log_cnt up to 32, and then the second one decreases it to 31.
 
 So unless we want to just live with this test failing occasionally,
 it seems we have two choices: redesign the behavior of nextval()
 to be insensitive to checkpoint timing, or provide an alternate
 regression expected file that matches the result with log_cnt = 31.
 I favor the second answer --- I don't want to touch the nextval
 logic, which has been stable for over six years.
 
 Comments?

Maybe you get consistent result by just changing the test thus:

checkpoint;
create sequence foo;
select nextval('foo');
select nextval('foo');
select * from foo;

--
Hannu




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Cause of occasional buildfarm failures in sequence test

2008-08-31 Thread Tom Lane
Hannu Krosing [EMAIL PROTECTED] writes:
 On Sun, 2008-08-31 at 13:17 -0400, Tom Lane wrote:
 So unless we want to just live with this test failing occasionally,
 it seems we have two choices: redesign the behavior of nextval()
 to be insensitive to checkpoint timing, or provide an alternate
 regression expected file that matches the result with log_cnt = 31.
 I favor the second answer --- I don't want to touch the nextval
 logic, which has been stable for over six years.

 Maybe you get consistent result by just changing the test thus:

 checkpoint;
 create sequence foo;
 select nextval('foo');
 select nextval('foo');
 select * from foo;

Actually I think we'd need to put the checkpoint after the create,
but yeah we could do that.  Or we could leave log_cnt out of the
set of columns displayed.  I don't really favor either of those
answers though.  They amount to avoiding testing of some code
paths ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Is this really really as designed or defined in some standard

2008-08-31 Thread Hannu Krosing
It seems that we allow several function arguments to have same 
name (or is it label :)

hannu=# create or replace function ff(a int, a int) returns int language
plpgsql as $$begin return $1+$2; end;$$;
CREATE FUNCTION
hannu=# select ff(1,1);
 ff 

  2
(1 row)

hannu=# select ff(1,2);
 ff 

  3
(1 row)

hannu=# create or replace function ffa(a int, a int) returns int
language plpgsql as $$begin return a + a; end;$$;
CREATE FUNCTION
hannu=# select ffa(1,2);
 ffa 
-
   2
(1 row)

Is this defined by some standard or just an oversight ?

--
Hannu



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch: Collation support

2008-08-31 Thread Alvaro Herrera
Radek Strnad escribió:

 - when creating database with different collation than database cluster, the
 database has to be reindexed. Any idea how to do it? Function
 ReindexDatabase works only when database is opened.

We have this Todo item:

  Set proper permissions on non-system schemas during db creation
Currently all schemas are owned by the super-user because they are
copied from the template1 database. However, since all objects are
inherited from the template database, it is not clear that setting
schemas to the db owner is correct. 

When this was discussed years ago, one proposed idea was that on the
first connection the backend should, as a first task, ensure that
certain administrative chores be done.  The first of those was changing
the ownership of schemas.  It sounds like this reindexing you propose
falls into the same category.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] In case module has wrong magic, report exact problem

2008-08-31 Thread Marko Kreen
In case module and server magic blocks do not match
report exact parameters that differ.

It is quite cumbersome problem to track down unless
user already knows the cause.

As we want to encourage module use, such situations
can happen more often.*** a/src/backend/utils/fmgr/dfmgr.c
--- b/src/backend/utils/fmgr/dfmgr.c
***
*** 19,28 
--- 19,29 
  #ifndef WIN32_ONLY_COMPILER
  #include dynloader.h
  #else
  #include port/dynloader/win32.h
  #endif
+ #include lib/stringinfo.h
  #include miscadmin.h
  #include utils/dynamic_loader.h
  #include utils/hsearch.h
  
  
*** PGFunction
*** 163,172 
--- 164,213 
  lookup_external_function(void *filehandle, char *funcname)
  {
return (PGFunction) pg_dlsym(filehandle, funcname);
  }
  
+ /*
+  * Detailed error report when module incompatibility is detected.
+  */
+ 
+ #define MAGIC_FIELD_TEST(field) do { \
+   if (srv-field != mod-field) { \
+   if (det-len) \
+   appendStringInfo(det, , ); \
+   appendStringInfo(det, %s: %d != %d, #field, mod-field, 
srv-field); \
+   } \
+ } while (0)
+ 
+ static void
+ module_incompat_error(const Pg_magic_struct *mod, const Pg_magic_struct *srv, 
const char *libname)
+ {
+   StringInfo det;
+ 
+   if (mod-version != srv-version)
+   ereport(ERROR,
+(errmsg(incompatible library \%s\: version mismatch,
+libname),
+ errdetail(Server is version %d.%d, library is version 
%d.%d.,
+   srv-version / 100,
+   srv-version % 100,
+   mod-version / 100,
+   mod-version % 100)));
+ 
+   det = makeStringInfo();
+ 
+   MAGIC_FIELD_TEST(funcmaxargs);
+   MAGIC_FIELD_TEST(indexmaxkeys);
+   MAGIC_FIELD_TEST(namedatalen);
+   MAGIC_FIELD_TEST(float4byval);
+   MAGIC_FIELD_TEST(float8byval);
+ 
+   ereport(ERROR,
+(errmsg(incompatible library \%s\: magic block mismatch: %s,
+libname, det-data)));
+ }
+ #undef MAGIC_FIELD_TEST
  
  /*
   * Load the specified dynamic-link library file, unless it already is
   * loaded.Return the pg_dl* handle for the file.
   *
*** internal_load_library(const char *libnam
*** 255,281 
  
/* try to unlink library */
pg_dlclose(file_scanner-handle);
free((char *) file_scanner);
  
!   /*
!* Report suitable error.  It's probably not 
worth writing a
!* separate error message for each field; only 
the most common
!* case of wrong major version gets its own 
message.
!*/
!   if (module_magic_data.version != 
magic_data.version)
!   ereport(ERROR,
!(errmsg(incompatible library \%s\: 
version mismatch,
!libname),
! errdetail(Server is version %d.%d, 
library is version %d.%d.,
!   
magic_data.version / 100,
!   
magic_data.version % 100,
!   
module_magic_data.version / 100,
!   
module_magic_data.version % 100)));
!   ereport(ERROR,
!(errmsg(incompatible library \%s\: magic 
block mismatch,
!libname)));
}
}
else
{
/* try to unlink library */
--- 296,306 
  
/* try to unlink library */
pg_dlclose(file_scanner-handle);
free((char *) file_scanner);
  
!   module_incompat_error(module_magic_data, 
magic_data, libname);
}
}
else
{
/* try to unlink library */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Make gram.y use palloc/pfree for memory management

2008-08-31 Thread Marko Kreen
Currently gram.y uses malloc/free for memory management,
but all pointers are local to function.  Unlike flex-based
scan.l which has static references to buffers and reuses them
across calls.

This means gram.y can leak memory if error is throws in
the middle of parsing.

The patch redefines malloc/free to palloc/pfree to fix the leak.

I did not use YYSTACK_ALLOC / YYSTACK_FREE that exists in newer bison,
because bison 1.875 does not support those.*** a/src/backend/parser/gram.y
--- b/src/backend/parser/gram.y
***
*** 60,69 
--- 60,75 
  #include utils/date.h
  #include utils/datetime.h
  #include utils/numeric.h
  #include utils/xml.h
  
+ /*
+  * bison does not re-use buffers, so we can redefine malloc/free
+  * to palloc/pfree.
+  */
+ #define malloc palloc
+ #define free   pfree
  
  /* Location tracking support --- simpler than bison's default */
  #define YYLLOC_DEFAULT(Current, Rhs, N) \
do { \
if (N) \
*** TableFuncTypeName(List *columns)
*** 10017,10022 
--- 10023,10032 
   * Must undefine base_yylex before including scan.c, since we want it
   * to create the function base_yylex not filtered_base_yylex.
   */
  #undef base_yylex
  
+ /* scan.c needs real malloc */
+ #undef malloc
+ #undef free
+ 
  #include scan.c

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Cleanup of GUC units code

2008-08-31 Thread Marko Kreen
Current GUC units code has 2 problems:

- It requires case-sensiteive representation of units (kB).
  As the main point of units is to make configuring easier,
  requiring byte-exact typing makes them unnecessarily fragile.

- In attempt to preserve maximum range of values for INT64_IS_BUSTED
  systems, the code is written rather non-obvious way.

Attached patch replaces per-unit custom code with lookup table,
where each unit is represented as multiplier of base unit.
And it compares unit names case-insensitivaly.

It sacrifices some range on INT64_IS_BUSTED systems, but as the only promise
we offer them is that Postgres may compile I don't see it as problem.

In case people like the case-sensitivity, it can be restored and the patch
applied as plain cleanup.*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 95,111 
  
  #define KB_PER_MB (1024)
  #define KB_PER_GB (1024*1024)
  
  #define MS_PER_S 1000
- #define S_PER_MIN 60
  #define MS_PER_MIN (1000 * 60)
- #define MIN_PER_H 60
- #define S_PER_H (60 * 60)
  #define MS_PER_H (1000 * 60 * 60)
- #define MIN_PER_D (60 * 24)
- #define S_PER_D (60 * 60 * 24)
  #define MS_PER_D (1000 * 60 * 60 * 24)
  
  /* XXX these should appear in other modules' header files */
  extern bool Log_disconnections;
  extern intCommitDelay;
--- 95,106 
*** parse_bool(const char *value, bool *resu
*** 4115,4124 
--- 4110,4215 
}
return true;
  }
  
  
+ #if BLCKSZ  1024 || XLOG_BLCKSZ  1024
+ #error BLCKSZ or XLOG_BLCKSZ less than 1K is not supported by GUC units code.
+ #endif
+ 
+ /*
+  * Accepted unit abbrevations.
+  *
+  * Units are represented in terms of base unit.
+  *
+  * Base unit for memory is 1 kbyte.
+  * Base unit for time is 1 ms.
+  *
+  * If several units of same type share common prefix, longer
+  * must come first. Does not apply if different types.
+  */
+ static const GucUnit unit_list [] = {
+   /* memory units */
+   { kb, GUC_UNIT_MEMORY, 1 },
+   { mb, GUC_UNIT_MEMORY, KB_PER_MB },
+   { gb, GUC_UNIT_MEMORY, KB_PER_GB },
+   /* time units */
+   { ms,  GUC_UNIT_TIME, 1 },
+   { s,   GUC_UNIT_TIME, MS_PER_S },
+   { min, GUC_UNIT_TIME, MS_PER_MIN },
+   { h,   GUC_UNIT_TIME, MS_PER_H },
+   { d,   GUC_UNIT_TIME, MS_PER_D },
+   /* end marker */
+   { NULL },
+ };
+ 
+ /* hints when unit parsing fails. keep them near unit list */
+ static const char *hintmsg_time = gettext_noop(Valid units for this 
parameter are \kB\, \MB\, and \GB\.);
+ static const char *hintmsg_mem = gettext_noop(Valid units for this parameter 
are \ms\, \s\, \min\, \h\, and \d\.);
+ 
+ /* returns guc_unit struct or NULL for error */
+ static const GucUnit *
+ lookup_unit(const char *s, int flags)
+ {
+   const GucUnit *unit;
+   for (unit = unit_list; unit-abbr; unit++)
+   {
+   if ((flags  unit-typemask) == 0)
+   continue;
+   if (pg_strncasecmp(s, unit-abbr, strlen(unit-abbr)) == 0)
+   return unit;
+   }
+   return NULL;
+ }
+ 
+ /* returns final value */
+ static int64
+ apply_unit(const GucUnit *unit, int flags, int64 val, char **errstr)
+ {
+   /* divider to get real units from base units */
+   int div;
+   /* final result */
+   int64 res;
+ 
+   switch (flags  unit-typemask)
+   {
+   /* memory values */
+   case GUC_UNIT_KB:
+   div = 1;
+   break;
+   case GUC_UNIT_BLOCKS:
+   div = BLCKSZ / 1024;
+   break;
+   case GUC_UNIT_XBLOCKS:
+   div = XLOG_BLCKSZ / 1024;
+   break;
+   /* time values */
+   case GUC_UNIT_MS:
+   div = 1;
+   break;
+   case GUC_UNIT_S:
+   div = MS_PER_S;
+   break;
+   case GUC_UNIT_MIN:
+   div = MS_PER_MIN;
+   break;
+   default:
+   /* can happen only if new GUC_UNIT_* is introduced */
+   *errstr = gettext_noop(BUG: apply_unit not updated);
+   return val;
+   }
+ 
+   res = val * unit-multiplier / div;
+ 
+ #ifdef INT64_IS_BUSTED
+   /* check for overflow */
+   if (res  (val / div) * unit-multiplier)
+   *errstr = gettext_noop(unit overflow);
+ #endif
+ 
+   return res;
+ }
  
  /*
   * Try to parse value as an integer.  The accepted formats are the
   * usual decimal, octal, or hexadecimal formats, optionally followed by
   * a unit name if flags indicates a unit is allowed.
*** parse_bool(const char *value, bool *resu
*** 4131,4140 
--- 4222,4233 
  bool
  parse_int(const char *value, int *result, int flags, const char **hintmsg)
  {

Re: [HACKERS] Is this really really as designed or defined in some standard

2008-08-31 Thread David Fetter
On Mon, Sep 01, 2008 at 12:55:21AM +0300, Hannu Krosing wrote:
 It seems that we allow several function arguments to have same 
 name (or is it label :)

Ugh!

 hannu=# create or replace function ff(a int, a int) returns int language
 plpgsql as $$begin return $1+$2; end;$$;
 CREATE FUNCTION
 hannu=# select ff(1,1);
  ff 
 
   2
 (1 row)
 
 hannu=# select ff(1,2);
  ff 
 
   3
 (1 row)
 
 hannu=# create or replace function ffa(a int, a int) returns int
 language plpgsql as $$begin return a + a; end;$$;
 CREATE FUNCTION
 hannu=# select ffa(1,2);
  ffa 
 -
2
 (1 row)
 
 Is this defined by some standard or just an oversight ?

This looks like a bug.

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TODO item: Implement Boyer-Moore searching (First time hacker)

2008-08-31 Thread David Rowley
Attached is a patch that I've been working on to implement boyer-moore
searching for strpos() and replace().

 

Benchmarking has been done, it's in spreadsheet format. It can be downloaded
from http://www.unixbeast.com/~fat/8.3_test.xls

 

This is my first patch. Please let me know if there is anything that I've
overlooked.

 

David.

 

 

-Original Message-
From: David Rowley [mailto:[EMAIL PROTECTED] 
Sent: 31 August 2008 00:21
To: 'pgsql-hackers@postgresql.org'
Subject: TODO item: Implement Boyer-Moore searching (First time hacker)

 

 

Reference: Bruce Momjian writes: -
http://archives.postgresql.org/pgsql-committers/2007-09/msg00402.php

Other references: Boyer Moore?? -
http://www.cs.utexas.edu/users/moore/best-ideas/string-searching/fstrpos-exa
mple.html

 

 

As a first time hacker of postgresql I've decided to attempt something that
does not go too deep in to the heart of the software.

 

Before I make an attempt at writing a patch, I've decided to start
benchmarking outside of Postgres. On reading the link underneath the Boyer
Moore item in the in the new TODO list wiki, I learned that someone else had
a shot at speeding up strpos around this time last year using KMP algotithm.
On looking a little deeper into Boyer Moore's algorithm it's quite obvious
that there is some overhead involved with preprocessing the search key
(needle). My target has been to not have any trade offs with this possibly
to be patch.  I didn't want smaller searches to suffer.

 

I'd like to receive some feedback on the various functions in the attached
before I go ahead and work on this any more. I've written 8 different
versions of the function. Version 8 seems to be the fastest out of them,
however I've not tested with multi byte chars.

 

Please note that these are just some initial ideas. For example startpos is
not yet implemented, but there is very little overhead to that anyway.

 

For comparision sake I mocked up the current postgresql 8.3's strpos() from
varlena.c. I've no idea how accuratly this will be to the real version.
Probably find out once I write the patch. I replaced strncmp with my own
version as I was having problems with my compilers optimizer, it optimized
too much and I was unable to benchmark. I felt running without and
optimization was unfair as strncmp is. Perhaps if someone else wants to
benchmark with -O2 on gcc they should first replace pg_strncmp with strncmp.

 

 

Currently I have a single file which can be compiled itself that contains
benchmarks for the functions I've tested so far. All of these implement a
stripped out version of the Boyer Moore algotithm. I've done away with one
of the character maps in a bid to reduce the overhead involved creating
them.

 

The position calculation will likely need changed for multi byte characters.
I'm suspecting I'm not meant to be doing length calculations on pointers.
Can someone confirm?

 

To keep this email short I've put most of the information required in as
comments into the source of the program.

 

I've uploaded some benchmark results using the attached program. The
benchmarking was done on windows.

Please see http://www.unixbeast.com/~fat/test1.html  there are 10 tests in
total, in each I attempt to exploit weaknesses, some with my functions and
some with the existing function.

 

My method for version 8 of the function probably looks quite unusual as it
does some simple cost based optimisation

 

I look forward to receiving feedback on this.

 

David.

 

 

 

 

 

 

 



boyer-moore-strpos.v1.tar.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Our CLUSTER implementation is pessimal

2008-08-31 Thread Gregory Stark

One thing that's been annoying me for a while is that our CLUSTER
implementation is really very slow. When I say very slow I mean it's really
very very very slow.

The problem is that it does a full index scan and looks up each tuple in the
order of the index. That means it a) is doing a lot of random i/o and b) has
to access the same pages over and over again.

Our query planner knows better and normally does a sequential heap scan and
sort for queries of this type. But CLUSTER has to use the index.

Just the other day I wanted to randomize the order of a table so I added a
random column, indexed it and started cluster running. After 6 hours I looked
at how far it had gotten and found it was only about 20% done rewriting the
heap (not including rebuilding the indexes). I cancelled it and built a new
table in the desired orderwith CREATE TABLE AS ... ORDER BY in 45 minutes!

There are a couple problems with this:

a) We need some way to decide *when* to do a sort and when to do an index
scan. The planner has all this machinery but we don't really have all the
pieces handy to use it in a utility statement. This is especially important
for the case where we're doing a cluster operation on a table that's already
clustered. In that case an index scan could conceivably actually win (though I
kind of doubt it). I don't really have a solution for this.

b) tuplesort no longer has the pieces needed to sort whole tuples including
visibility info. And actually even the old pieces that were removed had not
quite the right interface and behaviour. We need to preserve t_self for the
heap rewrite tools and we need to be able to use _bt_mkscankey_nodata() to
generate the scan keys that match the index. The cleanest approach I think is
to just add back in the old raw tuple methods to tuplesort and tweak them to
preserve t_self and take Scankeys instead of attnums etc.

c) expression indexes are a problem. We would have to start with constructing
new tuples to hold the expression values but retain the entire original heap
tuple. Perhaps pass both an indextuple and a heaptuple to tuplesort and store
both in the sorttuple? For now I figure we can just punt on expression indexes
and always do an index sacn.

I tested out the idea and it works reasonably well. The code might need some
cleanup including, I think refactoring cluster_rel() and possibly tweaking the
interface to tuplesort. But I'm fairly happy with it.

The results clustering a 2.2GB table with 2,000,000 rows on my 1.5GB laptop
using maintenance_work_mem of 768MB and work_mem of 256MB:

postgres=# \d x

   Table public.x
 Column |   Type   | Modifiers 
+--+---
 i  | integer  | 
 r  | double precision | 
 repeat | text | 
Indexes:
xi btree (i)
xi2 btree ((i + 0))
xir btree (r) CLUSTER

postgresql=# CLUSTER xi ON x;
CLUSTER
Time: 736029.322 ms

postgresql=# CLUSTER xir ON x;
CLUSTER
Time: 723610.507 ms

postgresql=# CLUSTER xi2 ON x;
CLUSTER
Time: 12842793.346 ms


That's 3.5 hours for traditional cluster and 12 minutes for cluster using a
sort...

Index: src/backend/commands/cluster.c
===
RCS file: /home/stark/src/REPOSITORY/pgsql/src/backend/commands/cluster.c,v
retrieving revision 1.177
diff -c -r1.177 cluster.c
*** src/backend/commands/cluster.c	19 Jun 2008 00:46:04 -	1.177
--- src/backend/commands/cluster.c	31 Aug 2008 11:36:14 -
***
*** 19,24 
--- 19,25 
  
  #include access/genam.h
  #include access/heapam.h
+ #include access/nbtree.h
  #include access/relscan.h
  #include access/rewriteheap.h
  #include access/transam.h
***
*** 46,52 
  #include utils/snapmgr.h
  #include utils/syscache.h
  #include utils/tqual.h
! 
  
  /*
   * This struct is used to pass around the information on tables to be
--- 47,53 
  #include utils/snapmgr.h
  #include utils/syscache.h
  #include utils/tqual.h
! #include utils/tuplesort.h
  
  /*
   * This struct is used to pass around the information on tables to be
***
*** 713,725 
  	int			natts;
  	Datum	   *values;
  	bool	   *isnull;
- 	IndexScanDesc scan;
  	HeapTuple	tuple;
  	bool		use_wal;
  	TransactionId OldestXmin;
  	TransactionId FreezeXid;
  	RewriteState rwstate;
  
  	/*
  	 * Open the relations we need.
  	 */
--- 714,739 
  	int			natts;
  	Datum	   *values;
  	bool	   *isnull;
  	HeapTuple	tuple;
  	bool		use_wal;
  	TransactionId OldestXmin;
  	TransactionId FreezeXid;
  	RewriteState rwstate;
  
+ 	bool	do_sort;
+ 
+ 	/* We use this if we're doing an index scan */
+ 	IndexScanDesc scan;
+ 
+ 	/* we need these if we're going to heap scan and sort */
+ 	HeapScanDesc hscan;
+ 	IndexInfo	*ii;
+ 	ScanKey		 scanKeys;
+ 	int			 i;
+ 
+ 	Tuplesortstate *tuplesort;
+ 	bool 		shouldfree;
+ 
  	/*
  	 * Open the relations we need.
  	 */
***
*** 773,782 
  	 * copied, we scan with 

Re: [HACKERS] posix advises ...

2008-08-31 Thread Alvaro Herrera
Greg Smith wrote:

 This patch does need a bit of general care in a couple of areas.  The  
 reviewing game plan I'm working through goes like this:

Did this review effort go anywhere?


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Make gram.y use palloc/pfree for memory management

2008-08-31 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 This means gram.y can leak memory if error is throws in
 the middle of parsing.

Please offer some evidence for that claim.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Cleanup of GUC units code

2008-08-31 Thread Tom Lane
Marko Kreen [EMAIL PROTECTED] writes:
 - In attempt to preserve maximum range of values for INT64_IS_BUSTED
   systems, the code is written rather non-obvious way.

I do not personally object a bit to making the units comparisons
case-insensitive (I think it's mainly Peter who wants to be strict
about it).  I don't think there are any other good ideas in this
patch, however, and exposing ourselves to intermediate overflows in
the name of simplicity is definitely not one.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] WIP: extend parser to show error locations in more cases

2008-08-31 Thread Tom Lane
This is the threatened second pass at reporting parse error locations.
It makes the majority of the ereport's in backend/parser/ include an
error location.  It's not quite ready to apply, mainly because I haven't
prepared updated regression test outputs yet (something like 45 out of
the 115 existing tests will need changes, plus no doubt parts of
contrib).

There are some exceedingly useful (IMHO) improvements here --- for
example, if you have an aggregate in an illegal place, it'll point
right at the aggregate instead of just telling you there is one.
In a complex query with multiple levels of subquery this is not always
such an easy thing to know.

The main areas that are still lacking location information are errors
involving relation aliases and FOR UPDATE/FOR SHARE clauses.  I'm not
all that excited about those cases, and I'm running out of time before
September commit fest; so barring objection I'm going to clean up and
commit what I have.

regards, tom lane



bin48qiI8bCoV.bin
Description: more-error-locations-2.patch.gz

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window functions patch v04 for the September commit fest

2008-08-31 Thread Hitoshi Harada
2008/8/30 Hitoshi Harada [EMAIL PROTECTED]:
 Here's the latest window functions patch against HEAD. It seems to be
 ready for the September commit fest, as added documents, WINDOW clause
 feature and misc tests. I guess this would be the window functions
 feature freeze for 8.4. The remaining feature will be implemented for
 the later release.

 This patch consists of:
 - windowed aggregates
 - cooperate with GROUP BY aggregates
 - some optimizations with multiple windows
 - ranking functions
 - WINDOW clause
 - windowing SQL regression tests
 - sgml documents

 Looking up the total road map, the dropped features are:

 - sliding window (window framing)
 - lead(), lag(), etc. that reach for random rows
 - user defined window functions

 The first and second topics are difficult to implement currently.
 Because these features require random row access, it seems that
 tuplestore would be able to save multiple positions to mark/restore.
 This is fundamental change that is over my capability. Also, user
 defined window functions seem to have much more to decide. I think I
 can't put into shape the general needs of user's window functions now.
 Lacking these feature, this stage looks compatible to SQLServer 2005,
 while Oracle and DB2 have almost full of the specification.

 Also, current implementation has only a type of plan which uses sort
 operation. It should be optimized by re-position the windows and/or
 using hashtable.

 Oh, git hosting repository is updated as well.

README is updated.
http://umitanuki.net/pgsql/wfv04/design.html

Please add link to commit fest wiki.

Regards,

-- 
Hitoshi Harada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Window functions patch v04 for the September commit fest

2008-08-31 Thread David Fetter
On Mon, Sep 01, 2008 at 12:15:19PM +0900, Hitoshi Harada wrote:
 README is updated.
 http://umitanuki.net/pgsql/wfv04/design.html
 
 Please add link to commit fest wiki.

Added :)

Cheers,
David.
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: [EMAIL PROTECTED]

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers