Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Valentine Gogichashvili
Hi,

with the FOR e IN SELECT UNNEST(a) construct there is an issue again related
to the unresting of composite type arrays:

BEGIN;
CREATE TYPE truple AS (i integer, a text, b text);

DO $SQL$
DECLARE
  start_time timestamp;
  t truple;
  ta truple[] := ARRAY( select ROW(s.i, 'A' || (s.i)::text, 'B' ||
(s.i)::text )::truple from generate_series(1, 1) as s(i) );
  i integer := 1;
BEGIN
  start_time := clock_timestamp();
  FOR t IN SELECT UNNEST(ta) LOOP
raise info 't is %', t;
i := i + 1;
  END LOOP;
  RAISE INFO 'looped in %', clock_timestamp() - start_time;
END;
$SQL$;
ROLLBACK;

fails with ERROR:  invalid input syntax for integer: (1,A1,B1)
CONTEXT:  PL/pgSQL function inline_code_block line 8 at FOR over SELECT
rows

So to UNNEST such an array one has to SELECT * FROM UNNEST(a) to be able
loop there like:

BEGIN;
CREATE TYPE truple AS (i integer, a text, b text);

DO $SQL$
DECLARE
  start_time timestamp;
  t truple;
  ta truple[] := ARRAY( select ROW(s.i, 'A' || (s.i)::text, 'B' ||
(s.i)::text )::truple from generate_series(1, 1) as s(i) );
  i integer := 1;
BEGIN
  start_time := clock_timestamp();
  FOR t IN SELECT * FROM UNNEST(ta) LOOP
raise info 't is %', t;
i := i + 1;
  END LOOP;
  RAISE INFO 'looped in %', clock_timestamp() - start_time;
END;
$SQL$;
ROLLBACK;

Is it a bug or a feature? And if the second, then any work on optimizing FOR
e IN SELECT UNNEST(a) should probably include FOR e IN SELECT * FROM
UNNEST(a) statement optimizations.

Also, would the suggested FOR-IN-ARRAY construct loop in such
a composite type arrays?

Best regards,

-- Valenine Gogichashvili


On Thu, Nov 18, 2010 at 8:16 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Pavel Stehule pavel.steh...@gmail.com writes:
  2010/11/18 Tom Lane t...@sss.pgh.pa.us:
  The problem here is that FOR is a syntactic choke point: it's already
  overloaded with several different sub-syntaxes that are quite difficult
  to separate.  Adding another one makes that worse, with the consequences
  that we might misinterpret the user's intent, leading either to
  misleading/unhelpful error messages or unexpected runtime behavior.

  yes, this argument is correct - but we can rearange a parser rules
  related to FOR statement. It can be solved.

 No, it can't.  The more things that can possibly follow FOR, the less
 likely that you correctly guess which one the user had in mind when
 faced with something that's not quite syntactically correct.  Or maybe
 it *is* syntactically correct, only not according to the variant that
 the user thought he was invoking.  We've seen bug reports of this sort
 connected with FOR already; in fact I'm pretty sure you've responded to
 a few yourself.  Adding more variants *will* make it worse.  We need
 a decent return on investment for anything we add here, and this
 proposal just doesn't offer enough benefit.

regards, tom lane

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Magnus Hagander
On Tue, Oct 5, 2010 at 17:21, Craig Ringer cr...@postnewspapers.com.au wrote:
 OK, it's pretty much ready for proper testing now. If a few people are happy
 with the results and think it's a good idea I'll chuck it in the commitfest
 app.

 As built, the crash dump handler works with a stock PostgreSQL 9.0 (release
 build) as shipped in EDB's installer. Just drop crashdump.dll in lib\,
 optionally pop the dbghelp.dll redist in bin\, add 'crashdump' to
 shared_preload_libraries, and crash some backends however you feel like
 doing so.

 The current build of crashdump.dll for release versions of PostgreSQL 9.0 on
 32-bit Windows is here:

  http://www.postnewspapers.com.au/~craig/webfiles/crashdump.dll

 If folks are happy with how this works, all it needs is:

 - Consideration of whether elog should be used or not. I'm inclined to
  suggest using elog to tell the user about the dump, but only after
  the crash dump has been written successfully.

 - Comments on whether this should be left as a loadable module, or
  integerated into core so it loads early in backend startup. The latter
  will permit crash dumping of early backend startup problems, and will
  have tiny overhead because there's no DLL to find and load. OTOH, it's
  harder to pull out if somehow something breaks.

  I'd want to start with loadable module in shared_preload_libraries
  and if that proves useful, only then consider integrating in core.
  I'm way too bad a programmer to want my code anywhere near Pg's core
  without plenty of real world testing.

 - (Maybe) a method to configure the crash dump type. I'm not too sure
  it's necessary given the set of dump flags I've landed up with,
  so I'd leave this be unless it proves to be necessary in real-world
  testing.

Finally getting to looking at this one - sorry about the very long delay.

I agree with Heikki's earlier comment that it's better to have this
included in the backend - but that's obviously not going to happen for
already-released versions. I'd therefor advocate a contrib module for
existing versions, and then in core for 9.1+.

We should then have an option to turn it off (on by default). But we
don't need to pay the overhead on every backend startup - we could
just map the value into the parameter shared memory block, and require
a full postmaster restart in order to change it.

Do we want to backpatch it into contrib/? Adding a new module there
seems kind of wrong - probably better to keep the source separate and
just publish the DLL files for people who do debugging?

Looking at the code:
* Why do we need to look at differnt versions of dbghelp.dll? Can't we
always use the one with Windows? And in that case, can't we just use
compile-time linking, do we need to bother with DLL loading at all?

* Per your comment about using elog() - a good idea in that case is to
use write_stderr(). That will send the output to stderr if there is
one, and otherwise the eventlog (when running as service).

* And yes, in the crash handler, we should *definitely* not use elog().

* On Unix, the core file is dropped in the database directory, we
don't have a separate directory for crashdumps. If we want to be
consistent, we should do that here too. I do think that storing them
in a directory like crashdumps is better, but I just wanted to raise
the comment.

* However, when storing it in crashdumps, I think the code would need
to create that directory if it does not exist, doesn't it?

* Right now, we overwrite old crashdumps. It is well known that PIDs
are recycled pretty quickly on Windows - should we perhaps dump as
postgres-pid-sequence.mdmp when there is a filename conflict?

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/




-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Latches with weak memory ordering (Re: [HACKERS] max_wal_senders must die)

2010-11-22 Thread Heikki Linnakangas

On 21.11.2010 15:18, Robert Haas wrote:

On Sat, Nov 20, 2010 at 4:07 PM, Tom Lanet...@sss.pgh.pa.us  wrote:

Robert Haasrobertmh...@gmail.com  writes:

So what DO we need to guard against here?


I think the general problem can be stated as process A changes two or
more values in shared memory in a fairly short span of time, and process
B, which is concurrently examining the same variables, sees those
changes occur in a different order than A thought it made them in.

In practice we do not need to worry about changes made with a kernel
call in between, as any sort of context swap will cause the kernel to
force cache synchronization.

Also, the intention is that the locking primitives will take care of
this for any shared structures that are protected by a lock.  (There
were some comments upthread suggesting maybe our lock code is not
bulletproof; but if so that's something to fix in the lock code, not
a logic error in code using the locks.)

So what this boils down to is being an issue for shared data structures
that we access without using locks.  As, for example, the latch
structures.


So is the problem case a race involving owning/disowning a latch vs.
setting that same latch?


No. (or maybe that as well, but that's not what we've been concerned 
about here). As far as I've understood correctly, the problem is that 
process A does something like this:


/* set a shared variable */
((volatile bool *) shmem)-variable = true;
/* Wake up process B to notice that we changed the variable */
SetLatch();

And process B does this:

for (;;)
{
  ResetLatch();
  if (((volatile bool *) shmem)-variable)
DoStuff();

  WaitLatch();
}

This is the documented usage pattern of latches. The problem arises if 
process A runs just before ResetLatch, but the effect of setting the 
shared variable doesn't become visible until after the if-test in 
process B. Process B will clear the is_set flag in ResetLatch(), but it 
will not DoStuff(), so it in effect misses the wakeup from process A and 
goes back to sleep even though it would have work to do.


This situation doesn't arise in the current use of latches, because the 
shared state comparable to shmem-variable in the above example is 
protected by a spinlock. But it might become an issue in some future use 
case.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] format() with embedded to_char() formatter

2010-11-22 Thread Itagaki Takahiro
format() function is very useful to construct formatted text,
but it doesn't support embedded formatter unlike sprintf() in C.
Of course, we can use to_char() functions for each argument value,
but embedded formatter would be more readable.

I'd like to propose %{...}s syntax, where format('%{xxx}s', arg)
is equivalent to format('%s', to_char(arg, 'xxx')). I think the
approach is better than implement C-like formatter because we
can reuse existing to_char() functions for the purpose.

Here are examples for the usage:

=# SELECT format('%{FM}s : %{-MM-DD}L', 123, current_timestamp);
   format
-
 0123 : '2010-11-22'

=# SELECT format('CREATE TABLE partition_%{MMDD}s () INHERITS
parent',  current_date);
   format

 CREATE TABLE partition_20101122 () INHERITS parent

Is it interesting? Comments welcome.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explain analyze getrusage tracking

2010-11-22 Thread Magnus Hagander
On Mon, Nov 15, 2010 at 03:33, Greg Stark st...@mit.edu wrote:
 This is an update to my earlier patch to add getrusage resource
 tracking to EXPLAIN ANALYZE.

 With this patch you get something like:

                                                  QUERY PLAN
 --
  Seq Scan on i  (cost=0.00..6919.44 rows=262144 width=101) (actual
 time=17.240..1123.751 rows=262144 loops=1)
   Resources: sys=210.000ms user=430.000ms read=33.6MB
   Buffers: shared read=4298
  Total runtime: 1548.651 ms
 (4 rows)

 The main change is to make it work under Windows. At least I think the
 changes should make it work under Windows, I haven't been able to test
 it. Actually I'm not to happy with the way I did it, I would be more
 inclined to hack the getrusagestub,h definition of struct rusage to
 have an instr_time in it so that we can use the same macros directly.
 But that's more changes than I would be happy making without being
 able to compile them to test them.

I tried building this under windows, and got a bunch of errors.

First and easiest - you need to rename IOCOUNTERS to IO_COUNTERS in
getrusage.c :P

But then I get a number of:
c:\pgsql\src\include\portability/instr_time.h(133) : error C2371: 'instr_time' :
 redefinition; different basic types

and
.\src\backend\utils\adt\pgstatfuncs.c(1345) : error C2039: 'QuadPart' : is not a
 member of 'timeval'
C:\Program Files\Microsoft SDKs\Windows\v6.1\include\winsock2.h(176) : s
ee declaration of 'timeval'


which  believe are related to the same issue. Haven't looked close
enough to figure out what you actually intend for it to be :-)

Finally, a number of:
c:\pgsql\src\include\executor/instrument.h(19) : fatal error C1083: Cannot open
include file: 'sys/resource.h': No such file or directory

include files simply doesn't exist on Windows. Hiding it  behind an
#ifdef complains about fields missing in struct rusage in some cases
and lack of existing rusage definition in others. I think you need
includes of pg_rusage.h, which will make sure it brings in
rusagestub.h when necessary and sys/resource.h when it's there?


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2010-11-22 Thread Pavel Stehule
Hello

this patch remove a multiple detoasting of varlena values in plpgsql.

It is usable mainly for iteration over longer array directly loaded
from relation.

It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.

sample: table 1 rows one column with array with 1000 string fields:

patched pl time: 6 sec
unpatched pl time: 170 sec

This doesn't change my opinion on FOR-IN-ARRAY cycle (is still
important for readability) - just remove one critical performance
issue

Regards

Pavel Stehule
*** ./pl_exec.c.orig	2010-11-16 10:28:42.0 +0100
--- ./pl_exec.c	2010-11-22 13:33:01.597726809 +0100
***
*** 222,227 
--- 222,228 
  	 * Setup the execution state
  	 */
  	plpgsql_estate_setup(estate, func, (ReturnSetInfo *) fcinfo-resultinfo);
+ 	estate.fn_mcxt = CurrentMemoryContext;
  
  	/*
  	 * Setup error traceback support for ereport()
***
*** 255,260 
--- 256,265 
  	var-value = fcinfo-arg[i];
  	var-isnull = fcinfo-argnull[i];
  	var-freeval = false;
+ 
+ 	/* only varlena types should be detoasted */
+ 	var-should_be_detoasted = !var-isnull  !var-datatype-typbyval
+ 			 var-datatype-typlen == -1;
  }
  break;
  
***
*** 493,498 
--- 498,504 
  	 * Setup the execution state
  	 */
  	plpgsql_estate_setup(estate, func, NULL);
+ 	estate.fn_mcxt = CurrentMemoryContext;
  
  	/*
  	 * Setup error traceback support for ereport()
***
*** 570,581 
--- 576,589 
  		elog(ERROR, unrecognized trigger action: not INSERT, DELETE, UPDATE, or TRUNCATE);
  	var-isnull = false;
  	var-freeval = true;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_name_varno]);
  	var-value = DirectFunctionCall1(namein,
  			  CStringGetDatum(trigdata-tg_trigger-tgname));
  	var-isnull = false;
  	var-freeval = true;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_when_varno]);
  	if (TRIGGER_FIRED_BEFORE(trigdata-tg_event))
***
*** 588,593 
--- 596,602 
  		elog(ERROR, unrecognized trigger execution time: not BEFORE, AFTER, or INSTEAD OF);
  	var-isnull = false;
  	var-freeval = true;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_level_varno]);
  	if (TRIGGER_FIRED_FOR_ROW(trigdata-tg_event))
***
*** 598,620 
--- 607,633 
  		elog(ERROR, unrecognized trigger event type: not ROW or STATEMENT);
  	var-isnull = false;
  	var-freeval = true;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_relid_varno]);
  	var-value = ObjectIdGetDatum(trigdata-tg_relation-rd_id);
  	var-isnull = false;
  	var-freeval = false;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_relname_varno]);
  	var-value = DirectFunctionCall1(namein,
  			CStringGetDatum(RelationGetRelationName(trigdata-tg_relation)));
  	var-isnull = false;
  	var-freeval = true;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_table_name_varno]);
  	var-value = DirectFunctionCall1(namein,
  			CStringGetDatum(RelationGetRelationName(trigdata-tg_relation)));
  	var-isnull = false;
  	var-freeval = true;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_table_schema_varno]);
  	var-value = DirectFunctionCall1(namein,
***
*** 624,634 
--- 637,649 
     trigdata-tg_relation;
  	var-isnull = false;
  	var-freeval = true;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_nargs_varno]);
  	var-value = Int16GetDatum(trigdata-tg_trigger-tgnargs);
  	var-isnull = false;
  	var-freeval = false;
+ 	var-should_be_detoasted = false;
  
  	var = (PLpgSQL_var *) (estate.datums[func-tg_argv_varno]);
  	if (trigdata-tg_trigger-tgnargs  0)
***
*** 654,665 
--- 669,682 
  		-1, false, 'i'));
  		var-isnull = false;
  		var-freeval = true;
+ 		var-should_be_detoasted = false;
  	}
  	else
  	{
  		var-value = (Datum) 0;
  		var-isnull = true;
  		var-freeval = false;
+ 		var-should_be_detoasted = false;
  	}
  
  	estate.err_text = gettext_noop(during function entry);
***
*** 841,846 
--- 858,864 
  new-value = 0;
  new-isnull = true;
  new-freeval = false;
+ new-should_be_detoasted = false;
  
  result = (PLpgSQL_datum *) new;
  			}
***
*** 3544,3550 
--- 3562,3571 
  var-value = newvalue;
  var-isnull = *isNull;
  if (!var-datatype-typbyval  !*isNull)
+ {
  	var-freeval = true;
+ 	var-should_be_detoasted = var-datatype-typlen == -1;
+ }
  break;
  			}
  
***
*** 3944,3949 
--- 3965,3993 
  
  *typeid = var-datatype-typoid;
  *typetypmod = var-datatype-atttypmod;
+ 
+ /*
+  * explicit deTOAST and decomprim for 

Re: [HACKERS] format() with embedded to_char() formatter

2010-11-22 Thread Pavel Stehule
Hello

 There is little bit complication. There are no one to_char function
- so you cannot to use DirectFunctionCall API.

but I am not against to this proposal.

regards

Pavel

2010/11/22 Itagaki Takahiro itagaki.takah...@gmail.com:
 format() function is very useful to construct formatted text,
 but it doesn't support embedded formatter unlike sprintf() in C.
 Of course, we can use to_char() functions for each argument value,
 but embedded formatter would be more readable.

 I'd like to propose %{...}s syntax, where format('%{xxx}s', arg)
 is equivalent to format('%s', to_char(arg, 'xxx')). I think the
 approach is better than implement C-like formatter because we
 can reuse existing to_char() functions for the purpose.

 Here are examples for the usage:

 =# SELECT format('%{FM}s : %{-MM-DD}L', 123, current_timestamp);
       format
 -
  0123 : '2010-11-22'

 =# SELECT format('CREATE TABLE partition_%{MMDD}s () INHERITS
 parent',  current_date);
                       format
 
  CREATE TABLE partition_20101122 () INHERITS parent

 Is it interesting? Comments welcome.

 --
 Itagaki Takahiro

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2010-11-22 Thread Andrew Dunstan



On 11/22/2010 07:46 AM, Pavel Stehule wrote:

Hello

this patch remove a multiple detoasting of varlena values in plpgsql.

It is usable mainly for iteration over longer array directly loaded
from relation.

It's doesn't have a impact on semantic or behave - it's just eliminate
some performance trap.

sample: table 1 rows one column with array with 1000 string fields:

patched pl time: 6 sec
unpatched pl time: 170 sec



Since you haven't told us exactly how you tested this it's hard to gauge 
the test results.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: fix performance problems with repated decomprimation of varlena values in plpgsql

2010-11-22 Thread Pavel Stehule
2010/11/22 Andrew Dunstan and...@dunslane.net:


 On 11/22/2010 07:46 AM, Pavel Stehule wrote:

 Hello

 this patch remove a multiple detoasting of varlena values in plpgsql.

 It is usable mainly for iteration over longer array directly loaded
 from relation.

 It's doesn't have a impact on semantic or behave - it's just eliminate
 some performance trap.

 sample: table 1 rows one column with array with 1000 string fields:

 patched pl time: 6 sec
 unpatched pl time: 170 sec


 Since you haven't told us exactly how you tested this it's hard to gauge the
 test results.

sorry - it is related to tests from FOR-IN-ARRAY thread

create table t1000(x text[]);

CREATE OR REPLACE FUNCTION rndstr() RETURNS text AS $$select
array_to_string(array(select substring('ABCDEFGHIJKLMNOPQ' FROM
(random()*16)::int FOR 1) from generate_series(1,10)),'')$$ LANGUAGE
sql;

create or replace function rndarray(int) returns text[] as
$$select array(select rndstr() from generate_series(1,$1)) $$ language sql;

insert into t1000 select rndarray(1000) from generate_series(1,1);

CREATE OR REPLACE FUNCTION public.filter02(text[], text, integer)
 RETURNS text[]
 LANGUAGE plpgsql
AS $function$
DECLARE
 s text[] := '{}';
 l int := 0; i int;
 v text;
BEGIN
 FOR i IN array_lower($1,1)..array_upper($1,1)
 LOOP
   EXIT WHEN l = $3;
   IF $1[i] LIKE $2 THEN
 s := s || $1[i];
 l := l + 1;
   END IF;
 END LOOP;
 RETURN s;
END;$function$

test query: select avg(array_upper(filter02(x,'%AA%', 10),1)) from t1000;

Regards

Pavel Stehule


 cheers

 andrew


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Tab completion for view triggers in psql

2010-11-22 Thread Alvaro Herrera
Excerpts from David Fetter's message of dom nov 21 21:17:12 -0300 2010:

 Given its small and isolated nature, I was hoping we could get this in
 sooner rather than later.  As I understand it, CFs are there to review
 patches that take significant effort for even a committer to
 understand, so this doesn't really fit that model.

It's a 300 line patch -- far from small.  It takes me 10 minutes to go
through a 3 line change in docs.  Maybe that makes me slow, but then if
I'm too slow for your patch, I probably don't want to handle it anyhow.

Please let's try to not work around the process.

(Also, like Robert, I work from the CF app, not from the mailing list
anymore.  It''s far more convenient -- at least when people follow the
rules.)

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Jaime Casanova
On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 this patch implement a new iteration construct - iteration over an
 array. The sense of this new iteration is:
  * a simple and cleaner syntax

 i will start the review of this one...

so, what is the concensus for this patch?
return with feedback? reject the patch on the grounds that we should
go fix unnest() if it slow?
something else?

the patch itself works as advertised (in functionality) but i haven't
make much performance tests to see if we actually win something

-- 
Jaime Casanova         www.2ndQuadrant.com
Professional PostgreSQL: Soporte y capacitación de PostgreSQL

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 6:37 AM, Magnus Hagander mag...@hagander.net wrote:
 Do we want to backpatch it into contrib/? Adding a new module there
 seems kind of wrong - probably better to keep the source separate and
 just publish the DLL files for people who do debugging?

If this works without changes to core, I see little reason not to
back-patch it into contrib.  Our primary concern with back-patching is
to avoid doing things that might break existing installations, but if
there aren't any core changes, I don't really see how that can be an
issue here.  It seems to me that it's probably simpler for us and our
users to keep the debugging tools together with our main tree.

However, I am not clear what benefit we get from moving this into core
in 9.1.  If it's still going to require a full postmaster restart, the
GUC you have to change may as well be shared_preload_libraries as a
new one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 6:21 AM, Valentine Gogichashvili
val...@gmail.com wrote:
 Hi,
 with the FOR e IN SELECT UNNEST(a) construct there is an issue again related
 to the unresting of composite type arrays:
 [ example ]
 Is it a bug or a feature?

It looks like the problem in this example is that PL/pgsql tries to
assign the result of unest(ta) to t.i rather than to t as a whole.
This is pretty ridiculously stupid in this case, but it would make
sense if the subselect where of the form SELECT a, b, c FROM ...

I haven't looked at the code to see whether there's a way to make this
case smarter (it would be nice - I've been bitten by similar problems
myself) but it's only tangentially related to the patch under
discussion.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 8:39 AM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Wed, Nov 17, 2010 at 7:08 PM, Jaime Casanova ja...@2ndquadrant.com wrote:
 On Thu, Sep 30, 2010 at 7:46 AM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 Hello

 this patch implement a new iteration construct - iteration over an
 array. The sense of this new iteration is:
  * a simple and cleaner syntax

 i will start the review of this one...

 so, what is the concensus for this patch?
 return with feedback? reject the patch on the grounds that we should
 go fix unnest() if it slow?
 something else?

I think it should be marked rejected.  I don't think Tom is likely to
ever be in favor of a syntax change here, and while I hesitate to deal
in absolutes, I don't think I will be either, and certainly not
without a lot more work on improving the performance of the existing
constructs.   In particular, this seems like something that really
ought to be pursued:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg01177.php

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add a primary key using an existing index

2010-11-22 Thread Gurjeet Singh
On Sat, Nov 20, 2010 at 9:00 AM, Steve Singer ssinger...@sympatico.cawrote:


 Submission Review:
 

 Tests
 
 The expected output for the regression tests you added don't match
 what I'm getting when I run the tests with your patch applied.
 I think you just need to regenerate the expected results they seem
 to be from a previous version of the patch (different error messages
 etc..).


Fixed. Also modified one test to cover the case where constraint name is
provided.



 Documentation
 ---

 I was able to generate the docs.

 The ALTER TABLE page under the synopsis has

 ADD table_constraint

 where table_constraint is defined on the CREATE TABLE page.
 On the CREATE TABLE page table_constraint isn't defined as having the WITH
 , the WITH is part of index_parameters.

 I propose the alter table page instead have

 ADD table_constraint [index_parameters]

 where index_parameters also references the CREATE TABLE page like
 table_constraint.


IMHO index_parameters is an optional component of table_constraint, and
hence can't be mentioned here, at least not the way shown above.

I have made slight improvements to the doc which might help the user
understand that this WITH(INDEX=) option is exclusive to ALTER TABLE and not
provided by CREATE TABLE.


 Usability Review
 

 Behaviour
 -
 I feel that if the ALTER TABLE ... renames the the index
 a NOTICE should be generated.  We generate notices about creating an index
 for a new pkey. We should give them a notice that we are renaming an index
 on them.


Done.



 Coding Review:
 ==

 Error Messages
 -
 in tablecmds your errdetail messages often don't start with a capital
 letter. I belive the preference is to have the errdetail strings start with
 a capital letter and end with a period.


Fixed.




 tablecmds.c  - get_constraint_index_oid

 contains the check

/* Currently only B-tree indexes are suupported for primary keys */
if (index_rel-rd_rel-relam != BTREE_AM_OID)
elog(ERROR, \%s\ is not a B-Tree index,
 index_name);

 but above we already validate that the index is a unique index with another
 check.  Today only B-tree indexes support unique constraints. If this
 changed at some point and we could have a unique index of some other type,
 would something in this patch need to be changed to support them?  If we are
 only depending on the uniqueness property then I think this check is covered
 by the uniquness one higher in the function.

 Also note the typo in your comment above (suupported)


I agree; code removed.


 Comments
 -

 index.c: Line 671 and 694.  Your indentation changes make the comments
 run over 80 characters.  If you end up submitting a new version
 of the patch I'd reformat those two comments.


Fixed.



 Other than those issues the patch looks good to me.


Thanks for your time Steve.

Regards,

PS: I will be mostly unavailable between 11/25 and 12/6, so wouldn't mind if
somebody took ownership of this patch for that duration.
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


replace_pkey_index.revised.patch.gz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Explain analyze getrusage tracking

2010-11-22 Thread Greg Stark
On Mon, Nov 22, 2010 at 12:40 PM, Magnus Hagander mag...@hagander.net wrote:
 I tried building this under windows, and got a bunch of errors.

Thanks!

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 However, I am not clear what benefit we get from moving this into core
 in 9.1.  If it's still going to require a full postmaster restart, the
 GUC you have to change may as well be shared_preload_libraries as a
 new one.

+1.  I am not in favor of randomly repackaging functionality, unless
there's some clear benefit gained by doing so.  In this case it seems
like something that could and should remain at arm's length forever,
so a contrib module is the ideal packaging.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 9:49 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 However, I am not clear what benefit we get from moving this into core
 in 9.1.  If it's still going to require a full postmaster restart, the
 GUC you have to change may as well be shared_preload_libraries as a
 new one.

 +1.  I am not in favor of randomly repackaging functionality, unless
 there's some clear benefit gained by doing so.  In this case it seems
 like something that could and should remain at arm's length forever,
 so a contrib module is the ideal packaging.

Now, if we could make this so low-overhead that it doesn't need a
switch, that would be a good argument for moving it into core, at
least to my way of thinking.  But if it's something that needs to be
enabled with a postmaster restart anyway, meh.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] format() with embedded to_char() formatter

2010-11-22 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@gmail.com writes:
 I'd like to propose %{...}s syntax, where format('%{xxx}s', arg)
 is equivalent to format('%s', to_char(arg, 'xxx')). I think the
 approach is better than implement C-like formatter because we
 can reuse existing to_char() functions for the purpose.

This seems pretty gross, not least because the existing to_char
functions are so limited and broken.  I don't really want to make
format() incorporate all the brain damage in timestamp to_char, in
particular.  Also, it doesn't seem that you're really getting much
notational leverage with this proposal.  And lastly, AFAICS there
is no way to do what you suggest without some really ugly kluges
in the parser --- I think the function parsing code would have to
have special cases to make format() work like this.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-11-22 Thread Dimitri Fontaine
Itagaki Takahiro itagaki.takah...@gmail.com writes:
 I checked cfparser.v2.patch.

Thanks for reviewing it!

 It exports the static parseRecoveryCommandFileLine() in xlog.c
 as the global cfParseOneLine() in cfparser.c without modification.

 It generates one warning, but it can be easily fixed.
   cfparser.c:34: warning: no previous prototype for 'cfParseOneLine'

Mmmm, that must have been a cherry-picking error on my side, it seems I
forgot to include this patch in the cfparser branch. Sorry about that.

  
http://git.postgresql.org/gitweb?p=postgresql-extension.git;a=commitdiff;h=e6b4c670a0189ee8a799521af06c1ee63f9e530e

 Some discussions about the patch:

 * Is cf the best name for the prefix? Less abbreviated forms might
   be less confusable. Personally, I prefer conf.

Will change accordingly if that's the choice.

 * Can we export ParseConfigFile() in guc-file.l rather than
   parseRecoveryCommandFileLine()? It can solve the issue that unquoted
   parameter values in recovery.conf are not recognized. Even if we
   won't merge them, just allowing unquoted values would be useful.

Should we then consider recovery.conf entries as ordinary GUCs? That
would allow to connect to a standby and issue 'show primary_conninfo'
there. This has been asked before, already.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Peter Eisentraut
On mån, 2010-11-22 at 12:37 +0100, Magnus Hagander wrote:
 Do we want to backpatch it into contrib/?

It's not a bug fix or an upgrading aid, so no.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 On mån, 2010-11-22 at 12:37 +0100, Magnus Hagander wrote:
 Do we want to backpatch it into contrib/?

 It's not a bug fix or an upgrading aid, so no.

I'm less than thrilled about back-patching this, too.  It seems to fly
in the face of all our historical practice.

If you drop the bit about back-patching, then I don't particularly care
whether it goes into core or contrib for 9.1 --- whichever packaging
makes the most sense is fine.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Magnus Hagander
On Mon, Nov 22, 2010 at 16:33, Tom Lane t...@sss.pgh.pa.us wrote:
 Peter Eisentraut pete...@gmx.net writes:
 On mån, 2010-11-22 at 12:37 +0100, Magnus Hagander wrote:
 Do we want to backpatch it into contrib/?

 It's not a bug fix or an upgrading aid, so no.

 I'm less than thrilled about back-patching this, too.  It seems to fly
 in the face of all our historical practice.

 If you drop the bit about back-patching, then I don't particularly care
 whether it goes into core or contrib for 9.1 --- whichever packaging
 makes the most sense is fine.

My suggestion in the first place was not to backpatch it - I just
wanted to get peoples opinions. I'm perfectly happy if we keep it
somewhere else for the time being - as long as we make the binaries
easily available. But that can go on the wiki for example.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Magnus Hagander
On Mon, Nov 22, 2010 at 15:15, Robert Haas robertmh...@gmail.com wrote:
 On Mon, Nov 22, 2010 at 6:37 AM, Magnus Hagander mag...@hagander.net wrote:
 Do we want to backpatch it into contrib/? Adding a new module there
 seems kind of wrong - probably better to keep the source separate and
 just publish the DLL files for people who do debugging?

 If this works without changes to core, I see little reason not to
 back-patch it into contrib.  Our primary concern with back-patching is
 to avoid doing things that might break existing installations, but if
 there aren't any core changes, I don't really see how that can be an
 issue here.  It seems to me that it's probably simpler for us and our
 users to keep the debugging tools together with our main tree.

 However, I am not clear what benefit we get from moving this into core
 in 9.1.  If it's still going to require a full postmaster restart, the
 GUC you have to change may as well be shared_preload_libraries as a
 new one.

on-by-default is what we gain. I think that's fairly big...


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-11-22 Thread Itagaki Takahiro
On Mon, Nov 22, 2010 at 18:36, Dimitri Fontaine dimi...@2ndquadrant.fr wrote:

 * Can we export ParseConfigFile() in guc-file.l rather than
   parseRecoveryCommandFileLine()?

 Should we then consider recovery.conf entries as ordinary GUCs? That
 would allow to connect to a standby and issue 'show primary_conninfo'
 there. This has been asked before, already.

No. My suggestion was just to use the internal parser.
ParseConfigFile() returns parsed parameters as head_p and tail_p.
So, we can use it independently from GUC variables.

static bool
ParseConfigFile(const char *config_file, const char *calling_file,
int depth, GucContext context, int elevel,
struct name_value_pair **head_p,
struct name_value_pair **tail_p)

Special codes for include and custom_variable_classes in it
might not be needed to parse recovery.conf and extensions.
We would disable them when we parse non-guc configuration files.
Or, we could allow include in recovery.conf if we think it is
useful in some cases.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] dblink versus long connection strings

2010-11-22 Thread Tom Lane
This bug report:
http://archives.postgresql.org/pgsql-bugs/2010-11/msg00139.php
shows that this patch was ill-considered:
http://archives.postgresql.org/pgsql-committers/2010-06/msg00013.php
and this later attempt didn't fix it, because it still misbehaves in
HEAD:
http://archives.postgresql.org/pgsql-committers/2010-06/msg00070.php
not to mention that that second patch didn't even touch pre-8.4
branches.

I'm inclined to think that we should just change all the
truncate_identifier calls to warn=false, and forget about providing
identifier-truncated warnings here.  It's too difficult to tell whether
a string is really meant as an identifier.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 on-by-default is what we gain. I think that's fairly big...

Only if that's actually what we want, which is far from clear in this
corner.  There are good reasons why most Linux distros configure daemons
not to dump core by default.  It's annoying when we are trying to debug
a Postgres crash, but that doesn't mean the reasons aren't good.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink versus long connection strings

2010-11-22 Thread Itagaki Takahiro
On Tue, Nov 23, 2010 at 01:27, Tom Lane t...@sss.pgh.pa.us wrote:
 This bug report:
 http://archives.postgresql.org/pgsql-bugs/2010-11/msg00139.php
 shows that this patch was ill-considered:
 http://archives.postgresql.org/pgsql-committers/2010-06/msg00013.php
 and this later attempt didn't fix it, because it still misbehaves in
 HEAD:
 http://archives.postgresql.org/pgsql-committers/2010-06/msg00070.php
 not to mention that that second patch didn't even touch pre-8.4
 branches.

 I'm inclined to think that we should just change all the
 truncate_identifier calls to warn=false, and forget about providing
 identifier-truncated warnings here.  It's too difficult to tell whether
 a string is really meant as an identifier.

It is not a truncated identifier, but I think the truncation is still
worth warning because we cannot distinguish two connections that
differ only 63 bytes.

Do we need another logic to name non-named connections?
For example, md5 hash of the connection string.

-- 
Itagaki Takahiro

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Heikki Linnakangas

On 20.11.2010 06:10, Joachim Wieland wrote:

2010/11/19 José Arthur Benetasso Villanovajose.art...@gmail.com:

The md5.c and kwlookup.c reuse using a link doesn't look nice either.
This way you need to compile twice, among others things, but I think
that its temporary, right?


No, it isn't. md5.c is used in the same way by e.g. libpq and there
are other examples for links in core, check out src/bin/psql for
example.


It seems like overkill to include md5 just for hashing the random bytes 
that getRandomData() generates. And if random() doesn't produce unique 
values, it's not going to get better by hashing it. How about using a 
timestamp instead of the hash?


If you don't initialize random() with srandom(), BTW, it will always 
return the same value.


But I'm not actually sure we should be preventing mix  match of files 
from different dumps. It might be very useful to do just that sometimes, 
like restoring a recent backup, with the contents of one table replaced 
with older data. A warning would be ok, though.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink versus long connection strings

2010-11-22 Thread Tom Lane
Itagaki Takahiro itagaki.takah...@gmail.com writes:
 On Tue, Nov 23, 2010 at 01:27, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm inclined to think that we should just change all the
 truncate_identifier calls to warn=false, and forget about providing
 identifier-truncated warnings here.  It's too difficult to tell whether
 a string is really meant as an identifier.

 It is not a truncated identifier, but I think the truncation is still
 worth warning because we cannot distinguish two connections that
 differ only 63 bytes.

The problem is to not give a warning when the string isn't meant as a
connection name at all, but as a libpq conninfo string (which can
perfectly reasonably run to more than 63 characters).  Most if not all
of the dblink functions will accept either.

Perhaps a reasonable compromise is to issue the truncation warnings when
an overlength name is being *entered* into the connection table, but not
for simple lookups.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink versus long connection strings

2010-11-22 Thread Andrew Dunstan



On 11/22/2010 11:51 AM, Tom Lane wrote:

Itagaki Takahiroitagaki.takah...@gmail.com  writes:

On Tue, Nov 23, 2010 at 01:27, Tom Lanet...@sss.pgh.pa.us  wrote:

I'm inclined to think that we should just change all the
truncate_identifier calls to warn=false, and forget about providing
identifier-truncated warnings here. Â It's too difficult to tell whether
a string is really meant as an identifier.

It is not a truncated identifier, but I think the truncation is still
worth warning because we cannot distinguish two connections that
differ only63 bytes.

The problem is to not give a warning when the string isn't meant as a
connection name at all, but as a libpq conninfo string (which can
perfectly reasonably run to more than 63 characters).  Most if not all
of the dblink functions will accept either.

Perhaps a reasonable compromise is to issue the truncation warnings when
an overlength name is being *entered* into the connection table, but not
for simple lookups.


Can't we distinguish a name from a conninfo string by the presence of an 
= sign?


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 But I'm not actually sure we should be preventing mix  match of files 
 from different dumps. It might be very useful to do just that sometimes, 
 like restoring a recent backup, with the contents of one table replaced 
 with older data. A warning would be ok, though.

+1.  This mechanism seems like a solution in search of a problem.
Just lose the whole thing, and instead fix pg_dump to complain if
the target directory isn't empty.  That should be sufficient to guard
against accidental mixing of different dumps, and as Heikki says
there's not a good reason to prevent intentional mixing.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] format() with embedded to_char() formatter

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Itagaki Takahiro itagaki.takah...@gmail.com writes:
 I'd like to propose %{...}s syntax, where format('%{xxx}s', arg)
 is equivalent to format('%s', to_char(arg, 'xxx')). I think the
 approach is better than implement C-like formatter because we
 can reuse existing to_char() functions for the purpose.

 This seems pretty gross, not least because the existing to_char
 functions are so limited and broken.  I don't really want to make
 format() incorporate all the brain damage in timestamp to_char, in
 particular.

If the existing to_char() functions are limited and broken, maybe we
ought to fix them.  I am reminded of some wit's quote that XML is like
violence - if it doesn't solve your problem, you aren't using enough
of it.  I'm not a believer in that philosophy, and don't think that
adding a whole new set of functions with incompatible semantics is the
right way to fix problems with the existing functions.  We have enough
of that already - especially around dates - and it sucks badly enough
as it is.  Non-orthogonality is bad.

 Also, it doesn't seem that you're really getting much
 notational leverage with this proposal.

I am not sure I agree.  It seems quite convenient to me to be able to
encode all the formatting crap in the message string, rather than
spreading it out all over the SQL statement.  Maybe time for another
poll on -general.

 And lastly, AFAICS there
 is no way to do what you suggest without some really ugly kluges
 in the parser --- I think the function parsing code would have to
 have special cases to make format() work like this.

Huh?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink versus long connection strings

2010-11-22 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 11/22/2010 11:51 AM, Tom Lane wrote:
 Perhaps a reasonable compromise is to issue the truncation warnings when
 an overlength name is being *entered* into the connection table, but not
 for simple lookups.

 Can't we distinguish a name from a conninfo string by the presence of an 
 = sign?

No, because = isn't disallowed in names ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 10:17 AM, Peter Eisentraut pete...@gmx.net wrote:
 On mån, 2010-11-22 at 12:37 +0100, Magnus Hagander wrote:
 Do we want to backpatch it into contrib/?

 It's not a bug fix or an upgrading aid, so no.

I don't see why an upgrading aid would be worthy of back-patching, but
not a debugging aid.  I'd certainly prioritize those in the other
order.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink versus long connection strings

2010-11-22 Thread Andrew Dunstan



On 11/22/2010 12:08 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 11/22/2010 11:51 AM, Tom Lane wrote:

Perhaps a reasonable compromise is to issue the truncation warnings when
an overlength name is being *entered* into the connection table, but not
for simple lookups.

Can't we distinguish a name from a conninfo string by the presence of an
= sign?

No, because = isn't disallowed in names ...


Ok, true, but it still might not be a bad heuristic to use for issuing a 
warning on lookup.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] format() with embedded to_char() formatter

2010-11-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Nov 22, 2010 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 And lastly, AFAICS there
 is no way to do what you suggest without some really ugly kluges
 in the parser --- I think the function parsing code would have to
 have special cases to make format() work like this.

 Huh?

How exactly are you going to get from

format('string here', timestamp_expr)

to

format('string here', to_char(timestamp_expr))

especially seeing that to_char is not one function but an overloaded
family of functions (doubtless soon to become even more overloaded,
if this proposal is adopted)?

Or is the intention to replicate the parser's
overloaded-function-resolution behavior at runtime?  That seems awkward,
duplicative, slow, and probably prone to security issues (think
search_path).

Or perhaps Itagaki-san intended to hard-wire a fixed set of to_char
functions that format() knows about.  That seems to lose whatever minor
charms the proposal possessed, because it wouldn't be extensible without
changing format()'s C code.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink versus long connection strings

2010-11-22 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 11/22/2010 12:08 PM, Tom Lane wrote:
 No, because = isn't disallowed in names ...

 Ok, true, but it still might not be a bad heuristic to use for issuing a 
 warning on lookup.

The fine manual says that using = in a connection name might be unwise
because of the risk of confusion.  It doesn't say that you should expect
to get a NOTICE every single time you use the name.  People have
complained of postmaster log bloat for lots less reason than this.

In any case I don't see an argument why warning on connection creation
isn't sufficient.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I don't see why an upgrading aid would be worthy of back-patching, but
 not a debugging aid.  I'd certainly prioritize those in the other
 order.

I think the sort of upgrading aid Peter has in mind is the kind where
it's entirely useless if it's not back-patched, because it has to run in
the pre-upgraded server.  We've discussed such things before in the
context of in-place upgrade, though I believe there have been no actual
instances as yet.

I'm not really sure why we're even considering the notion of
back-patching this item.  Doing so would not fit with any past practice
or agreed-on project management practices, not even under our lax
standards for contrib (and I keep hearing people claim that contrib
is or should be as trustworthy as core, anyway).  Since when do we
back-patch significant features that have not been through a beta test
cycle?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] dblink versus long connection strings

2010-11-22 Thread Andrew Dunstan



On 11/22/2010 12:21 PM, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 11/22/2010 12:08 PM, Tom Lane wrote:

No, because = isn't disallowed in names ...

Ok, true, but it still might not be a bad heuristic to use for issuing a
warning on lookup.

The fine manual says that using = in a connection name might be unwise
because of the risk of confusion.  It doesn't say that you should expect
to get a NOTICE every single time you use the name.  People have
complained of postmaster log bloat for lots less reason than this.

In any case I don't see an argument why warning on connection creation
isn't sufficient.


OK.

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 12:30 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I don't see why an upgrading aid would be worthy of back-patching, but
 not a debugging aid.  I'd certainly prioritize those in the other
 order.

 I think the sort of upgrading aid Peter has in mind is the kind where
 it's entirely useless if it's not back-patched, because it has to run in
 the pre-upgraded server.  We've discussed such things before in the
 context of in-place upgrade, though I believe there have been no actual
 instances as yet.

 I'm not really sure why we're even considering the notion of
 back-patching this item.  Doing so would not fit with any past practice
 or agreed-on project management practices, not even under our lax
 standards for contrib (and I keep hearing people claim that contrib
 is or should be as trustworthy as core, anyway).  Since when do we
 back-patch significant features that have not been through a beta test
 cycle?

I am as conservative about back-patching as anybody here, but
debugging on Windows is not an easy thing to do, and I strongly
suspect we are going to point people experiencing crashes on Windows
to this code whether it's part of our official distribution or not.  I
don't see what we get out of insisting that people install it
separately.  This is a tool that is only intended to be used when
PostgreSQL is CRASHING, so arguing that we shouldn't include the code
because it might not be stable doesn't carry much water AFAICS.  As
far as I understand it, we don't back-patch new features because of
the risk of breaking things, but in this case refusing to back-patch
the code seems more likely to prevent adequate diagnosis of what is
already broken.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 * On Unix, the core file is dropped in the database directory, we
 don't have a separate directory for crashdumps. If we want to be
 consistent, we should do that here too. I do think that storing them
 in a directory like crashdumps is better, but I just wanted to raise
 the comment.

Just a note on that - it's by no means universal that Unix systems will
put the core files in $PGDATA.  OS X likes to put them in /cores, which
I think is a convention shared with some other BSDish systems.  On Linux
I believe it's possible to configure where the core goes via environment
settings.

 * However, when storing it in crashdumps, I think the code would need
 to create that directory if it does not exist, doesn't it?

If it didn't do so, then manual creation/removal of the directory could
be used as an on/off switch for the feature.  Which would have a number
of advantages, not least that you don't need to have the crash dumper
dependent on GUC working.  I haven't looked at the patch but this
discussion makes it sound like the dumper is dependent on an
uncomfortably large amount of backend code being functional.  You need
to minimize the number of assumptions of that sort.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] format() with embedded to_char() formatter

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Nov 22, 2010 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 And lastly, AFAICS there
 is no way to do what you suggest without some really ugly kluges
 in the parser --- I think the function parsing code would have to
 have special cases to make format() work like this.

 Huh?

 How exactly are you going to get from

        format('string here', timestamp_expr)

 to

        format('string here', to_char(timestamp_expr))

 especially seeing that to_char is not one function but an overloaded
 family of functions (doubtless soon to become even more overloaded,
 if this proposal is adopted)?

 Or is the intention to replicate the parser's
 overloaded-function-resolution behavior at runtime?  That seems awkward,
 duplicative, slow, and probably prone to security issues (think
 search_path).

Ick.

 Or perhaps Itagaki-san intended to hard-wire a fixed set of to_char
 functions that format() knows about.  That seems to lose whatever minor
 charms the proposal possessed, because it wouldn't be extensible without
 changing format()'s C code.

Extensibility would be (really) nice to have, but the feature may have
some merit even without that.  I certainly spend a lot more time
formatting built-in types than custom ones.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I am as conservative about back-patching as anybody here, but
 debugging on Windows is not an easy thing to do, and I strongly
 suspect we are going to point people experiencing crashes on Windows
 to this code whether it's part of our official distribution or not.  I
 don't see what we get out of insisting that people install it
 separately.  This is a tool that is only intended to be used when
 PostgreSQL is CRASHING, so arguing that we shouldn't include the code
 because it might not be stable doesn't carry much water AFAICS.  As
 far as I understand it, we don't back-patch new features because of
 the risk of breaking things, but in this case refusing to back-patch
 the code seems more likely to prevent adequate diagnosis of what is
 already broken.

Well, if we're going to hand out prebuilt DLLs to people, we can do that
without having back-patched the code officially.  But more to the point
is that it's not clear that we're going to end up with a contrib module
at all going forward (a core feature would clearly be a lot more
reliable), and I really do not wish to get involved with maintaining two
independent versions of this code.

This argument seems to boil down to we have to have this yesterday,
which I don't buy for a minute.  If it's as critical as that, why did
it take this long for someone to write it?  I do NOT agree that this
feature is important enough to justify a free pass around our normal
management and quality assurance processes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Heikki Linnakangas

On 22.11.2010 19:47, Robert Haas wrote:

I am as conservative about back-patching as anybody here, but
debugging on Windows is not an easy thing to do, and I strongly
suspect we are going to point people experiencing crashes on Windows
to this code whether it's part of our official distribution or not.


This whole thing makes me wonder: is there truly no reliable, simple 
method to tell Windows to create a core dump on crash, without writing 
custom code for it. I haven't seen one, but I find it amazing if there 
isn't. We can't be alone with this.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] format() with embedded to_char() formatter

2010-11-22 Thread Pavel Stehule
2010/11/22 Robert Haas robertmh...@gmail.com:
 On Mon, Nov 22, 2010 at 12:17 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Mon, Nov 22, 2010 at 9:55 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 And lastly, AFAICS there
 is no way to do what you suggest without some really ugly kluges
 in the parser --- I think the function parsing code would have to
 have special cases to make format() work like this.

 Huh?

 How exactly are you going to get from

        format('string here', timestamp_expr)

 to

        format('string here', to_char(timestamp_expr))

 especially seeing that to_char is not one function but an overloaded
 family of functions (doubtless soon to become even more overloaded,
 if this proposal is adopted)?

a code for this is available now - if there is a some break, then it
is a search_path. But this is a general problem. Probably isn't
significant problem to limit search_path just for pg_catalog.


 Or is the intention to replicate the parser's
 overloaded-function-resolution behavior at runtime?  That seems awkward,
 duplicative, slow, and probably prone to security issues (think
 search_path).

 Ick.

 Or perhaps Itagaki-san intended to hard-wire a fixed set of to_char
 functions that format() knows about.  That seems to lose whatever minor
 charms the proposal possessed, because it wouldn't be extensible without
 changing format()'s C code.

 Extensibility would be (really) nice to have, but the feature may have
 some merit even without that.  I certainly spend a lot more time
 formatting built-in types than custom ones.


The implementation should not be complex or ugly,  but it can returns
back dependency problem.

Regards

Pavel Stehule

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Magnus Hagander
On Mon, Nov 22, 2010 at 18:46, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 * However, when storing it in crashdumps, I think the code would need
 to create that directory if it does not exist, doesn't it?

 If it didn't do so, then manual creation/removal of the directory could
 be used as an on/off switch for the feature.  Which would have a number
 of advantages, not least that you don't need to have the crash dumper
 dependent on GUC working.  I haven't looked at the patch but this
 discussion makes it sound like the dumper is dependent on an
 uncomfortably large amount of backend code being functional.  You need
 to minimize the number of assumptions of that sort.

No, it's dependent on close to zero backend functionality.
Particularly if we take out the dependency on elog() (write_stderr is
much simpler). In fact, the calls to elog() are the *only* places
where it calls into the backend as it stands today.

And yes, ISTM it could work reasonably well to use the
creation/deletion of the directory as an on/off switch for it. Which
is the default is of course up to the packager then as well ;)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Magnus Hagander
On Mon, Nov 22, 2010 at 18:54, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 I am as conservative about back-patching as anybody here, but
 debugging on Windows is not an easy thing to do, and I strongly
 suspect we are going to point people experiencing crashes on Windows
 to this code whether it's part of our official distribution or not.  I
 don't see what we get out of insisting that people install it
 separately.  This is a tool that is only intended to be used when
 PostgreSQL is CRASHING, so arguing that we shouldn't include the code
 because it might not be stable doesn't carry much water AFAICS.  As
 far as I understand it, we don't back-patch new features because of
 the risk of breaking things, but in this case refusing to back-patch
 the code seems more likely to prevent adequate diagnosis of what is
 already broken.

 Well, if we're going to hand out prebuilt DLLs to people, we can do that
 without having back-patched the code officially.  But more to the point
 is that it's not clear that we're going to end up with a contrib module
 at all going forward (a core feature would clearly be a lot more
 reliable), and I really do not wish to get involved with maintaining two
 independent versions of this code.

I think the reasonable options are either don't backpatch at all or
backpatch the same way as we put it in HEAD, which is probably
included in backend. I agree that sticking it in contrib is a
half-measure that we shouldn't do.

*IF* we go with a contrib module for 9.1 as well, we could backpatch
as contrib module, but I think that's the only case.

 This argument seems to boil down to we have to have this yesterday,
 which I don't buy for a minute.  If it's as critical as that, why did
 it take this long for someone to write it?  I do NOT agree that this
 feature is important enough to justify a free pass around our normal
 management and quality assurance processes.

Agreed.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Magnus Hagander
On Mon, Nov 22, 2010 at 18:56, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 22.11.2010 19:47, Robert Haas wrote:

 I am as conservative about back-patching as anybody here, but
 debugging on Windows is not an easy thing to do, and I strongly
 suspect we are going to point people experiencing crashes on Windows
 to this code whether it's part of our official distribution or not.

 This whole thing makes me wonder: is there truly no reliable, simple method
 to tell Windows to create a core dump on crash, without writing custom code
 for it. I haven't seen one, but I find it amazing if there isn't. We can't
 be alone with this.

You can do it without custom code but it's quite hard. And AFAIK you
need to install the debugger tools package from microsoft (separate
download, and  not something you'll normally find on a system).

There is DrWatson and the Error Reporting service you can use.
DrWatson is made for interactive programs (or was the last time I
looked at it). The error reporting service requires setting up a local
error reporting service and configure it for reports, if you want them
sent anywhere other than to Microsoft.  Neither one of them is a
particularly good choie. The minidumps Craig's patch does are a lot
more useful.

We intentionally *disable* drwatson, because it's part of what pops up
an error message you have to click Ok on before your server will
continue running...

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Mon, Nov 22, 2010 at 18:46, Tom Lane t...@sss.pgh.pa.us wrote:
 ... I haven't looked at the patch but this
 discussion makes it sound like the dumper is dependent on an
 uncomfortably large amount of backend code being functional.

 No, it's dependent on close to zero backend functionality.
 Particularly if we take out the dependency on elog() (write_stderr is
 much simpler). In fact, the calls to elog() are the *only* places
 where it calls into the backend as it stands today.

Well, in the contrib-module guise, it's dependent on
shared_preload_libraries or local_preload_libraries, which at least
involves guc and dfmgr working pretty well (and not only in the
postmaster, but during backend startup).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Magnus Hagander
On Mon, Nov 22, 2010 at 19:39, Tom Lane t...@sss.pgh.pa.us wrote:
 Magnus Hagander mag...@hagander.net writes:
 On Mon, Nov 22, 2010 at 18:46, Tom Lane t...@sss.pgh.pa.us wrote:
 ... I haven't looked at the patch but this
 discussion makes it sound like the dumper is dependent on an
 uncomfortably large amount of backend code being functional.

 No, it's dependent on close to zero backend functionality.
 Particularly if we take out the dependency on elog() (write_stderr is
 much simpler). In fact, the calls to elog() are the *only* places
 where it calls into the backend as it stands today.

 Well, in the contrib-module guise, it's dependent on
 shared_preload_libraries or local_preload_libraries, which at least
 involves guc and dfmgr working pretty well (and not only in the
 postmaster, but during backend startup).

Yes, sorry. My mindset was in the version that'll go into 9.1.


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Peter Eisentraut
On mån, 2010-11-22 at 19:56 +0200, Heikki Linnakangas wrote:
 This whole thing makes me wonder: is there truly no reliable, simple 
 method to tell Windows to create a core dump on crash, without writing
 custom code for it. I haven't seen one, but I find it amazing if there
 isn't. We can't be alone with this.

Well, there is no reliable and simple method to rename a file on
Windows, so what can you expect ... ;-)



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] visibility map

2010-11-22 Thread Robert Haas
On Mon, Jun 14, 2010 at 1:19 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 I *think* that the answer to this parenthesized question is no.
 When we vacuum a page, we set the LSN on both the heap page and the
 visibility map page.  Therefore, neither of them can get written to
 disk until the WAL record is flushed, but they could get flushed in
 either order.  So the visibility map page could get flushed before the
 heap page, as the non-parenthesized portion of the comment indicates.

 Right.

 However, at least in theory, it seems like we could fix this up during
 redo.

 Setting a bit in the visibility map is currently not WAL-logged, but yes
 once we add WAL-logging, that's straightforward to fix.

Eh, so.  Suppose - for the sake of argument - we do the following:

1. Allocate an additional infomask(2) bit that means xmin is frozen,
no need to call XidInMVCCSnapshot().  When we freeze a tuple, we set
this bit in lieu of overwriting xmin.  Note that freezing pages is
already WAL-logged, so redo is possible.

2. Modify VACUUM so that, when the page is observed to be all-visible,
it will freeze all tuples on the page, set PD_ALL_VISIBLE, and set the
visibility map bit, writing a single XLOG record for the whole
operation (possibly piggybacking on XLOG_HEAP2_CLEAN if the same
vacuum already removed tuples; otherwise and/or when no tuples were
removed writing XLOG_HEAP2_FREEZE or some new record type).  This
loses no forensic information because of (1).  (If the page is NOT
observed to be all-visible, we freeze individual tuples only when they
hit the current age thresholds.)

Setting the visibility map bit is now crash-safe.

Please poke holes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Instrument checkpoint sync calls

2010-11-22 Thread Jeff Janes
On Mon, Nov 15, 2010 at 12:09 PM, Greg Smith g...@2ndquadrant.com wrote:
 So my task list is:

 0) Rebase against the HEAD that just code related to this touched today

 1) Assume that log_checkpoints is sufficient control over whether the timing
 overhead added is worth collecting, and therefore remove the half-baked idea
 of also wrapping with a compile-time option.

 2) Have the sync summary returned upwards, so it can be put onto the same
 line as the rest of the rest of the log_checkpoint info.

 All seems reasonable to me.  Will rev a new patch by tomorrow.

For the individual file sync times emitted under debug1, it would be
very handy if the file being synced was identified, for example
relation base/16384/16523. Rather than being numbered sequentially
within a given checkpoint.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: Proposed Windows-specific change: Enable crash dumps (like core files)

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 1:28 PM, Magnus Hagander mag...@hagander.net wrote:
 I think the reasonable options are either don't backpatch at all or
 backpatch the same way as we put it in HEAD, which is probably
 included in backend. I agree that sticking it in contrib is a
 half-measure that we shouldn't do.

 *IF* we go with a contrib module for 9.1 as well, we could backpatch
 as contrib module, but I think that's the only case.

I agree with this, certainly.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-11-22 Thread Peter Eisentraut
On tor, 2010-11-18 at 21:37 +0200, Heikki Linnakangas wrote:
 Have you done any performance testing? Functions like text_cmp can be
 a hotspot in sorting, so any extra overhead there might be show up in
 tests.

Without having optimized it very much yet, the performance for a 1GB
ORDER BY is

* without COLLATE clause: about the same as without the patch

* with COLLATE clause: about 30%-50% slower

I can imagine that there is some optimization potential in the latter
case.  But in any case, it's not awfully slow.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Per-column collation

2010-11-22 Thread Peter Eisentraut
On mån, 2010-11-22 at 11:58 +0900, Itagaki Takahiro wrote:
 * COLLATE information must be explicitly passed by caller in the patch,
 but we might forgot the handover when we write new codes. Is it possible
 to pass it automatically, say using a global variable? If we could do so,
 existing extensions might work with collation without rewritten.

I don't see how that is supposed to work.  I understand the concern, but
the system is fairly robust against this becoming a problem.

 * Did you check the regression test on Windows? We probably cannot use
 en_US.utf8 on Windows. Also, some output of the test includes non-ASCII
 characters. How will we test COLLATE feature on non-UTF8 databases?

I attempted to discuss this here:

http://archives.postgresql.org/pgsql-hackers/2010-11/msg00464.php

For a lack of a solution, the approach now is to run the regression test
file manually, and we provide separate test files for as many platforms
and encodings or whatever we want to support.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add a primary key using an existing index

2010-11-22 Thread Steve Singer

On 10-11-22 09:37 AM, Gurjeet Singh wrote:
On Sat, Nov 20, 2010 at 9:00 AM, Steve Singer ssinger...@sympatico.ca 
mailto:ssinger...@sympatico.ca wrote:



Submission Review:


Tests

The expected output for the regression tests you added don't match
what I'm getting when I run the tests with your patch applied.
I think you just need to regenerate the expected results they seem
to be from a previous version of the patch (different error
messages etc..).


Fixed. Also modified one test to cover the case where constraint name 
is provided.


Almost fixed.
I still get an unexpected difference.

! DETAIL:  cannot create PRIMARY KEY/UNIQUE constraint with a non-unique 
index.

  CREATE UNIQUE INDEX rpi_idx2 ON rpi_test(a , b);
  -- should fail; WITH INDEX option specified more than once.
  ALTER TABLE rpi_test ADD PRIMARY KEY (a, b)
--- 35,41 
  -- should fail; non-unique
  ALTER TABLE rpi_test ADD primary key(a, b) WITH (INDEX = 'rpi_idx1');
  ERROR:  rpi_idx1 is not a unique index
! DETAIL:  Cannot create PRIMARY KEY/UNIQUE constraint using a 
non-unique index.







Documentation
---

I was able to generate the docs.

The ALTER TABLE page under the synopsis has

ADD table_constraint

where table_constraint is defined on the CREATE TABLE page.
On the CREATE TABLE page table_constraint isn't defined as having
the WITH
, the WITH is part of index_parameters.

I propose the alter table page instead have

ADD table_constraint [index_parameters]

where index_parameters also references the CREATE TABLE page like
table_constraint.


IMHO index_parameters is an optional component of table_constraint, 
and hence can't be mentioned here, at least not the way shown above.




My reading of CREATE TABLE is that index_parameters is an optional 
parameter that comes after table_constraint and isn't part of 
table_constraint.   Any other opinions?


Everything else I mentioned seems fixed in this version



gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device



   




Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Pavel Stehule
Hello

I spent last two days a searching how to solve this problem better.
Probably I removed a issue with toasting. But I found  other issue,
that wasn't discussed before. This issue is only seq access to items
via array_seek function. I though about some variable that stores a
last accessed field and last used subscript - but there isn't a good
place where this info can be stored (maybe in ArrayType). Other issues
are a arrays with a null bitmap. It is difficult to say if this cache
can have a positive effect - maybe yes - for large arrays. Problem is
in possible a increase of price for random access to array. And there
are not any hint, that helps with specification about access to
array.

I don't think so searching inside expr. plan is a good idea. Is way to
have more code, more complexity. If we will do it, then more important
are accelerations of expression var = var + 1; var = var || var; or
some else.

So, please, I know, so you and Tom are busy, but try to spend a few
time about this problem before you are definitely reject this idea.

With my last patch (removing a toasting issue) the classic access to
array should be usable. But still any direct access to array from PL
executor is 20% faster - depends on array type. Maybe it is useless
discus - and all can be solved with possible support SRF inside simple
expression, but I don't know when it will be possible.

Regards

Pavel Stehule


 *
 * CAUTION: if you change the header for ordinary arrays you will also
 * need to change the headers for oidvector and int2vector!
 */




 I think it should be marked rejected.  I don't think Tom is likely to
 ever be in favor of a syntax change here, and while I hesitate to deal
 in absolutes, I don't think I will be either, and certainly not
 without a lot more work on improving the performance of the existing
 constructs.   In particular, this seems like something that really
 ought to be pursued:

 http://archives.postgresql.org/pgsql-hackers/2010-11/msg01177.php

 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] directory archive format for pg_dump

2010-11-22 Thread Heikki Linnakangas

On 22.11.2010 19:07, Tom Lane wrote:

Heikki Linnakangasheikki.linnakan...@enterprisedb.com  writes:

But I'm not actually sure we should be preventing mix  match of files
from different dumps. It might be very useful to do just that sometimes,
like restoring a recent backup, with the contents of one table replaced
with older data. A warning would be ok, though.


+1.  This mechanism seems like a solution in search of a problem.
Just lose the whole thing, and instead fix pg_dump to complain if
the target directory isn't empty.  That should be sufficient to guard
against accidental mixing of different dumps, and as Heikki says
there's not a good reason to prevent intentional mixing.


Extending that thought a bit, it would be nice if the per-file header 
would carry the info if the file is compressed or not, instead of just 
one such flag in the TOC. You could then also mix  match files from 
compressed and non-compressed archives.


Other than the md5 thing, the patch looks fine to me. There's many quite 
levels of indirection, it took me a while to get my head around the call 
chains like DataDumper-_WriteData-WriteDataToArchive-_WriteBuf, but I 
don't have any ideas on how to improve that.


However, docs are missing, so I'm marking this as Waiting on Author.

There's some cosmetic changes I'd like to have fixed or do myself before 
committing:


* wrap long lines
* use extern in function prototypes in header files
* inline some functions like _StartDataCompressor, _EndDataCompressor, 
_DoInflate/_DoDeflate  that aren't doing anything but call some other 
function.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-22 Thread Joachim Wieland
On Sun, Nov 21, 2010 at 11:48 PM, Fujii Masao masao.fu...@gmail.com wrote:
 --
 If you suspect a bug in Hot Standby, please set
        trace_recovery_messages = DEBUG2
 in postgresql.conf and repeat the action

 Always useful to know
 * max_connections
 * current number of sessions
 * whether we have two phase commits happening
 --

The trace_recovery_messages parameter does not give more output...

max_connections is set to 100

there have been no sessions on the standby itself, but a few on the
primary database, I don't know how much but probably not more than 10.
The sessions there were doing quite some load however, among them
slony synchronization, since the hot standby master database was
actually a slony replica.

max_prepared_transactions has not been changed from its default value of 0.


Joachim

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-11-22 Thread Dimitri Fontaine
Itagaki Takahiro itagaki.takah...@gmail.com writes:
 No. My suggestion was just to use the internal parser.

What about something like the attached, cfparser.v3.patch?

If that looks ok, do we want to add some documentation about the new
lexer capabilities? Also, for what good reason would we want to prevent
people from using the include facility?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 5024,5200  str_time(pg_time_t tnow)
  }
  
  /*
-  * Parse one line from recovery.conf. 'cmdline' is the raw line from the
-  * file. If the line is parsed successfully, returns true, false indicates
-  * syntax error. On success, *key_p and *value_p are set to the parameter
-  * name and value on the line, respectively. If the line is an empty line,
-  * consisting entirely of whitespace and comments, function returns true
-  * and *keyp_p and *value_p are set to NULL.
-  *
-  * The pointers returned in *key_p and *value_p point to an internal buffer
-  * that is valid only until the next call of parseRecoveryCommandFile().
-  */
- static bool
- parseRecoveryCommandFileLine(char *cmdline, char **key_p, char **value_p)
- {
- 	char	   *ptr;
- 	char	   *bufp;
- 	char	   *key;
- 	char	   *value;
- 	static char *buf = NULL;
- 
- 	*key_p = *value_p = NULL;
- 
- 	/*
- 	 * Allocate the buffer on first use. It's used to hold both the parameter
- 	 * name and value.
- 	 */
- 	if (buf == NULL)
- 		buf = malloc(MAXPGPATH + 1);
- 	bufp = buf;
- 
- 	/* Skip any whitespace at the beginning of line */
- 	for (ptr = cmdline; *ptr; ptr++)
- 	{
- 		if (!isspace((unsigned char) *ptr))
- 			break;
- 	}
- 	/* Ignore empty lines */
- 	if (*ptr == '\0' || *ptr == '#')
- 		return true;
- 
- 	/* Read the parameter name */
- 	key = bufp;
- 	while (*ptr  !isspace((unsigned char) *ptr) 
- 		   *ptr != '='  *ptr != '\'')
- 		*(bufp++) = *(ptr++);
- 	*(bufp++) = '\0';
- 
- 	/* Skip to the beginning quote of the parameter value */
- 	ptr = strchr(ptr, '\'');
- 	if (!ptr)
- 		return false;
- 	ptr++;
- 
- 	/* Read the parameter value to *bufp. Collapse any '' escapes as we go. */
- 	value = bufp;
- 	for (;;)
- 	{
- 		if (*ptr == '\'')
- 		{
- 			ptr++;
- 			if (*ptr == '\'')
- *(bufp++) = '\'';
- 			else
- 			{
- /* end of parameter */
- *bufp = '\0';
- break;
- 			}
- 		}
- 		else if (*ptr == '\0')
- 			return false;		/* unterminated quoted string */
- 		else
- 			*(bufp++) = *ptr;
- 
- 		ptr++;
- 	}
- 	*(bufp++) = '\0';
- 
- 	/* Check that there's no garbage after the value */
- 	while (*ptr)
- 	{
- 		if (*ptr == '#')
- 			break;
- 		if (!isspace((unsigned char) *ptr))
- 			return false;
- 		ptr++;
- 	}
- 
- 	/* Success! */
- 	*key_p = key;
- 	*value_p = value;
- 	return true;
- }
- 
- /*
   * See if there is a recovery command file (recovery.conf), and if so
   * read in parameters for archive recovery and XLOG streaming.
   *
!  * XXX longer term intention is to expand this to
!  * cater for additional parameters and controls
!  * possibly use a flex lexer similar to the GUC one
   */
  static void
  readRecoveryCommandFile(void)
  {
  	FILE	   *fd;
- 	char		cmdline[MAXPGPATH];
  	TimeLineID	rtli = 0;
  	bool		rtliGiven = false;
! 	bool		syntaxError = false;
  
  	fd = AllocateFile(RECOVERY_COMMAND_FILE, r);
! 	if (fd == NULL)
! 	{
! 		if (errno == ENOENT)
! 			return;/* not there, so no archive recovery */
! 		ereport(FATAL,
! (errcode_for_file_access(),
!  errmsg(could not open recovery command file \%s\: %m,
! 		RECOVERY_COMMAND_FILE)));
! 	}
  
  	/*
! 	 * Parse the file...
  	 */
! 	while (fgets(cmdline, sizeof(cmdline), fd) != NULL)
! 	{
! 		char	   *tok1;
! 		char	   *tok2;
! 
! 		if (!parseRecoveryCommandFileLine(cmdline, tok1, tok2))
! 		{
! 			syntaxError = true;
! 			break;
! 		}
! 		if (tok1 == NULL)
! 			continue;
  
! 		if (strcmp(tok1, restore_command) == 0)
  		{
! 			recoveryRestoreCommand = pstrdup(tok2);
  			ereport(DEBUG2,
  	(errmsg(restore_command = '%s',
  			recoveryRestoreCommand)));
  		}
! 		else if (strcmp(tok1, recovery_end_command) == 0)
  		{
! 			recoveryEndCommand = pstrdup(tok2);
  			ereport(DEBUG2,
  	(errmsg(recovery_end_command = '%s',
  			recoveryEndCommand)));
  		}
! 		else if (strcmp(tok1, archive_cleanup_command) == 0)
  		{
! 			archiveCleanupCommand = pstrdup(tok2);
  			ereport(DEBUG2,
  	(errmsg(archive_cleanup_command = '%s',
  			archiveCleanupCommand)));
  		}
! 		else if (strcmp(tok1, recovery_target_timeline) == 0)
  		{
  			rtliGiven = true;
! 			if (strcmp(tok2, latest) == 0)
  rtli = 0;
  			else
  			{
  errno = 0;
! rtli = (TimeLineID) strtoul(tok2, NULL, 0);
  if (errno == EINVAL || errno == ERANGE)
  	ereport(FATAL,
  			(errmsg(recovery_target_timeline is not a valid number: \%s\,
! 	tok2)));
  			}
  			if (rtli)
  ereport(DEBUG2,
--- 

[HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Peter Tanski

I have been working on a plugin for GiST that has some unusual features:

* The data type for both Node and Leaf keys is large (typically 4222  
bytes on 32-bit; 4230 bytes on 64-bit).


* Due to the large size the storage class is EXTENDED (main would only  
degrade to EXTENDED in any case).  Tests using EXTERNAL show the same  
results so I do not believe compression is an issue.


* This is a hash-type index: the values are large hashes for an audio  
fingerprinting application and the sole relationship between keys is a  
float match probability between two values.  For example, if A matches  
B with a score of 0.18 but A matches C with a score of 0.61, A is  
closer to C than to B.  The distance metric is calculated in reverse  
(1.0 - match_score) so the penalty for inserting A under the same Node  
as B would be 0.82 (scaled by 1000 to 820).


* The Leaf Keys contain the same data as the rows themselves.

* The Node (union) Keys are:

  - a bitwise OR of the lower-level Leaf Keys

  - a penalty or consistency comparison of Leaf or Query (L) value  
against a Node union-key (N) is effectively a scaled hamming distance  
of:

  L ^ (L  N)
so if L is under N the check will always return 1.0 (true for  
consistency; 0.0 for penalty);
by the same operation, newly inserted values compare closer to  
Node (union) keys where the corresponding Leaf keys are closer


  - Comparisons between two Nodes, N1 and N2 (used in Same(), for  
example)  have used a:

-- Tanimoto bit distance popcount(N1  N2) / popcount(N1 | N2);
-- the same scaled-hamming distance check used for a Leaf against  
another Leaf;

-- simple memcmp for identity.

Whatever test I use for Same(),  Penalty() and Consistent() does not  
seem to affect the problem significantly.  For now I am only using  
Consistent() as a check for retrieval.


I have developed this under debug source-builds of postgresql 8.4.5  
and 9.0.1 on Mac OS X 10.6 (Snow Leopard, x86 64-bit) and Ubuntu Linux  
10.04 (Lucid, x86 64-bit and 32-bit).  There is no difference between  
the platforms or architectures.  I am using the standard PostgreSQL  
Makefile build setup so compiler flags are the same as used by  
PostgreSQL.


The problem may be my own understanding of Picksplit(), Penalty() and  
Same().  Using gevel, on a table with 500 newly-inserted rows:


postgres=# \d fps2
   Table public.fps2
   Column| Type  | Modifiers
-+---+---
 soid| character(24) | not null
 fingerprint | fprint| not null
Indexes:
fps2_fingerprint_ix gist (fingerprint)

postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
-
 Number of levels:  4   +
 Number of pages:   61  +
 Number of leaf pages:  42  +
 Number of tuples:  193 +
 Number of invalid tuples:  0   +
 Number of leaf tuples: 133 +
 Total size of tuples:  271704 bytes+
 Total size of leaf tuples: 187236 bytes+
 Total size of index:   499712 bytes+


Note that there are only 133 leaf tuples -- for 500 rows.  If the  
index process were operating correctly, there should have been 500  
leaf tuples there.  If I REINDEX the table the number of leaf tuples  
may change slightly but not by much.  This closely corresponds to a  
query:


postgres=# select count(*) from fps2 f1 join fps2 f2 on f1.fingerprint  
= f2.fingerprint;

 count
---
   133

where = is a match operator that returns rows where the Leaf key  
comparison is  .98 (on the scaled probability score pretty much  
exactly equal).  The above query is using the index:


postgres=# explain select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint ~= f2.fingerprint;

   QUERY PLAN

 Aggregate  (cost=1173.67..1173.68 rows=1 width=0)
   -  Nested Loop  (cost=0.00..1170.54 rows=1250 width=0)
 -  Seq Scan on fps2 f1  (cost=0.00..105.00 rows=500 width=32)
 -  Index Scan using fps2_fingerprint_ix on fps2 f2   
(cost=0.00..2.11 rows=2 width=32)

   Index Cond: (f1.fingerprint ~= f2.fingerprint)


If I use a table scan instead of the index, the query would return:

postgres=# select count(*) from fps2 f1 join fps2 f2 on  
fprint_cmp(f1.fingerprint, f2.fingerprint)  .98;

 count
---
   500


The GiST tree looks like:

postgres=# select gist_tree('fps2_fingerprint_ix');
  gist_tree
--
 0(l:0) blk: 0 numTuple: 4 free: 2532b(68.97%) rightlink:4294967295  
(InvalidBlockNumber) +
 1(l:1) blk: 37 numTuple: 2 free: 5340b(34.56%) rightlink:105  
(OK)   +
 1(l:2) blk: 90 numTuple: 3 free: 3936b(51.76%) 

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Peter Tanski

One minor correction:

postgres=# explain select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint = f2.fingerprint;

   QUERY PLAN

 Aggregate  (cost=1173.67..1173.68 rows=1 width=0)
   -  Nested Loop  (cost=0.00..1170.54 rows=1250 width=0)
 -  Seq Scan on fps2 f1  (cost=0.00..105.00 rows=500 width=32)
 -  Index Scan using fps2_fingerprint_ix on fps2 f2   
(cost=0.00..2.11 rows=2 width=32)

   Index Cond: (f1.fingerprint = f2.fingerprint)

(The previous query example used the ~= operator which was defined to  
match at  .5 but in this case there no matches in the table so ~= is  
the same as =.)


On Nov 22, 2010, at 4:18 PM, Peter Tanski wrote:

I have been working on a plugin for GiST that has some unusual  
features:


* The data type for both Node and Leaf keys is large (typically 4222  
bytes on 32-bit; 4230 bytes on 64-bit).


* Due to the large size the storage class is EXTENDED (main would  
only degrade to EXTENDED in any case).  Tests using EXTERNAL show  
the same results so I do not believe compression is an issue.


* This is a hash-type index: the values are large hashes for an  
audio fingerprinting application and the sole relationship between  
keys is a float match probability between two values.  For example,  
if A matches B with a score of 0.18 but A matches C with a score of  
0.61, A is closer to C than to B.  The distance metric is calculated  
in reverse (1.0 - match_score) so the penalty for inserting A under  
the same Node as B would be 0.82 (scaled by 1000 to 820).


* The Leaf Keys contain the same data as the rows themselves.

* The Node (union) Keys are:

 - a bitwise OR of the lower-level Leaf Keys

 - a penalty or consistency comparison of Leaf or Query (L) value  
against a Node union-key (N) is effectively a scaled hamming  
distance of:

 L ^ (L  N)
   so if L is under N the check will always return 1.0 (true for  
consistency; 0.0 for penalty);
   by the same operation, newly inserted values compare closer to  
Node (union) keys where the corresponding Leaf keys are closer


 - Comparisons between two Nodes, N1 and N2 (used in Same(), for  
example)  have used a:

   -- Tanimoto bit distance popcount(N1  N2) / popcount(N1 | N2);
   -- the same scaled-hamming distance check used for a Leaf against  
another Leaf;

   -- simple memcmp for identity.

Whatever test I use for Same(),  Penalty() and Consistent() does not  
seem to affect the problem significantly.  For now I am only using  
Consistent() as a check for retrieval.


I have developed this under debug source-builds of postgresql 8.4.5  
and 9.0.1 on Mac OS X 10.6 (Snow Leopard, x86 64-bit) and Ubuntu  
Linux 10.04 (Lucid, x86 64-bit and 32-bit).  There is no difference  
between the platforms or architectures.  I am using the standard  
PostgreSQL Makefile build setup so compiler flags are the same as  
used by PostgreSQL.


The problem may be my own understanding of Picksplit(), Penalty()  
and Same().  Using gevel, on a table with 500 newly-inserted rows:


postgres=# \d fps2
  Table public.fps2
  Column| Type  | Modifiers
-+---+---
soid| character(24) | not null
fingerprint | fprint| not null
Indexes:
   fps2_fingerprint_ix gist (fingerprint)

postgres=# select gist_stat('fps2_fingerprint_ix');
   gist_stat
-
Number of levels:  4   +
Number of pages:   61  +
Number of leaf pages:  42  +
Number of tuples:  193 +
Number of invalid tuples:  0   +
Number of leaf tuples: 133 +
Total size of tuples:  271704 bytes+
Total size of leaf tuples: 187236 bytes+
Total size of index:   499712 bytes+


Note that there are only 133 leaf tuples -- for 500 rows.  If the  
index process were operating correctly, there should have been 500  
leaf tuples there.  If I REINDEX the table the number of leaf tuples  
may change slightly but not by much.  This closely corresponds to a  
query:


postgres=# select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint = f2.fingerprint;

count
---
  133

where = is a match operator that returns rows where the Leaf key  
comparison is  .98 (on the scaled probability score pretty much  
exactly equal).  The above query is using the index:


postgres=# explain select count(*) from fps2 f1 join fps2 f2 on  
f1.fingerprint ~= f2.fingerprint;

  QUERY PLAN

Aggregate  (cost=1173.67..1173.68 rows=1 width=0)
  -  Nested Loop  (cost=0.00..1170.54 rows=1250 width=0)
-  Seq Scan on fps2 f1  (cost=0.00..105.00 rows=500 width=32)
-  Index Scan using fps2_fingerprint_ix on 

Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 3:36 PM, Pavel Stehule pavel.steh...@gmail.com wrote:
 So, please, I know, so you and Tom are busy, but try to spend a few
 time about this problem before you are definitely reject this idea.

If I were to spend more time on this problem, what exactly would I
spend that time doing and how would it help?  If I were interested in
spending time I'd probably spend it pursuing the suggestions Tom
already made, and that's what I think you should do.  But I'm not
going to do that, because the purpose of the CommitFest is not for me
to write new patches from scratch that do something vaguely similar to
what a patch you wrote was trying to do.  It's for all of us to review
and commit the patches that have already been written.  You aren't
helping with that process, so your complaint that we aren't spending
enough time on your patches would be unfair even if were true, and it
isn't. The problem with your patch is that it has a design weakness,
not that it got short shift.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-11-22 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of lun nov 22 18:12:39 -0300 2010:
 Itagaki Takahiro itagaki.takah...@gmail.com writes:
  No. My suggestion was just to use the internal parser.
 
 What about something like the attached, cfparser.v3.patch?
 
 If that looks ok, do we want to add some documentation about the new
 lexer capabilities? Also, for what good reason would we want to prevent
 people from using the include facility?

Hmm, the first thought that comes to mind is that the GucContext param
to ParseConfigFile is unused and can be removed.  This is probably an
oversight from when include files were introduced in 2006.

I don't like the fact that this code handles custom_variable_classes
internally.  I think this would be exposed to the parsing of extension
control files, which is obviously wrong.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-11-22 Thread Alvaro Herrera
Excerpts from Alvaro Herrera's message of lun nov 22 18:59:52 -0300 2010:

 Hmm, the first thought that comes to mind is that the GucContext param
 to ParseConfigFile is unused and can be removed.  This is probably an
 oversight from when include files were introduced in 2006.

Committed and pushed.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extensions, this time with a patch

2010-11-22 Thread Alvaro Herrera
Excerpts from Dimitri Fontaine's message of lun nov 22 18:12:39 -0300 2010:
 Itagaki Takahiro itagaki.takah...@gmail.com writes:
  No. My suggestion was just to use the internal parser.
 
 What about something like the attached, cfparser.v3.patch?

the handling of relative vs absolute paths is bogus here.  I think it'd
make more sense to have a bool are we including; and if that's false and
the path is not absolute, then the file is relative to CWD; or maybe we
make it absolute by prepending PGDATA; maybe something else?  (need to
think of something that makes sense for both recovery.conf and extension
control files)


 If that looks ok, do we want to add some documentation about the new
 lexer capabilities?

beyond extra code comments?  probably not.

 Also, for what good reason would we want to prevent
 people from using the include facility?

Not sure about this

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] reporting reason for certain locks

2010-11-22 Thread Alvaro Herrera
Hi,

When we lock on a Xid or VirtualXid, there's no way to obtain clear
information on the reason for locking.  Consider the following example:

CREATE TABLE foo (a int);

Session 1:
BEGIN;
SELECT 1;
-- we now have a snapshot

Session 2:
CREATE INDEX CONCURRENTLY foo_a ON foo(a);

This blocks until transaction 1 commits, and it's not obvious to the
user the reason for this.  There's some info in pg_locks but it just
says it's blocked in a VirtualXid.

A much more common ocurrence is tuple locks.  We block in an Xid in that
case; and this has been a frequent question in the mailing lists and
IRC.

I think it would be very nice to be able to report something to the
user; however, I'm not seeing the mechanism.

A simple idea I had was that each backend would have a reserved shared
memory area where they would write what they are about to lock, when
locking an Xid or VXid.  Thus, if they block, someone else can examine
that and make the situation clearer.  The problem with this idea is that
it would require locking a LWLock just before trying each lock on
Xid/VXid, which would be horrible for performance.

... or maybe not, because when we call XactLockTableWait, we've already
established that we've accepted to sleep.

Thoughts?

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread David E. Wheeler
Patch attached.

Best,

David


enum_value.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread Andrew Dunstan



On 11/22/2010 06:36 PM, David E. Wheeler wrote:

Patch attached.


Thanks, I'll look at this shortly. I think it needs a little bit more, 
which I'll do. In particular, we should now avoid using the word 'value' 
to refer to the internal representation of an enum - that will just be 
confusing.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reporting reason for certain locks

2010-11-22 Thread Tom Lane
Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 A much more common ocurrence is tuple locks.  We block in an Xid in that
 case; and this has been a frequent question in the mailing lists and
 IRC.

 I think it would be very nice to be able to report something to the
 user; however, I'm not seeing the mechanism.

At least for tuple locks, the information is already visible, because we
have a real lock on the target tuple before we try to lock the current
holder's VXID.  So I think this isn't so much a question of needing more
low-level mechanism as one of providing a more useful view --- some kind
of self-join on pg_locks is needed.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread Tom Lane
David E. Wheeler da...@kineticode.com writes:
 Patch attached.

Most of those changes seem like they make it less readable, not more so.
In particular I don't find it an improvement to replace textual label
with textual value.  I think of value as meaning some abstract
notion of a particular enum member, which is not identical to the
concrete text string that represents it.  If you consider them the same
thing then renaming an enum value would be a meaningless concept.

Maybe instead of textual label, we should say name?  But that
doesn't seem like quite le mot juste either.  label is actually a
pretty good word for the text representation of an enum value.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Tom Lane
I've done some *very* preliminary review of this patch now.  I think
that the way to move forward is to first commit the work surrounding
changes to pg_amop, including suitable changes to CREATE/ALTER OPERATOR
CLASS/FAMILY so that there's a way to put the new info into the table.
The system won't initially *do* anything useful with ordering-operator
entries, but this is necessary infrastructure before we can start to
do anything interesting.

After reviewing both Teodor's and Robert's patches for the issue,
I believe that Teodor has a better basic approach: if an operator
is useful for both searching and ordering, the way to handle that
is to make two pg_amop entries for it, with different strategy
numbers.  This means in particular that we can continue to require
strategy numbers to be unique within an opclass, so right offhand
I see no need to widen pg_amop_fam_strat_index to five columns.
(Which means we don't need that patch to allow five-key syscaches.)
What we need instead is to add the purpose column to the existing
unique index on (amopopr, amopfamily).  In English that means that
instead of allowing an operator to have only one entry per opfamily,
it can now have one entry per opfamily per purpose.

I do however concur with Robert that it's a bit silly to go to all this
work and not leave room in the design for additional operator purposes
later.  So rather than using a boolean amoporder column, I think we want
an enumerated-type column amoppurpose.  Per our usual system catalog
conventions for poor man's enums, I suggest declaring the column as
char with the two allowed values
AMOP_SEARCH 's'
AMOP_ORDER  'o'
Aside from leaving room for possible extension, I think that using
AMOP_SEARCH/AMOP_ORDER rather than true/false in the code will be more
readable and less error-prone.

As far as the syntax of CREATE/ALTER OPERATOR CLASS/FAMILY is concerned,
I like Robert's proposal (OPERATOR ... FOR ORDER or FOR SEARCH) better
than Teodor's, though we don't need the multiple-purposes-per-entry
aspect of it.  This is mainly because it looks more easily extensible
if we ever do get around to having more purposes.

There's a lot more to be done after this, but if there are no objections
to this much, I'll go merge the relevant parts of the two patches and
commit.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread Andrew Dunstan



On 11/22/2010 06:57 PM, Tom Lane wrote:

David E. Wheelerda...@kineticode.com  writes:

Patch attached.

Most of those changes seem like they make it less readable, not more so.
In particular I don't find it an improvement to replace textual label
with textual value.  I think of value as meaning some abstract
notion of a particular enum member, which is not identical to the
concrete text string that represents it.  If you consider them the same
thing then renaming an enum value would be a meaningless concept.

Maybe instead of textual label, we should say name?  But that
doesn't seem like quite le mot juste either.  label is actually a
pretty good word for the text representation of an enum value.


Oh my boots and buttons. I think we're splitting some very fine hairs 
here. A few weeks back you were telling us that label wasn't a very good 
word and shouldn't be sanctified in the SQL.


I don't mind that much leaving it as it is, but we do seem to be 
straining at gnats a bit.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 11/22/2010 06:57 PM, Tom Lane wrote:
 Maybe instead of textual label, we should say name?  But that
 doesn't seem like quite le mot juste either.  label is actually a
 pretty good word for the text representation of an enum value.

 Oh my boots and buttons. I think we're splitting some very fine hairs 
 here. A few weeks back you were telling us that label wasn't a very good 
 word and shouldn't be sanctified in the SQL.

It isn't a very good word for the abstract value, IMO, but the text
representation is a different concept.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 7:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andrew Dunstan and...@dunslane.net writes:
 On 11/22/2010 06:57 PM, Tom Lane wrote:
 Maybe instead of textual label, we should say name?  But that
 doesn't seem like quite le mot juste either.  label is actually a
 pretty good word for the text representation of an enum value.

 Oh my boots and buttons. I think we're splitting some very fine hairs
 here. A few weeks back you were telling us that label wasn't a very good
 word and shouldn't be sanctified in the SQL.

 It isn't a very good word for the abstract value, IMO, but the text
 representation is a different concept.

+1 for what Andrew said.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread David E. Wheeler
On Nov 22, 2010, at 4:46 PM, Tom Lane wrote:

 Oh my boots and buttons. I think we're splitting some very fine hairs 
 here. A few weeks back you were telling us that label wasn't a very good 
 word and shouldn't be sanctified in the SQL.
 
 It isn't a very good word for the abstract value, IMO, but the text
 representation is a different concept.

But that's the thing we've been talking about all along! It's now

  ALTER ENUM ADD VALUE 'FOO';

But that sets the text value, the label, not the abstract value.

Boots and buttons indeed.

David

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Tom Lane
I wrote:
 As far as the syntax of CREATE/ALTER OPERATOR CLASS/FAMILY is concerned,
 I like Robert's proposal (OPERATOR ... FOR ORDER or FOR SEARCH) better
 than Teodor's, though we don't need the multiple-purposes-per-entry
 aspect of it.  This is mainly because it looks more easily extensible
 if we ever do get around to having more purposes.

Although having said that ... I was poking around a bit more and noticed
how entirely bogus the current patch's matching of pathkeys is.  It pays
no attention at all to which pk_opfamily is specified, which means it
will fail given a query like
SELECT ... ORDER BY indexable_col - constant USING 
where  represents some sort order that has nothing to do with the
order that GIST is expecting.  The query will be considered to match the
index and then it will proceed to produce results in the wrong order.

We could perhaps kluge around that by insisting that the pathkey opclass
be the default btree opclass for the relevant datatype; but that's
fairly expensive to check for in match_pathkey_to_index, and it's
getting even further away from having a general-purpose solution.

I'm inclined to think that instead of just specifying FOR ORDER,
the opclass definition ought to specify FOR ORDER BY something,
where something could perhaps be a pre-existing btree opclass name.
Or maybe it could be a suitable sort operator, though then we'd have
to do a bit of reverse-engineering in match_pathkey_to_index, so that's
still no fun from a lookup-speed point of view.  In any case what we'd
be specifying is a sort order for the datatype that the opclass member
operator yields, not the indexed datatype (so in practice it'd usually
be float8_ops or float8 , no doubt).

The reason I bring this up now is that it affects the decision as to
what the unique key for pg_amop ought to be.  Instead of having an
enum purpose column, maybe we should consider that the unique key
is (operator oid, opfamily oid, order-by-oid), where order-by-oid
is zero for a search operator and the OID of the btree opclass or sort
operator for an ordering operator.  This would be of value if we
imagined that a single opclass could support ordering by more than one
distance ordering operator; which seems a bit far-fetched but perhaps
not impossible.  On the other side of the coin it'd mean we aren't
leaving room for other sorts of operator purposes.

On balance I'm inclined to leave the unique key as per previous proposal
(with a purpose column) and add the which-sort-order-is-that
information as payload columns that aren't part of the key.

Thoughts?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread Josh Berkus
All,

Whatever we pick, someone will be confused by it and about equal numbers
regardless.  Let's just stick with the current patch.

Or we could call it extraint conclusions.  ;-)


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reporting reason for certain locks

2010-11-22 Thread Josh Berkus

 ... or maybe not, because when we call XactLockTableWait, we've already
 established that we've accepted to sleep.
 
 Thoughts?

Other than this being a sincere need, no.


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] s/LABEL/VALUE/ for ENUMs

2010-11-22 Thread Josh Berkus
On 11/22/10 5:38 PM, Josh Berkus wrote:
 All,
 
 Whatever we pick, someone will be confused by it and about equal numbers
 regardless.  Let's just stick with the current patch.

... original patch.  Sorry.  Let's not fiddle with the names.

 
 Or we could call it extraint conclusions.  ;-)
 
 


-- 
  -- Josh Berkus
 PostgreSQL Experts Inc.
 http://www.pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] reporting reason for certain locks

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 5:55 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote:
 Hi,

 When we lock on a Xid or VirtualXid, there's no way to obtain clear
 information on the reason for locking.  Consider the following example:

 CREATE TABLE foo (a int);

 Session 1:
 BEGIN;
 SELECT 1;
 -- we now have a snapshot

 Session 2:
 CREATE INDEX CONCURRENTLY foo_a ON foo(a);

 This blocks until transaction 1 commits, and it's not obvious to the
 user the reason for this.  There's some info in pg_locks but it just
 says it's blocked in a VirtualXid.

 A much more common ocurrence is tuple locks.  We block in an Xid in that
 case; and this has been a frequent question in the mailing lists and
 IRC.

 I think it would be very nice to be able to report something to the
 user; however, I'm not seeing the mechanism.

 A simple idea I had was that each backend would have a reserved shared
 memory area where they would write what they are about to lock, when
 locking an Xid or VXid.  Thus, if they block, someone else can examine
 that and make the situation clearer.  The problem with this idea is that
 it would require locking a LWLock just before trying each lock on
 Xid/VXid, which would be horrible for performance.

 ... or maybe not, because when we call XactLockTableWait, we've already
 established that we've accepted to sleep.

 Thoughts?

How about publishing additional details to pg_stat_activity via
pgstat_report_waiting()?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Robert Haas
On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 The reason I bring this up now is that it affects the decision as to
 what the unique key for pg_amop ought to be.  Instead of having an
 enum purpose column, maybe we should consider that the unique key
 is (operator oid, opfamily oid, order-by-oid), where order-by-oid
 is zero for a search operator and the OID of the btree opclass or sort
 operator for an ordering operator.  This would be of value if we
 imagined that a single opclass could support ordering by more than one
 distance ordering operator; which seems a bit far-fetched but perhaps
 not impossible.  On the other side of the coin it'd mean we aren't
 leaving room for other sorts of operator purposes.

Since the need for additional purposes is mostly hypothetical, this
wouldn't bother me any.

 On balance I'm inclined to leave the unique key as per previous proposal
 (with a purpose column) and add the which-sort-order-is-that
 information as payload columns that aren't part of the key.

This is probably OK too, although I confess I'm a lot less happy about
it now that you've pointed out the need for those payload columns.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER OBJECT any_name SET SCHEMA name

2010-11-22 Thread Robert Haas
On Sun, Nov 21, 2010 at 4:47 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Sat, Nov 20, 2010 at 11:23 PM, Robert Haas robertmh...@gmail.com wrote:
 Ah, nuts.  I see now there's a v7.  Never mind...

 OK.  I looked at the right version, now.  Hopefully.

 Yeah, that was the most recent one and I linked it in the commit fest
 application. Given the very fast feedback I got, there has been a lot of
 activity and patches versions produced, so that's easy to get confused.

Especially because you also posted some revs of the ALTER EXTENSION ..
SET SCHEMA patch on this thread

 It seems we have no regression tests at all for any of the existing
 SET SCHEMA commands.  This seems like a good time to correct that
 oversight, and also add some for the new commands you're adding here.

 Yeah, it's time for me to have a look at regression tests :)

 Please find attached set_schema.v8.patch with tests for the added
 commands in the patch.

 (It might be helpful to submit the regression tests for the existing
 commands as a separate patch.) Also, you're missing psql tab
 completion support, which would be nice to have.

 Do you still want me to prepare another patch for adding in the tests
 the set schema variants that already existed but are not yet covered?
 Which are the one you did spot, btw?

[rhaas pgsql]$ git grep 'SET SCHEMA' src/test/regress/
[rhaas pgsql]$

 Completion support for psql. Isn't that stepping on David's toes? :)
 I'll see about that later if needed, maybe sometime tomorrow…

Please do.  Tab completion support should really be included in the
patch - adding it as a separate patch is better than not having it, of
course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] knngist - 0.8

2010-11-22 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Nov 22, 2010 at 8:32 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 On balance I'm inclined to leave the unique key as per previous proposal
 (with a purpose column) and add the which-sort-order-is-that
 information as payload columns that aren't part of the key.

 This is probably OK too, although I confess I'm a lot less happy about
 it now that you've pointed out the need for those payload columns.

The reason I said columns is that I can foresee eventually wanting to
specify a pathkey in its entirety --- opfamily, asc/desc, nulls_first,
and whatever we come up with for collation.  We don't currently need to
store more than the opfamily, since the others can never need to have
non-default values given the current implementation of KNNGIST.  But the
idea that they might all be there eventually makes me feel that we don't
want to try to incorporate this data in pg_amop's unique key.  I'm
satisfied to say that only one sort order can be associated with a
particular operator in a particular opclass, which is what would be
implied by using AMOP_SEARCH/AMOP_ORDER as the unique key component.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Pavel Stehule
2010/11/22 Robert Haas robertmh...@gmail.com:
 On Mon, Nov 22, 2010 at 3:36 PM, Pavel Stehule pavel.steh...@gmail.com 
 wrote:
 So, please, I know, so you and Tom are busy, but try to spend a few
 time about this problem before you are definitely reject this idea.

 If I were to spend more time on this problem, what exactly would I
 spend that time doing and how would it help?  If I were interested in
 spending time I'd probably spend it pursuing the suggestions Tom
 already made, and that's what I think you should do.  But I'm not
 going to do that, because the purpose of the CommitFest is not for me
 to write new patches from scratch that do something vaguely similar to
 what a patch you wrote was trying to do.  It's for all of us to review
 and commit the patches that have already been written.  You aren't
 helping with that process, so your complaint that we aren't spending
 enough time on your patches would be unfair even if were true, and it
 isn't. The problem with your patch is that it has a design weakness,
 not that it got short shift.

ok, I can only recapitulate so this feature was proposed cca two
months ago, and minimally Tom and maybe you did agreement - with
request on syntax - do you remember? I am little bit tired so this
agreement was changed when I spent my time with this.

Pavel




 --
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread David Fetter
On Tue, Nov 23, 2010 at 05:55:28AM +0100, Pavel Stehule wrote:
 
 ok, I can only recapitulate so this feature was proposed cca two
 months ago, and minimally Tom and maybe you did agreement - with
 request on syntax - do you remember? I am little bit tired so this
 agreement was changed when I spent my time with this.

What you can actually do that's productive is stop all current
development and concentrate on reviewing patches.  If the language gap
is an issue, please feel free to mail me your reviews in Czech, and I
will get them translated.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] final patch - plpgsql: for-in-array

2010-11-22 Thread Pavel Stehule
2010/11/23 David Fetter da...@fetter.org:
 On Tue, Nov 23, 2010 at 05:55:28AM +0100, Pavel Stehule wrote:

 ok, I can only recapitulate so this feature was proposed cca two
 months ago, and minimally Tom and maybe you did agreement - with
 request on syntax - do you remember? I am little bit tired so this
 agreement was changed when I spent my time with this.

 What you can actually do that's productive is stop all current
 development and concentrate on reviewing patches.  If the language gap
 is an issue, please feel free to mail me your reviews in Czech, and I
 will get them translated.


it was a last mail to this topic minimally for some months.

Regards

Pavel Stehule

 Cheers,
 David.
 --
 David Fetter da...@fetter.org http://fetter.org/
 Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
 Skype: davidfetter      XMPP: david.fet...@gmail.com
 iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

 Remember to vote!
 Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Heikki Linnakangas

On 22.11.2010 23:18, Peter Tanski wrote:

Whatever test I use for Same(), Penalty() and Consistent() does not seem
to affect the problem significantly. For now I am only using
Consistent() as a check for retrieval.


I believe it's not possible to lose leaf tuples with incorrectly defined 
gist support functions. You might get completely bogus results, but the 
tuples should be there when you look at gist_tree() output. So this 
sounds like a gist bug to me.



Note that there are only 133 leaf tuples -- for 500 rows. If the index
process were operating correctly, there should have been 500 leaf tuples
there. If I REINDEX the table the number of leaf tuples may change
slightly but not by much.


One idea for debugging is to insert the rows to the table one by one, 
and run the query after each insertion. When do the leaf tuples disappear?


If you can put together a small self-contained test case and post it to 
the list, I can take a look.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [JDBC] Support for JDBC setQueryTimeout, et al.

2010-11-22 Thread Kris Jurka



On Mon, 22 Nov 2010, Itagaki Takahiro wrote:


On Fri, Oct 15, 2010 at 03:40, Rados?aw Smogura
rsmog...@softperience.eu wrote:

Regarding JDBC in the CF process -- other interfaces are handled
there.  I haven't seen one patch this size for JDBC since I've been
involved, let alone two competing patches to implement the same
feature.  Small patches which can be quickly handled don't make sense
to put into the process, but it seemed reasonable for these.


In any way I'm sending this patch, and I will put this under Miscellaneous in
CF. This cleared patch takes only 47k (in uncleared was some binary read
classes) and about 50% it's big test case.


I changed the patch's topic to JDBC.
https://commitfest.postgresql.org/action/patch_view?id=399



I don't think it makes sense to try to manage anything other than core 
code in the commitfest app.  The other patch touched the backend, so it 
made sense to put it in the commitfest, but as far as I understand it, 
this one is pure Java code.  There is a backlog of JDBC issues to deal 
with, but I think it needs its own commitfest instead of trying to tack on 
to the main project's.


Kris Jurka
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Standby: too many KnownAssignedXids

2010-11-22 Thread Heikki Linnakangas

On 19.11.2010 23:46, Joachim Wieland wrote:

FATAL:  too many KnownAssignedXids. head: 0, tail: 0, nxids: 9978,
pArray-maxKnownAssignedXids: 6890


Hmm, that's a lot of entries in KnownAssignedXids.

Can you recompile with WAL_DEBUG, and run the recovery again with 
wal_debug=on ? That will print all the replayed WAL records, which is a 
lot of data, but it might give a hint what's going on.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers