Re: [HACKERS] GIN index creation extremely slow ?

2006-06-28 Thread Teodor Sigaev

I won't have access to the original testcase and server for a few days
but I just redid some testing on a slower personal box of mine with a
smaller(but similiar) testset and on that box I could not reproduce that
issue.
So the problem is either caused by the size of the table or somehow by
the data itself :-(


We tested gin with 5 millions records (but not a wikipedia's text, but with blog 
records which is usually mush shorter) and index creation time was near 8 hours 
on Dual Xeon 3.2, but we didn't try GiST. In other hand, creating GiST index on 
0.5 millions emails took about 12 hours.


That's why 7 minutes is very suspicious result.

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] GIN index creation extremely slow ?

2006-06-28 Thread Martijn van Oosterhout
On Wed, Jun 28, 2006 at 01:09:23PM +0400, Teodor Sigaev wrote:
 We tested gin with 5 millions records (but not a wikipedia's text, but with 
 blog records which is usually mush shorter) and index creation time was 
 near 8 hours on Dual Xeon 3.2, but we didn't try GiST. In other hand, 
 creating GiST index on 0.5 millions emails took about 12 hours.
 
 That's why 7 minutes is very suspicious result.

Tom did commit a patch a while ago which made a huge difference in
index creation time for tsearch by changing one routine. I don't know
if it got backpatched, so it might be worth checking people are working
on the same version.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] GIN index creation extremely slow ?

2006-06-28 Thread Oleg Bartunov

On Wed, 28 Jun 2006, Martijn van Oosterhout wrote:


On Wed, Jun 28, 2006 at 01:09:23PM +0400, Teodor Sigaev wrote:

We tested gin with 5 millions records (but not a wikipedia's text, but with
blog records which is usually mush shorter) and index creation time was
near 8 hours on Dual Xeon 3.2, but we didn't try GiST. In other hand,
creating GiST index on 0.5 millions emails took about 12 hours.

That's why 7 minutes is very suspicious result.


Tom did commit a patch a while ago which made a huge difference in
index creation time for tsearch by changing one routine. I don't know
if it got backpatched, so it might be worth checking people are working
on the same version.


it's on 8.1.4 and speedup is about 10 times.



Have a nice day,



Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] GIN index creation extremely slow ?

2006-06-28 Thread Teodor Sigaev

Tom did commit a patch a while ago which made a huge difference in
index creation time for tsearch by changing one routine. I don't know
if it got backpatched, so it might be worth checking people are working
on the same version.


I saw that patch, but I still think that 7 minutes is too small :)

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Help with casting and comparing.

2006-06-28 Thread Tzahi Fadida
Hi,

I need help finding out how to determine if two types are equality compatible
and compare them.

I am using the following call to check for equality between two values:
DatumGetBool(\
FunctionCall2((fctx-tupleSetAttEQFunctions[attID]-eq_opr_finfo)\
  , lvalue, rvalue))

The structure fctx-tupleSetAttEQFunctions[attID]-eq_opr_finfo currently 
holds a reference to an equality function that is proper for the type of 
lvalue and rvalue.

Currently i only allow two values only of the same type but i wish to allow
to compare values like 20.2=?20.2 or 20=?20.0 etc...

The first step is to find out if two attributes are equality and casting 
compatible, i.e., if one type can be cast to the other type so they can be 
compared. Or, just equality compatible and the casting is done somehow behind
the scene.
Finally, i have to use a function to compare the two values.

-- 
Regards,
Tzahi.
--
Tzahi Fadida
Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info
WARNING TO SPAMMERS:  see at 
http://members.lycos.co.uk/my2nis/spamwarning.html

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] SO_SNDBUF size is small on win32?

2006-06-28 Thread Yoshiyuki Asaba
Hi,

From: Tom Lane [EMAIL PROTECTED]
Subject: Re: [HACKERS] SO_SNDBUF size is small on win32? 
Date: Tue, 27 Jun 2006 14:43:57 -0400

  What I would think might help is a patch on the libpq side (because it
  *does* use a nonblocking socket) to avoid sending more than 
  8K per WSASend call.
 
  It could definitly be a good idea to have a patch there *as well*, but I
  think they'd both be affected.
 
 On the libpq side, sending large messages is probably rare except for
 COPY IN mode.  Has anyone noticed performance issues specifically with
 COPY IN?

I think libpq interface does not use non-blocking socket. Because
'FRONTEND' symbol is enabled.

src/include/port/win32.h
  #ifndef FRONTEND
  #define socket(af, type, protocol) pgwin32_socket(af, type, protocol)
  #define accept(s, addr, addrlen) pgwin32_accept(s, addr, addrlen)
  #define connect(s, name, namelen) pgwin32_connect(s, name, namelen)
  #define select(n, r, w, e, timeout) pgwin32_select(n, r, w, e, timeout)
  #define recv(s, buf, len, flags) pgwin32_recv(s, buf, len, flags)
  #define send(s, buf, len, flags) pgwin32_send(s, buf, len, flags)

I think this is only server-side problem. Is this right?

Regards,

--
Yoshiyuki Asaba
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] SO_SNDBUF size is small on win32?

2006-06-28 Thread Tom Lane
Yoshiyuki Asaba [EMAIL PROTECTED] writes:
 I think libpq interface does not use non-blocking socket.

Not unless the Windows port has also disabled pg_set_noblock ...

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Help with casting and comparing.

2006-06-28 Thread Martijn van Oosterhout
On Wed, Jun 28, 2006 at 03:25:57PM +0300, Tzahi Fadida wrote:
 Hi,
 
 I need help finding out how to determine if two types are equality compatible
 and compare them.

snip

 Currently i only allow two values only of the same type but i wish to allow
 to compare values like 20.2=?20.2 or 20=?20.0 etc...
 
 The first step is to find out if two attributes are equality and casting 
 compatible, i.e., if one type can be cast to the other type so they can be 
 compared. Or, just equality compatible and the casting is done somehow behind
 the scene.

There's two ways an equality could work. For example, there are
equality operators that take parameters of different types. That's the
easy case. Then you have binary compatable types, and then types with
actual conversion functions.

Fortunatly the backend contains functions that do all this already.
Check out parser/parse_oper.c, in particular oper() and
compatable_oper().

You may have to be prepared to handle a parsetree to do the actual
work.

/* oper() -- search for a binary operator
 * Given operator name, types of arg1 and arg2, return oper struct.
 *
 * IMPORTANT: the returned operator (if any) is only promised to be
 * coercion-compatible with the input datatypes.  Do not use this if
 * you need an exact- or binary-compatible match; see compatible_oper.
...etc...

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


[HACKERS] optimizing constant quals within outer joins

2006-06-28 Thread Phil Frost
I have an optimization I'd like to see which I think should be pretty
easy for someone familiar with the planner code to implement. My
situation is this: I have an application using veil[1]. Essentially, I
have a schema private and another public. Private contains regular
tables, where private contains views on those tables, like create view
public.foo as select * from foo where i_have_global_priv('select_foo'),
and i_have_global_priv is a stable function.

My problem is that in several situations, postgresql is planning a
sequential scan with i_have_global_priv(n) as a filter, where N is some
constant literal specified in the view definition. This leads to the
function being called hundreds of thousands of times, which makes my
query orders of magnitude slower.

In some cases, the planner already optimizes this by moving the where
i_have_global_priv(n) qualification out of the seq scan filter and into
the one-time filter of a result node. The relevant function in the code
seems to be pull_constant_clauses, called from query_planner in
planmain.c around line 118.

By experimentation, it seems that this optimization will not be made on
either side of an outer join. For example:

dew=# explain select * from
(select * from private.orderitem where i_have_global_priv(28)) as oi
join (
select * from private.orderitemproduct where i_have_global_priv(32)
) as oip using (objectid);
  QUERY PLAN
   
---
 Result  (cost=96.56..402.70 rows=5004 width=325)
   One-Time Filter: (i_have_global_priv(28) AND i_have_global_priv(32))
   -  Hash Join  (cost=96.55..402.69 rows=5004 width=325)
 Hash Cond: (outer.objectid = inner.objectid)
 -  Seq Scan on orderitem  (cost=0.00..165.44 rows=6044 width=306)
 -  Hash  (cost=84.04..84.04 rows=5004 width=23)
   -  Seq Scan on orderitemproduct  (cost=0.00..84.04 rows=5004 
width=23)

dew=# explain select * from
(select * from private.orderitem where i_have_global_priv(28)) as oi
left join (
select * from private.orderitemproduct where i_have_global_priv(32)
) as oip using (objectid);
   QUERY PLAN   
 
-
 Hash Left Join  (cost=100.72..301.94 rows=2015 width=325)
   Hash Cond: (outer.objectid = inner.objectid)
   -  Seq Scan on orderitem  (cost=0.00..180.55 rows=2015 width=306)
 Filter: i_have_global_priv(28)
   -  Hash  (cost=96.55..96.55 rows=1668 width=23)
 -  Seq Scan on orderitemproduct  (cost=0.00..96.55 rows=1668 width=23)
   Filter: i_have_global_priv(32)


Notice that the cross join plan results in i_have_global_priv being
called just twice -- once for each privilege being checked, while the
left join plan will result in it being called once for each row.

So, is this something I can coerce someone into doing? It would be very
much appreciated here.


[1] http://veil.projects.postgresql.org/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] optimizing constant quals within outer joins

2006-06-28 Thread Martijn van Oosterhout
On Wed, Jun 28, 2006 at 10:35:37AM -0400, Phil Frost wrote:
 I have an optimization I'd like to see which I think should be pretty
 easy for someone familiar with the planner code to implement. My
 situation is this: I have an application using veil[1]. Essentially, I
 have a schema private and another public. Private contains regular
 tables, where private contains views on those tables, like create view
 public.foo as select * from foo where i_have_global_priv('select_foo'),
 and i_have_global_priv is a stable function.
 
 My problem is that in several situations, postgresql is planning a
 sequential scan with i_have_global_priv(n) as a filter, where N is some
 constant literal specified in the view definition. This leads to the
 function being called hundreds of thousands of times, which makes my
 query orders of magnitude slower.

Is the function marked stable or immutable?

In the examples you give the planner can't move the function around the
tree because that would change the output of the query. For inner joins
it's ok, for outer joins it's much more tricky.

I thought the planner would evaluate constant conditions early on which
I why I'm asking about the function.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature


Re: [HACKERS] Help with casting and comparing.

2006-06-28 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 On Wed, Jun 28, 2006 at 03:25:57PM +0300, Tzahi Fadida wrote:
 I need help finding out how to determine if two types are equality compatible
 and compare them.

 Fortunatly the backend contains functions that do all this already.
 Check out parser/parse_oper.c, in particular oper() and
 compatable_oper().

Note that this still leaves the question of what operator to search for,
and where to look for it.  The current system doesn't really provide an
adequate way of identifying a suitable equality operator; you kind of
have to take it on faith that people won't have made = do unexpected
things (an assumption already violated by some builtin datatypes ...).
We've been moving gradually in the direction of relying on btree
operator classes to give us a better understanding of which operators
really act like equality, but it's far from all done.

The most recent thread about fixing this was
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00960.php
Nothing much has been done since then as far as fixing foreign-key
checks, but you might want to look at the code for interpreting row
value comparisons (make_row_comparison_op in parse_expr.c).
SelectSortFunction in tuplesort.c is another example of looking for
btree info to infer the behavior of an operator.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] optimizing constant quals within outer joins

2006-06-28 Thread Phil Frost
On Wed, Jun 28, 2006 at 05:11:59PM +0200, Martijn van Oosterhout wrote:
 On Wed, Jun 28, 2006 at 10:35:37AM -0400, Phil Frost wrote:
  I have an optimization I'd like to see which I think should be pretty
  easy for someone familiar with the planner code to implement. My
  situation is this: I have an application using veil[1]. Essentially, I
  have a schema private and another public. Private contains regular
  tables, where private contains views on those tables, like create view
  public.foo as select * from foo where i_have_global_priv('select_foo'),
  and i_have_global_priv is a stable function.
  
  My problem is that in several situations, postgresql is planning a
  sequential scan with i_have_global_priv(n) as a filter, where N is some
  constant literal specified in the view definition. This leads to the
  function being called hundreds of thousands of times, which makes my
  query orders of magnitude slower.
 
 Is the function marked stable or immutable?
 
 In the examples you give the planner can't move the function around the
 tree because that would change the output of the query. For inner joins
 it's ok, for outer joins it's much more tricky.
 
 I thought the planner would evaluate constant conditions early on which
 I why I'm asking about the function.

i_have_global_priv is a stable function.

The planner in fact can move the function around without changing the
output. I can make it do so by putting offset 0 in the subqueries:

dew=# explain select * from
(select * from private.orderitem where i_have_global_priv(28) offset 0) as 
oi
left join (
select * from private.orderitemproduct where i_have_global_priv(32) 
offset 0
) as oip using (objectid);
QUERY PLAN  
   
---
 Merge Right Join  (cost=1310.33..3603.67 rows=151221 width=187)
   Merge Cond: (outer.objectid = inner.objectid)
   -  Sort  (cost=441.55..454.06 rows=5004 width=45)
 Sort Key: oip.objectid
 -  Subquery Scan oip  (cost=0.00..134.08 rows=5004 width=45)
   -  Limit  (cost=0.00..84.04 rows=5004 width=23)
 -  Result  (cost=0.00..84.04 rows=5004 width=23)
   One-Time Filter: i_have_global_priv(32)
   -  Seq Scan on orderitemproduct  (cost=0.00..84.04 
rows=5004 width=23)
   -  Sort  (cost=868.78..883.89 rows=6044 width=146)
 Sort Key: oi.objectid
 -  Limit  (cost=0.00..165.44 rows=6044 width=306)
   -  Result  (cost=0.00..165.44 rows=6044 width=306)
 One-Time Filter: i_have_global_priv(28)
 -  Seq Scan on orderitem  (cost=0.00..165.44 rows=6044 
width=306)

The transformation is from this:

   -  Seq Scan on orderitem  (cost=0.00..180.55 rows=2015 width=306)
 Filter: i_have_global_priv(28)

to this:

   -  Result  (cost=0.00..165.44 rows=6044 width=306)
 One-Time Filter: i_have_global_priv(28)
 -  Seq Scan on orderitem  (cost=0.00..165.44 rows=6044 width=306)

which produce the same result. However, I'm not about to put offset 0
in all my view definitions, as that would prevent a number of other
extremely desirable optimizations.

Can a Result node not be an input to an outer join node? That would make
me sad :(

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] optimizing constant quals within outer joins

2006-06-28 Thread Tom Lane
Phil Frost [EMAIL PROTECTED] writes:
 The planner in fact can move the function around without changing the
 output.

Not when it's within the nullable side of an outer join --- moving a
WHERE clause up out of that would make the difference between no row
out, and a null-extended row out, which are certainly not the same.

I'm not sure why it's not pulling up from the left side of the left join
though.  That might be a bug.  What PG version is this exactly?

Of course the real question is why is your app generating such poorly
phrased queries ;-)

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] SO_SNDBUF size is small on win32?

2006-06-28 Thread Yoshiyuki Asaba
From: Tom Lane [EMAIL PROTECTED]
Subject: Re: [HACKERS] SO_SNDBUF size is small on win32? 
Date: Wed, 28 Jun 2006 09:54:21 -0400

 Yoshiyuki Asaba [EMAIL PROTECTED] writes:
  I think libpq interface does not use non-blocking socket.
 
 Not unless the Windows port has also disabled pg_set_noblock ...

Sorry, I misunderstood.

I tried to occur this issue on msys.

 % cat test.sh
 export PGHOST=xxx
 export PGPORT=5432
 export PGDATABASE=test
 
 dropdb $PGDATABASE
 createdb
 psql -c 'CREATE TABLE t1 (a int, b text)'
 i=0
 while [ $i -lt 50 ]; do
   psql -c insert into t1 values ($i, repeat('x', 1))
   i=`expr $i + 1`
 done
 pg_dump -a  dump
 time psql -f dump

 % sh test.sh

But, I did not occur this issue... Does anyone occur this issue?
--
Yoshiyuki Asaba
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Tom Lane
Buildfarm member platypus is showing a regression failure that I'm
surprised we have not seen before:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=platypusdt=2006-06-28%2014:05:01

Basically what this is showing is that when there is more than one
referencing table, the order in which things get done is dependent
on chance locations of system catalog entries.  That results in
cosmetic differences in which of multiple violations gets reported,
or in the order of truncate cascades to notices.

Given our push to have the buildfarm all green all the time,
I don't think I want to just live with occasional failures.
Seems like the alternatives are

1. Find a way to make the processing order consistent (eg by driving it
off OID ordering).  Doesn't seem easy, but maybe I'm missing an idea.

2. Install multiple expected files for the truncate test.

3. Dumb down the test cases so that they don't test multiple-cascade
situations.

Don't much care for any of these :-(.

Also, it seems possible that not-so-cosmetic problems could occur, for
instance deadlock between two backends trying to truncate the same
tables in different orders.  That suggests that answer #1 would be the
best way to fix it, but that would mean ordering the tables consistently
before we even get any locks on them, which seems hard.

Thoughts?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] optimizing constant quals within outer joins

2006-06-28 Thread Phil Frost
On Wed, Jun 28, 2006 at 11:40:52AM -0400, Tom Lane wrote:
 Phil Frost [EMAIL PROTECTED] writes:
  The planner in fact can move the function around without changing the
  output.
 
 Not when it's within the nullable side of an outer join --- moving a
 WHERE clause up out of that would make the difference between no row
 out, and a null-extended row out, which are certainly not the same.
 
 I'm not sure why it's not pulling up from the left side of the left join
 though.  That might be a bug.  What PG version is this exactly?
 
 Of course the real question is why is your app generating such poorly
 phrased queries ;-)

Sure it can't pull the condition to the root result node, but it can
make an intermediate result node that is a child of the join and wraps
the sequential scan. offset 0 makes it do this. I'd like this:

create table a(i int);
create table b(i int);
create function stable_function() returns bool language plpgsql stable as $$
begin return true; end $$;
create view c as select * from b where stable_function();
explain select * from a left join c using (i);
   QUERY PLAN
-
 Merge Right Join  (cost=220.32..338.32 rows=7629 width=4)
   Merge Cond: (outer.i = inner.i)
   -  Sort  (cost=70.54..72.32 rows=713 width=4)
 Sort Key: b.i
 -  Seq Scan on b  (cost=0.00..36.75 rows=713 width=4)
   Filter: stable_function()
   -  Sort  (cost=149.78..155.13 rows=2140 width=4)
 Sort Key: a.i
 -  Seq Scan on a  (cost=0.00..31.40 rows=2140 width=4)

to become this:

   QUERY PLAN
-
 Merge Right Join  (cost=220.32..338.32 rows=7629 width=4)
   Merge Cond: (outer.i = inner.i)
   -  Sort  (cost=70.54..72.32 rows=713 width=4)
 Sort Key: b.i
 -  Result
   One-Time Filter: stable_function()
   -  Seq Scan on b  (cost=0.00..36.75 rows=713 width=4)
 Filter: stable_function()
   -  Sort  (cost=149.78..155.13 rows=2140 width=4)
 Sort Key: a.i
 -  Seq Scan on a  (cost=0.00..31.40 rows=2140 width=4)

That will make the same results. Maybe there is something about the
implementation that I don't understand that makes it hard, but the
concept is simple: before you do a seq scan on b, you call
stable_function(), and if it returns true, you just do the sequential
scan without calling stable_function() for each row. If it returns
false, you can not do the sequental scan at all, and return the empty
set immediately.

I wasn't aware my queries are badly phrased. The application generates
quite nice queries like select * from saleorder_summary, which is a view
along the lines of 'select * from order left join saleorder using
(objectid)'. order and saleorder are views like select * from
private.order where i_have_global_priv(20). The subqueries are in the
examples I gave just to make it simpler to demonstrate.

The only other way I can think of phrasing a query like that is perhaps

select *
from private.order
left join purchaseorder on (
order.objectid = purchaseorder.objectid and i_have_global_priv(31)
)

This of course would not only be hugely inconvinent, but would require
that regular users have unrestricted access to the base tables, which
totally defeats the purpose of using veil. Also, that too is not
optimized as well as it could be:

test=# explain select * from a left join b on (a.i = b.i and stable_function());
   QUERY PLAN
-
 Merge Left Join  (cost=299.56..710.97 rows=7633 width=8)
   Merge Cond: (outer.i = inner.i)
   Join Filter: stable_function()
   -  Sort  (cost=149.78..155.13 rows=2140 width=4)
 Sort Key: a.i
 -  Seq Scan on a  (cost=0.00..31.40 rows=2140 width=4)
   -  Sort  (cost=149.78..155.13 rows=2140 width=4)
 Sort Key: b.i
 -  Seq Scan on b  (cost=0.00..31.40 rows=2140 width=4)

stable_function() will still be called multiple times needlessly.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Index corruption

2006-06-28 Thread Marc Munro
We have now experienced index corruption on two separate but identical
slony clusters.  In each case the slony subscriber failed after
attempting to insert a duplicate record.  In each case reindexing the
sl_log_1 table on the provider fixed the problem.

The latest occurrence was on our production cluster yesterday.  This has
only happened since we performed kernel upgrades and we are uncertain
whether this represents a kernel bug, or a postgres bug exposed by
different timings in the new kernel.

Our systems are:

Sun v40z 4 x Dual Core AMD Opteron(tm) Processor 875
Kernel 2.6.16.14 #8 SMP x86_64 x86_64 x86_64 GNU/Linux
kernel boot option: elevator=deadline
16 Gigs of RAM
postgresql-8.0.3-1PGDG
Bonded e1000/tg3 NICs with 8192 MTU.
Slony 1.1.0

NetApp FAS270 OnTap 7.0.3
Mounted with the NFS options
rw,nfsvers=3,hard,rsize=32768,wsize=32768,timeo=600,tcp,noac
Jumbo frames 8192 MTU.

All postgres data and logs are stored on the netapp.

In the latest episode, the index corruption was coincident with a
slony-induced vacuum.  I don't know if this was the case with our test
system failures.

What can we do to help identify the cause of this?  I believe we will be
able to reproduce this on a test system if there is some useful
investigation we can perform.

__
Marc


signature.asc
Description: This is a digitally signed message part


[HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as primary keys

2006-06-28 Thread Thomas Hallgren

Martijn van Oosterhout wrote:

On Wed, Jun 28, 2006 at 01:56:47PM +0200, Thomas Hallgren wrote:
A user that is trusted with installing a C-function in the backend is 
free to scan the process memory anyway so in what way did that increase 
the security? IMHO, the only relevant security in that context is to 
have trusted people install trusted modules. I'm surprised that 
something like that made you remove significant functionality.


You're missing the point. The type output function is not generally a
priveledged function. Think bpcharout, text_out, numeric_out, etc...
These can be called by users directly and the input to those functions
cannot be trusted.

Ah, OK that makes sense. An alternative solution when the signature was changed could 
perhaps have been to pass one single argument, a structure appointing the data and its 
associated type. My idea would work if the data and its type lived together always from the 
moment its instantiated (read from disk or otherwise) and until death do them apart (or the 
data is stored on disk, in which case the tupledesc knows what it is). I guess that would 
imply a major rewrite and that my desire to have a RAW fixed length type isn't enough 
motivation to do that :-)


Instead, I would like to humbly request the inclusion of a UUID datatype (or an opaque 128 
bit datatype) in the core package. It's increasingly common and some databases (MS 
SQLServer) already have built in support for it.


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Alvaro Herrera
Tom Lane wrote:

 1. Find a way to make the processing order consistent (eg by driving it
 off OID ordering).  Doesn't seem easy, but maybe I'm missing an idea.

Hmm, what about

1. get the complete list of tables to truncate, AccessShareLock'ed, get
their names
2. release locks
3. sort the list lexicographically (or by Oid, whatever)
4. acquire the stronger locks, in list order, taking care of not
aborting if a table is no longer there
5. truncate

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as primary keys

2006-06-28 Thread Jim C. Nasby
On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote:
 Instead, I would like to humbly request the inclusion of a UUID datatype 
 (or an opaque 128 bit datatype) in the core package. It's increasingly 

ISTM that we get enough requests for this that it's probably worth
doing.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 1. Find a way to make the processing order consistent (eg by driving it
 off OID ordering).  Doesn't seem easy, but maybe I'm missing an idea.

 Hmm, what about

 1. get the complete list of tables to truncate, AccessShareLock'ed, get
 their names
 2. release locks
 3. sort the list lexicographically (or by Oid, whatever)
 4. acquire the stronger locks, in list order, taking care of not
 aborting if a table is no longer there
 5. truncate

Releasing locks is no good ... what if someone adds/drops FK constraints
while you've not got any lock?

One thing I was toying with was to add an index to pg_constraint on,
say, (confrelid, conrelid), and to replace the existing seqscans for FK
constraints with scans using this index.  The second-column ordering
would guarantee everybody visits the entries in the same order.  Not
sure about overall performance implications ... in a small database,
several indexscans might take more time than one seqscan.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as primary keys

2006-06-28 Thread mark
On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote:
 Instead, I would like to humbly request the inclusion of a UUID datatype 
 (or an opaque 128 bit datatype) in the core package. It's increasingly 
 common and some databases (MS SQLServer) already have built in support for 
 it.

We have it. We're just not putting in the effort required to have it
included in core, as it's too much effort to convince people that the
type has value, that is is generic, and would be widely used without
being abused. All the geometric types that I'll never use in core,
with few or no uses, including functions to operate on these types,
and no UUID type... Hehe... To me, that's irony... :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Jim C. Nasby
On Wed, Jun 28, 2006 at 01:13:42PM -0400, Tom Lane wrote:
 One thing I was toying with was to add an index to pg_constraint on,
 say, (confrelid, conrelid), and to replace the existing seqscans for FK
 constraints with scans using this index.  The second-column ordering
 would guarantee everybody visits the entries in the same order.  Not
 sure about overall performance implications ... in a small database,
 several indexscans might take more time than one seqscan.

In a small database, both operations are likely to be plenty fast for
TRUNCATE, though. Surely the performance impact of getting the requisite
locks would far exceed any catalog scan times, no? And if you were doing
TRUNCATE's very often, I'd expect the right pages to be in cache
anyway...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-28 Thread A.M.
On Wed, June 28, 2006 1:14 pm, [EMAIL PROTECTED] wrote:
 On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote:

 Instead, I would like to humbly request the inclusion of a UUID
 datatype (or an opaque 128 bit datatype) in the core package. It's
 increasingly common and some databases (MS SQLServer) already have built
 in support for it.

 We have it. We're just not putting in the effort required to have it
 included in core, as it's too much effort to convince people that the type
 has value, that is is generic, and would be widely used without being
 abused. All the geometric types that I'll never use in core, with few or
 no uses, including functions to operate on these types, and no UUID
 type... Hehe... To me, that's irony... :-)

Is it on pgfoundry? From past discussions, the new criteria for getting
something into core is to first determine if it is successful on
pgfoundry.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Index corruption

2006-06-28 Thread Tom Lane
Marc Munro [EMAIL PROTECTED] writes:
 We have now experienced index corruption on two separate but identical
 slony clusters.  In each case the slony subscriber failed after
 attempting to insert a duplicate record.  In each case reindexing the
 sl_log_1 table on the provider fixed the problem.

Please be more specific about what you mean by index corruption
... what were the exact symptoms?

 postgresql-8.0.3-1PGDG

The *first* thing you should do is update to 8.0.8 and see if the
problem is still there.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-28 Thread Jim C. Nasby
On Wed, Jun 28, 2006 at 01:18:39PM -0400, A.M. wrote:
 On Wed, June 28, 2006 1:14 pm, [EMAIL PROTECTED] wrote:
  On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote:
 
  Instead, I would like to humbly request the inclusion of a UUID
  datatype (or an opaque 128 bit datatype) in the core package. It's
  increasingly common and some databases (MS SQLServer) already have built
  in support for it.
 
  We have it. We're just not putting in the effort required to have it
  included in core, as it's too much effort to convince people that the type
  has value, that is is generic, and would be widely used without being
  abused. All the geometric types that I'll never use in core, with few or
  no uses, including functions to operate on these types, and no UUID
  type... Hehe... To me, that's irony... :-)
 
 Is it on pgfoundry? From past discussions, the new criteria for getting
 something into core is to first determine if it is successful on
 pgfoundry.

If http://lnk.nu/pgfoundry.org/a86.php is accurate, then no one has ever
downloaded it. But I find that exceptionally hard to believe...

Looking back through the list archives I think you'd find this comes up
at least every few months.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Andrew Dunstan

Tom Lane wrote:


Buildfarm member platypus is showing a regression failure that I'm
surprised we have not seen before:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=platypusdt=2006-06-28%2014:05:01

Basically what this is showing is that when there is more than one
referencing table, the order in which things get done is dependent
on chance locations of system catalog entries.  That results in
cosmetic differences in which of multiple violations gets reported,
or in the order of truncate cascades to notices.

Given our push to have the buildfarm all green all the time,
I don't think I want to just live with occasional failures.
Seems like the alternatives are

1. Find a way to make the processing order consistent (eg by driving it
off OID ordering).  Doesn't seem easy, but maybe I'm missing an idea.

2. Install multiple expected files for the truncate test.

3. Dumb down the test cases so that they don't test multiple-cascade
situations.

Don't much care for any of these :-(.

Also, it seems possible that not-so-cosmetic problems could occur, for
instance deadlock between two backends trying to truncate the same
tables in different orders.  That suggests that answer #1 would be the
best way to fix it, but that would mean ordering the tables consistently
before we even get any locks on them, which seems hard.

Thoughts?


 



If this were a significant risk wouldn't we have seen many such failures 
before now? I guess we don't expect to see concurrent truncates being 
run. Probably worth protecting against, but also probably something of a 
corner case.


In the absence of a fix I'd go for the extra regression result file.

cheers

andrew


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-28 Thread mark
On Wed, Jun 28, 2006 at 12:38:50PM -0500, Jim C. Nasby wrote:
 On Wed, Jun 28, 2006 at 01:18:39PM -0400, A.M. wrote:
  On Wed, June 28, 2006 1:14 pm, [EMAIL PROTECTED] wrote:
   On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote:
   Instead, I would like to humbly request the inclusion of a UUID
   datatype (or an opaque 128 bit datatype) in the core package. It's
   increasingly common and some databases (MS SQLServer) already have built
   in support for it.
   We have it. We're just not putting in the effort required to have it
   included in core, as it's too much effort to convince people that the type
   has value, that is is generic, and would be widely used without being
   abused. All the geometric types that I'll never use in core, with few or
   no uses, including functions to operate on these types, and no UUID
   type... Hehe... To me, that's irony... :-)
  Is it on pgfoundry? From past discussions, the new criteria for getting
  something into core is to first determine if it is successful on
  pgfoundry.
 If http://lnk.nu/pgfoundry.org/a86.php is accurate, then no one has ever
 downloaded it. But I find that exceptionally hard to believe...
 
 Looking back through the list archives I think you'd find this comes up
 at least every few months.

I've downloaded the version off pgfoundry.org. It is broken. It leaks
memory, and if memory is correct it can cause the client to core dump.

Two of us worked on a re-write based off a different UUID system library,
and I've been happily using it in production for a year or so. I don't
believe either of us have bothered to market it. Each time it comes up,
a number of people on this list shut it down, and it doesn't seem worth
the effort to convince them otherwise. They can have their ivory tower,
and I can have my plugin.

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-28 Thread Andrew Dunstan

Jim C. Nasby wrote:


On Wed, Jun 28, 2006 at 01:18:39PM -0400, A.M. wrote:
 


On Wed, June 28, 2006 1:14 pm, [EMAIL PROTECTED] wrote:
   


On Wed, Jun 28, 2006 at 06:39:16PM +0200, Thomas Hallgren wrote:

 


Instead, I would like to humbly request the inclusion of a UUID
datatype (or an opaque 128 bit datatype) in the core package. It's
increasingly common and some databases (MS SQLServer) already have built
in support for it.
   


We have it. We're just not putting in the effort required to have it
included in core, as it's too much effort to convince people that the type
has value, that is is generic, and would be widely used without being
abused. All the geometric types that I'll never use in core, with few or
no uses, including functions to operate on these types, and no UUID
type... Hehe... To me, that's irony... :-)
 


Is it on pgfoundry? From past discussions, the new criteria for getting
something into core is to first determine if it is successful on
pgfoundry.
   



If http://lnk.nu/pgfoundry.org/a86.php is accurate, then no one has ever
downloaded it. But I find that exceptionally hard to believe...

Looking back through the list archives I think you'd find this comes up
at least every few months.
 



That's because there is nothing there to download. See instead: 
http://gborg.postgresql.org/project/pguuid/projdisplay.php


Personally I don't buy the misuse objection - we already have plenty of 
things that can be misused. As long as there is a reasonable valid use 
and we can make it portable enough, I think there is a good case for 
including it.


cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  1. Find a way to make the processing order consistent (eg by driving it
  off OID ordering).  Doesn't seem easy, but maybe I'm missing an idea.
 
  Hmm, what about
 
  1. get the complete list of tables to truncate, AccessShareLock'ed, get
  their names
  2. release locks
  3. sort the list lexicographically (or by Oid, whatever)
  4. acquire the stronger locks, in list order, taking care of not
  aborting if a table is no longer there
  5. truncate
 
 Releasing locks is no good ... what if someone adds/drops FK constraints
 while you've not got any lock?

Recheck after acquiring the stronger locks, unlock and drop from list.

 One thing I was toying with was to add an index to pg_constraint on,
 say, (confrelid, conrelid), and to replace the existing seqscans for FK
 constraints with scans using this index.  The second-column ordering
 would guarantee everybody visits the entries in the same order.  Not
 sure about overall performance implications ... in a small database,
 several indexscans might take more time than one seqscan.

I think there is more than one place that would benefit from such an
index.  Probably turn into a syscache as well?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-28 Thread Alvaro Herrera
[EMAIL PROTECTED] wrote:

 I've downloaded the version off pgfoundry.org. It is broken. It leaks
 memory, and if memory is correct it can cause the client to core dump.

Also it couldn't possibly be included in core, since it's based on a
GPL'ed UUID library.  If you have a more appropiately licensed package,
it could be considered for inclusion.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 If this were a significant risk wouldn't we have seen many such failures 
 before now?

Hard to tell.  It's possibly architecture-dependent, for one thing
(MAXALIGN will affect space availability).  Since this happened in a
parallel regression run, it could also be a matter of timing relative to
the concurrent tests.  I've often thought that we are not getting as much
mileage out of the parallel-testing facility as we could, because it's
really not exercising variations in timing all that much.  It'd be
interesting to throw in a small random delay at the start of each member
of a concurrent set of tests.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Tom Lane wrote:
  Alvaro Herrera [EMAIL PROTECTED] writes:
   Tom Lane wrote:
   1. Find a way to make the processing order consistent (eg by driving it
   off OID ordering).  Doesn't seem easy, but maybe I'm missing an idea.
  
   Hmm, what about
  
   1. get the complete list of tables to truncate, AccessShareLock'ed, get
   their names
   2. release locks
   3. sort the list lexicographically (or by Oid, whatever)
   4. acquire the stronger locks, in list order, taking care of not
   aborting if a table is no longer there
   5. truncate
  
  Releasing locks is no good ... what if someone adds/drops FK constraints
  while you've not got any lock?
 
 Recheck after acquiring the stronger locks, unlock and drop from list.

Oops, this doesn't cover the add FK constraints case, only drop.  I
think it would work to keep the locks on the tables initially mentioned
in the command (i.e. those not followed by CASCADE).  Hmm, but it fails
if it cascades more than once, so scratch that.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 One thing I was toying with was to add an index to pg_constraint on,
 say, (confrelid, conrelid), and to replace the existing seqscans for FK
 constraints with scans using this index.

 I think there is more than one place that would benefit from such an
 index.  Probably turn into a syscache as well?

Yeah, that was in the back of my mind too, but I haven't looked through
the code to see.  A syscache wouldn't work because it's not a unique key.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Single Index Tuple Chain (SITC) method

2006-06-28 Thread Bruce Momjian
bruce wrote:
 Greg Stark wrote:
  
  Bruce Momjian [EMAIL PROTECTED] writes:
  
   PFC wrote:

 My idea is that if an UPDATE places the new tuple on the same page as
 the old tuple, it will not create new index entries for any indexes
 where the key doesn't change.

Basically the idea behind preventing index bloat by updates is 
to have  
one index tuple point to several actual tuples having the same value.

   
   The idea is not to avoid index bloat, but to allow heap reuse, and having
   one index entry for multiple versions of an UPDATEd row is merely an
   implementation detail.
  
  It sort of sounds like you're describing a whole new index type that stores
  only the page, not the precise record of any tuple it indexes. If your table
 
 Background, indexes point to page item pointers, not to actual offsets
 in the page.  This is how vacuum can move around tuples without modifying the
 indexes.  The index points to a page item pointer that is a chain of
 tuples with the same indexed columns.

Here is an overview of the SITC method:

http://momjian.us/cgi-bin/pgsitc

Anyone want to start coding?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fixed length datatypes. WAS [GENERAL] UUID's as

2006-06-28 Thread Jim C. Nasby
On Wed, Jun 28, 2006 at 01:49:55PM -0400, Andrew Dunstan wrote:
 Personally I don't buy the misuse objection - we already have plenty of 
 things that can be misused. As long as there is a reasonable valid use 
 and we can make it portable enough, I think there is a good case for 
 including it.

Well, since Mark has one, how about we consider adding it in?

If nothing else, can you please put your stuff on pgFoundry so others
can find it, Mark?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Single Index Tuple Chain (SITC) method

2006-06-28 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Here is an overview of the SITC method:
   http://momjian.us/cgi-bin/pgsitc

A pretty fundamental problem is that the method assumes it's OK to
change the CTID of a live tuple (by swapping its item pointer with some
expired version).  It is not --- this will break:
* active UPDATEs and DELETEs that may have fetched the CTID
  but not yet completed processing to decide whether to change
  the tuple;
* pending AFTER ROW triggers, such as foreign key checks;
* ODBC as well as other applications that assume CTID is a
  usable unique row identifier within transactions.
VACUUM FULL can get away with moving tuples to new CTIDs because it takes
AccessExclusiveLock, so there can be no open transactions with knowledge
of current CTIDs in the table.  This is not OK for something that's
supposed to happen in plain UPDATEs, though.

Another problem is you can't recycle tuples, nor item ids, without
taking a VACUUM-style lock on the page (LockBufferForCleanup).  If
anyone else is holding a pin on the page they risk getting totally
confused --- for instance, a seqscan will either miss a tuple or scan it
twice depending on which direction you're juggling item ids around it.
The concurrency loss involved in LockBufferForCleanup is OK for
background-maintenance operations like VACUUM, but I seriously doubt
anyone will find it acceptable for UPDATE.  It could easily create
application-level deadlocks, too.  (VACUUM is safe against that because
it only holds one lock.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Buildfarm member platypus is showing a regression failure that I'm
 surprised we have not seen before:
 http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=platypusdt=2006-06-28%2014:05:01

 If this were a significant risk wouldn't we have seen many such failures 
 before now?

mongoose just failed with almost the exact same symptoms:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mongoosedt=2006-06-28%2021:30:02

It's not quite the same diffs, which is unsurprising given the presumed
mechanism behind the failure, but that probably shoots down the add
another expected file response.

I imagine some recent change has made the probability of this behavior
much higher than it was before; perhaps there's more pg_constraint
update traffic in concurrent tests?  Anyway, it's now up to must fix
in my estimation.  I'll look into the new-pg_constraint-index idea.

I think someone should also take a hard look at that idea of introducing
more timing variability into the parallel tests.  Any volunteers?

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Instability in TRUNCATE regression test

2006-06-28 Thread Tom Lane
I wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 One thing I was toying with was to add an index to pg_constraint on,
 say, (confrelid, conrelid), and to replace the existing seqscans for FK
 constraints with scans using this index.

 I think there is more than one place that would benefit from such an
 index.  Probably turn into a syscache as well?

 Yeah, that was in the back of my mind too, but I haven't looked through
 the code to see.  A syscache wouldn't work because it's not a unique key.

Having looked through the code, the only two places that currently seem
to have any need for an index on confrelid are the two paths in TRUNCATE
that find/check for FK relationships.  So I'm hesitant to add an index
just for that; seems like too much overhead to put onto all other
updates of pg_constraint.

What we can perhaps do instead is pull out the related OIDs (ie, a
function that given a rel OID returns a list of rels that have FK
dependencies on that rel) and then sort that list into OID order before
acting on it.

Note: the OID-sort-order concept is not perfect; if the OID counter were
to wrap around while the regression tests are running, you could get a
bogus failure of this type.  That seems low enough probability to live
with, though.  Anyway it'll never happen in the buildfarm's usage, since
buildfarm only runs the tests in freshly-initdb'd databases.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] UUID's as primary keys

2006-06-28 Thread Greg Stark
Tom Lane [EMAIL PROTECTED] writes:

 Martijn van Oosterhout kleptog@svana.org writes:
  The input functions get it, the output functions (bpcharout,
  bpcharsend, etc) don't. Which makes it kind of hard to print a raw
  value if you don't know how long it's going to be. They used to, but
  that was removed some time back.

 Even back then you couldn't rely on the typmod value to be supplied;
 it was quite likely to be passed as -1.  The issue is not actually
 with on-disk storage, it is with function/operator arguments and
 results.  Those have never been identified any more closely than by
 giving a type OID.  So for any value that came from a function,
 you won't have a typmod, and you'd better be able to find out all
 you need to know just by inspecting the value itself.  Hence, length
 words.

Hm, so it could be stored on disk without the length header as long as the
length header is added to the in-memory representation? I don't think the type
system has hooks for reading and storing data to disk though.

 This is all pretty off-topic for pgsql-general, isn't it?

[moved to -hackers]

-- 
greg


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] optimizing constant quals within outer joins

2006-06-28 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 Phil Frost [EMAIL PROTECTED] writes:
  The planner in fact can move the function around without changing the
  output.
 
 Not when it's within the nullable side of an outer join --- moving a
 WHERE clause up out of that would make the difference between no row
 out, and a null-extended row out, which are certainly not the same.
 
 I'm not sure why it's not pulling up from the left side of the left join
 though.  That might be a bug.  What PG version is this exactly?

In fact it doesn't even pull it up out of a regular join. I looked into this
when it was first brought up on IRC and as near as I can tell it is trying to
do so and somehow just failing.


postgres=# create function foo(text) returns bool as 'select case when $1 = 
''foo'' then true else false end' language sql stable strict ;


postgres=# explain select 1 from a,a as b where foo('foo') ;
   QUERY PLAN
-
 Result  (cost=31.34..75332.74 rows=3763600 width=0)
   One-Time Filter: foo('foo'::text)
   -  Nested Loop  (cost=31.34..75332.74 rows=3763600 width=0)
 -  Seq Scan on a  (cost=0.00..29.40 rows=1940 width=0)
 -  Materialize  (cost=31.34..50.74 rows=1940 width=0)
   -  Seq Scan on a b  (cost=0.00..29.40 rows=1940 width=0)
(6 rows)


postgres=# explain select 1 from (select * from a where foo('foo')) as x, a;
   QUERY PLAN
-
 Nested Loop  (cost=31.34..25169.19 rows=1255180 width=0)
   -  Seq Scan on a  (cost=0.00..34.25 rows=647 width=0)
 Filter: foo('foo'::text)
   -  Materialize  (cost=31.34..50.74 rows=1940 width=0)
 -  Seq Scan on a  (cost=0.00..29.40 rows=1940 width=0)
(5 rows)


-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [GENERAL] UUID's as primary keys

2006-06-28 Thread Tom Lane
Greg Stark [EMAIL PROTECTED] writes:
 Hm, so it could be stored on disk without the length header as long as
 the length header is added to the in-memory representation? I don't
 think the type system has hooks for reading and storing data to disk
 though.

No, it doesn't, and we'd pay a nonzero price for allowing that.
Currently the executor doesn't have to care (much) about whether a
tuple is on-disk or in-memory --- the individual datums look the same
either way.  Allowing them to be different would force a lot of
format conversion steps that currently need not happen.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly