Re: [HACKERS] to_char and i18n

2005-12-22 Thread Karel Zak
On Wed, 2005-12-21 at 23:50 -0500, Tom Lane wrote:
 Manuel Sugawara masm@fciencias.unam.mx writes:
  Tom Lane [EMAIL PROTECTED] writes:
  I thought to_char already had i18n behavior.  What exactly are you
  thinking of changing?
 
  The modifiers that are suitable to localize. Month and day names comes
  to mind and maybe others, I'm not sure what the state of the code is,
  but I can say that, at least, the 'month' and 'day' modifiers does not
  behave in a localized way.

The names for months and days are hardcoded to to_char code and it's in
English only.

 Can we spell the names differently but keep to the same field widths?

That's important point. How resolve this problem Oracle? Maybe we can
say (in docs) that with non-English locales it works with days/months
names as in FM (fill) mode.

# select length( to_char(now(), 'Day') ) as Normal,
 length( to_char(now(), 'FMDay') ) as FM;
 normal | fm
+
  9 |  8

It means 'FM' uses variable size of Day/Month field -- without FM is the
size fixed to 9 chars.

I think that for backward compatibility the locale sensitive to_char()
should be implemented as separate call to_char(datetime, format,
locale) or we should add new modifiers to the current to_char,
something like to_char(datetime, LCMonth) or both.

I don't have any time to work on to_char(), I can help to review patches
only.

Karel

-- 
Karel Zak [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Lukas Smith

Jim C. Nasby wrote:

On Wed, Dec 21, 2005 at 05:43:38PM -0500, Bruce Momjian wrote:


Rick Gigger wrote:


It seems to me like there are two classes of problems here:

1) Simply invalidating plans made with out of date statistics.
2) Using run-time collected data to update the plan to something more  
intelligent.


It also seems like #1 would be fairly straightforward and simple  
whereas #2 would be much more complex.  #1 would do me a world of  
good and probably other people as well.  Postgres's query planning  
has always been fine for me, or at least I have always been able to  
optimize my queries when I've got a representative data set to work  
with.  Query plan caching only gets me when the query plan is created  
before the statistics are present to create a good plan.


Just one users 2 cents.


Agreed.  I just can't add #2 unless we get more agreement from the
group, because it has been a disputed issue in the past.



Well, how about this, since it's a prerequisit for #2 and would be
generally useful anyway:

Track normal resource consumption (ie: tuples read) for planned queries
and record parameter values that result in drastically different
resource consumption.

This would at least make it easy for admins to identify prepared queries
that have a highly variable execution cost.


Yeah, it seems such a log would be very helpful in its own right for 
DBA's and also as a feedback loop to find possibles issues in the query 
planner. And maybe one day this feedback loop can be even directly used 
by the server itself.


regards,
Lukas Smith

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] PL/pgSQL proposal: using list of scalars in assign stmts, fore and fors stmts

2005-12-22 Thread Pavel Stehule

Hello

Now, statements EXECUTE INTO and SELECT INTO allow using list of scalars. 
FORe and FORs allow only ROW o RECORD VARIABLE. I'll plan and I did it 
enhance this stmts:


for := FOR target IN {SELECT | EXECUTE} ... LOOP
target := {row|record|comma separated list of scalar vars}

assign := target2 ':=' expression
target2 := {row|record|variable|'ROW(' comma separated list of scalar vars 
')'}


for example:
CREATE OR REPLACE FUNCTION test(OUT _rc, OUT _x varchar, OUT _y varchar)
RETURNS SETOF RECORD  AS $$
DECLARE _r RECORD;
BEGIN
 rc := 0;
 -- old style;
 FOR _r IN SELECT generate_series AS x, generateseries + 1 AS y FROM 
generate_series(1,4) LOOP

   _rc := _rc + 1; _x := _r.x; _y := _r.y;
   RETURN NEXT;
 END LOOP;
 -- new one
 FOR _x,_y IN SELECT generate_series, generateseries + 1 FROM 
generate_series(1,4) LOOP

   _rc := _rc + 1;
   RETURN NEXT;
 END LOOP;
 -- new two
   FOR _r IN SELECT generate_series AS x, generateseries + 1 AS y FROM 
generate_series(1,4)LOOP

   _rc := _rc + 1; ROW(_x,_y) := _r;
   RETURN NEXT;
 END LOOP;
 RETURN;
END; $$ LANGUAGE plpgsql;

any comments?
Regards
Pavel Stehule

_
Najdete si svou lasku a nove pratele na Match.com. http://www.msn.cz/


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Function call with offset and limit

2005-12-22 Thread REYNAUD Jean-Samuel
Hi

I've just tried it, and it works. So it's a good work-around.

Though, is it a wanted feature to have a function being performed on
each row before the offset ?


Le mercredi 21 décembre 2005 à 13:41 -0600, Jim C. Nasby a écrit :
 Have you tried
 
 SELECT *, test_func(idkeyword)
 FROM (SELECT * FROM tag OFFSET 5000 LIMIT 1)
 ;
 
 ?
 
 This should probably have been on -general, btw.
 
 On Wed, Dec 21, 2005 at 06:44:33PM +0100, REYNAUD Jean-Samuel wrote:
  Hi all,
  
  We need to find a solution for a strange problem. 
  We have a plpgsql FUNCTION which performs an heavy job (named
  test_func). 
  
  CREATE or replace function test_func(z int) returns integer as $$
  declare
  tst integer;
  begin
  --
  -- Large jobs with z
  --
  tst :=  nextval('test_truc');
  return tst;
  end;
  $$ LANGUAGE plpgsql;
  
  
  So I made this test:
  
  test=# select setval('test_truc',1);
   setval
  
1
  (1 row)
  
  test=#  select currval('test_truc') ;
   currval
  -
 1
  (1 row)
  
  test=# select *,test_func(idkeyword) from tag offset 5000 limit 1;
   idkeyword |   test_func
  ---+-
5001 |   5002
  (1 row)
  
  test=# select currval('test_truc') ;
   currval
  -
  5002
  (1 row)
  
  
  This demonstrates that the function is called 5001 times though only one
  row is returned. Problem is that this heavy job is performed much, much
  more than needed.
  
  But, If I do:
  test=# select *,(select test_func(1)) from tag offset 5000 limit 1;
  My function is called only once.
  
  Is there any work around ?
  
  
  Thanks
  -- 
  REYNAUD Jean-Samuel [EMAIL PROTECTED]
  Elma
  
  
  ---(end of broadcast)---
  TIP 1: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly
  
 
-- 
REYNAUD Jean-Samuel [EMAIL PROTECTED]
Elma


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Function call with offset and limit

2005-12-22 Thread Martijn van Oosterhout
On Thu, Dec 22, 2005 at 10:52:58AM +0100, REYNAUD Jean-Samuel wrote:
 Hi
 
 I've just tried it, and it works. So it's a good work-around.
 
 Though, is it a wanted feature to have a function being performed on
 each row before the offset ?

Well, saying offset 5000 pretty much means to calculate the first 5000
rows and throw away the result. To calculate that it needs to execute
the function each time. What happens if the function has side-effects
like in your case? What if you had a WHERE clause that depended on the
result of that function?

If the function has no side-effects, like say pow() then the backend
could skip but that should be transparent to the user. SQL allows you
specify the way you want it and PostgreSQL is simply executing what you
wrote down...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpyWoANLmTEz.pgp
Description: PGP signature


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-22 Thread Simon Riggs
On Wed, 2005-12-21 at 19:07 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  While we scan, if we found two adjacent pages, both of which have less
  than (say) 40% rows, we could re-join or unsplit those pages together.
 
 Curiously enough, this has been thought of before.  It is not as easy
 as you think, or it would have been done the first time around.
 nbtree/README explains why:
 
 : We consider deleting an entire page from the btree only when it's become
 : completely empty of items.  (Merging partly-full pages would allow better
 : space reuse, but it seems impractical to move existing data items left or
 : right to make this happen --- a scan moving in the opposite direction
 : might miss the items if so.  We could do it during VACUUM FULL, though.)

Sorry, I missed that. 

Seems impractical? Complex, yes. But I think it sounds a lot simpler
than the online REINDEX scheme... which is also the reason enhancing
VACUUM FULL doesn't appeal.

During the first VACUUM index scan, we can identify pages that are
candidates for reorganisation. Then during the second physical scan that
follows we can reorg pages in the same manner we delete them.

We identify a page during VACUUM for reorg if:
- it is  20% full and we want to write this page
- the following page is  20% full and we want to write this page
- it has a higher physical block number than the following page
That way we aren't super aggressive about reorg, but the cases we do
pick have a tendency to keep the index clustered. (Perhaps that could be
settable via an index optional parameter). That definition also means we
have almost no additional I/O over what the VACUUM would have written
anyway.

Page deletion requires us to lock left, lock right, lock above. That is
exactly the same if we have identified the page for reorg, since once we
have locked left and right, we just move rows to one or both of the
those other blocks, then perform the marking half-dead.

I know you've considered these things deeply, but my viewpoint is that
when what you have is already very good the only way forward consists of
considering seemingly foolish thoughts: backtracking.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Recovery from multi trouble

2005-12-22 Thread Simon Riggs
On Mon, 2005-12-19 at 17:17 +0900, OKADA Satoshi wrote:
 Tom Lane wrote:
 
 OKADA Satoshi [EMAIL PROTECTED] writes:
   
 
 The loss of log was simulated by deleting the latest xlog file. 
 
 
 
 What does that have to do with reality?  Postgres is very careful not to
 use an xlog file until it's been fully metadata-synced.  You might as
 well complain that PG doesn't recover after rm -rf / ...
   
 
 In this case(postmaster abnormal end ,and log is lost), I understand
 that database cannot recover normally.
 
 
 Though a database cannot recover normally, postmaster does not output
 a clear message showing this situation. I think that it is a problem.

IMHO the problem is the deletion of the xlog file, not the error
message.

If you *did* lose an xlog file, would you not expect the system to come
up anyway? You're saying that you'd want the system to stay down because
of this? Would you want the system to be less available in that
circumstance?

I guess you want might a new postmaster option: don't come up if you
are damaged. Would you really use that?

Overall, thank you for doing the durability testing. It is good to know
that you're doing that and taking the time to report any issues you see.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Recovery from multi trouble

2005-12-22 Thread Martijn van Oosterhout
On Thu, Dec 22, 2005 at 10:53:39AM +, Simon Riggs wrote:
 IMHO the problem is the deletion of the xlog file, not the error
 message.
 
 If you *did* lose an xlog file, would you not expect the system to come
 up anyway? You're saying that you'd want the system to stay down because
 of this? Would you want the system to be less available in that
 circumstance?

Well, that's what pg_resetxlog does. If you have an unclean shutdown
and you lose the xlog, you've possibly lost data. Should the postmaster
just come up and pretend nothing happened?

 I guess you want might a new postmaster option: don't come up if you
 are damaged. Would you really use that?

Well, we have zero_damaged_pages, which is off by default.

 Overall, thank you for doing the durability testing. It is good to know
 that you're doing that and taking the time to report any issues you see.

Having a system that just blithely continues in the face of possible
data loss doesn't seem very nice either. Sure, it's nice to know about
it but is it really something we can do something about? The admin
either restores from backup or runs pg_resetxlog, accepting the fact
data will be lost. I don't think this is something postgres should be
doing on its own.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpcJa0Nhxn1G.pgp
Description: PGP signature


Re: [HACKERS] replicating tsearch2 across versions of postgres

2005-12-22 Thread Dave Cramer

Is it possible to just not replicate the internal tsearch tables ?

Dave
On 21-Dec-05, at 4:37 PM, Dave Cramer wrote:


Thanks, that might be easier than first thought.

Dave
On 21-Dec-05, at 2:04 PM, Tom Lane wrote:


Dave Cramer [EMAIL PROTECTED] writes:

Is it possible to add the old signatures back for backward
compatibility ? Something like a tsearch2-compat lib ?


The old signatures were security holes.  We are not going to put them
back.  I would suggest changing the functions on the 7.4 machine  
to the

new signatures --- see the 7.4.8 release notes.

regards, tom lane




---(end of  
broadcast)---

TIP 4: Have you searched our list archives?

  http://archives.postgresql.org




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] to_char and i18n

2005-12-22 Thread Euler Taveira de Oliveira
--- Karel Zak [EMAIL PROTECTED] escreveu:

I have a patch like this. But this was for 7.4.x. I have to take a look
at it.

 That's important point. How resolve this problem Oracle? Maybe we can
 say (in docs) that with non-English locales it works with days/months
 names as in FM (fill) mode.
 
Yeah. We could make the new mode (TM?) ignores the FX mode and write a
note in docs.

 I think that for backward compatibility the locale sensitive
 to_char()
 should be implemented as separate call to_char(datetime, format,
 locale) or we should add new modifiers to the current to_char,
 something like to_char(datetime, LCMonth) or both.
 
I vote for another modifier (TM?). That's more flexible than another
function overload because to_char() implements modifiers yet.

 I don't have any time to work on to_char(), I can help to review
 patches
 only.
 
OK. I'll send a revised patch ASAP.


Euler Taveira de Oliveira
euler[at]yahoo_com_br








___ 
Yahoo! doce lar. Faça do Yahoo! sua homepage. 
http://br.yahoo.com/homepageset.html 


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Simon Riggs
Having just optimized COPY to avoid writing WAL during the transaction
in which a table was first created, it seems worth considering whether
this should occur for INSERT, UPDATE and DELETE also.

It is fairly common to do data transformation using INSERT SELECTs and
UPDATEs. This is usually done with temporary tables however. (DELETE
would most efficiently be handled as an additional NOT clause on the
insert, so it is uncommonly used in this circumstance.)

However, CREATE TABLE AS SELECT (CTAS) does not allow inheritance, so a
new permanent partition has to be created using CREATE TABLE, followed
by an INSERT SELECT or COPY.

Is that sufficient reason to optimise INSERT SELECT and UPDATE also? Or
should I not bother? Or should I try to teach CTAS to use inheritance
(which sounds harder and has a few gotchas).

Currently, CTAS optimization requires a heap_sync during ExecEndPlan. It
would be easy enough to extend this so that it also works for INSERT,
UPDATE and DELETE.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-22 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Sorry, I missed that. 

And you evidently still didn't understand it.  Locking both pages does
not fix the problem, because it doesn't guarantee that there's not a
concurrent indexscan in flight from one to the other.  If you move items
from one page to the other in the opposite direction from the way the
scan is going, then it will miss those items.  If we try to fix this by
making scans lock one page before releasing the previous, then we'll
create a bunch of deadlock cases.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-22 Thread Martijn van Oosterhout
On Thu, Dec 22, 2005 at 10:40:24AM -0500, Tom Lane wrote:
 And you evidently still didn't understand it.  Locking both pages does
 not fix the problem, because it doesn't guarantee that there's not a
 concurrent indexscan in flight from one to the other.  If you move items
 from one page to the other in the opposite direction from the way the
 scan is going, then it will miss those items.  If we try to fix this by
 making scans lock one page before releasing the previous, then we'll
 create a bunch of deadlock cases.

I've been thinking, maybe there's a lockless way of going about this?
Have some kind of index modification identifier that you set at the
beginning of the index scan. What you do is mark the tuples you want to
move with and IMI (I'm trying to avoid the word transaction here) and then
copy the tuples to the new page with IMI+1. Any currently running index
scan will notice the higher IMI and ignore them. When all old index
scans are done, you can remove the old ones.

It's sort of a mini-MVCC for indexes except you could probably just use
a few states: visible to all, visible to current scan, invisible to
current scan. Or use two bits to represent frozen, 1, 2 and 3. A plain
VACUUM could do the state transistions each time it moves through the
index.

The downsides are probably that it's a pain to make it work
concurrently and requires writing each index page at least twice. But
it's a thought...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpp8Aiv8Fqt8.pgp
Description: PGP signature


Re: [HACKERS] PL/pgSQL proposal: using list of scalars in assign stmts, fore and fors stmts

2005-12-22 Thread Tom Lane
Pavel Stehule [EMAIL PROTECTED] writes:
 for := FOR target IN {SELECT | EXECUTE} ... LOOP
 target := {row|record|comma separated list of scalar vars}

This part seems all right to me.

 assign := target2 ':=' expression
 target2 := {row|record|variable|'ROW(' comma separated list of scalar vars 
 ')'}

As I already said on -patches, I consider this a bad idea.  It's too
error prone (because there's no easy way of seeing what the field order
will be).  And it doesn't add anything that you can't do now.  I think
a series of var = rec.field assignments is a preferable way to do it.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Currently, CTAS optimization requires a heap_sync during ExecEndPlan. It
 would be easy enough to extend this so that it also works for INSERT,
 UPDATE and DELETE.

If you tried to do it that way you'd break the system completely.  Not
all updates go through the executor.

I think it's a bad idea anyway; you'd be adding overhead to the lowest
level routines in order to support a feature that would be very seldom
used, at least in comparison to the number of times those routines are
executed.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Jim C. Nasby
On Thu, Dec 22, 2005 at 02:31:33PM +, Simon Riggs wrote:
 Having just optimized COPY to avoid writing WAL during the transaction
 in which a table was first created, it seems worth considering whether
 this should occur for INSERT, UPDATE and DELETE also.
 
 It is fairly common to do data transformation using INSERT SELECTs and
 UPDATEs. This is usually done with temporary tables however. (DELETE
 would most efficiently be handled as an additional NOT clause on the
 insert, so it is uncommonly used in this circumstance.)
 
 However, CREATE TABLE AS SELECT (CTAS) does not allow inheritance, so a
 new permanent partition has to be created using CREATE TABLE, followed
 by an INSERT SELECT or COPY.
 
 Is that sufficient reason to optimise INSERT SELECT and UPDATE also? Or
 should I not bother? Or should I try to teach CTAS to use inheritance
 (which sounds harder and has a few gotchas).
 
 Currently, CTAS optimization requires a heap_sync during ExecEndPlan. It
 would be easy enough to extend this so that it also works for INSERT,
 UPDATE and DELETE.

Well, both UPDATE and DELETE seem like pretty odd use cases to me;
typically I'd do any needed data manipulation during the INSERT SELECT.
But it doesn't make sense to me to set this up for INSERT and ignore
UPDATE and DELETE.

I do think this needs to be something that is made either completely
transparent or must be specifically enabled. As described, I believe
this would break PITR, so users should have to specifically request that
behavior (and they should probably get a WARNING message, too).

Though, what's interesting is that theoretically it should be possible
to do this and still protect PITR, by logging the statements (but not
the actual data) to WAL. This isn't very practical with WAL (though it
might be worth looking at storing a compressed version of what's being
fed in to COPY), but in this case if we end up in a recovery situation
the data that the insert is pulling from should exist in the database in
the same state, so it should be possible to re-create the table. There's
still an issue of how to handle the pages from the new table that will
end up in WAL on subsiquent transactions, since presumably they might be
identical, but someone might be able to come up with a clever solution
for that. In the meantime, breaking WAL recovery needs to be something
that users must specifically request, via something like UPDATE NOWAL.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Jim C. Nasby
BTW, this should also probably be moved over to -hackers...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-22 Thread Alvaro Herrera
Martijn van Oosterhout wrote:

 The downsides are probably that it's a pain to make it work
 concurrently and requires writing each index page at least twice. But
 it's a thought...

We already do something similar for page deletions.  Empty pages are not
deleted right away, but they are marked with BTP_DEAD, and then deleted
on a subsequent vacuum.  Or something like that, I don't remember the
exact details.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Jim C. Nasby
On Wed, Dec 21, 2005 at 11:00:31PM -0500, Bruce Momjian wrote:
  Track normal resource consumption (ie: tuples read) for planned queries
  and record parameter values that result in drastically different
  resource consumption.
  
  This would at least make it easy for admins to identify prepared queries
  that have a highly variable execution cost.
 
 We have that TODO already:
 
   * Log statements where the optimizer row estimates were dramatically
 different from the number of rows actually found?

Does the stored plan also save how many rows were expected? Otherwise
I'm not sure how that TODO covers it... If it does then please ignore my
ramblings below. :)

My idea has nothing to do with row estimates. It has to do with the
amount of work actually done to perform a query. Consider this example:

CREATE TABLE queue (status char NOT NULL, queue_item text NOT NULL);
CREATE INDEX queue__status ON queue (status);

Obviously, to process this you'll need a query like:
SELECT * FROM queue WHERE status='N' -- N for New;

Say you also occasionally need to see a list of items that have been
processed:
SELECT * FROM queue WHERE status='D' -- D for Done;

And let's say you need to keep done items around for 30 days.

Now, if both of these are done using a prepared statement, it's going to
look like:

SELECT * FROM queue WHERE status='?';

If the first one to run is the queue processing one, the planner will
probably choose the index. This means that when we're searching on 'N',
there will be a fairly small number of tuples read to execute the query,
but when searching for 'D' a very large number of tuples will be read.

What I'm proposing is to keep track of the 'normal' number of tuples
read when executing a prepared query, and logging any queries that are
substantially different. So, if you normally have to read 50 tuples to
find all 'N' records, when the query looking for 'D' records comes along
and has to read 5000 tuples instead, we want to log that. Probably the
easiest way to accomplish this is to store a moving average of tuples
read with each prepared statement entry.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Simon Riggs
On Thu, 2005-12-22 at 12:12 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Currently, CTAS optimization requires a heap_sync during ExecEndPlan. It
  would be easy enough to extend this so that it also works for INSERT,
  UPDATE and DELETE.
 
 If you tried to do it that way you'd break the system completely.  Not
 all updates go through the executor.

Wow, didn't know that.

 I think it's a bad idea anyway; you'd be adding overhead to the lowest
 level routines in order to support a feature that would be very seldom
 used, at least in comparison to the number of times those routines are
 executed.

Agreed.

Maybe just INSERT SELECT then. That's easy enough to test for without
altering the main code path in the executor too much. If anybody is
going to say they want it?

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-22 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 We already do something similar for page deletions.  Empty pages are not
 deleted right away, but they are marked with BTP_DEAD, and then deleted
 on a subsequent vacuum.  Or something like that, I don't remember the
 exact details.

Right, and the reason for that is exactly that there might be a
concurrent indexscan already in flight to the newly-dead page.
We must wait to recycle the page until we are certain no such scans
remain.

It doesn't matter whether a concurrent indexscan visits the dead
page or not, *because it's empty* and so there's nothing to miss.
So there's no race condition.  But if you try to move valid data
across pages then there is a race condition.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Bruce Momjian

Oh, OK, so you are logging prepared queries where the plan generates a
significantly different number of rows from previous runs.  I am not
sure why that is better, or easier, than just invalidating the  cached
plan if the cardinality changes.

---

Jim C. Nasby wrote:
 On Wed, Dec 21, 2005 at 11:00:31PM -0500, Bruce Momjian wrote:
   Track normal resource consumption (ie: tuples read) for planned queries
   and record parameter values that result in drastically different
   resource consumption.
   
   This would at least make it easy for admins to identify prepared queries
   that have a highly variable execution cost.
  
  We have that TODO already:
  
  * Log statements where the optimizer row estimates were dramatically
different from the number of rows actually found?
 
 Does the stored plan also save how many rows were expected? Otherwise
 I'm not sure how that TODO covers it... If it does then please ignore my
 ramblings below. :)
 
 My idea has nothing to do with row estimates. It has to do with the
 amount of work actually done to perform a query. Consider this example:
 
 CREATE TABLE queue (status char NOT NULL, queue_item text NOT NULL);
 CREATE INDEX queue__status ON queue (status);
 
 Obviously, to process this you'll need a query like:
 SELECT * FROM queue WHERE status='N' -- N for New;
 
 Say you also occasionally need to see a list of items that have been
 processed:
 SELECT * FROM queue WHERE status='D' -- D for Done;
 
 And let's say you need to keep done items around for 30 days.
 
 Now, if both of these are done using a prepared statement, it's going to
 look like:
 
 SELECT * FROM queue WHERE status='?';
 
 If the first one to run is the queue processing one, the planner will
 probably choose the index. This means that when we're searching on 'N',
 there will be a fairly small number of tuples read to execute the query,
 but when searching for 'D' a very large number of tuples will be read.
 
 What I'm proposing is to keep track of the 'normal' number of tuples
 read when executing a prepared query, and logging any queries that are
 substantially different. So, if you normally have to read 50 tuples to
 find all 'N' records, when the query looking for 'D' records comes along
 and has to read 5000 tuples instead, we want to log that. Probably the
 easiest way to accomplish this is to store a moving average of tuples
 read with each prepared statement entry.
 -- 
 Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
 Pervasive Software  http://pervasive.comwork: 512-231-6117
 vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
 
 ---(end of broadcast)---
 TIP 6: explain analyze is your friend
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and DELETE?

2005-12-22 Thread Martijn van Oosterhout
On Thu, Dec 22, 2005 at 12:37:51PM -0600, Jim C. Nasby wrote:
 I do think this needs to be something that is made either completely
 transparent or must be specifically enabled. As described, I believe
 this would break PITR, so users should have to specifically request that
 behavior (and they should probably get a WARNING message, too).

This reminds me of a friend who used MSSQL that had replication going
that broke every time you did a certain statement. It may have been
SELECT INTO [1]. His main problem was that the replication would
stop working silently. We need to be waving red flags if we broke
someone's backup procedure.

Considering WAL bypass is code for breaks PITR, I think we really
need to make sure people realise that running such a command breaks
their backups/replication/whatever people are doing. 

[1] 
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/trblsql/tr_reslsyserr_1_1r94.asp

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp19qVFCa9Gc.pgp
Description: PGP signature


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Lukas Smith

Jim C. Nasby wrote:


Now, if both of these are done using a prepared statement, it's going to
look like:

SELECT * FROM queue WHERE status='?';

If the first one to run is the queue processing one, the planner will
probably choose the index. This means that when we're searching on 'N',
there will be a fairly small number of tuples read to execute the query,
but when searching for 'D' a very large number of tuples will be read.


I do not know how exactly how pg handles this internally, however while 
skimming the oracle tuning pocket guide I picked up for 2 euros I 
noticed that it mentioned that since oracle 9i bound parameter values 
are evaluated before the execution plan is determined.


Maybe I am mixing up separate concepts (are bound variables and prepared 
statements different concepts?) here. I also do not really understand if 
that means that oracle does not store a query plan for a prepared query 
or if it just does some special handling in case it knows that a 
prepared statement column is known to have a highly varying selectivity 
per value.


regards,
Lukas

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Bruce Momjian
Lukas Smith wrote:
 Jim C. Nasby wrote:
 
  Now, if both of these are done using a prepared statement, it's going to
  look like:
  
  SELECT * FROM queue WHERE status='?';
  
  If the first one to run is the queue processing one, the planner will
  probably choose the index. This means that when we're searching on 'N',
  there will be a fairly small number of tuples read to execute the query,
  but when searching for 'D' a very large number of tuples will be read.
 
 I do not know how exactly how pg handles this internally, however while 
 skimming the oracle tuning pocket guide I picked up for 2 euros I 
 noticed that it mentioned that since oracle 9i bound parameter values 
 are evaluated before the execution plan is determined.
 
 Maybe I am mixing up separate concepts (are bound variables and prepared 
 statements different concepts?) here. I also do not really understand if 
 that means that oracle does not store a query plan for a prepared query 
 or if it just does some special handling in case it knows that a 
 prepared statement column is known to have a highly varying selectivity 
 per value.

What the Oralce manual means I think is that the plan of the query is
delayed until the _first_ EXECUTE, so it has some values to use in the
optimizer.  The problem is that later queries might use constants of
greatly different cardinality.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Lukas Smith

Bruce Momjian wrote:

Maybe I am mixing up separate concepts (are bound variables and prepared 
statements different concepts?) here. I also do not really understand if 
that means that oracle does not store a query plan for a prepared query 
or if it just does some special handling in case it knows that a 
prepared statement column is known to have a highly varying selectivity 
per value.


What the Oralce manual means I think is that the plan of the query is
delayed until the _first_ EXECUTE, so it has some values to use in the
optimizer.  The problem is that later queries might use constants of
greatly different cardinality.


ok .. which just goes to tell to not use prepared statements for a 
column with highly varying selectivity ..?


or is there a realistic shot at fixing this use case?

regards,
Lukas


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] PL/pgSQL proposal: using list of scalars in assign stmts, fore and fors stmts

2005-12-22 Thread David Fetter
On Thu, Dec 22, 2005 at 10:18:16AM +0100, Pavel Stehule wrote:
 Hello
 
 Now, statements EXECUTE INTO and SELECT INTO allow using list of scalars. 
 FORe and FORs allow only ROW o RECORD VARIABLE. I'll plan and I did it 
 enhance this stmts:
 
 for := FOR target IN {SELECT | EXECUTE} ... LOOP
 target := {row|record|comma separated list of scalar vars}
 
 assign := target2 ':=' expression
 target2 := {row|record|variable|'ROW(' comma separated list of scalar 
 vars ')'}

How about:

target2 := {row|record|variable|'[ROW](' comma separated list of scalar vars 
')'}

instead, where the ROW is optional?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Jim C. Nasby
On Thu, Dec 22, 2005 at 09:55:14PM +0100, Lukas Smith wrote:
 Bruce Momjian wrote:
 
 Maybe I am mixing up separate concepts (are bound variables and prepared 
 statements different concepts?) here. I also do not really understand if 
 that means that oracle does not store a query plan for a prepared query 
 or if it just does some special handling in case it knows that a 
 prepared statement column is known to have a highly varying selectivity 
 per value.
 
 What the Oralce manual means I think is that the plan of the query is
 delayed until the _first_ EXECUTE, so it has some values to use in the
 optimizer.  The problem is that later queries might use constants of
 greatly different cardinality.
 
 ok .. which just goes to tell to not use prepared statements for a 
 column with highly varying selectivity ..?
 
 or is there a realistic shot at fixing this use case?

FWIW, I believe that 10g has some brains in this regard, where it can
detect if it should store multiple plans for one prepared statement.
This is critical for them, because they'r parser/planner is much harder
on the system than ours is.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Bruce Momjian
Lukas Smith wrote:
 Bruce Momjian wrote:
 
  Maybe I am mixing up separate concepts (are bound variables and prepared 
  statements different concepts?) here. I also do not really understand if 
  that means that oracle does not store a query plan for a prepared query 
  or if it just does some special handling in case it knows that a 
  prepared statement column is known to have a highly varying selectivity 
  per value.
  
  What the Oralce manual means I think is that the plan of the query is
  delayed until the _first_ EXECUTE, so it has some values to use in the
  optimizer.  The problem is that later queries might use constants of
  greatly different cardinality.
 
 ok .. which just goes to tell to not use prepared statements for a 
 column with highly varying selectivity ..?
 
 or is there a realistic shot at fixing this use case?

It is an issue for all databases.  We gave a TODO about it:

* Flush cached query plans when the dependent objects change,
  when the cardinality of parameters changes dramatically, or
  when new ANALYZE statistics are available


-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Lukas Smith

Bruce Momjian wrote:


It is an issue for all databases.  We gave a TODO about it:

* Flush cached query plans when the dependent objects change,
  when the cardinality of parameters changes dramatically, or
  when new ANALYZE statistics are available


Ok, just so I understand this correctly. In the mentioned case the 
cardinality does not really change in regards to the table stats, its 
just thatI happen to use a value that has a different selectivity and 
therefore I may need a different plan. So I do not really see how this 
use case is handled with the above todo, nor do I really see how its 
handled with what Jim suggested earlier. The fact of the matter is that 
for this use case you need to use different query plans for the same 
prepared statements.


regards,
Lukas

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Jim C. Nasby
Well, not just rows; total tuples, both base heap and index. ISTM that
would be a better metric than just plain rows read out of base or rows
returned.

Depending on how far down this road we want to go, this would allow for
detecting what parameter values require different query plans, and then
using different query plans for different sets of values. Simply
invalidating the cached plan means you could potentially end up needing
to re-plan very frequently. But given the current speed of our
optimizer, it's probably not worth going to this extent.

Another concern I have is: is cardinality the only metric we need to
look at when deciding to re-plan or are there others?

In either case, my guess is that tracking the info needed to make this
idea happen is probably much easier than doing automatic plan
invalidation based on cardinality, so it would be a useful interum step.
But if we could actually get cardinality invalidation into 8.2, I'd say
put the effort into that...

On Thu, Dec 22, 2005 at 03:14:09PM -0500, Bruce Momjian wrote:
 
 Oh, OK, so you are logging prepared queries where the plan generates a
 significantly different number of rows from previous runs.  I am not
 sure why that is better, or easier, than just invalidating the  cached
 plan if the cardinality changes.
 
 ---
 
 Jim C. Nasby wrote:
  On Wed, Dec 21, 2005 at 11:00:31PM -0500, Bruce Momjian wrote:
Track normal resource consumption (ie: tuples read) for planned queries
and record parameter values that result in drastically different
resource consumption.

This would at least make it easy for admins to identify prepared queries
that have a highly variable execution cost.
   
   We have that TODO already:
   
 * Log statements where the optimizer row estimates were dramatically
   different from the number of rows actually found?
  
  Does the stored plan also save how many rows were expected? Otherwise
  I'm not sure how that TODO covers it... If it does then please ignore my
  ramblings below. :)
  
  My idea has nothing to do with row estimates. It has to do with the
  amount of work actually done to perform a query. Consider this example:
  
  CREATE TABLE queue (status char NOT NULL, queue_item text NOT NULL);
  CREATE INDEX queue__status ON queue (status);
  
  Obviously, to process this you'll need a query like:
  SELECT * FROM queue WHERE status='N' -- N for New;
  
  Say you also occasionally need to see a list of items that have been
  processed:
  SELECT * FROM queue WHERE status='D' -- D for Done;
  
  And let's say you need to keep done items around for 30 days.
  
  Now, if both of these are done using a prepared statement, it's going to
  look like:
  
  SELECT * FROM queue WHERE status='?';
  
  If the first one to run is the queue processing one, the planner will
  probably choose the index. This means that when we're searching on 'N',
  there will be a fairly small number of tuples read to execute the query,
  but when searching for 'D' a very large number of tuples will be read.
  
  What I'm proposing is to keep track of the 'normal' number of tuples
  read when executing a prepared query, and logging any queries that are
  substantially different. So, if you normally have to read 50 tuples to
  find all 'N' records, when the query looking for 'D' records comes along
  and has to read 5000 tuples instead, we want to log that. Probably the
  easiest way to accomplish this is to store a moving average of tuples
  read with each prepared statement entry.
  -- 
  Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
  Pervasive Software  http://pervasive.comwork: 512-231-6117
  vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
  
  ---(end of broadcast)---
  TIP 6: explain analyze is your friend
  
 
 -- 
   Bruce Momjian|  http://candle.pha.pa.us
   pgman@candle.pha.pa.us   |  (610) 359-1001
   +  If your life is a hard drive, |  13 Roberts Road
   +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
 
 ---(end of broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq
 

-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Jim C. Nasby
On Thu, Dec 22, 2005 at 10:14:15PM +0100, Lukas Smith wrote:
 Ok, just so I understand this correctly. In the mentioned case the 
 cardinality does not really change in regards to the table stats, its 
 just thatI happen to use a value that has a different selectivity and 
 therefore I may need a different plan. So I do not really see how this 
 use case is handled with the above todo, nor do I really see how its 
 handled with what Jim suggested earlier. The fact of the matter is that 
 for this use case you need to use different query plans for the same 
 prepared statements.

What I mentioned would allow for identifying query plans that this is
happening on. Doing something about it would be the next step after
that.

Can anyone think of something other than selectivity that would make a
stored query plan go bad based soley on the parameters being fed into it?
(In other words ignore the obvious cases of bad statistics or a DDL
change).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Bruce Momjian

We need invalidation anyway, so I don't see why an intermediate step
makes sense.

---

Jim C. Nasby wrote:
 Well, not just rows; total tuples, both base heap and index. ISTM that
 would be a better metric than just plain rows read out of base or rows
 returned.
 
 Depending on how far down this road we want to go, this would allow for
 detecting what parameter values require different query plans, and then
 using different query plans for different sets of values. Simply
 invalidating the cached plan means you could potentially end up needing
 to re-plan very frequently. But given the current speed of our
 optimizer, it's probably not worth going to this extent.
 
 Another concern I have is: is cardinality the only metric we need to
 look at when deciding to re-plan or are there others?
 
 In either case, my guess is that tracking the info needed to make this
 idea happen is probably much easier than doing automatic plan
 invalidation based on cardinality, so it would be a useful interum step.
 But if we could actually get cardinality invalidation into 8.2, I'd say
 put the effort into that...
 
 On Thu, Dec 22, 2005 at 03:14:09PM -0500, Bruce Momjian wrote:
  
  Oh, OK, so you are logging prepared queries where the plan generates a
  significantly different number of rows from previous runs.  I am not
  sure why that is better, or easier, than just invalidating the  cached
  plan if the cardinality changes.
  
  ---
  
  Jim C. Nasby wrote:
   On Wed, Dec 21, 2005 at 11:00:31PM -0500, Bruce Momjian wrote:
 Track normal resource consumption (ie: tuples read) for planned 
 queries
 and record parameter values that result in drastically different
 resource consumption.
 
 This would at least make it easy for admins to identify prepared 
 queries
 that have a highly variable execution cost.

We have that TODO already:

* Log statements where the optimizer row estimates were 
dramatically
  different from the number of rows actually found?
   
   Does the stored plan also save how many rows were expected? Otherwise
   I'm not sure how that TODO covers it... If it does then please ignore my
   ramblings below. :)
   
   My idea has nothing to do with row estimates. It has to do with the
   amount of work actually done to perform a query. Consider this example:
   
   CREATE TABLE queue (status char NOT NULL, queue_item text NOT NULL);
   CREATE INDEX queue__status ON queue (status);
   
   Obviously, to process this you'll need a query like:
   SELECT * FROM queue WHERE status='N' -- N for New;
   
   Say you also occasionally need to see a list of items that have been
   processed:
   SELECT * FROM queue WHERE status='D' -- D for Done;
   
   And let's say you need to keep done items around for 30 days.
   
   Now, if both of these are done using a prepared statement, it's going to
   look like:
   
   SELECT * FROM queue WHERE status='?';
   
   If the first one to run is the queue processing one, the planner will
   probably choose the index. This means that when we're searching on 'N',
   there will be a fairly small number of tuples read to execute the query,
   but when searching for 'D' a very large number of tuples will be read.
   
   What I'm proposing is to keep track of the 'normal' number of tuples
   read when executing a prepared query, and logging any queries that are
   substantially different. So, if you normally have to read 50 tuples to
   find all 'N' records, when the query looking for 'D' records comes along
   and has to read 5000 tuples instead, we want to log that. Probably the
   easiest way to accomplish this is to store a moving average of tuples
   read with each prepared statement entry.
   -- 
   Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
   Pervasive Software  http://pervasive.comwork: 512-231-6117
   vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
   
   ---(end of broadcast)---
   TIP 6: explain analyze is your friend
   
  
  -- 
Bruce Momjian|  http://candle.pha.pa.us
pgman@candle.pha.pa.us   |  (610) 359-1001
+  If your life is a hard drive, |  13 Roberts Road
+  Christ can be your backup.|  Newtown Square, Pennsylvania 19073
  
  ---(end of broadcast)---
  TIP 3: Have you checked our extensive FAQ?
  
 http://www.postgresql.org/docs/faq
  
 
 -- 
 Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
 Pervasive Software  http://pervasive.comwork: 512-231-6117
 vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461
 
 ---(end of broadcast)---
 TIP 1: if 

Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Lukas Kahwe Smith

Bruce Momjian wrote:


Right, if the cardinality changes, you realize this before execution and
optimize/save the plan again.  A further optimization would be to save
_multiple_ plans for a single prepared plan based on constants and
choose one of the other, but that is beyond where we are willing to
consider at this stage, I think.


ok .. so you store the cardinality that was used when generating the
original plan. on the next execution you look up the cardinality again
and compare it, if its off too much, you replan. however this could in
extreme cases mean that you replan on every execution and thereby
killing off the entire advantage of storing the plan. but thats the
absolute worse case scenario.

regards,
Lukas

PS: bruce original email was only send to me directly ..





---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Function call with offset and limit

2005-12-22 Thread Jim C. Nasby
On Thu, Dec 22, 2005 at 11:18:22AM +0100, Martijn van Oosterhout wrote:
 On Thu, Dec 22, 2005 at 10:52:58AM +0100, REYNAUD Jean-Samuel wrote:
  Hi
  
  I've just tried it, and it works. So it's a good work-around.
  
  Though, is it a wanted feature to have a function being performed on
  each row before the offset ?
 
 Well, saying offset 5000 pretty much means to calculate the first 5000
 rows and throw away the result. To calculate that it needs to execute
 the function each time. What happens if the function has side-effects
 like in your case? What if you had a WHERE clause that depended on the
 result of that function?
 
 If the function has no side-effects, like say pow() then the backend
 could skip but that should be transparent to the user. SQL allows you
 specify the way you want it and PostgreSQL is simply executing what you
 wrote down...

Well, it would be a good optimization to make if the function is
immutable and isn't otherwise referenced (ie: by WHERE or ORDER BY),
there's no reason I can think of to execute it as you read through the
rows. Might be able to do this with STABLE functions as well.

TODO?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Re: Which qsort is used

2005-12-22 Thread Manfred Koizar
On Thu, 22 Dec 2005 08:01:00 +0100, Martijn van Oosterhout
kleptog@svana.org wrote:
But where are you including the cost to check how many cells are
already sorted? That would be O(H), right?

Yes.  I didn't mention it, because H  N.

 This is where we come back
to the issue that comparisons in PostgreSQL are expensive.

So we agree that we should try to reduce the number of comparisons.
How many comparisons does it take to sort 10 items?  1.5 million?

Hmm, what are the chances you have 10 unordered items to sort and
that the first 8% will already be in order. ISTM that that probability
will be close enough to zero to not matter...

If the items are totally unordered, the check is so cheap you won't
even notice.  OTOH in Tom's example ...

|What I think is much more probable in the Postgres environment
|is almost-but-not-quite-ordered inputs --- eg, a table that was
|perfectly ordered by key when filled, but some of the tuples have since
|been moved by UPDATEs.

... I'd not be surprised if H is 90% of N.
Servus
 Manfred

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Unsplitting btree index leaf pages

2005-12-22 Thread Simon Riggs
On Thu, 2005-12-22 at 10:40 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Sorry, I missed that. 
 
 And you evidently still didn't understand it.  Locking both pages does
 not fix the problem, because it doesn't guarantee that there's not a
 concurrent indexscan in flight from one to the other.  If you move items
 from one page to the other in the opposite direction from the way the
 scan is going, then it will miss those items.  If we try to fix this by
 making scans lock one page before releasing the previous, then we'll
 create a bunch of deadlock cases.

Thank you for explaining things; I'm sorry you had to do it twice.

I'm ever optimistic, so I try to resist the temptation to stay quiet in
case I make a mistake.

Best Regards, Simon Riggs



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Simon Riggs
On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
 Considering WAL bypass is code for breaks PITR

No it isn't. All of the WAL bypass logic does *not* operate when PITR is
active. The WAL bypass logic is aimed at Data Warehouses, which
typically never operate in PITR mode for performance reasons, however
the choice is yours.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Re: Which qsort is used

2005-12-22 Thread Dann Corbit
An interesting article on sorting and comparison count:
http://www.acm.org/jea/ARTICLES/Vol7Nbr5.pdf

Here is the article, the code, and an implementation that I have been
toying with:
http://cap.connx.com/chess-engines/new-approach/algos.zip

Algorithm quickheap is especially interesting because it does not
require much additional space (just an array of integers up to size
log(element_count) and in addition, it has very few data movements.

 -Original Message-
 From: Manfred Koizar [mailto:[EMAIL PROTECTED]
 Sent: Thursday, December 22, 2005 1:59 PM
 To: Martijn van Oosterhout
 Cc: Tom Lane; Dann Corbit; Qingqing Zhou; Bruce Momjian; Luke
Lonergan;
 Neil Conway; pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Re: Which qsort is used
 
 On Thu, 22 Dec 2005 08:01:00 +0100, Martijn van Oosterhout
 kleptog@svana.org wrote:
 But where are you including the cost to check how many cells are
 already sorted? That would be O(H), right?
 
 Yes.  I didn't mention it, because H  N.
 
  This is where we come back
 to the issue that comparisons in PostgreSQL are expensive.
 
 So we agree that we should try to reduce the number of comparisons.
 How many comparisons does it take to sort 10 items?  1.5 million?
 
 Hmm, what are the chances you have 10 unordered items to sort and
 that the first 8% will already be in order. ISTM that that
probability
 will be close enough to zero to not matter...
 
 If the items are totally unordered, the check is so cheap you won't
 even notice.  OTOH in Tom's example ...
 
 |What I think is much more probable in the Postgres environment
 |is almost-but-not-quite-ordered inputs --- eg, a table that was
 |perfectly ordered by key when filled, but some of the tuples have
since
 |been moved by UPDATEs.
 
 ... I'd not be surprised if H is 90% of N.
 Servus
  Manfred

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Stephen Frost
* Simon Riggs ([EMAIL PROTECTED]) wrote:
 On Thu, 2005-12-22 at 21:18 +0100, Martijn van Oosterhout wrote:
  Considering WAL bypass is code for breaks PITR
 
 No it isn't. All of the WAL bypass logic does *not* operate when PITR is
 active. The WAL bypass logic is aimed at Data Warehouses, which
 typically never operate in PITR mode for performance reasons, however
 the choice is yours.

Eh?  PITR mode is bad for performance?  Maybe I missed something but I
wouldn't have thought PITR would degrade regular performance all that
badly.  So long as it doesn't take 15 minutes or some such to move the
WAL to somewhere else (and I'm not sure that'd even slow things down..).
For a Data Warehouse, have you got a better way of doing backups such
that you don't lose at minimum most of a day's work?  I'm not exactly a
big fan do doing a pg_dump every night either given that the database is
360GB.  Much nicer to take a weekly dump of the database and then do
PITR for a week or two before taking another dump of the db.

I like the idea of making COPY go faster, but please don't break my
backup system while you're at it.  I'm honestly kind of nervous about
what you mean by checking it PITR is active- how is that done, exactly?
Check if you have a script set to rotate the logs elsewhere?  Or is it
checking if you're in the taking-a-full-database-backup stage?  Or what?
What's the performance decrease when using PITR, and what's it from?  Is
it just that COPY isn't as fast?  Honestly, I could live with COPY being
not as fast as it could be if my backups work. :)

Sorry for sounding concerned but, well, backups are very important and
so is performance and I'm afraid either I've not read all the
documentation about the issues being discussed here or there isn't
enough out there to make sense of it all yet. :)

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Trent Shipley
On Thursday 2005-12-22 14:28, Lukas Kahwe Smith wrote:
 Bruce Momjian wrote:
  Right, if the cardinality changes, you realize this before execution and
  optimize/save the plan again.  A further optimization would be to save
  _multiple_ plans for a single prepared plan based on constants and
  choose one of the other, but that is beyond where we are willing to
  consider at this stage, I think.

 ok .. so you store the cardinality that was used when generating the
 original plan. on the next execution you look up the cardinality again
 and compare it, if its off too much, you replan. however this could in
 extreme cases mean that you replan on every execution and thereby
 killing off the entire advantage of storing the plan. but thats the
 absolute worse case scenario.

 regards,
 Lukas

 PS: bruce original email was only send to me directly ..

So you have a parameterized query (one parameter for simplicity of argument), 
as the parameter changes, cardinality changes dramatically.

It seems to me that in this case better than replanning is building a data 
structure that associates different parameter values with appropriate plans.  
The plans can be reused until, as would be the case with an no-parameter 
query, a parameter specific plan should be flushed (or the entire family of 
plans can be flushed).

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Automatic function replanning

2005-12-22 Thread Bruce Momjian
Trent Shipley wrote:
 On Thursday 2005-12-22 14:28, Lukas Kahwe Smith wrote:
  Bruce Momjian wrote:
   Right, if the cardinality changes, you realize this before execution and
   optimize/save the plan again.  A further optimization would be to save
   _multiple_ plans for a single prepared plan based on constants and
   choose one of the other, but that is beyond where we are willing to
   consider at this stage, I think.
 
  ok .. so you store the cardinality that was used when generating the
  original plan. on the next execution you look up the cardinality again
  and compare it, if its off too much, you replan. however this could in
  extreme cases mean that you replan on every execution and thereby
  killing off the entire advantage of storing the plan. but thats the
  absolute worse case scenario.
 
  regards,
  Lukas
 
  PS: bruce original email was only send to me directly ..
 
 So you have a parameterized query (one parameter for simplicity of argument), 
 as the parameter changes, cardinality changes dramatically.
 
 It seems to me that in this case better than replanning is building a data 
 structure that associates different parameter values with appropriate plans.  
 The plans can be reused until, as would be the case with an no-parameter 
 query, a parameter specific plan should be flushed (or the entire family of 
 plans can be flushed).

TODO updated:

* Flush cached query plans when the dependent objects change,
  when the cardinality of parameters changes dramatically, or
  when new ANALYZE statistics are available

  A more complex solution would be to save multiple plans for different
  cardinality and use the appropriate plan based on the EXECUTE values.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] [BUGS] Solaris cc compiler on amd: PostgreSQL does not

2005-12-22 Thread Bruce Momjian
Jim C. Nasby wrote:
 On Sat, Dec 17, 2005 at 04:59:45PM -0500, Bruce Momjian wrote:
  Tom Lane wrote:
   Bruce Momjian pgman@candle.pha.pa.us writes:
Looking before that, the 64-bit registers are now prefixed with 'r'
instead of 'e', so what I did was to convert all the long/l assembler
instructions to quad/64-bit/q, and rename the registers to use 64-bit
versions. I also modified the function alignment from 4 to 8, patch
attached.  Please give it a try and report back any error lines.
   
   Surely that breaks it for the i386 case --- don't we need a separate
   solaris_x86_64.s source file?
  
  Yes, it is only for him to test.  But his email bounced back so it is
  possible he isn't going to be back.  :-(
 
 Sorry, I don't remember the history of this thread, but I do have access
 to solaris on an opteron if something needs to be tested.

If you set up Solaris in 64-bit mode, and are using the Sun compiler,
please test the patch I posted and let me know if it compiles:

http://archives.postgresql.org/pgsql-patches/2005-12/msg00275.php

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Oracle PL/SQL Anonymous block equivalent in postgres

2005-12-22 Thread Ashok Agrawal
Hi ,

Here is the requirements :

1. Development  Support team can ONLY have READ Only ACCESS
   to PRODUCTION Database. They will NOT have access to create
   stored procedure  functions in the PRODUCTION on the fly.

2. During application support, need to write script which
   uses procedural language (IF ELSE, AND , OR,  Cursor, Loop etc)
   to produce results or identify/fix issues. Number of this kind of
   Requests goes in hundreds during Quarter end.

Currently on Oracle as long as you have sqlplus read only access,
you can write PL/SQL block and get the work done.

Since postgres doesn't support procedural language except in
stored objects like procedure/functions, how do I achieve in
postgres without using stored objects. If this is not possible,
then does it make sense to add this requirement into TO-DO list.

Pls advice.

Thanks
Ashok



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Oracle PL/SQL Anonymous block equivalent in postgres

2005-12-22 Thread Bruce Momjian
Ashok Agrawal wrote:
 Hi ,
 
 Here is the requirements :
 
 1. Development  Support team can ONLY have READ Only ACCESS
to PRODUCTION Database. They will NOT have access to create
stored procedure  functions in the PRODUCTION on the fly.
 
 2. During application support, need to write script which
uses procedural language (IF ELSE, AND , OR,  Cursor, Loop etc)
to produce results or identify/fix issues. Number of this kind of
Requests goes in hundreds during Quarter end.
 
 Currently on Oracle as long as you have sqlplus read only access,
 you can write PL/SQL block and get the work done.
 
 Since postgres doesn't support procedural language except in
 stored objects like procedure/functions, how do I achieve in
 postgres without using stored objects. If this is not possible,
 then does it make sense to add this requirement into TO-DO list.

So you want to write procedural code on the client.  psql works but
doesn't have IF and loop constructs.  pgbash allows this, though you are
writing shell scripts.  Is that OK?  Here are some examples:

http://www.psn.co.jp/PostgreSQL/pgbash/example-e.html

You can do things in perl too, if you want, and tcl, and almost any
other open source language.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PL/pgSQL proposal: using list of scalars in assign stmts, fore and fors stmts

2005-12-22 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 How about:
 target2 := {row|record|variable|'[ROW](' comma separated list of scalar 
 vars ')'}
 instead, where the ROW is optional?

If we're going to do this at all (which I'm still agin), I think the ROW
keyword is important to minimize ambiguity.  If you are allowed to start
a statement with just (x, ... then there will be way too many
situations where the parser gets confused by slightly bad input,
resulting in way-off-base syntax error reports.  Or worse, no syntax
error, but a function that does something else than you expected.

I know that ROW is optional in the bit of SQL syntax that this proposal
is based on, but that's only because the SQL spec says we have to, not
because it's a good idea.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Oracle PL/SQL Anonymous block equivalent in postgres

2005-12-22 Thread Hannu Krosing
Ühel kenal päeval, N, 2005-12-22 kell 18:19, kirjutas Bruce Momjian:
 Ashok Agrawal wrote:
  Hi ,
  
  Here is the requirements :
  
  1. Development  Support team can ONLY have READ Only ACCESS
 to PRODUCTION Database. They will NOT have access to create
 stored procedure  functions in the PRODUCTION on the fly.
  
  2. During application support, need to write script which
 uses procedural language (IF ELSE, AND , OR,  Cursor, Loop etc)
 to produce results or identify/fix issues. Number of this kind of
 Requests goes in hundreds during Quarter end.
  
  Currently on Oracle as long as you have sqlplus read only access,
  you can write PL/SQL block and get the work done.
  
  Since postgres doesn't support procedural language except in
  stored objects like procedure/functions, how do I achieve in
  postgres without using stored objects. If this is not possible,
  then does it make sense to add this requirement into TO-DO list.
 
 So you want to write procedural code on the client.  

I guess he rather wants to have dont-save-but-execute-immediately
pl/pgsql code.

could this perhaps be solved by having temporary functions, similar to
postgresql's temporary tables ?

--
Hannu


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] what is the smallest working page size for postgresql

2005-12-22 Thread Hannu Krosing
Hi

Could anybody tell me what is the smallest working page size for
postgresql ?

I have a table where access is highly random over huge table getting
usually only one small tuple from each page. One way to get more
performance could be using smaller page size, so the per-tuple read
overhead would be smaller. 

Would 4k pages work ? what about 2k and 1k ? 512bytes ?

What would it take, to make only heap pages small and keep index pages
larger ? Probably at least per-tablespace or per-pagesize split shared
buffer space and changes in caching algorithms ?

Has anyone tested if 8k is big enough to trigger (in my case
unneccesary) read-ahead on disks/controllers/devices/filesystem ?

---
Hannu



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Oracle PL/SQL Anonymous block equivalent in postgres

2005-12-22 Thread Bruce Momjian
Hannu Krosing wrote:
 ?hel kenal p?eval, N, 2005-12-22 kell 18:19, kirjutas Bruce Momjian:
  Ashok Agrawal wrote:
   Hi ,
   
   Here is the requirements :
   
   1. Development  Support team can ONLY have READ Only ACCESS
  to PRODUCTION Database. They will NOT have access to create
  stored procedure  functions in the PRODUCTION on the fly.
   
   2. During application support, need to write script which
  uses procedural language (IF ELSE, AND , OR,  Cursor, Loop etc)
  to produce results or identify/fix issues. Number of this kind of
  Requests goes in hundreds during Quarter end.
   
   Currently on Oracle as long as you have sqlplus read only access,
   you can write PL/SQL block and get the work done.
   
   Since postgres doesn't support procedural language except in
   stored objects like procedure/functions, how do I achieve in
   postgres without using stored objects. If this is not possible,
   then does it make sense to add this requirement into TO-DO list.
  
  So you want to write procedural code on the client.  
 
 I guess he rather wants to have dont-save-but-execute-immediately
 pl/pgsql code.
 
 could this perhaps be solved by having temporary functions, similar to
 postgresql's temporary tables ?

I am thinking they want the ability to sit at a prompt and type stuff.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgadmin-hackers] Client-side password encryption

2005-12-22 Thread Andrew Dunstan



Tom Lane wrote:


Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 

So it appears that pg_md5_encrypt is not officially exported from libpq.  
Does anyone see a problem with adding it to the export list and the 
header file?
 



 

Is it different to normal md5?  How is this helpful to the phpPgAdmin 
project?
   



It would be better to export an API that is (a) less random (why one
input null-terminated and the other not?) and (b) less tightly tied
to MD5 --- the fact that the caller knows how long the result must be
is the main problem here.

Something like
char *pg_gen_encrypted_passwd(const char *passwd, const char *user)
with malloc'd result (or NULL on failure) seems more future-proof.


 



Where are we on this? In general I agree with Tom, but I have no time to 
do the work. Unless someone has an immediate implementation, I suggest 
that pro tem we add pg_md5_encrypt to src/interfaces/libpq/exports.txt, 
which is the minimum needed to unbreak Windows builds, while this gets 
sorted out properly.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Qingqing Zhou

Simon Riggs [EMAIL PROTECTED] wrote

 No it isn't. All of the WAL bypass logic does *not* operate when PITR is
 active. The WAL bypass logic is aimed at Data Warehouses, which
 typically never operate in PITR mode for performance reasons, however
 the choice is yours.


To make things, is it possible to add a GUC to let user disable *all* the 
xlogs?

Regards,
Qingqing 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [pgadmin-hackers] Client-side password encryption

2005-12-22 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Where are we on this? In general I agree with Tom, but I have no time to 
 do the work. Unless someone has an immediate implementation, I suggest 
 that pro tem we add pg_md5_encrypt to src/interfaces/libpq/exports.txt, 
 which is the minimum needed to unbreak Windows builds, while this gets 
 sorted out properly.

I had forgotten that the Windows build is broken.  I'll see what I can
do with throwing together the cleaner-API function.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PL/pgSQL proposal: using list of scalars in assign

2005-12-22 Thread Andrew Dunstan



Tom Lane wrote:


David Fetter [EMAIL PROTECTED] writes:
 


How about:
target2 := {row|record|variable|'[ROW](' comma separated list of scalar vars 
')'}
instead, where the ROW is optional?
   



If we're going to do this at all (which I'm still agin), I think the ROW
keyword is important to minimize ambiguity.  If you are allowed to start
a statement with just (x, ... then there will be way too many
situations where the parser gets confused by slightly bad input,
resulting in way-off-base syntax error reports.  Or worse, no syntax
error, but a function that does something else than you expected.

I know that ROW is optional in the bit of SQL syntax that this proposal
is based on, but that's only because the SQL spec says we have to, not
because it's a good idea.


 



I see no virtue in this either. It strikes me as just more syntactic 
sugar, and unless I am misreading or out of date it would be another 
incompatibility with Oracle. I don't mind doing that, but I think it 
should be for a better reason than that it accords with someone's taste 
in syntactic style. I'd be somewhat more persuaded if Oracle did this. I 
also agree with Tom's comments about requiring ROW. As I observed 
regarding another syntax proposal, terseness is not always good, and 
redundancy is not always bad.


cheers

andrew

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Qingqing Zhou

Qingqing Zhou [EMAIL PROTECTED] wrote


 To make things, is it possible to add a GUC to let user disable *all* the 
 xlogs?


It may work like this:

BEGIN TRANSACTION WITHOUT XLOG;/* forbidden vacuum, PITR etc */
BEGIN
.../* no xlog during this peroid */
END;   /* don't mark this transaction committed */
BEGIN
...
END;
END TRANSACTION DO COMMIT;/* at this time issue checkpiont  mark all 
transactions committed */

So during this peroid, if any transaction failed, the only consequence is 
add invisible garbage data. When everything is going well, then END 
TRANSACTION DO COMMIT will mark these transaction permanate. Also, seems 
there is no problem even with XLOG_NO_TRAN updates.

Regards,
Qingqing



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Oracle PL/SQL Anonymous block equivalent in postgres

2005-12-22 Thread Ashok Agrawal
I don't need ability to sit at a prompt and type stuff.

To be more clear, I am attaching one sample code. I would like to
migrate this code using postgres without converting into procedure
or function.

Thanks
Ashok

Bruce Momjian wrote On 12/22/05 15:35,:
 Hannu Krosing wrote:
 
?hel kenal p?eval, N, 2005-12-22 kell 18:19, kirjutas Bruce Momjian:

Ashok Agrawal wrote:

Hi ,

Here is the requirements :

1. Development  Support team can ONLY have READ Only ACCESS
   to PRODUCTION Database. They will NOT have access to create
   stored procedure  functions in the PRODUCTION on the fly.

2. During application support, need to write script which
   uses procedural language (IF ELSE, AND , OR,  Cursor, Loop etc)
   to produce results or identify/fix issues. Number of this kind of
   Requests goes in hundreds during Quarter end.

Currently on Oracle as long as you have sqlplus read only access,
you can write PL/SQL block and get the work done.

Since postgres doesn't support procedural language except in
stored objects like procedure/functions, how do I achieve in
postgres without using stored objects. If this is not possible,
then does it make sense to add this requirement into TO-DO list.

So you want to write procedural code on the client.  

I guess he rather wants to have dont-save-but-execute-immediately
pl/pgsql code.

could this perhaps be solved by having temporary functions, similar to
postgresql's temporary tables ?
 
 
 I am thinking they want the ability to sit at a prompt and type stuff.
 
~


sample.sql
Description: application/soffice

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 BEGIN TRANSACTION WITHOUT XLOG;/* forbidden vacuum, PITR etc */

 So during this peroid, if any transaction failed, the only consequence is 
 add invisible garbage data.

No, the likely consequence is irretrievable corruption of any table or
index page touched by the transaction.  You're going to have a very hard
time selling this as a good idea.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Oracle PL/SQL Anonymous block equivalent in postgres

2005-12-22 Thread Bruce Momjian

Wow, that is large.  I think PL/pgSQL is your best approach.  I
recommend you create a schema that users can write into.

---

Ashok Agrawal wrote:
 I don't need ability to sit at a prompt and type stuff.
 
 To be more clear, I am attaching one sample code. I would like to
 migrate this code using postgres without converting into procedure
 or function.
 
 Thanks
 Ashok
 
 Bruce Momjian wrote On 12/22/05 15:35,:
  Hannu Krosing wrote:
  
 ?hel kenal p?eval, N, 2005-12-22 kell 18:19, kirjutas Bruce Momjian:
 
 Ashok Agrawal wrote:
 
 Hi ,
 
 Here is the requirements :
 
 1. Development  Support team can ONLY have READ Only ACCESS
to PRODUCTION Database. They will NOT have access to create
stored procedure  functions in the PRODUCTION on the fly.
 
 2. During application support, need to write script which
uses procedural language (IF ELSE, AND , OR,  Cursor, Loop etc)
to produce results or identify/fix issues. Number of this kind of
Requests goes in hundreds during Quarter end.
 
 Currently on Oracle as long as you have sqlplus read only access,
 you can write PL/SQL block and get the work done.
 
 Since postgres doesn't support procedural language except in
 stored objects like procedure/functions, how do I achieve in
 postgres without using stored objects. If this is not possible,
 then does it make sense to add this requirement into TO-DO list.
 
 So you want to write procedural code on the client.  
 
 I guess he rather wants to have dont-save-but-execute-immediately
 pl/pgsql code.
 
 could this perhaps be solved by having temporary functions, similar to
 postgresql's temporary tables ?
  
  
  I am thinking they want the ability to sit at a prompt and type stuff.
  
 ~

[ application/x-soffice is not supported, skipping... ]

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] wrote
 Qingqing Zhou [EMAIL PROTECTED] writes:
 BEGIN TRANSACTION WITHOUT XLOG;/* forbidden vacuum, PITR etc */

 So during this peroid, if any transaction failed, the only consequence is
 add invisible garbage data.

 No, the likely consequence is irretrievable corruption of any table or
 index page touched by the transaction.


I guess I know (at least part) of what you mean. This is because we rely on 
replay all the xlog no matter it belongs to a committed transaction or not. 
Why? Because a failed transaction is not totally useless since later 
transaction may reply on some physical thing it creates - for example, a new 
page and its links of a btree. So for heap, there is(95% sure) no such 
problem. Our heap redo algorithm can automatically add empty pages to a 
heap. For index, there are problems, but I suspect they are solvable by not 
bypassing these records ... if this is not totally-nonstarter, I will 
investigate details of how to do it.

Regards,
Qingqing 



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [pgadmin-hackers] Client-side password encryption

2005-12-22 Thread Tom Lane
I wrote:
 I had forgotten that the Windows build is broken.  I'll see what I can
 do with throwing together the cleaner-API function.

Done, but I noticed that the change to createuser has arguably broken
it; at least we need to change the docs.  To wit, the docs say

-E
--encrypted
 Encrypts the user's password stored in the database. If not
 specified, the default password behavior is used.

-N
--unencrypted
 Does not encrypt the user's password stored in the database. If not
 specified, the default password behavior is used.

As currently coded, however, the behavior when neither switch is given
is to force the password to be encrypted --- the database's
password_encryption setting is overridden.

I'm not sure we can do much about this --- certainly we don't want the
default behavior of createuser to still be to send an unencrypted
password.  But if we leave the code as-is the docs need a change.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] what is the smallest working page size for postgresql

2005-12-22 Thread Qingqing Zhou

Hannu Krosing [EMAIL PROTECTED] wrote

 Could anybody tell me what is the smallest working page size for
 postgresql ?

 I have a table where access is highly random over huge table getting
 usually only one small tuple from each page. One way to get more
 performance could be using smaller page size, so the per-tuple read
 overhead would be smaller.

 Would 4k pages work ? what about 2k and 1k ? 512bytes ?

 What would it take, to make only heap pages small and keep index pages
 larger ? Probably at least per-tablespace or per-pagesize split shared
 buffer space and changes in caching algorithms ?


I recall there was a discussion several weeks ago:

http://archives.postgresql.org/pgsql-performance/2005-12/msg00120.php

I bet block size less than 512 won't bring you any benefits, since that's 
the physical disk sector size limit.

Regards,
Qingqing 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [pgadmin-hackers] Client-side password encryption

2005-12-22 Thread Christopher Kings-Lynne
Where are we on this? In general I agree with Tom, but I have no time to 
do the work. Unless someone has an immediate implementation, I suggest 
that pro tem we add pg_md5_encrypt to src/interfaces/libpq/exports.txt, 
which is the minimum needed to unbreak Windows builds, while this gets 
sorted out properly.



I had forgotten that the Windows build is broken.  I'll see what I can
do with throwing together the cleaner-API function.


Another question about these encrypted passwords.  phpPgAdmin needs to 
connect to databases that are sometimes on other servers.


I use the pg_connect() function to do this.  This is passed down to 
PQconenct() I presume.


So, can I specify the password to pg_connect() as 
'md5127349123742342344234'?


ie. Can I CONNECT using an md5'd password?

Also, does this work for non-md5 host lines on the server, and how can I 
avoid doing it on older (pre-7.2) PostgreSQL??


Chris


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [pgadmin-hackers] Client-side password encryption

2005-12-22 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 So, can I specify the password to pg_connect() as 
 'md5127349123742342344234'?

Certainly not.  We'd hardly be worrying about obscuring the original
password if the encrypted version were enough to get in with.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [Bizgres-general] WAL bypass for INSERT, UPDATE and

2005-12-22 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 I guess I know (at least part) of what you mean. This is because we rely on 
 replay all the xlog no matter it belongs to a committed transaction or not. 
 Why? Because a failed transaction is not totally useless since later 
 transaction may reply on some physical thing it creates - for example, a new 
 page and its links of a btree. So for heap, there is(95% sure) no such 
 problem.

Torn pages (partial page write) are still a problem.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [pgadmin-hackers] Client-side password encryption

2005-12-22 Thread Christopher Kings-Lynne
So, can I specify the password to pg_connect() as 
'md5127349123742342344234'?


Certainly not.  We'd hardly be worrying about obscuring the original
password if the encrypted version were enough to get in with.


AndrewSN can't post at the moment, but asked me to post this for him:

Knowing the md5 hash is enough to authenticate via the 'md5' method in 
pg_hba.conf, even if you don't know the original password. Admittedly 
you have to modify libpq to do this, but this isn't going to stop an 
attacker for more than 5 seconds.


I'll add my own note that never sending the cleartext password does not 
necessarily improve PostgreSQL security, but certainly stops someone who 
sniffs the password from then using that cleartext password to get into 
other applications.  If all they can get is the md5 hash, then all they 
can get into is PostgreSQL.


Chris


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] [pgadmin-hackers] Client-side password encryption

2005-12-22 Thread Tom Lane
Christopher Kings-Lynne [EMAIL PROTECTED] writes:
 AndrewSN can't post at the moment, but asked me to post this for him:
 Knowing the md5 hash is enough to authenticate via the 'md5' method in 
 pg_hba.conf, even if you don't know the original password.

If you know the md5 hash, you know everything the postmaster does, so
it's hard to see where such an attacker is going to be stopped.  The
entire point here is not to expose the cleartext password, and that
really has nothing to do with whether you're going to break into the
PG database.  It's about protecting users who are foolish enough to
use the same cleartext password for multiple services.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq