date:20060724

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread Jie Zhang


Thanks Tom and Gavin for your comments!

Yes, this patch is generated against 8.2 in a short time. As long as the
code is working, I post the patch to get some comments and help.

 
 * The xlog routines need help; they seem to not be updated for recent
 changes in the API for xlog recovery code.
 
 Yep. The patch was actually against 8.1 and was hastily brought up to 8.2.
 I think Jie's intention was to simply let everyone know that this was
 going on.

Thanks for pointing this out. I didn't notice that these are changed in 8.2.

 
 
 * The hacks on vacuum.c (where it tests the AM name) are utterly
 unacceptable.  If you need the main code to do something special for a
 particular index AM, define a bool flag column for it in pg_am.
 
 Yes.

Sounds good.

 
 
 * The interface to the existing executor bitmap scan code is mighty
 messy --- seems like there's a lot of almost duplicate code, a lot
 of rather invasive hacking, etc.  This needs to be rethought and
 refactored.
 
 I agree.

I will think about this more.

 
 
 * Why in the world is it introducing duplicate copies of a lot
 of btree comparison functions?  Use the ones that are there.
 
 Yes, I raised this with Jie and she has fixed it. One thought is, we may
 want to rename those comparison functions prefixed with 'bm' to make their
 naming less confusing. They'll be used by btree, gin and bitmap index
 methods. Anyway, a seperate patch.

Yeah, the main problem I hesitated to use btree's comparison functions
because of those function names starting with 'bt'. Since Gavin told me that
Gin is using those functions as well, I had changed them. Renaming them
would be good.

 
 
 * The patch itself is a mess: it introduces .orig and .rej files,
 changes around $PostgreSQL$ lines, etc.
 
 
 Right, not to mention patches to configure and a lot of style which needs
 to be knocked into shape.
 

The way I generate a patch is kind of clumsy. I need to find a better way to
do that.

I will start fixing these.

Thanks,
Jie



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Adding a pgbench run to buildfarm

2006-07-24 Thread Dave Page

 -Original Message-
 From: [EMAIL PROTECTED] 
 [mailto:[EMAIL PROTECTED] On Behalf Of Bort, Paul
 Sent: 24 July 2006 04:52
 To: pgsql-hackers@postgresql.org
 Subject: [HACKERS] Adding a pgbench run to buildfarm

 -hackers,

 With help from Andrew Dunstan, I'm adding the ability to do a pgbench
 run after all of the other tests during a buildfarm run. 

 Andrew said I should solicit opinions as to what parameters to use. A
 cursory search through the archives led me to pick a scaling factor of
 10, 5 users, and 100 transactions. All of these will be 
 adjustable using
 the build-farm.conf mechanism already in place. 

 Comments? Suggestions?

Please ensure the run is optional. The machine hosting Snake and
Bandicoot is currently running 16 builds a day, and I'd prefer not to
significantly add to it's load.

Regards, Dave.

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Golden Liu


Updateable cursors are used as follows:

begin;
declare foo cursor for select * from bar for update;
fetch foo;
update bar set abc='def' where current of foo;
fetch foo;
delete from bar where current of foo;
commit;


PostgreSQL doesn't support this feature now ( 8.1.4). Will PGSQL
support it recently? Does anyone work on this?


thanks
Golden
 7.24

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Transaction Speed and real time database

2006-07-24 Thread Csaba Nagy

[please use reply to all, otherwise you'll have what you just had: the
guy who you write goes home for the weekend and all the rest of the
people on the list who would answer you won't know there is soemthing to
answer...]

On Fri, 2006-07-21 at 13:39, moises wrote:
 Sorry if I can't explain me clear.
 I want to use an extension of postgres, (because I'm very clear that
 postgres is not a real time database.)
 I want to modify some modules of postgres for this convenience, for example,
 add some time constrains to transactions, I know that is a very hard work, 
 Our Applications are 100% real time works, controlling industrial process,
 and not OLAP or OLTP applications.
 
 My question is how many fast are postgres in transaction actually? Because
 maybe we don't need to work, in looking for more speed, just constrains and
 planners.

I have to admit RT is for me just a vague idea I still remember from
some courses, but I'm not sure if RT and transactional is compatible at
all... do you really need a transactional data base for RT applications
? Cause postgres is just that, an excellent transactional DB, which BTW
is fast enough for transactional work (where you expect that sometimes
operations fail due to others working in parallel). I'm not sure in
industrial RT applications you could afford failures due to
concurrency... and in that case you would be better off using something
non-transactional - but this is just my limited understanding of the
problem. Others on this list might know better...

Cheers,
Csaba.




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Adding a pgbench run to buildfarm

2006-07-24 Thread Stefan Kaltenbrunner

Mark Kirkwood wrote:
 Tom Lane wrote:
 Bort, Paul [EMAIL PROTECTED] writes:
 Andrew said I should solicit opinions as to what parameters to use. A
 cursory search through the archives led me to pick a scaling factor of
 10, 5 users, and 100 transactions.

 100 transactions seems barely enough to get through startup transients.
 Maybe 1000 would be good.

 
 Scale factor 10 produces an accounts table of about 130 Mb. Given that
 most HW these days has at least 1G of ram, this probably means not much
 retrieval IO is tested (only checkpoint and wal fsync). Do we want to
 try 100 or even 200? (or recommend scale factor such that size  ram)?

hmm - that 1GB is a rather optimistic estimate for most of the
buildfarm boxes(mine at least).
Out of the 6 ones I have - only one that actually has much RAM
(allocated) and lionfish for example is rather resource starved at only
48(!) MB of RAM and very limited diskspace - which has been plenty
enough until now doing the builds (with enough swap of course).
I supposed that anything that would result in additional diskspace usage
in excess of maybe 150MB or so would run it out of resources :-(

I'm also not too keen on running excessivly long pgbench runs on some of
the buildfarm members so I would prefer to make that one optional.


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Gavin Sherry

On Mon, 24 Jul 2006, Golden Liu wrote:

 Updateable cursors are used as follows:

 begin;
 declare foo cursor for select * from bar for update;
 fetch foo;
 update bar set abc='def' where current of foo;
 fetch foo;
 delete from bar where current of foo;
 commit;


 PostgreSQL doesn't support this feature now ( 8.1.4). Will PGSQL
 support it recently? Does anyone work on this?


No one has stepped up to do this for 8.2 so unfortunately you will most
likely not see this within the next year or so :-(.

Thanks,

Gavin

PS: sorry for not responding to your private email in time.

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Documenting Replication Solutions

2006-07-24 Thread Markus Schiltknecht


Hello Christopher,

I remember Bruce pushing towards a 'PostgreSQL Development 
Group'-statement about replication solutions for PostgreSQL. AFAIK, 
there is a lot of uncertainty about replication solutions and I think it 
would be a good thing to give people some guidance. I think in a FAQ or 
such we should cover these type of questions:


- Can I configure a standby server for hot-failover?

- Can I do load-balancing?

- What kind of replication is possible with PostgreSQL, what do the 
different solutions implement?


- What's the development status of Slony-II?, of Postgres-R, PgCluster, 
etc...


- What's the difference between Slony-II and Postgres-R?

Where on the website do we put such a FAQ? Or should some of these 
questions be part of the main FAQ?


Regards

Markus

Christopher Browne wrote:

Quoth Markus Schiltknecht [EMAIL PROTECTED]:

at the code sprint, we agreed to put together some documentation about
current and upcoming replication solutions for PostgreSQL. Is somebody
already working on that?


I don't recall that anyone expressly agreed to do so; I'll see if I
can, this week...


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Transaction Speed and real time database

2006-07-24 Thread Joerg Hessdoerfer

Hi,

On Monday 24 July 2006 10:33, Csaba Nagy wrote:
 [please use reply to all, otherwise you'll have what you just had: the
 guy who you write goes home for the weekend and all the rest of the
 people on the list who would answer you won't know there is soemthing to
 answer...]

 On Fri, 2006-07-21 at 13:39, moises wrote:
  Sorry if I can't explain me clear.
  I want to use an extension of postgres, (because I'm very clear that
  postgres is not a real time database.)
  I want to modify some modules of postgres for this convenience, for
  example, add some time constrains to transactions, I know that is a very
  hard work, Our Applications are 100% real time works, controlling
  industrial process, and not OLAP or OLTP applications.
 
  My question is how many fast are postgres in transaction actually?
  Because maybe we don't need to work, in looking for more speed, just
  constrains and planners.

 I have to admit RT is for me just a vague idea I still remember from
 some courses, but I'm not sure if RT and transactional is compatible at
 all... do you really need a transactional data base for RT applications
 ? Cause postgres is just that, an excellent transactional DB, which BTW
 is fast enough for transactional work (where you expect that sometimes
 operations fail due to others working in parallel). I'm not sure in
 industrial RT applications you could afford failures due to
 concurrency... and in that case you would be better off using something
 non-transactional - but this is just my limited understanding of the
 problem. Others on this list might know better...

 Cheers,
 Csaba.


As someone 'from the industry' I have to chime in here. IMHO there is no way 
to make PostgreSQL itself RT capable - not for a long time. 
The following items (at least) are roadblocks:

- index tree growth (undeterministic in time and duration)
- need to vacuum (for FIFO applications)
- database file growth (also undeterministic, but can probably be worked 
around by initializing the database with the max. no. of records, then 
deleting and vacuuming without -f. But this requires FSM to be large enough, 
and still there is the need for frequent vacuum).

OTOH, one has to be very careful to not mix terms here. In industrial 
(production floor) applications, the term 'real time database' refers to 
soemthing completely different than a relational, transactional DB.

A 'real time database' in this scenario is something which keeps aggregates of 
data points and events over time, and this is usually the only type of data 
retrieval they allow for: 'give me channels x,y,z from start to end time'. 
The prime property of such an RTDB is constant-time inserts, and priorization 
of access paths. Normally they work in embedded controllers or on real-time 
OSes.

There are RTDBs which can also be seen as relational databases, but the design 
of these is usually the RT part with a relational 'addon'.  

Because of the features of a full-fledged relational database engine, 
engineers often wish they had one of those instead ;-). Usually, we solve 
this with some sort of streaming 'frontend', which buffers the data flow 
(usually to disk) until it's inserted into the database. This lowers the 
real-time requirement to 'has to be fast enough on average'. We have several 
of these types of applications in production at various customers, some for 
6+ years continuously (using PostgreSQL 7.0!).

Hope this clears up the issues somewhat for 'normal' database people.

Greetings,
Jörg
-- 
Leiter Softwareentwicklung - S.E.A GmbH
Mail: [EMAIL PROTECTED]
WWW:  http://www.sea-gmbh.com

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Transaction Speed and real time database

2006-07-24 Thread Csaba Nagy

[snip]
 OTOH, one has to be very careful to not mix terms here. In industrial 
 (production floor) applications, the term 'real time database' refers to 
 soemthing completely different than a relational, transactional DB.

But relational and transactional are orthogonal, they don't
imply/require each other... most of the roadblocks you mentioned
(including vacuum) is part of postgres transactional design and a
non-transactional DB won't have that overhead. Your input enforces my
thinking that the transactionality of the DB is the real roadblock...
which means postgres will never really be an RT application in the
proper sense of the word.

 Because of the features of a full-fledged relational database engine, 
 engineers often wish they had one of those instead ;-). Usually, we solve 
 this with some sort of streaming 'frontend', which buffers the data flow 
 (usually to disk) until it's inserted into the database. This lowers the 
 real-time requirement to 'has to be fast enough on average'. We have several 
 of these types of applications in production at various customers, some for 
 6+ years continuously (using PostgreSQL 7.0!).

This sounds the most reasonable approach :-)

Cheers,
Csaba.



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Transaction Speed and real time database

2006-07-24 Thread Joerg Hessdoerfer

Hi,

On Monday 24 July 2006 11:26, Csaba Nagy wrote:
 [snip]

  OTOH, one has to be very careful to not mix terms here. In industrial
  (production floor) applications, the term 'real time database' refers to
  soemthing completely different than a relational, transactional DB.

 But relational and transactional are orthogonal, they don't
 imply/require each other... most of the roadblocks you mentioned
 (including vacuum) is part of postgres transactional design and a
 non-transactional DB won't have that overhead. Your input enforces my
 thinking that the transactionality of the DB is the real roadblock...
 which means postgres will never really be an RT application in the
 proper sense of the word.

[...]

Yes, the terms are orthogonal. But most relational databases I know of are 
also transactional - because it just makes sense.

The roadblocks I metioned were specific to PG. The storage manager is as it 
is, no way around it. So you need vacuum, you can have index growth, and you 
will have table space growth ;-)

Greetings,
Jörg
-- 
Leiter Softwareentwicklung - S.E.A GmbH
Mail: [EMAIL PROTECTED]
WWW:  http://www.sea-gmbh.com

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Adding a pgbench run to buildfarm

2006-07-24 Thread Andrew Dunstan


Dave Page wrote:


With help from Andrew Dunstan, I'm adding the ability to do a pgbench
run after all of the other tests during a buildfarm run. 






Please ensure the run is optional. The machine hosting Snake and
Bandicoot is currently running 16 builds a day, and I'd prefer not to
significantly add to it's load.

  



Rest easy. It will be optional, of course.

cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Adding a pgbench run to buildfarm

2006-07-24 Thread Andrew Dunstan


Gavin Sherry wrote:

Not all machines stay the same over time.
A machine may by upgraded, a machine may be getting backed up or may in
some other way be utilised during a performance test. This would skew the
stats for that machine. It may confuse people more than help them...

At the very least, the performance figures would need to be accompanied by
details of what other processes were running and what resources they were
chewing during the test.

This is what was nice about the OSDL approach. Each test was preceeded by
an automatic reinstall of the OS and the machines were specifically for
testing. The tester had complete control.

We could perhaps mimic some of that using virtualisation tools which
control access to system resources but it wont work on all platforms. The
problem is that it probably introduces a new variable, in that I'm not
sure that virtualisation software can absolutely limit CPU resources a
particular container has. That is, you might not be able to get
reproducible runs with the same code. :(

  


We are really not going to go in this direction. If you want ideal 
performance tests then a heterogenous distributed collection  of 
autonomous systems like buildfarm is not what you want.


You are going to have to live with the fatc that there will be 
occasional, possibly even frequent, blips in the data due to other 
activity on the machine.


If you want tightly controlled or very heavy load testing this is the 
wrong vehicle.


You might think that what that leaves us is not worth having - the 
consensus in Toronto seemed to be that it is worth having, which is why 
it is being pursued.


cheers

andrew


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[HACKERS] Maximum size of tuples in a relation

2006-07-24 Thread Tzahi Fadida

Is there a way to figure out from the catalogs somehow the largest size in 
bytes of the largest tuple in a relation without going one by one over the 
relation and calculating that. alternatively, without actually pulling out 
the whole tuple from the disk?
(It is necessary for the algorithm to know that value).
10x.

-- 
Regards,
Tzahi.
--
Tzahi Fadida
Blog: http://tzahi.blogsite.org | Home Site: http://tzahi.webhop.info
WARNING TO SPAMMERS:  see at 
http://members.lycos.co.uk/my2nis/spamwarning.html

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Adding a pgbench run to buildfarm

2006-07-24 Thread Andrew Dunstan


Tom Lane wrote:

Mark Kirkwood [EMAIL PROTECTED] writes:
  
Scale factor 10 produces an accounts table of about 130 Mb. Given that 
most HW these days has at least 1G of ram, this probably means not much 
retrieval IO is tested (only checkpoint and wal fsync). Do we want to 
try 100 or even 200? (or recommend scale factor such that size  ram)?



That gets into a different set of questions, which is what we want the
buildfarm turnaround time to be like.  The faster members today produce
a result within 10-15 minutes of pulling their CVS snaps, and I'd be
seriously unhappy if that changed to an hour or three.  Maybe we need to
divorce compile/regression tests from performance tests?


  


We could have the system report build/regression results before going on 
to do performance testing. I don't want to divorce them altogether if I 
can help it, as it will make cleanup a lot messier.


cheers

andrew


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread Hannu Krosing

Ühel kenal päeval, P, 2006-07-23 kell 20:25, kirjutas Tom Lane:
 Gavin Sherry [EMAIL PROTECTED] writes:
  On Sun, 23 Jul 2006, Tom Lane wrote:
  However, the main problem I've got with this is that a new index AM is a
  pretty large burden, and no one's made the slightest effort to sell
  pghackers on taking this on.
 
  For low cardinality sets, bitmaps greatly out perform btree.
 
 If the column is sufficiently low cardinality, you might as well just do
 a seqscan --- you'll be hitting most of the heap's pages anyway.  I'm
 still waiting to be convinced that there's a sweet spot wide enough to
 justify supporting another index AM.  (I'm also wondering whether this
 doesn't overlap the use-case for GIN.)

IIRC they quoted the cardinality of 1 as something that is still
faster than btree for several usecases.

And also for AND-s of several indexes, where indexes are BIG, your btree
indexes may be almost as big as tables but the resulting set of pages is
small.

-- 

Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] LDAP patch feature freeze

2006-07-24 Thread Tom Lane

Albe Laurenz [EMAIL PROTECTED] writes:
 Any chance that my LDAP patch
 http://momjian.us/mhonarc/patches/msg0.html
 will get reviewed before the feature freeze?

Feature freeze is the deadline for patch submission, not patch application.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Allow commenting of variables in postgresql.conf to - try 4

2006-07-24 Thread Tom Lane

Joachim Wieland [EMAIL PROTECTED] writes:
 I did not check the changes you have done to set_config_option and the like
 but tested the commenting / uncommenting / changing of guc variables and the
 behavior and log output. The general idea (at least my idea) is that
 whenever a SIGHUP is received and there is some difference between the
 config file and the active value that the server is using, a notice message
 is written to the log.

Notice message?  Where did that come from?  The behavior I thought
people were after was just that variables previously defined by the file
would revert to reset values if not any longer defined by the file.

From a reviewer's point of view, it'd be nice if the patch did not
contain so many useless changes of whitespace.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PATCHES] patch implementing the multi-argument aggregates (SOC project)

2006-07-24 Thread Tom Lane

Sergey E. Koposov [EMAIL PROTECTED] writes:
 Since the feature freeze is in a few days, I'm sending the first iteration 
 of my patch implementing the multi-argument aggregates (PolyArgAgg) (SOC 
 project)

This patch is nowhere near ready for submission :-(.  Most of the
comments seem to be I don't know what to do here ...

A general hint on the polymorphic stuff is that you should be able to
exactly duplicate what's done for polymorphic functions --- or even
better, get rid of the separate code for aggregates and just invoke
the existing logic for functions.  (You might need to refactor code
a little bit to separate out the common functionality.)

Instead of copying data inside advance_transition_function, it might
be better for the caller to store the values into the right fields
of a temporary FunctionCallInfoData struct, and just pass that to
advance_transition_function.

The names for the new aggregates seem a bit, how to say, terse and
unfriendly.  SQL generally tends to a more verbose style of naming.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] Allow commenting of variables in postgresql.conf to - try 4

2006-07-24 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote:
 Joachim Wieland [EMAIL PROTECTED] writes:
  I did not check the changes you have done to set_config_option and the like
  but tested the commenting / uncommenting / changing of guc variables and the
  behavior and log output. The general idea (at least my idea) is that
  whenever a SIGHUP is received and there is some difference between the
  config file and the active value that the server is using, a notice message
  is written to the log.
 
 Notice message?  Where did that come from?  The behavior I thought
 people were after was just that variables previously defined by the file
 would revert to reset values if not any longer defined by the file.

There's two issues here, I believe.  There's the
'revert-to-reset-values' issue for things which can be changed with a
reload and then there's also the 'notice-message-if-unable-to-change'
a given variable without a reset.

On reload a variable is changed:

#1: That variable can be changed by a reload.
If the variable has been removed/commented-out then it is reverted
to the reset-value.  Otherwise, the new value is used.

#2: That variable can *not* be changed by a reload.
Notice-level message is sent to the log notifying the admin that the
change requested could not be performed.  This change could be
either a revert to reset-value if it was removed/commented-out or an
explicit change request to a different value.

Personally, I'm very interested in having both.  I'm about 90% sure both
were discussed previously on hackers and that the general consensus was
that both were good.  It's possible the second point wasn't noticed by
everyone involved though.  Of course, I might be misunderstanding what
Joachim was referring to also.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] [PATCHES] patch implementing the multi-argument aggregates (SOC

2006-07-24 Thread Sergey E. Koposov


On Mon, 24 Jul 2006, Tom Lane wrote:


Sergey E. Koposov [EMAIL PROTECTED] writes:

Since the feature freeze is in a few days, I'm sending the first iteration
of my patch implementing the multi-argument aggregates (PolyArgAgg) (SOC
project)


This patch is nowhere near ready for submission :-(.  Most of the


:-(
But now at least I know that...


comments seem to be I don't know what to do here ...



No that's not quite true... I have only ~ 2-3 such comments, all others
just express that I marked the places where I've had any little doubts 
and which I'll check additionally...



A general hint on the polymorphic stuff is that you should be able to
exactly duplicate what's done for polymorphic functions --- or even
better, get rid of the separate code for aggregates and just invoke
the existing logic for functions.  (You might need to refactor code
a little bit to separate out the common functionality.)

Instead of copying data inside advance_transition_function, it might
be better for the caller to store the values into the right fields
of a temporary FunctionCallInfoData struct, and just pass that to
advance_transition_function.


Thank you for the hints, I'll think about them...


The names for the new aggregates seem a bit, how to say, terse and
unfriendly.  SQL generally tends to a more verbose style of naming.



The names for the functions came from SQL 2003 standart...

Regards,
Sergey

***
Sergey E. Koposov
Max Planck Institute for Astronomy/Sternberg Astronomical Institute
Tel: +49-6221-528-349
Web: http://lnfm1.sai.msu.ru/~math
E-mail: [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Florian G. Pflug


Gavin Sherry wrote:

On Mon, 24 Jul 2006, Golden Liu wrote:


begin;
declare foo cursor for select * from bar for update;
fetch foo;
update bar set abc='def' where current of foo;
fetch foo;
delete from bar where current of foo;
commit;



No one has stepped up to do this for 8.2 so unfortunately you will most
likely not see this within the next year or so :-(.


Couldn't this be emulated by doing
begin;
declare foo cursor for select * from bar for update;
fetch foo into v_foo ;
update bar set abc='def' where ctid = v_foo.ctid;
fetch foo into v_foo ;
delete from bar where ctid = v_foo.ctid;
commit;

Or could a concurrent vacuum run lead to the wrong
rows being updated/deleted?

greetings, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Tom Lane

I'm fairly dissatisfied with the naming of Greg Stark's proposed new
feature for creating indexes without blocking writers of the table.
To my mind, ONLINE just doesn't convey any useful information ---
the existing CREATE INDEX functionality could already be said to be
online, in the sense that you don't have to take down the database
to do it.  I thought about SHARED but someone could probably raise
the same objection to it.  Anyone have a better idea?

I'm also wondering about where in the command the keyword should go.
As submitted it's

  ( { replaceable class=parametercolumn/replaceable | ( replaceable 
class=parameterexpression/replaceable ) } [ replaceable 
class=parameteropclass/replaceable ] [, ...] )
  [ WITH ( replaceable class=PARAMETERstorage_parameter/replaceable = 
replaceable class=PARAMETERvalue/replaceable [, ... ] ) ]
  [ TABLESPACE replaceable class=parametertablespace/replaceable ]
+ [ ONLINE]
  [ WHERE replaceable class=parameterpredicate/replaceable ]

which seems a bit randomly chosen; what's more it creates a problem for
psql, which would have to parse nearly the entire command to discover
whether it's safe to execute inside a transaction block or not.
I'm tempted to put the new keyword at the very front:

SHARED CREATE INDEX 

which would probably mean that we'd have to document it as if it were a
completely separate command from CREATE INDEX, but then again that might
not be a bad thing considering how differently the two cases behave.
If not that, we probably still need to put it somewhere near the front
for psql's sake.

Comments?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Rod Taylor

   SHARED CREATE INDEX 

 Comments?

CREATE [UNIQUE] INDEX foo [WITH NOLOCK] ON ...


-- 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Rod Taylor

Sorry, hit send too quickly.

NOLOCK is kinda like NOWAIT, except implies that the command will not
take a strong lock instead of stating that it will not wait for one.

On Mon, 2006-07-24 at 11:20 -0400, Rod Taylor wrote:
  SHARED CREATE INDEX 
 
  Comments?
 
 CREATE [UNIQUE] INDEX foo [WITH NOLOCK] ON ...
 
 
-- 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Tom Lane

Florian G. Pflug [EMAIL PROTECTED] writes:
 Couldn't this be emulated by doing
 begin;
 declare foo cursor for select * from bar for update;
 fetch foo into v_foo ;
 update bar set abc='def' where ctid = v_foo.ctid;

That wouldn't follow the expected semantics if there's a concurrent
update, because the updated row would always fail the WHERE clause,
and thus the update would just silently not happen.  (I'm thinking
about READ COMMITTED mode of course --- in SERIALIZABLE you'd just get
the expected error.)  You'd have to find some way to pump the row's most
up-to-date version through the cursor's query plan, a la EvalPlanQual,
to see if it still met the cursor's WHERE condition.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Alvaro Herrera

Florian G. Pflug wrote:
 Gavin Sherry wrote:
 On Mon, 24 Jul 2006, Golden Liu wrote:
 
 begin;
 declare foo cursor for select * from bar for update;
 fetch foo;
 update bar set abc='def' where current of foo;
 fetch foo;
 delete from bar where current of foo;
 commit;
 
 No one has stepped up to do this for 8.2 so unfortunately you will most
 likely not see this within the next year or so :-(.
 
 Couldn't this be emulated by doing
 begin;
 declare foo cursor for select * from bar for update;
 fetch foo into v_foo ;
 update bar set abc='def' where ctid = v_foo.ctid;
 fetch foo into v_foo ;
 delete from bar where ctid = v_foo.ctid;
 commit;
 
 Or could a concurrent vacuum run lead to the wrong
 rows being updated/deleted?

No, a concurrent vacuum can't change that because vacuum can't change
the page unless it can get a super-exclusive lock on it (which means
nobody else can have a scan stopped at that page, which is exactly
what this cursor has).

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Florian G. Pflug


Tom Lane wrote:

Florian G. Pflug [EMAIL PROTECTED] writes:

Couldn't this be emulated by doing
begin;
declare foo cursor for select * from bar for update;
fetch foo into v_foo ;
update bar set abc='def' where ctid = v_foo.ctid;


That wouldn't follow the expected semantics if there's a concurrent
update, because the updated row would always fail the WHERE clause,
and thus the update would just silently not happen.  (I'm thinking
about READ COMMITTED mode of course --- in SERIALIZABLE you'd just get
the expected error.)  You'd have to find some way to pump the row's most
up-to-date version through the cursor's query plan, a la EvalPlanQual,
to see if it still met the cursor's WHERE condition.


How could there be a concurrent update of the _same_ row, when
I do select * from bar *for update*. Or are you talking about
concurrent updates to the same page that could somehow alter
the ctid of _another_ tuple?

greetings, Florian Pflug

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Peter Eisentraut

Am Montag, 24. Juli 2006 17:13 schrieb Tom Lane:
 To my mind, ONLINE just doesn't convey any useful information ---
 the existing CREATE INDEX functionality could already be said to be
 online, in the sense that you don't have to take down the database
 to do it.  I thought about SHARED but someone could probably raise
 the same objection to it.  Anyone have a better idea?

CONCURRENTLY

 I'm tempted to put the new keyword at the very front:

   SHARED CREATE INDEX 

 which would probably mean that we'd have to document it as if it were a
 completely separate command from CREATE INDEX, but then again that might
 not be a bad thing considering how differently the two cases behave.

What is so different about them that would justify this?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Tom Lane

Alvaro Herrera [EMAIL PROTECTED] writes:
 No, a concurrent vacuum can't change that because vacuum can't change
 the page unless it can get a super-exclusive lock on it (which means
 nobody else can have a scan stopped at that page, which is exactly
 what this cursor has).

More to the point, vacuum certainly may not delete a row that's still
visible to any open transaction, which this row would be by definition.
And VACUUM FULL couldn't move it, because it couldn't get exclusive
lock on the table.

You'd probably have to forbid use of WHERE CURRENT for a cursor WITH HOLD
though, since that quite possibly would contain rows that don't exist
anymore.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Tom Lane

Florian G. Pflug [EMAIL PROTECTED] writes:
 How could there be a concurrent update of the _same_ row, when
 I do select * from bar *for update*.

AFAICT the spec doesn't require one to have written FOR UPDATE
in order to use WHERE CURRENT OF.  (In effect, they expect FOR UPDATE
to be the default, which is certainly not a change we're going to
want to make to DECLARE CURSOR.)  If we did make that restriction
then we could probably skip the EvalPlanQual mess.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Tom Lane

Peter Eisentraut [EMAIL PROTECTED] writes:
 Am Montag, 24. Juli 2006 17:13 schrieb Tom Lane:
 I'm tempted to put the new keyword at the very front:
 
 SHARED CREATE INDEX 
 
 which would probably mean that we'd have to document it as if it were a
 completely separate command from CREATE INDEX, but then again that might
 not be a bad thing considering how differently the two cases behave.

 What is so different about them that would justify this?

Well, mainly it's exactly the reason that psql wants to know the
difference: one can be executed inside a transaction block, and the
other one can't.  To my mind that's a sufficiently big difference
that it deserves a different command name.  We messed this up with
CLUSTER but that's not a precedent I want to follow.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] UPDATE/DELETE XXX WHERE CURRENT OF cursor_name

2006-07-24 Thread Florian G. Pflug


Tom Lane wrote:

Florian G. Pflug [EMAIL PROTECTED] writes:

How could there be a concurrent update of the _same_ row, when
I do select * from bar *for update*.


AFAICT the spec doesn't require one to have written FOR UPDATE
in order to use WHERE CURRENT OF.  (In effect, they expect FOR UPDATE
to be the default, which is certainly not a change we're going to
want to make to DECLARE CURSOR.)  If we did make that restriction
then we could probably skip the EvalPlanQual mess.


But if the expect for update to be default, then essentially they
do require that one to use a cursor with for update semantics when
using where current of - or do they allow where current of even
for not for update cursors?

If one would restrict in implementation of where current of to
for update, without hold cursors, the only non-trivial problem that 
I can see is how to support more than one update of the same row.


Because as far as I can see, if you'd do
begin;
declare foo cursor select * from bar for update;
fetch foo into v_foo ;
update bar set ... where ctid = v_foo.ctid ;
update bar set ... where ctid = v_foo.ctid ;
commit;

the second update would silently be ignored. But since only
updates happing in the same transaction would somehow need to be
tracked, this should be much easier to do than supporting
the non-for-update case.

greetings, Florian Pflug


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Joshua D. Drake


Josh Berkus wrote:

Neil,


(FWIW, I'd be fairly comfortable hacking on PL/Ruby, as I have some
prior experience with Ruby and its C API.)


Well, if you're willing to be a maintainer, that removes a major roadblock.



O.k. so we don't loose this. Do we want to work on PL/Ruby in core or not?

Joshua D. Drake


--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Generic Monitoring Framework with DTrace patch

2006-07-24 Thread Peter Eisentraut

I've committed the dtrace patch.  Some documentation would be nice now ...

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Peter Eisentraut

Am Montag, 24. Juli 2006 18:23 schrieb Joshua D. Drake:
 O.k. so we don't loose this. Do we want to work on PL/Ruby in core or not?

Unless you plan to fork or hijack the package, we need to hear from the author 
first.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Resurrecting per-page cleaner for btree

2006-07-24 Thread Tom Lane

ITAGAKI Takahiro [EMAIL PROTECTED] writes:
 This is a revised patch originated by Junji TERAMOTO for HEAD.
   [BTree vacuum before page splitting]
   http://archives.postgresql.org/pgsql-patches/2006-01/msg00301.php
 I think we can resurrect his idea because we will scan btree pages
 at-atime now; the missing-restarting-point problem went away.
 Have I missed something? Comments welcome.

I think the only serious objection to this would be that it'd mean that
tuples that should have an index entry might not have one.  The current
form of VACUUM does not care, but people keep raising the idea of doing
retail vacuuming that operates by looking up index entries explicitly.
You could certainly make a retail vacuumer do nothing if it fails to
find the expected index entry, but ISTM that'd be a rather serious loss
of consistency checking --- you could not tell the someone-already-
deleted-it case apart from a bug in the vacuumer's index value
computation or lookup.

Personally I don't think retail vacuuming in that form will ever fly
anyway, so I have no problem with installing the proposed patch,
but I thought I'd better throw this comment out to see if anyone
thinks it's a big deal.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Adding a pgbench run to buildfarm

2006-07-24 Thread Bort, Paul

Andrew Dunstan wrote:
 
 We are really not going to go in this direction. If you want ideal 
 performance tests then a heterogenous distributed collection  of 
 autonomous systems like buildfarm is not what you want.
 
 You are going to have to live with the fatc that there will be 
 occasional, possibly even frequent, blips in the data due to other 
 activity on the machine.
 
 If you want tightly controlled or very heavy load testing this is the 
 wrong vehicle.
 
 You might think that what that leaves us is not worth having - the 
 consensus in Toronto seemed to be that it is worth having, 
 which is why 
 it is being pursued.
 

I wasn't at the conference, but the impression I'm under is that the
point of this isn't to catch a change that causes a 1% slowdown; the
point is to catch much larger problems, probably 20% slowdown or more.

Given the concerns about running this on machines that don't have a lot
of CPU and disk to spare, should it ship disabled?

Andrew, what do you think of pgbench reports shipping separately? I have
no idea how the server end is set up, so I don't know how much of a pain
that would be. 

Regards,
Paul Bort

P.S. My current thought for settings is scaling factor 10, users 5,
transactions 1000.

 

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Alvaro Herrera

Joshua D. Drake wrote:
 Josh Berkus wrote:
 Neil,
 
 (FWIW, I'd be fairly comfortable hacking on PL/Ruby, as I have some
 prior experience with Ruby and its C API.)
 
 Well, if you're willing to be a maintainer, that removes a major roadblock.
 
 
 O.k. so we don't loose this. Do we want to work on PL/Ruby in core or not?

Side question -- is it plRuby or PL/Ruby?  We should be consistent.  I
just noticed the top-level README file has all the wrong names -- what
is pl/c for starters?  Or plPgsql?  We've _never_ used those names.

Also some time ago I convinced you that the actual name for the PHP
stuff was PL/php and you agreed.  Yet I see on the README the name
plPHP which manages to not get a single letter correctly capitalized!

I'll patch the README later, but consider this a call for future
consistency ...  (I'd like to know what do we call pl/c though.  It's
just C, right?)

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Peter Eisentraut

Am Montag, 24. Juli 2006 18:49 schrieb Alvaro Herrera:
 Side question -- is it plRuby or PL/Ruby?  We should be consistent.  I
 just noticed the top-level README file has all the wrong names -- what
 is pl/c for starters?  Or plPgsql?  We've _never_ used those names.

I'm beginning to think that this is part of some obscure plot by Joshua Drake 
to confuse people.  I advise all committers not to take any documentation 
patches from him without careful scrutiny.

I've fixed the README.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Adding a pgbench run to buildfarm

2006-07-24 Thread Andrew Dunstan


Bort, Paul wrote:

Given the concerns about running this on machines that don't have a lot
of CPU and disk to spare, should it ship disabled?
  


Yes, certainly.


Andrew, what do you think of pgbench reports shipping separately? I have
no idea how the server end is set up, so I don't know how much of a pain
that would be. 



  


Well, we'll need to put in some changes to collect the data, certainly. 
I don't see why we shouldn't ship the pgbench result separately, but ...



P.S. My current thought for settings is scaling factor 10, users 5,
transactions 1000.

  


... at this size it's hardly worth it. A quick test on my laptop showed 
this taking about a minute for the setup and another minute for the run, 
Unless we scale way beyond this I don't see any point in separate reporting.



cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] inclusion of hstore software in main tarball

2006-07-24 Thread Nicolai Petri

On Wednesday 03 May 2006 19:49, Jim C. Nasby wrote:
 On Wed, May 03, 2006 at 07:06:09PM +0200, hubert depesz lubaczewski wrote:
  hi
  some time ago i was told on postgresql-general about existence of
  hstore package for postgresql.
  as i understand it is being developed by the same guys that are behind
  tsearch2, gin and (at least) recent changes in gist.
 
  would it be possible to include this software in main postgresql tarball?
  it would be best if it came as standard feature, but contrib module would
  also work.

 Why put it in contrib unless it depends on being in there? Just grab it
 from pgFoundry if you need it.
Doesn't this apply to everything in /contrib ? I think hstore is such a small 
module that would fit nicely in contrib. For me it really sucks to have all 
users of our software download and compile it. Especially when they are used 
to just install binary rpms.

If people hate contrib so much why not just get rid of it forever.. Either it 
should embrace as much small contrib modules as possible - else it should 
disappear in my opinion.

cheers,
Nicolai Petri

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] Getting current transaction id

2006-07-24 Thread Nicolai Petri

Hello all

I'm in the need for my custom written replication engine to obtain the current 
transaction id from a trigger function. As far as I'm told it's not possible 
today. Would people object much if this functionality was added for 8.2 ? 
It's the last piece of postgresql C code I have left in my application and I 
think it would be a nice information in general to have available for users. 
If not for anything else then for simple statistics.

I attached the function I use with great success today.


Best regards,
Nicolai Petri


-
PG_FUNCTION_INFO_V1(get_transaction_id);

Datum get_transaction_id(PG_FUNCTION_ARGS) {
TransactionId curxact;

curxact=GetTopTransactionId();
return Int32GetDatum(curxact);
}


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Joshua D. Drake


Peter Eisentraut wrote:

Am Montag, 24. Juli 2006 18:23 schrieb Joshua D. Drake:

O.k. so we don't loose this. Do we want to work on PL/Ruby in core or not?


Unless you plan to fork or hijack the package, we need to hear from the author 
first.


What do you want to hear? I have my emails in correspondence asking for 
the relicense and the approval to submit.


Is there something specific you are looking for?

Joshua D. Drake





--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] [PATCHES] Allow commenting of variables in postgresql.conf to - try 4

2006-07-24 Thread Peter Eisentraut

Am Montag, 24. Juli 2006 16:55 schrieb Stephen Frost:
 #2: That variable can *not* be changed by a reload.
 Notice-level message is sent to the log notifying the admin that the
   change requested could not be performed.

This already happens.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Joshua D. Drake


Peter Eisentraut wrote:

Am Montag, 24. Juli 2006 18:49 schrieb Alvaro Herrera:

Side question -- is it plRuby or PL/Ruby?  We should be consistent.  I
just noticed the top-level README file has all the wrong names -- what
is pl/c for starters?  Or plPgsql?  We've _never_ used those names.


I'm beginning to think that this is part of some obscure plot by Joshua Drake 
to confuse people. 


I sincerely hope you are kidding.

I advise all committers not to take any documentation 
patches from him without careful scrutiny.


Gah... aren't you just all sour grapes. The README was reviewed by 
several people, in fact it went through two versions to the patches list.


Sorry that nobody caught it (including myself), but good lord it isn't 
that big of a deal.


Joshua D. Drake




I've fixed the README.




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Joshua D. Drake


Peter Eisentraut wrote:

Am Montag, 24. Juli 2006 18:49 schrieb Alvaro Herrera:

Side question -- is it plRuby or PL/Ruby?  We should be consistent.  I
just noticed the top-level README file has all the wrong names -- what
is pl/c for starters?  Or plPgsql?  We've _never_ used those names.


I'm beginning to think that this is part of some obscure plot by Joshua Drake 
to confuse people.  I advise all committers not to take any documentation 
patches from him without careful scrutiny.


I've fixed the README.


As a secondary note to this, I am by far not the only person that makes 
the mistake. A simple search on archives shows that.


Sincerely,

Joshua D. Drake




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

[HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Marc Munro

I want Veil (http://pgfoundry.org/projects/veil/) to be able to read
configuration details from a configuration file.  Rather than implement
my own config file parser, I'd like to be able to re-use the parser
defined in guc-file.l

If this is not contentious, I will submit a patch to make the parser
available to add-ins.

__
Marc


signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Alvaro Herrera

Joshua D. Drake wrote:
 Peter Eisentraut wrote:
 Am Montag, 24. Juli 2006 18:49 schrieb Alvaro Herrera:
 Side question -- is it plRuby or PL/Ruby?  We should be consistent.  I
 just noticed the top-level README file has all the wrong names -- what
 is pl/c for starters?  Or plPgsql?  We've _never_ used those names.
 
 I'm beginning to think that this is part of some obscure plot by Joshua 
 Drake to confuse people. 
 
 I sincerely hope you are kidding.

I understand that he is.

 I advise all committers not to take any documentation 
 patches from him without careful scrutiny.
 
 Gah... aren't you just all sour grapes. The README was reviewed by 
 several people, in fact it went through two versions to the patches list.

I saw those fly by and my gut feeling was whoever commits this is
_certainly_ going to fix it.  I'm not sure why I didn't comment on it.

 Sorry that nobody caught it (including myself), but good lord it isn't 
 that big of a deal.

Consistency is important.  It may not be _THAT_ big a deal, but we
should be at least a little careful.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PATCHES] Allow commenting of variables in postgresql.conf to - try 4

2006-07-24 Thread Joachim Wieland

On Mon, Jul 24, 2006 at 07:09:17PM +0200, Peter Eisentraut wrote:
 Am Montag, 24. Juli 2006 16:55 schrieb Stephen Frost:
  #2: That variable can *not* be changed by a reload.
  Notice-level message is sent to the log notifying the admin that the
  change requested could not be performed.

 This already happens.

Not if the option gets commented/deleted, i.e.:

shared_buffers = 8000
START
#shared_buffers = 8000
HUP

This does not issue a message at the moment.


Joachim


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] Allow commenting of variables in postgresql.conf to - try 4

2006-07-24 Thread Joachim Wieland

On Mon, Jul 24, 2006 at 10:55:47AM -0400, Stephen Frost wrote:
 #2: That variable can *not* be changed by a reload.
 Notice-level message is sent to the log notifying the admin that the
   change requested could not be performed.  This change could be
   either a revert to reset-value if it was removed/commented-out or an
   explicit change request to a different value.

Right. And what I am voting for is to not only issue such a message once but
every time a SIGHUP is received as long as the actively-used value differs
from the value in the configuration file. One of the reasons for having this
fall-back-to-default-value stuff is to make sure that an admin can restart a
server and be sure that it will behave in the same way as when it was
shut down.

Moreover it's just clearer to send the notice message every time a SIGHUP is
received since every reload is the admin's request to apply all of the
values in the configuration file independently of what has happened in the
past.


Joachim


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Joshua D. Drake



Sorry that nobody caught it (including myself), but good lord it isn't 
that big of a deal.


Consistency is important.  It may not be _THAT_ big a deal, but we
should be at least a little careful.



I do not disagree. All I was saying was that it is a very common mistake 
(see secondary note same thread).


Joshua D. Drake




--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Peter Eisentraut

Joshua D. Drake wrote:
 What do you want to hear? I have my emails in correspondence asking
 for the relicense and the approval to submit.

 Is there something specific you are looking for?

Either the author is going to abandon development, then it might make 
sense to pick up the pieces within the PostgreSQL source tree.  But 
then I'd like to see a specific statement to that effect from him on 
this list.  (I have no reason to believe that he is abandoning, FWIW.)

Or the author is agreeing to continue maintenance within the PostgreSQL 
source tree.  Then he should personally talk to us about arranging 
commit access.

If it's neither of these, that is, he will continue to maintain PL/Ruby 
by himself, and we're just going to copy code back and forth, then 
we're going to have the pgaccess nightmares all over again, which no 
one is looking forward to.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] [PATCHES] Allow commenting of variables in postgresql.conf to - try 4

2006-07-24 Thread Peter Eisentraut

Joachim Wieland wrote:
 On Mon, Jul 24, 2006 at 07:09:17PM +0200, Peter Eisentraut wrote:
  Am Montag, 24. Juli 2006 16:55 schrieb Stephen Frost:
   #2: That variable can *not* be changed by a reload.
   Notice-level message is sent to the log notifying the admin
   that the change requested could not be performed.
 
  This already happens.

 Not if the option gets commented/deleted, i.e.:

 shared_buffers = 8000
 START
 #shared_buffers = 8000
 HUP

 This does not issue a message at the moment.

Because at the moment, the above does not change the value of 
shared_buffers.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] inclusion of hstore software in main tarball

2006-07-24 Thread Alvaro Herrera

Nicolai Petri wrote:

 If people hate contrib so much why not just get rid of it forever.. Either it 
 should embrace as much small contrib modules as possible - else it should 
 disappear in my opinion.

Actually that's partly the intention.  Several contrib modules have been
removed in the recent past, some others have been moved out to
pgFoundry.  Some are waiting for a little more maturity before they are
moved into core.

IMHO what you should really be doing is convincing somebody to create
binary packages for your operating system for the interesting/useful
pgFoundry projects.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] The vacuum-ignore-vacuum patch

2006-07-24 Thread Tom Lane

Alvaro Herrera [EMAIL PROTECTED] writes:
 Hannu Krossing asked me about his patch to ignore transactions running
 VACUUM LAZY in other vacuum transactions.  I attach a version of the
 patch updated to the current sources.

nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
version of the computation?

In general, it seems to me that a transaction running lazy vacuum could
be ignored for every purpose except truncating clog/subtrans.  Since it
will never insert its own XID into the database (note: VACUUM ANALYZE is
run as two separate transactions, hence the pg_statistic rows inserted
by ANALYZE are not a counterexample), there's no need for anyone to
include it as running in their snapshots.  So unless I'm missing
something, this is a safe change for lazy vacuum, but perhaps not for
full vacuum, which *does* put its XID into the database.

A possible objection to this is that it would foreclose running VACUUM
and ANALYZE as a single transaction, exactly because of the point that
we couldn't insert pg_statistic rows using a lazy vacuum's XID.  I think
there was some discussion of doing that in connection with enlarging
ANALYZE's sample greatly --- if ANALYZE goes back to being a full scan
or nearly so, it'd sure be nice to combine it with the VACUUM scan.
However maybe we should just accept that as the price of not having
multiple vacuums interfere with each other.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] [PATCHES] The vacuum-ignore-vacuum patch

2006-07-24 Thread Alvaro Herrera

Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Hannu Krossing asked me about his patch to ignore transactions running
  VACUUM LAZY in other vacuum transactions.  I attach a version of the
  patch updated to the current sources.
 
 nonInVacuumXmin seems useless ... perhaps a vestige of some earlier
 version of the computation?

Hmm ... I remember removing a now-useless variable somewhere, but maybe
this one escaped me.  I don't have the code handy -- will check.

 In general, it seems to me that a transaction running lazy vacuum could
 be ignored for every purpose except truncating clog/subtrans.  Since it
 will never insert its own XID into the database (note: VACUUM ANALYZE is
 run as two separate transactions, hence the pg_statistic rows inserted
 by ANALYZE are not a counterexample), there's no need for anyone to
 include it as running in their snapshots.  So unless I'm missing
 something, this is a safe change for lazy vacuum, but perhaps not for
 full vacuum, which *does* put its XID into the database.

But keep in mind that in the current code, clog truncation takes
relminxid (actually datminxid) into account, not running transactions,
so AFAICS this should affect anything.

Subtrans truncation is different and it certainly should consider lazy
vacuum's Xids.

 A possible objection to this is that it would foreclose running VACUUM
 and ANALYZE as a single transaction, exactly because of the point that
 we couldn't insert pg_statistic rows using a lazy vacuum's XID.  I think
 there was some discussion of doing that in connection with enlarging
 ANALYZE's sample greatly --- if ANALYZE goes back to being a full scan
 or nearly so, it'd sure be nice to combine it with the VACUUM scan.
 However maybe we should just accept that as the price of not having
 multiple vacuums interfere with each other.

Hmm, what about having a single scan for both, and then starting a
normal transaction just for the sake of inserting the pg_statistics
tuple?

I think the interactions of Xids and vacuum and other stuff are starting
to get complex; IMHO it warrants having a README.vacuum, or something.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] [Fwd: dblink patch - Asynchronous queries and parallel execution]

2006-07-24 Thread Joe Conway


(cross-posted from PATCHES without the attachement)

I just received this (offlist), and have not had a chance to review it
myself yet, but figured I should post it now in case others want to have
a look and comment or discuss before feature freeze.

If there are no major objections to the concept, I'll take
responsibility to review and commit once I'm through with the Values
list-of-targetlists stuff.

(I'm not sure where we finished off with the discussion of PATCHES vs
HACKERS list for this kind of stuff, so I'm going to send another copy
of this to HACKERS without the attachement)

Thanks,

Joe


 Original Message 
Subject: dblink patch - Asynchronous queries and parallel execution
Date: Mon, 24 Jul 2006 12:47:51 +0200
From: Kai Londenberg [EMAIL PROTECTED]
To: [EMAIL PROTECTED]

Hello,

I needed parallel query execution features for a project, and so I
modified the dblink module to add support
for asynchronous query execution.

I thought others might find these features useful as well, therefore I'd
like to contribute this to the
current Postgresql contrib/dblink codebase.

The code is based on the contrib/dblink code included with the current
8.1.4 version of PostgreSQL.
I'm including the entire modified contrib/dblink directory in archived form.

I modified dblink.c and dblink.sql.in, and created the file README.async

Hope you like it, and include it in a possible next version of dblink.

The code still needs some testing and code review. I made it work for
me, but I don't have any
experience writing Postgresql Extensions, and haven't touched C for a while.

The most important thing about this code is that it allows parallel
execution of queries on several
backend databases, and re-joining of their results. This solves a lot of
scalability problems.

This is my corresponding README.async file which describes my additions.

-
dblink-async patch by Kai Londenberg ([EMAIL PROTECTED])
24.7.2006

All code is licensed under the same terms as the rest of the dblink
code.

SQL Function declarations have been added at the bottom of dblink.sql

Added functions:

int dblink_send_query(connstr text, sql text)
  Sends a query to a remote server for asynchronous execution.

  returns immediately without waiting for results.

  returns 1 on success, or 1 on failure.
  results *must* be fetched by dblink_get_result(connstr)
  a running query may be cancelled by dblink_cancel_query(connstr)


dblink_get_result(connstr text[,bool fail_on_error])
  retrieves the result of a query started by dblink_send_query.

  Blocks until a result gets available.

  This function *must* be called if dblink_send_query returned
  a 1, even on cancelled queries - otherwise the connection
  can't be used anymore.

dblink_get_connections()
  List all open dblink connections by name.
  Returns a comma separated string of all connection names.
  Takes no params

  Example: SELECT string_to_array(dblink_get_connections(), ',');

int dblink_is_busy(connstr)
  returns 1 if connection is busy, 0 if it is not busy.
  If this function returns 0, it is guaranteed that dblink_get_result
  will not block.

text dblink_cancel_query(connstr)
  Cancels a running query on a given connection.
  returns OK on success, or an error message on failure.


Examples:

 Example 1 - Union over parallel executed remote queries --

SELECT dblink_connect('dtest1', 'host=server1 port=5432 dbname=dtest_1
user=duser password=pass');
SELECT * from
   dblink_send_query('dtest1', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

SELECT dblink_connect('dtest2', 'host=server2 port=5432 dbname=dtest_2
user=duser password=pass');
SELECT * from
   dblink_send_query('dtest2', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

SELECT dblink_connect('dtest3', 'host=server3 port=5432 dbname=dtest_3
user=duser password=pass');
SELECT * from
   dblink_send_query('dtest3', 'SELECT country_code, city from
world_cities where city like \'fe%\'') as t1;

CREATE TEMPORARY TABLE result AS
(SELECT * from dblink_get_result('dtest1') as t1(country_code text, city
text))
UNION
(SELECT * from dblink_get_result('dtest2') as t2(country_code text, city
text))
UNION
(SELECT * from dblink_get_result('dtest3') as t3(country_code text, city
text))
ORDER by city DESC LIMIT 100;

SELECT dblink_disconnect('dtest1');
SELECT dblink_disconnect('dtest2');
SELECT dblink_disconnect('dtest3');
SELECT * from result;

--- End of Example 1 ---

best regards,

Kai Londenberg





---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Peter Eisentraut

Marc Munro wrote:
 I want Veil (http://pgfoundry.org/projects/veil/) to be able to read
 configuration details from a configuration file.

What kind of details?  By the time any server-side module is loaded, the 
configuration file has already been read, so why would you need to read 
it again?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] The vacuum-ignore-vacuum patch

2006-07-24 Thread Tom Lane

Alvaro Herrera [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 A possible objection to this is that it would foreclose running VACUUM
 and ANALYZE as a single transaction, exactly because of the point that
 we couldn't insert pg_statistic rows using a lazy vacuum's XID.

 Hmm, what about having a single scan for both, and then starting a
 normal transaction just for the sake of inserting the pg_statistics
 tuple?

We could, but I think memory consumption would be the issue.  VACUUM
wants a lotta memory for the dead-TIDs array, ANALYZE wants a lot for
its statistics gathering ... even more if it's trying to take a larger
sample than before.  (This is probably why we kept them separate in
the last rewrite.)

 I think the interactions of Xids and vacuum and other stuff are starting
 to get complex; IMHO it warrants having a README.vacuum, or something.

Go for it ...

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Tom Lane

Peter Eisentraut [EMAIL PROTECTED] writes:
 Marc Munro wrote:
 I want Veil (http://pgfoundry.org/projects/veil/) to be able to read
 configuration details from a configuration file.

 What kind of details?  By the time any server-side module is loaded, the 
 configuration file has already been read, so why would you need to read 
 it again?

Probably the correct question is whether Marc's problem isn't already
solved by the custom GUC variable mechanism --- that is, whatever he
wants to configure should be defined as custom GUCs within the
existing configuration file.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Andrew Dunstan


Peter Eisentraut wrote:

Marc Munro wrote:
  

I want Veil (http://pgfoundry.org/projects/veil/) to be able to read
configuration details from a configuration file.



What kind of details?  By the time any server-side module is loaded, the 
configuration file has already been read, so why would you need to read 
it again?


  


We already have a mechanism to allow custom config sets in 
postgresql.conf. It's used by pl/java and pl/perl, for example. It 
probably needs better documentation.


If you want to use another config file then your module probably should 
use its own parser.


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Greg Stark

Tom Lane [EMAIL PROTECTED] writes:

 I'm fairly dissatisfied with the naming of Greg Stark's proposed new
 feature for creating indexes without blocking writers of the table.
 To my mind, ONLINE just doesn't convey any useful information ---
 the existing CREATE INDEX functionality could already be said to be
 online, in the sense that you don't have to take down the database
 to do it.  I thought about SHARED but someone could probably raise
 the same objection to it.  Anyone have a better idea?

I know Oracle calls this online index builds. In fact it works similarly
with a single keyword online tacked on near the end of the create index
statement.

Anyone know what MSSQL or DB2 call it?

 I'm also wondering about where in the command the keyword should go.
 As submitted it's
 
   [ WITH ( replaceable class=PARAMETERstorage_parameter/replaceable 
 = replaceable class=PARAMETERvalue/replaceable [, ... ] ) ]
   [ TABLESPACE replaceable class=parametertablespace/replaceable ]
 + [ ONLINE]
   [ WHERE replaceable class=parameterpredicate/replaceable ]

One thing I didn't like about this is that really all of these clauses should
be legal to put in in any order. I'm not sure that's doable with the WHERE
clause but the others ought to be possible to make an arbitrary list that can
be given in any order. But perhaps that's irrelevant if we don't go with
ONLINE at the end at all.

 which seems a bit randomly chosen; what's more it creates a problem for
 psql, which would have to parse nearly the entire command to discover
 whether it's safe to execute inside a transaction block or not.

One thing to think about, what will the command to execute stored procedures
look like? Those will also need to be called from outside a transaction.

I keep coming back to this feeling that the server should be the one starting
the transaction, not psql. But then that could just be my experience with
Oracle. In Oracle you're never outside a transaction. Transactions
implicitly start the first time you execute almost any statement.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Marc Munro

On Mon, 2006-07-24 at 20:31 +0200, Peter Eisentraut wrote:
 Marc Munro wrote:
  I want Veil (http://pgfoundry.org/projects/veil/) to be able to read
  configuration details from a configuration file.
 
 What kind of details?  By the time any server-side module is loaded, the 
 configuration file has already been read, so why would you need to read 
 it again?
 
I want to read a veil config file rather than the postgres config file.
Basically, I just want access to the rather nice config file parser that
already exists.

As for the kind of details:
- how much shared memory to allocate for veil
- the size of the hash tables for veil's shared variables

__
Marc


signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Marc Munro

On Mon, 2006-07-24 at 14:44 -0400, Tom Lane wrote:
 Peter Eisentraut [EMAIL PROTECTED] writes:
  Marc Munro wrote:
  I want Veil (http://pgfoundry.org/projects/veil/) to be able to read
  configuration details from a configuration file.
 
  What kind of details?  By the time any server-side module is loaded, the 
  configuration file has already been read, so why would you need to read 
  it again?
 
 Probably the correct question is whether Marc's problem isn't already
 solved by the custom GUC variable mechanism --- that is, whatever he
 wants to configure should be defined as custom GUCs within the
 existing configuration file.

I don't think it's already solved but I may be missing the point.  The
Veil shared library will be loaded by process_preload_libraries only
after the postgresql config file has been read.  I was assuming that at
this point it would be too late to specify custom GUCs.

Instead I want to load my own veil.conf file which would support the
same sort of syntax as postgresql.conf.

My proposal is to simply expose a new function processAddinConfigFile()
which would mimic ProcessConfigFile but would call a user-supplied
function to deal with each entry.  Obviously, this would be usable by
future add-ins and not just Veil.

If there is a better way to do this please tell me.
__
Marc




signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Tom Lane

Marc Munro [EMAIL PROTECTED] writes:
 I don't think it's already solved but I may be missing the point.  The
 Veil shared library will be loaded by process_preload_libraries only
 after the postgresql config file has been read.  I was assuming that at
 this point it would be too late to specify custom GUCs.

No, being able to do that is exactly the point of the custom-GUC
mechanism.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Marc Munro

On Mon, 2006-07-24 at 15:17 -0400, Tom Lane wrote:
 No, being able to do that is exactly the point of the custom-GUC
 mechanism.

Excellent.  I shall study this and hope to bother you no further :-)

Thanks.
__
Marc


signature.asc
Description: This is a digitally signed message part

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Bort, Paul

 
Greg Stark asked:
 I know Oracle calls this online index builds. In fact it 
 works similarly
 with a single keyword online tacked on near the end of the 
 create index
 statement.
 
 Anyone know what MSSQL or DB2 call it?
 

I have to live with MSSQL at work, and I didn't remember anything like
this, so I looked up the syntax for CREATE INDEX
(http://msdn.microsoft.com/library/default.asp?url=/library/en-us/tsqlre
f/ts_tsqlcon_6lyk.asp), and I can't find anything that gives the user
control over this. The closest is this note in the remarks:

Backup and CREATE INDEX operations do not block each other. If a
backup is in progress, index is created in a fully logged mode, which
may require extra log space.

It sounds to me like they fall back to 'online' index creation if a
backup is in progress, but give the user no control over it. I also
looked in the settings and didn't see anything relevant.

Regards,
Paul Bort

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] [PATCHES] Resurrecting per-page cleaner for btree

2006-07-24 Thread Heikki Linnakangas


On Mon, 24 Jul 2006, Tom Lane wrote:


Personally I don't think retail vacuuming in that form will ever fly
anyway, so I have no problem with installing the proposed patch,
but I thought I'd better throw this comment out to see if anyone
thinks it's a big deal.


My feeling is that retail vacuuming would be useful some day. But it's 
certainly not going to be there in 8.2 so I have no objection with the 
patch. It's a fairly localized change; it can easily be reverted later if 
necessary.


- Heikki

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Alvaro Herrera

Greg Stark wrote:
 Tom Lane [EMAIL PROTECTED] writes:
 

  I'm also wondering about where in the command the keyword should go.
  As submitted it's
  
[ WITH ( replaceable 
  class=PARAMETERstorage_parameter/replaceable = replaceable 
  class=PARAMETERvalue/replaceable [, ... ] ) ]
[ TABLESPACE replaceable class=parametertablespace/replaceable ]
  + [ ONLINE]
[ WHERE replaceable class=parameterpredicate/replaceable ]
 
 One thing I didn't like about this is that really all of these clauses should
 be legal to put in in any order. I'm not sure that's doable with the WHERE
 clause but the others ought to be possible to make an arbitrary list that can
 be given in any order. But perhaps that's irrelevant if we don't go with
 ONLINE at the end at all.

If you were going to propose something like the INCLUDING CONSTRAINTS
or EXCLUDING CONSTRAINTS stuff, which you can specify multiple times
and then the last one takes precedence, then I personally don't like it.
IMHO it's not nice that we accept that kind of things.

  which seems a bit randomly chosen; what's more it creates a problem for
  psql, which would have to parse nearly the entire command to discover
  whether it's safe to execute inside a transaction block or not.
 
 One thing to think about, what will the command to execute stored procedures
 look like? Those will also need to be called from outside a transaction.

CALL presumably?

 I keep coming back to this feeling that the server should be the one starting
 the transaction, not psql. But then that could just be my experience with
 Oracle. In Oracle you're never outside a transaction. Transactions
 implicitly start the first time you execute almost any statement.

Well, the problem is the AUTOCOMMIT mode.  In that case it isn't the
server starting the transaction, but the client app.  We already had the
server starting the transaction (back in 7.3 IIRC) and it was such a
problem that it was rolled back, and instead the autocommit logic was
put into the client apps.

In Postgres, just like in Oracle, you're never outside of a transaction
as well.  The only difference is when does the transaction end -- in
Postgres, it's either right after the command, or when COMMIT/ROLLBACK
is issued, depending on whether the user (or the client app) issued
BEGIN or not.  But you already knew that.  So the only problem is that
psql should be figuring out if it should send a BEGIN or not, which
depends on the command being executed.  In the normal CREATE INDEX case,
psql should send a BEGIN right before.  But in the ONLINE CREATE INDEX
case, psql should figure out that it must *not* send the BEGIN.

The server logic does not change in either case.  It only knows to
reject the ONLINE CREATE INDEX when inside a transaction block, because
it cannot possibly roll it back.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Peter Eisentraut

Greg Stark wrote:
 One thing to think about, what will the command to execute stored
 procedures look like? Those will also need to be called from outside
 a transaction.

Huh?  Only if you invent your own stored-procedure theory or have a 
hitherto unknown interpretation of the SQL standard.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] effective_cache_size is a real?

2006-07-24 Thread Peter Eisentraut

Is it intentional that effective_cache_size is a real (as opposed to 
integer)?  The initial revision of guc.c already has it that way, so it 
was probably blindly adapted from the previous adhockery that had all 
planner variables be doubles.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] Generic Monitoring Framework with DTrace patch

2006-07-24 Thread Robert Lor


Excellent! I'll submit a doc patch shortly.

Regards,
-Robert

Peter Eisentraut wrote:


I've committed the dtrace patch.  Some documentation would be nice now ...

 




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] TODO: Mark change-on-restart-only values in

2006-07-24 Thread Jim Nasby


On Jul 17, 2006, at 2:48 PM, Zdenek Kotala wrote:

Josh Berkus wrote:

Zdenek,

I would like to implement Mark change-on-restart-only values in
postgresql.conf item. Anybody works on this? Does it mean add extra
comment to postgresql.conf for variable which has PG_POSTMASTER  
context?

Somehow I thought you'd already submitted a patch?
I sent patch for relatively related problem when somebody  
commenting out item in the configuration file. This item is look  
like easy, but I surprise that this item does not have % prefix.  
The question is if it is only about adding extra comments to  
postgresql.conf file.


If memory serves, we also wanted to have the postmaster log a warning  
if it reloaded postgresql.conf and found values that had changed but  
wouldn't take effect until restart. So presumably the postmaster  
would have to keep an in-memory copy of the parsed postgresql.conf  
and compare that with the parsed copy it gets after reloading  
postgresql.conf.

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Alvaro Herrera

Peter Eisentraut wrote:
 Greg Stark wrote:
  One thing to think about, what will the command to execute stored
  procedures look like? Those will also need to be called from outside
  a transaction.
 
 Huh?  Only if you invent your own stored-procedure theory or have a 
 hitherto unknown interpretation of the SQL standard.

We've discussed a couple of times that stored procedures should have
enough abilities to start and commit transactions on their own, to be
useful to administration tasks like vacuuming.  It's not something that
has been set in stone but it's something to consider.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

[HACKERS] Units in postgresql.conf -- How to report?

2006-07-24 Thread Peter Eisentraut

So assuming we allowed units in postgresql.conf, how would you report 
them with SHOW?

1. The way they were set (hard)

2. Without units (not user-friendly)

3. Always in base units (seconds or bytes)

4. The largest unit that gives an integer

(4) seems the most reasonable to me in terms of interface and 
implementation.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] TODO: Mark change-on-restart-only values in

2006-07-24 Thread Peter Eisentraut

Jim Nasby wrote:
 If memory serves, we also wanted to have the postmaster log a warning
 if it reloaded postgresql.conf and found values that had changed but
 wouldn't take effect until restart.

It already does that.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] effective_cache_size is a real?

2006-07-24 Thread Simon Riggs

On Mon, 2006-07-24 at 22:55 +0200, Peter Eisentraut wrote:
 Is it intentional that effective_cache_size is a real (as opposed to 
 integer)?  The initial revision of guc.c already has it that way, so it 
 was probably blindly adapted from the previous adhockery that had all 
 planner variables be doubles.

Makes no sense to me as a real. It should be an integer, since it is the
effective number of cache pages, not KB, MB or GB.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

[HACKERS] Back online; Trip postponed

2006-07-24 Thread Bruce Momjian

FYI, I have been offline for the past few days because of storm damage
in my area.  Things have been repaired and my server is back online.

Also, my trip to the UK, India, and Pakistan that I announced at our
10th anniversary has been postponed until September.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] [Fwd: dblink patch - Asynchronous queries and parallel

2006-07-24 Thread Josh Berkus


Joe Conway wrote:

(cross-posted from PATCHES without the attachement)

I just received this (offlist), and have not had a chance to review it
myself yet, but figured I should post it now in case others want to have
a look and comment or discuss before feature freeze.


Is there a downside to this patch?  I can't see any.

--Josh

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] effective_cache_size is a real?

2006-07-24 Thread Josh Berkus


Peter,

Is it intentional that effective_cache_size is a real (as opposed to 
integer)?  The initial revision of guc.c already has it that way, so it 
was probably blindly adapted from the previous adhockery that had all 
planner variables be doubles.




I beleive that it's a real because the other query estimate variables 
are reals.  Might be worth checking the estimation code to make sure 
that changing the type won't break anything.


--Josh

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

[HACKERS] Help! - Slony-I - saving/setting/restoring GUC

2006-07-24 Thread Chris Browne

In support of PG 8.2, we need to have the log trigger function do the
following:
  - Save value of standards_conforming_string
  - Set value of standards_conforming_string to FALSE
  - proceed with saving data to sl_log_? 
  - Recover value of standards_conforming_string

The variable, standards_conforming_string, does not appear to be
widely exported, but rather seems pretty localized to guc.c

This is the prime thing holding us from RC2.

Tom, Peter, you have touched guc.c in the context of
standards_conforming_string; perahps you can suggest something?  Darcy
and I aren't seeing what to do...
-- 
let name=cbbrowne and tld=cbbrowne.com in name ^ @ ^ tld;;
http://www3.sympatico.ca/cbbrowne/
Why do we drive on parkways and park on driveways?

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Help! - Slony-I - saving/setting/restoring GUC

2006-07-24 Thread Peter Eisentraut

Chris Browne wrote:
 In support of PG 8.2, we need to have the log trigger function do the
 following:
   - Save value of standards_conforming_string
   - Set value of standards_conforming_string to FALSE
   - proceed with saving data to sl_log_?
   - Recover value of standards_conforming_string

Would SET LOCAL help you?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] plPHP and plRuby

2006-07-24 Thread Joshua D. Drake


Peter Eisentraut wrote:

Joshua D. Drake wrote:

What do you want to hear? I have my emails in correspondence asking
for the relicense and the approval to submit.

Is there something specific you are looking for?


Either the author is going to abandon development, then it might make 
sense to pick up the pieces within the PostgreSQL source tree.  But 
then I'd like to see a specific statement to that effect from him on 
this list.  (I have no reason to believe that he is abandoning, FWIW.)


Or the author is agreeing to continue maintenance within the PostgreSQL 
source tree.  Then he should personally talk to us about arranging 
commit access.


If it's neither of these, that is, he will continue to maintain PL/Ruby 
by himself, and we're just going to copy code back and forth, then 
we're going to have the pgaccess nightmares all over again, which no 
one is looking forward to.


O.k. yes that all makes sense. I will contact him and see if I can get a 
thread going between us all.


Joshua D. Drake







--

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] [PATCHES] Resurrecting per-page cleaner for btree

2006-07-24 Thread Bruce Momjian

Tom Lane wrote:
 ITAGAKI Takahiro [EMAIL PROTECTED] writes:
  This is a revised patch originated by Junji TERAMOTO for HEAD.
[BTree vacuum before page splitting]
http://archives.postgresql.org/pgsql-patches/2006-01/msg00301.php
  I think we can resurrect his idea because we will scan btree pages
  at-atime now; the missing-restarting-point problem went away.
  Have I missed something? Comments welcome.
 
 I think the only serious objection to this would be that it'd mean that
 tuples that should have an index entry might not have one.  The current
 form of VACUUM does not care, but people keep raising the idea of doing
 retail vacuuming that operates by looking up index entries explicitly.
 You could certainly make a retail vacuumer do nothing if it fails to
 find the expected index entry, but ISTM that'd be a rather serious loss
 of consistency checking --- you could not tell the someone-already-
 deleted-it case apart from a bug in the vacuumer's index value
 computation or lookup.
 
 Personally I don't think retail vacuuming in that form will ever fly
 anyway, so I have no problem with installing the proposed patch,
 but I thought I'd better throw this comment out to see if anyone
 thinks it's a big deal.

Agreed.  Reverse lookup of index entries will always be too slow.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Windows buildfarm support, or lack of it

2006-07-24 Thread Jim Nasby


On Jul 16, 2006, at 3:08 PM, Andrew Dunstan wrote:

Kris Jurka wrote:
  For my cygwin buildfarm member I setup cron, but the make step  
failed for every build for unknown reasons while succeeding if not  
run from cron.



Is this still happening? We should try to get to the bottom of it.


Try setting a more complete $PATH; there's a good chance that make  
isn't able to find something it needs.

--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread Jie Zhang



On 7/24/06 6:59 AM, Hannu Krosing [EMAIL PROTECTED] wrote:

 Ühel kenal päeval, P, 2006-07-23 kell 20:25, kirjutas Tom Lane:
 Gavin Sherry [EMAIL PROTECTED] writes:
 On Sun, 23 Jul 2006, Tom Lane wrote:
 However, the main problem I've got with this is that a new index AM is a
 pretty large burden, and no one's made the slightest effort to sell
 pghackers on taking this on.
 
 For low cardinality sets, bitmaps greatly out perform btree.
 
 If the column is sufficiently low cardinality, you might as well just do
 a seqscan --- you'll be hitting most of the heap's pages anyway.  I'm
 still waiting to be convinced that there's a sweet spot wide enough to
 justify supporting another index AM.  (I'm also wondering whether this
 doesn't overlap the use-case for GIN.)
 
 IIRC they quoted the cardinality of 1 as something that is still
 faster than btree for several usecases.
 
 And also for AND-s of several indexes, where indexes are BIG, your btree
 indexes may be almost as big as tables but the resulting set of pages is
 small.

Yeah, Hannu points it out very well -- the bitmap index works very well when
columns have low cardinalities and AND operations will produce small number
of results.

Also, the bitmap index is very small in low cardinality cases, where the
btree tends to take up at least 10 times more space.



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] pg_dump: add option to ignore TABLE DATA for failed TABLE object creation

2006-07-24 Thread Jim Nasby

You should submit to -patches; at least that it should either get  
into the queue or you should get a reason why it didn't.


On Jul 17, 2006, at 4:27 PM, Martin Pitt wrote:

some time ago I started a discussion [1] here about modifying pg_dump
to not restore TABLE DATA objects if the corresponding TABLE oject
failed to be created (usually because it already exists, but it might
fail due to a different error like a nonexisting data type). We need
this to provide automatic major version upgrades for databases with
extensions like PostGIS. Tom's reply [3] seemed to indicate that this
was not entirely crackful, so I implemented his approach, and after
some feedback I now have a fairly clean patch that works very well.

The patch was scheduled for review and inclusion [4], and indeed the
page had the patch for a while, but after some time it vanished.

Can you please reconsider this? If there is still a problem with the
patch, I'd like to work on it until it meets your standards.

For your convenience I attach the current patch version; a test script
[5] is also available (the ML kills shell script attachments, so I put
it on a Debian server). It does not alter the default behaviour, it
just adds a new option -X no-data-for-failed-tables. If you think this
mode should be the default, I'm happy to change it that way.

Thank you a lot!

Martin

[1] http://archives.postgresql.org/pgsql-hackers/2006-02/msg00694.php
[2] http://bugs.debian.org/351571
[3] http://archives.postgresql.org/pgsql-hackers/2006-02/msg00716.php
[4] http://archives.postgresql.org/pgsql-hackers/2006-02/msg01253.php
[5] http://people.debian.org/~mpitt/test-pg_restore-existing.sh

--
Martin Pitthttp://www.piware.de
Ubuntu Developer   http://www.ubuntu.com
Debian Developer   http://www.debian.org

In a world without walls and fences, who needs Windows and Gates?
13-pg_restore-ignore-failing-tables.patch


--
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread Bruce Momjian

Jie Zhang wrote:
  IIRC they quoted the cardinality of 1 as something that is still
  faster than btree for several usecases.
  
  And also for AND-s of several indexes, where indexes are BIG, your btree
  indexes may be almost as big as tables but the resulting set of pages is
  small.
 
 Yeah, Hannu points it out very well -- the bitmap index works very well when
 columns have low cardinalities and AND operations will produce small number
 of results.

What operations on columns of low cardinality produce a small number of
results?  That seems contradictory.

 Also, the bitmap index is very small in low cardinality cases, where the
 btree tends to take up at least 10 times more space.

Also, are adding/changing rows is more expensive with bitmaps than
btrees?

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Units in postgresql.conf -- How to report?

2006-07-24 Thread David Fetter

On Mon, Jul 24, 2006 at 11:13:53PM +0200, Peter Eisentraut wrote:
 So assuming we allowed units in postgresql.conf, how would you report 
 them with SHOW?
 
 1. The way they were set (hard)
 
 2. Without units (not user-friendly)
 
 3. Always in base units (seconds or bytes)
 
 4. The largest unit that gives an integer
 
 (4) seems the most reasonable to me in terms of interface and 
 implementation.

I'm for (4), as it's also what people are used to from things like
GNU's -h option.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] RESET CONNECTION?

2006-07-24 Thread Bruce Momjian

Tatsuo Ishii wrote:
  Alvaro Herrera wrote:
   Bruce Momjian wrote:
Mario Weilguni wrote:
 Will this patch make it into 8.2?
 http://archives.postgresql.org/pgsql-patches/2004-12/msg00228.php
 
 It's a really nice feature, would be extremly useful with tools like 
 pgpool.

No, it will not because RESET CONNECTION can mess up interface code that
doesn't want the connection reset.  We are not sure how to handle that.
   
   Hmm, what interface code are you talking about?
  
  I believe JDBC, for example, sets things inside the interface that would
  be broken by RESET CONNECTION.  Here is a thread about it:
  
  http://archives.postgresql.org/pgsql-patches/2005-01/msg00029.php
 
 I think we had similar problem with client encoding and solved it by
 using parameter status. Why don't we solve the JDBC problem in the
 same way?

Oh, yes, we could do that.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread Jie Zhang



On 7/24/06 6:04 PM, Bruce Momjian [EMAIL PROTECTED] wrote:

 Jie Zhang wrote:
 IIRC they quoted the cardinality of 1 as something that is still
 faster than btree for several usecases.
 
 And also for AND-s of several indexes, where indexes are BIG, your btree
 indexes may be almost as big as tables but the resulting set of pages is
 small.
 
 Yeah, Hannu points it out very well -- the bitmap index works very well when
 columns have low cardinalities and AND operations will produce small number
 of results.
 
 What operations on columns of low cardinality produce a small number of
 results?  That seems contradictory.

Let's see an example. Table 'T' includes two columns, 'p' and 's'. The
column 'p' has 100 distinct values, say p1-p100 and the column 'status' has
20 distinct values, say s1-s20. The query

  'select * from order where priority=p1 and status=s1'

may produce small number of results. Also, if these related rows are
clustered together, that would be even better.

 
 Also, the bitmap index is very small in low cardinality cases, where the
 btree tends to take up at least 10 times more space.
 
 Also, are adding/changing rows is more expensive with bitmaps than
 btrees?

Inserting a row will only affect the last word (at most last several words)
of a bitmap vector, so this should not be very expensive: 3-4 IOs. When a
row is updated and the new row is inserted in the middle of the heap,
currently the code will update the bit in the place -- where the bit should
be. Searching for the page which includes the bit to be updated is not very
efficient now, but this can be fixed. Currently, we have to scan the pages
for a bitmap vector one by one until we hit the right page. Since the bitmap
vector is compressed, updating a bit in the middle may cause its page to
overflow. In this case, we create a new page to accommodate those extra
bits, and insert this new page right after the original page.

Overall, inserting a row or updating a row can be done efficiently. But it
is true that the bitmap index does not perform well if there are lots of
inserts and updates, especially updates.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread Gavin Sherry

On Mon, 24 Jul 2006, Bruce Momjian wrote:

 Jie Zhang wrote:
   IIRC they quoted the cardinality of 1 as something that is still
   faster than btree for several usecases.
  
   And also for AND-s of several indexes, where indexes are BIG, your btree
   indexes may be almost as big as tables but the resulting set of pages is
   small.
 
  Yeah, Hannu points it out very well -- the bitmap index works very well when
  columns have low cardinalities and AND operations will produce small number
  of results.

 What operations on columns of low cardinality produce a small number of
 results?  That seems contradictory.

WHERE a = 1 and b = 2

a = 1 may be 5% of the table and b = 2 may be 5% of the table but their
intersection may be .001%.

Luke: the URL you sent to the bitmap slides was internal to Greenplum.
Would you be able to put them on a public site?

Thanks,

Gavin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread mark

On Mon, Jul 24, 2006 at 09:04:28PM -0400, Bruce Momjian wrote:
 Jie Zhang wrote:
   IIRC they quoted the cardinality of 1 as something that is still
   faster than btree for several usecases.
   
   And also for AND-s of several indexes, where indexes are BIG, your btree
   indexes may be almost as big as tables but the resulting set of pages is
   small.
  Yeah, Hannu points it out very well -- the bitmap index works very well
  when columns have low cardinalities and AND operations will produce small
  number of results.
 What operations on columns of low cardinality produce a small number of
 results?  That seems contradictory.

Not necessarily. Cardinality of 2 means 1/2. 3 means 1/3. 4 means 1/4. Etc.

Reading 1/4, for a larger table, has a good chance of being faster than
reading 4/4 of the table. :-)

No opinion on whether such tables exist in the real world - but the
concept itself seems sound.

  Also, the bitmap index is very small in low cardinality cases, where the
  btree tends to take up at least 10 times more space.
 Also, are adding/changing rows is more expensive with bitmaps than
 btrees?

Without looking at the code, but having read the Oracle docs, I would
guess yes. Every vacuum/delete would need to clear all of the bits for
the row. Every insert/update would need to allocate a bit in at least
the bitmap tree for the row inserted/updated. Seems like more pages
may need to be written. Although perhaps some clever optimization
would limit this. :-)

It seems interesting though. We won't really know until we see the
benchmarks. I'm seeing it as a form of working directly with the
intermediate form of the bitmap index scanner. If the existing index
scanner, building the bitmaps on the fly can out-perform the regular
index scanner, I'm seeing potentially in a pre-built bitmap.

Obviously, it isn't the answer to everything. The use I see for it,
are a few of my 1:N object attribute tables. The table has an object
identifier, and a column indicating the attribute type that the value
is for. If I have 20 or more attribute type values, however, most
objects include rows for most attribute types, my regular index ends
up repeating the attribute type for every row. If I want to scan the
table for all rows that have a particular attribute type with a
particular value, it's a seqscan right now. With the bitmap scanner,
knowing which rows to skip to immediately is readily available.

Will it be worth it or not? I won't know until I try it. :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] LDAP patch feature freeze

2006-07-24 Thread Bruce Momjian

Tom Lane wrote:
 Albe Laurenz [EMAIL PROTECTED] writes:
  Any chance that my LDAP patch
  http://momjian.us/mhonarc/patches/msg0.html
  will get reviewed before the feature freeze?
 
 Feature freeze is the deadline for patch submission, not patch application.

Right, and it will be applied this week, I think.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Bruce Momjian

Rod Taylor wrote:
 Sorry, hit send too quickly.
 
 NOLOCK is kinda like NOWAIT, except implies that the command will not
 take a strong lock instead of stating that it will not wait for one.
 
 On Mon, 2006-07-24 at 11:20 -0400, Rod Taylor wrote:
 SHARED CREATE INDEX 

I like NOLOCK myself.  ONLINE makes it sound like non-ONLINE index
creation has to happen when the server is down.

I don't like SHARED as the first word because we don't do this in any
other cases --- it makes it look like the command is SHARED, and I am
sure people will try putting SHARED in front of other commands like
UPDATE.  Hey, where is the SHARED manual page?

Anyway, if you want psql to easily identify it, just return NOLOCK as
part of the command string returned:

test= create index i on test(x);
CREATE INDEX NOLOCK

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Bruce Momjian

Bruce Momjian wrote:
 Rod Taylor wrote:
  Sorry, hit send too quickly.
  
  NOLOCK is kinda like NOWAIT, except implies that the command will not
  take a strong lock instead of stating that it will not wait for one.
  
  On Mon, 2006-07-24 at 11:20 -0400, Rod Taylor wrote:
SHARED CREATE INDEX 
 
 I like NOLOCK myself.  ONLINE makes it sound like non-ONLINE index
 creation has to happen when the server is down.
 
 I don't like SHARED as the first word because we don't do this in any
 other cases --- it makes it look like the command is SHARED, and I am
 sure people will try putting SHARED in front of other commands like
 UPDATE.  Hey, where is the SHARED manual page?
 
 Anyway, if you want psql to easily identify it, just return NOLOCK as
 part of the command string returned:
 
   test= create index i on test(x);
   CREATE INDEX NOLOCK

Oh, psql needs to know before the command is sent?  How do we handle it
now with CLUSTER?  Whatever psql is trying to prevent doesn't seem to
warrant mucking up the logical order of the CREATE INDEX command.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Making config file parser available to add-ins

2006-07-24 Thread Bruce Momjian

Marc Munro wrote:
-- Start of PGP signed section.
 On Mon, 2006-07-24 at 15:17 -0400, Tom Lane wrote:
  No, being able to do that is exactly the point of the custom-GUC
  mechanism.
 
 Excellent.  I shall study this and hope to bother you no further :-)
 

And in 8.2, postgresql.conf can include other configuration files.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Better name/syntax for online index creation

2006-07-24 Thread Alvaro Herrera

Bruce Momjian wrote:

  Anyway, if you want psql to easily identify it, just return NOLOCK as
  part of the command string returned:
  
  test= create index i on test(x);
  CREATE INDEX NOLOCK
 
 Oh, psql needs to know before the command is sent?  How do we handle it
 now with CLUSTER?

We don't, which is exactly the problem.  If I'm not mistaken, currently
psql in autocommit off mode, CLUSTER doesn't start a transaction block,
which is arguably wrong because some forms of CLUSTER (single-table) are
able to work within a transaction.  But since not all of them are, then
we must act like they all were, because otherwise we would send spurious
error messages to the user.

 Whatever psql is trying to prevent doesn't seem to
 warrant mucking up the logical order of the CREATE INDEX command.

Personally I'm not sure if this is too serious an issue.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] On-disk bitmap index patch

2006-07-24 Thread Mark Kirkwood


Jie Zhang wrote:


On 7/24/06 6:59 AM, Hannu Krosing [EMAIL PROTECTED] wrote:




And also for AND-s of several indexes, where indexes are BIG, your btree
indexes may be almost as big as tables but the resulting set of pages is
small.


Yeah, Hannu points it out very well -- the bitmap index works very well when
columns have low cardinalities and AND operations will produce small number
of results.

Also, the bitmap index is very small in low cardinality cases, where the
btree tends to take up at least 10 times more space.




The smallness of the bitmap index also means that some queries will 
require much less work_mem to achieve good performance e.g consider:


 TPCH dataset with scale factor 10 on my usual PIII HW, query -
 select count(*) from lineitem where l_linenumber=1;

This executes in about 100 seconds with work_mem = 20M if there is a 
bitmap index on l_linenumber. It takes 3832 seconds (!) if there is a 
btree index on the same column. Obviously cranking up work_mem will even 
up the difference (200M gets the btree to about 110 seconds), but being 
able to get good performance with less memory is a good thing!


Cheers

Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] effective_cache_size is a real?

2006-07-24 Thread Tom Lane

Peter Eisentraut [EMAIL PROTECTED] writes:
 Is it intentional that effective_cache_size is a real (as opposed to 
 integer)?

Yes --- the planner generally does all that stuff in float arithmetic to
avoid worrying about overflow.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

1 2 >

1 - 100 of 107 matches

Mail list logo