Re: [HACKERS] strange 'vacuum verbose analyze' behaviour

2005-01-29 Thread Oleg Bartunov
Tom,
After almost 20 hours running vacuum I see postmaster grew a little bit:
  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND 
20458 postgres  15   0 2136m 553m 204m D 33.2 54.7 198:18.36 postmaster

It's strange that I see no output since starting 
vacuumdb -v -z -f wsdb  vacuum-wsdb.log 21

-rw-r--r--1 postgres postgres48784 Jan 28 15:03 vacuum-wsdb.log
Is there something I could do ?
Oleg
On Fri, 28 Jan 2005, Oleg Bartunov wrote:
On Fri, 28 Jan 2005, Tom Lane wrote:
Oleg Bartunov oleg@sai.msu.su writes:
Memory growth stoped at 1.8Gb

   PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
20458 postgres  15   0 1902m 503m 204m D  5.9 49.7  13:59.61 postmaster
Index-related memory leak maybe?  What are the indexes on this table,
exactly?
nothing special :)
CREATE TABLE usno (
ra real,
dec real,
bmag real,
rmag real);
COPY usno FROM stdin with delimiter '|';
CREATE INDEX radec_idx1 ON usno (ra,dec) ;
ALTER TABLE usno ADD COLUMN ipix bigint;
UPDATE usno SET ipix = q3c_ang2pix(ra,dec);
CREATE INDEX ipix_ind ON usno (ipix);

regards, tom lane
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] some linker troubles with rc5 on sun studio 9 ...

2005-01-29 Thread Hans-Jürgen Schönig
Tom Lane wrote:
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes:
tuptoaster.c, line 966: member can not have variably modified type: data

We've seen that before.  Apparently there are some versions of Sun's
compiler that are too stupid to reduce this constant expression to a
constant.  File a bug with Sun.
regards, tom lane
---(end of broadcast)---
TIP 8: explain analyze is your friend

As Tom pointed it this is truly a compiler bug of zthe compiler included 
in Sun Studio 9. It seems as if Sun is attempting to fix that.

Here is a test case which will fail on Sun CC 5.6.
#include stddef.h
#include stdio.h
typedef struct {
int s1;
char data[10];
unsigned char bits[1];
} Ss;
int main(int argc, char **argv){
struct {
int ap;
char data[offsetof(Ss,bits)];
}s1;
printf(offsetof: %d, sizeof %d\n,
offsetof(Ss, bits),
sizeof(s1.data));
return 0;
}
When I'm using Studio 9 cc, compilation failed:
/opt/F9/SUNWspro/bin/cc -V -Xa -o test test.c
cc: Sun C 5.6 2004/07/15
acomp: Sun C 5.6 2004/07/15
test.c, line 14: member can not have variably modified type: data
cc: acomp failed for test.c
8.0.0 compiles now with --disable-spinlocks.
When spinlocks are enabled we will see one more problem ...
gmake[4]: Entering directory 
`/usr/share/src/pg/postgresql-8.0.0/src/backend/storage/lmgr'
/opt/SUNWspro/bin/cc -Xa -v -g -I../../../../src/include -c -o lmgr.o lmgr.c
/opt/SUNWspro/bin/cc -Xa -v -g -I../../../../src/include -c -o lock.o lock.c
/opt/SUNWspro/bin/cc -Xa -v -g -I../../../../src/include -c -o proc.o proc.c
/opt/SUNWspro/bin/cc -Xa -v -g -I../../../../src/include -c -o 
deadlock.o deadlock.c
/opt/SUNWspro/bin/cc -Xa -v -g -I../../../../src/include -c -o lwlock.o 
lwlock.c
/opt/SUNWspro/bin/cc -Xa -v -g -I../../../../src/include -c -o spin.o spin.c
/opt/SUNWspro/bin/cc -Xa -v -g -I../../../../src/include -c -o s_lock.o 
s_lock.c
/opt/SUNWspro/prod/bin/fbe: /tmp/yabeAAAQ.aaFl, line 277: error: can't 
compute difference between symbols in different segments
Failure in /opt/SUNWspro/prod/bin/fbe, status = 0x100
Fatal Error exec'ing /opt/SUNWspro/prod/bin/fbe
cc: acomp failed for s_lock.c
gmake[4]: *** [s_lock.o] Error 2
gmake[4]: Leaving directory 
`/usr/share/src/pg/postgresql-8.0.0/src/backend/storage/lmgr'
gmake[3]: *** [lmgr-recursive] Error 2
gmake[3]: Leaving directory 
`/usr/share/src/pg/postgresql-8.0.0/src/backend/storage'
gmake[2]: *** [storage-recursive] Error 2
gmake[2]: Leaving directory `/usr/share/src/pg/postgresql-8.0.0/src/backend'
gmake[1]: *** [all] Error 2
gmake[1]: Leaving directory `/usr/share/src/pg/postgresql-8.0.0/src'
gmake: *** [all] Error 2

The nice thing is: This error is not documented and we have not found a 
flag to preserve /tmp/yabeAAAQ.aaFl (which makes this error very useful).

google does not contain information about this issue because all four 
sites I have found are down.

Has anybody ever seen something like that before?
Hans
--
Cybertec Geschwinde u Schoenig
Schoengrabern 134, A-2020 Hollabrunn, Austria
Tel: +43/660/816 40 77
www.cybertec.at, www.postgresql.at
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Group-count estimation statistics

2005-01-29 Thread Mischa
 From: Sailesh Krishnamurthy [EMAIL PROTECTED]
  Tom == Tom Lane [EMAIL PROTECTED] writes:
 
 Tom The only real solution, of course, is to acquire cross-column
 Tom statistics, but I don't see that happening in the near
 Tom future.
 
 Another approach is a hybrid hashing scheme where we use a hash table
 until we run out of memory at which time we start spilling to disk. In
 other words, no longer use SortAgg at all ..
 
 Under what circumstances will a SortAgg consumer more IOs than a
 hybrid hash strategy ?

Goetz Graefe did a heck of a lot of analysis of this, prior to his being snapped
up by Microsoft. He also worked out a lot of the nitty-gritty for hybrid hash
algorithms, extending the Grace hash for spill-to-disk, and adding a kind of
recursion for really huge sets. The figures say that hybrid hash beats
sort-aggregate, across the board. 


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Victor Y. Yegorov
Hello.
I'd like to implement bitmap indexes and want your comments. Here is
an essence of what I've found regarding bitmaps for the last month.
Consider the following table   So, the bitmap for attribute A will be the
with 1 attribute A(int2):  following:
 # | A Val | Bitmap(s)
+---   -+---
 1 | 1   1 | 11011001 0111
 2 | 1   2 | 00100100 1000
 3 | 2   3 | 0010 
 4 | 1
 5 | 1
 6 | 2
 7 | 3
 8 | 1
 9 | 2
10 | 1
11 | 1
12 | 1
Some points:
1) If some new value will be inserted (say, 4) at some point of time, a new
  bitmap for it will be added. Same for NULLs (if atrribute has no NOT NULL
  contraint) --- one more bitmap. Or should we restrict NOT NULL for
  bitmap'ed attributes?;
2) Queries, like where A = 1 or where A != 2 will require only 1 scan of
  the index, while where A  3 will require 2 stages: 1st create a 
list of
  values lesser then 3, 2nd --- do OR of all bitmaps for that values.
  For high cardinality attributes, this can take a lot of time;

3) Each bitmap is only a bitmap, so there should be an array of 
corresponding
  ctids pointers. Maybe, some more arrays (pages, don't know).

For 2)nd --- there are techniques, allowing better performance for A  3
queries via increased storage space (see here for details:
http://delab.csd.auth.gr/papers/ADBIS03mmnm.pdf) and increased reaction time
for simple queries. I don't know, if they should be implemented, may later.
The most tricky part will be combinig multiple index scans on several
attributes --- as Neil Conway said on #postrgesql, this will be tricky, 
as some
modifications will be needed in the index scan api. I remember, Tom Lane
suggested on-disk bitmaps --- implementing bitmap index access method
would be of much use not only for bitmap indexes, I think.

WAH compressing method should be used for bitmaps (to my mind). Also, 
there is
a method of reordering heap tuples for better compression of bitmaps, I 
thought
it may be possible to implement it as some option to the existing CLUSTER
command, papers:
WAH: http://www-library.lbl.gov/docs/LBNL/496/26/PDF/LBNL-49626.pdf
CLUSTER: http://www.cse.ohio-state.edu/~hakan/publications/reordering.pdf

I'd like to hear from you, before starting to do something.
--
Victor
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] [pgsql-hackers] Allow GRANT/REVOKE permissions to be applied to all schema

2005-01-29 Thread Stephen Frost
* Tom Lane ([EMAIL PROTECTED]) wrote:
 Or just make the user enter two commands for this case.  Aside from
 syntactic simplicity, that might be a good idea anyway.  The NEW TABLES
 case is *fundamentally* different from every other form of GRANT, in
 that it causes future actions.  So it might be a wise idea from the
 standpoint of understandability to keep it as a separate command from
 the immediate-acting ALL TABLES.

I agree with this- issueing two seperate commands in this instance seems
like it'd be fine and not terribly onerous.

Stephen


signature.asc
Description: Digital signature


[HACKERS] Allowing VACUUM to time out when waiting for locks?

2005-01-29 Thread Philip Warner
We have a frequently updated (peak  5/sec) table with about 1000 rows.
We run VACCUM FULL on this table every 5 minutes.
The regular updates are not long in duration, and the vacuum is fast, so they
do not produce noticeable delays.
When we run a pg_dump on the database:
- the dump takes a long standing AccessShareLock lock on this table
(the database is large, and the table is locked for the duration).
- the regular updates run quite happily
- the VACUUM FULL comes along and asks for a AccessExclusiveLock
  (which is not granted due to PG_DUMP).
- the regular updates hang until the dump completes
Is it possible to set up a vacuum to fail if a lock is not granted in
a limited period of time (eg. 1 minute)?
We could use lock files to synchronize our dumps with our vacuums, but
were hoping for a way of managing this within the DB so that ad-hoc dumps
will not cause a problem.

Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 03 5330 3172  | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp.mit.edu:11371   |/ 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly


Re: [HACKERS] Allowing VACUUM to time out when waiting for locks?

2005-01-29 Thread Bruno Wolff III
On Sun, Jan 30, 2005 at 01:23:11 +1100,
  Philip Warner [EMAIL PROTECTED] wrote:
 
 We have a frequently updated (peak  5/sec) table with about 1000 rows.
 
 We run VACCUM FULL on this table every 5 minutes.

Why not just use plain VACUUM? The table will reach a steady state size.
You should only need to use VACUUM FULL if an event causes the table
to bloat to significantly more than steady state size and you want to reduce
the size again.

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [pgsql-hackers] Patent issues and 8.1

2005-01-29 Thread Robert Treat
On Friday 28 January 2005 12:36, Josh Berkus wrote:
 Robert,

  Read the law... willful vs. unknown infringement are two different
  things.

 We're not infringing anything, yet.   That's a *pending* patent.


*sigh*  Thats understood.  But you were using the counter-argument that we 
might be infringing on patents we know nothing about now so why treat this 
one any different. I'm pointing out this one would be different because we 
know about it and the law treats these things seperatly.  

  Um... thats the way our legal system works. You could do that to any
  project if you had a patent they were infringing upon no matter how
  stoic they tried to be about it. (By our I mean the U.S. system)

 You're not following me.  Imagine this:
 1) Pervasive/Fujistsu/SRA/Mammoth PostgreSQL steals some big clients from
 Obsolete Proprietary Database Company (OPDC).
 2) OPDC has someone dig through their piles of patents and finds something
 that looks like it might infringe on something PostgreSQL does.
 3) OPDC gets a blogger or similar to post something And in the latest
 patent infringment news ...
 4) -Hackers hears about it and we derail development for another 3 months
 in order to work around the patent.
 Net Cost to OPDC: couple $thousand, to delay a PG release by 3+ months.

 What's kept patent litigation from being used against OSS projects so far
 is the bad PR that would attach, the potential cost of litigation, the
 possibility of having the patent invalidated, and the dubvious prospect of
 compensation.  But if a competitor can disrupt an OSS project with a
 *threatened* patent, then the cost is minimal and the effect is huge.

 We will face this situation again -- at least, until software patents go
 away -- and both I and Bruce feel that it's important to set a precedent in
 dealing with them because you can bet this discussion is being read by
 people who are not in favor of the spread of PostgreSQL.This isn't just
 about the ARC patent, it's about the next one after that.


I guess I don't understand your rational here?  You want to set a precendent 
that the PGDG only responds to lawsuits?  Seems we should be willing to 
address anyones patent concerns in a resonable manner...  but that will 
depend on the size of the changes needed and what point in the development 
cycle we are.  This is a good size change and it comes at a time in the dev 
cycle when we have all our options open (it would be different if we were 4 
months in with all kinds of new things already added) and it's a feature that 
*we all want to change anyway* so why not be agressive about it?

  FWIW I've really only been advocating

 BTW, my last post wasn't specifically addressed at you, but at the
 viewpoint that we should drop everything and work on the ARC replacement to
 get it out the door in 4 months.

  that we don't do the change in a
  patch branch, which I'm afraid the do nothing till the lawyers show up
  plan would eventually lead to. We wouldn't normally do things that way
  on technical grounds, so I'd prefer not to be forced into doing things
  that way for other reasons; enough so that I think we ought to have a
  plan to address it now.

 It's not a choice between doing something and doing nothing; you're
 mischaracterizing.   It's a choice between:

 1) Shall we begin development immediately on an 8.1 which does not include
 the ARC code and can be upgraded without initdb, for plans to release this
 version in 4 months or less?

 2) Shall we work our regular 1-year development cycle, with plans to
 replace ARC with an improved memory management approach as part of the
 features of 8.1, keeping a reversion-to-LRU patch in reserve in case we
 have to release it as a patch in the 8.0.x series?

 I advocate (2), partly because I don't believe that (1) is really possible
 for us.   When's the last time we did a fast release?   What I do advocate
 doing *now* is:


I'm not mischarecterizing, I just feel that putting out an lru based 8.0.x 
release is such a bad idea that I'd rather do (1) than gamble on (2).  
Honestly I don't think anything will ever come of this, but if things go 
spectacularly bad, the fewer  arc-based releases out there the better.  Not 
to mention that the only downside I have seen to (1) is that people think it 
will disrupt development too much but I don't buy that.  We can branch 8.1 
and 8.2 now, with 2month dev planned for 8.1 and a 12 month dev for 8.2 and 
go about things.  This would also have the advantage of pushing out a lot of 
loose ends a bit sooner (do we really want to wait a year for things like 
typo friendly psql?) as people get more understanding of the new features 
made in 8.0.  

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Allowing VACUUM to time out when waiting for locks?

2005-01-29 Thread Tom Lane
Philip Warner [EMAIL PROTECTED] writes:
 We have a frequently updated (peak  5/sec) table with about 1000 rows.
 We run VACCUM FULL on this table every 5 minutes.

I agree with Bruno's comment that you shouldn't be doing that in the
first place.  Plain vacuum (perhaps executed even more often, like
once a minute) will cause fewer locking headaches.

 Is it possible to set up a vacuum to fail if a lock is not granted in
 a limited period of time (eg. 1 minute)?

I think you could do that by setting a statement timeout.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Tom Lane
Victor Y. Yegorov [EMAIL PROTECTED] writes:
 I remember, Tom Lane suggested on-disk bitmaps

I have suggested no such thing, and in fact believe that the sort of
index structure you are proposing would be of very little use.  What
I've been hoping to look into is *in memory* bitmaps used as an
interface between index scans and the subsequent heap lookups.
See eg this thread:
http://archives.postgresql.org/pgsql-hackers/2004-10/msg00439.php
particularly
http://archives.postgresql.org/pgsql-hackers/2004-10/msg00668.php

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [pgsql-hackers] Patent issues and 8.1

2005-01-29 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 I'm not mischarecterizing, I just feel that putting out an lru based 8.0.x 
 release is such a bad idea that I'd rather do (1) than gamble on (2).  

I don't understand why you think it's such a bad idea.  We do have the
problem of getting adequate testing, but I think the answer to that
is to put the same patch into HEAD as well.

 We can branch 8.1 and 8.2 now, with 2month dev planned for 8.1 and a
 12 month dev for 8.2 and go about things.

I will resist that idea strongly.  We have no experience as a community
with managing multiple active development branches, and I feel certain
that we'd mess it up (eg, commit things into the wrong branch, or fail
to commit things into both branches that need to be in both).  Case in
point: Teodor has already, without discussion, committed 8.1 changes in
tsearch2 that should force an initdb.  If we were taking the idea of a
backward-compatible 8.1 seriously we'd have to make him back that out of
8.1.  I can't see trying to ride herd on all the committers to make sure
no one unintentionally breaks file-level compatibility over a whole dev
cycle, even a short one.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Victor Y. Yegorov
* Tom Lane [EMAIL PROTECTED] [29.01.2005 18:24]:
 Victor Y. Yegorov [EMAIL PROTECTED] writes:
  I remember, Tom Lane suggested on-disk bitmaps
 
 I have suggested no such thing, and in fact believe that the sort of
 index structure you are proposing would be of very little use.

Why? I thought they would be useful for data warehouse databases.

Maybe I said something the wrong way, but what I'm trying to implement
is exactly what is said about in the first link you've posted below:
http://archives.postgresql.org/pgsql-hackers/2004-10/msg00439.php

Or am I misunderstanding the point?


 What I've been hoping to look into is *in memory* bitmaps used as an
 interface between index scans and the subsequent heap lookups.

Sorry, that was what I've been speaking of.

Anyway, bitmap indexes API could be used for in-memory bitmaps you're speaking
of.


-- 

Victor Y. Yegorov

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Victor Y. Yegorov
* Pawe Niewiadomski [EMAIL PROTECTED] [29.01.2005 17:45]:
  I'd like to implement bitmap indexes and want your comments. Here is
  an essence of what I've found regarding bitmaps for the last month.
 
 Do you think it would be possible to work on it as a team?

Yes, why not.

But everything depends on the community, may bitmaps will be realized as a
contrib or pgfoundry module. The only thing --- I don't know, if that is
possible for indexes.


-- 

Victor Y. Yegorov

---(end of broadcast)---
TIP 8: explain analyze is your friend


[HACKERS] Security bugs in tsearch2

2005-01-29 Thread Tom Lane
I just noticed that all of tsearch2's dict init routines are declared
like this:

CREATE FUNCTION dex_init(text)
returns internal
as 'MODULE_PATHNAME' 
language 'C';

This is really unacceptable, because it breaks the type safety of the
internal pseudotype.  I quote from datatype.sgml:

The internal pseudo-type is used to declare functions
that are meant only to be called internally by the database
system, and not by direct invocation in a SQL
query.  If a function has at least one internal-type
argument then it cannot be called from SQL.  To
preserve the type safety of this restriction it is important to
follow this coding rule: do not create any function that is
declared to return internal unless it has at least one
internal argument.

In a database with these functions installed, it will be trivial for
anyone to crash the backend (and perhaps do worse things) since he can
pass the result of dex_init to any function taking an internal argument
--- almost none of which are expecting a dict data structure.

Please change these at once.  The best solution might be to declare
an actual dict datatype for these functions to return.

Another security issue is that these functions can be used to read any
file that the backend can read.  I'm not sure at the moment whether a
bad guy could do anything useful with the result (are the contents of
stoplists exposed anywhere?) but it seems fairly dangerous to allow
arbitrary pathnames in the argument.

A possible short-term solution is to revoke public execute rights on
these functions, so that only superusers can call them.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Mike Rylander
On Sat, 29 Jan 2005 19:41:20 +0200, Victor Y. Yegorov [EMAIL PROTECTED] wrote:
 * Pawe Niewiadomski [EMAIL PROTECTED] [29.01.2005 17:45]:
   I'd like to implement bitmap indexes and want your comments. Here is
   an essence of what I've found regarding bitmaps for the last month.
  
  Do you think it would be possible to work on it as a team?
 
 Yes, why not.
 
 But everything depends on the community, may bitmaps will be realized as a
 contrib or pgfoundry module. The only thing --- I don't know, if that is
 possible for indexes.

For on-disk bitmap indexes, yes.  I don't see any reason this couldn't
be done with GiST, perhaps even as a generalization of the index stuff
in the int_array contrib module.  But the bitmaps that Tom as been
advocating, the ones used to join two index scans, will require a new
planner Op.

As a side note, wouldn't the in-memory bitmaps pretty much kill the
need for multicolumn indexes?  It seems that they would be able to
join index scans on the same table, and then there would be no need
for industrial strength cross-column correlation stats.  The planner
would be able to choose a multi index scan based on multiple single
column stat entries and completely sidestep the need for precalculated
cross-column correlations.  Am I getting that right?

-- 
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Dawid Kuroczko
On Sat, 29 Jan 2005 18:46:44 +, Mike Rylander [EMAIL PROTECTED] wrote:
 As a side note, wouldn't the in-memory bitmaps pretty much kill the
 need for multicolumn indexes?  It seems that they would be able to
 join index scans on the same table, and then there would be no need
 for industrial strength cross-column correlation stats.  The planner
 would be able to choose a multi index scan based on multiple single
 column stat entries and completely sidestep the need for precalculated
 cross-column correlations.  Am I getting that right?

I'm not too sure of that.  Lets imagine big table with two columns,
a and b.  If we use multicolumn index (a,b), the search must go through
a tree, find a value, and from there find b value.

With in-memory bitmap, the search would start with index a, all
matching rows would form the bitmap; then the second search
would go through b index, forming another bitmap.  Which then
would be ANDed with previous bitmap.
If I am correct, in case of in-memory bitmap PostgreSQL would
have to read more index tuples (the less unique values, the
more tuples to read) which in majority of cases would mean
more work than multicolumn index.

However in-memory bitmap would speed up many other
cases (think: OR), but multicolumn indexes are there to stay. :)

   Regards,
 Dawid

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] [pgsql-hackers] Patent issues and 8.1

2005-01-29 Thread Robert Treat
On Saturday 29 January 2005 11:33, Tom Lane wrote:
 Robert Treat [EMAIL PROTECTED] writes:
  I'm not mischarecterizing, I just feel that putting out an lru based
  8.0.x release is such a bad idea that I'd rather do (1) than gamble on
  (2).

 I don't understand why you think it's such a bad idea.  We do have the
 problem of getting adequate testing, but I think the answer to that
 is to put the same patch into HEAD as well.


The combination of inadequate testing, making support more difficult, and 
general all around confusion that beta/rc's for a revision level release are 
sure to cause. Not to mention that if the patent ever does materialize into a 
problem, the scope of that problem will be that much greater the longer we 
wait.

  We can branch 8.1 and 8.2 now, with 2month dev planned for 8.1 and a
  12 month dev for 8.2 and go about things.

 I will resist that idea strongly.  We have no experience as a community
 with managing multiple active development branches, and I feel certain
 that we'd mess it up (eg, commit things into the wrong branch, or fail
 to commit things into both branches that need to be in both). Case in 
 point: Teodor has already, without discussion, committed 8.1 changes in
 tsearch2 that should force an initdb.   If we were taking the idea of a 
 backward-compatible 8.1 seriously we'd have to make him back that out of
 8.1. 

I think this is a false case since right now we are hanging in limbo, with 
people unsure of what is proper to commit into what branch.  If there had 
been an official announcement that no initdb level changes were to go into 
8.1 I think we'd be ok.  

 I can't see trying to ride herd on all the committers to make sure 
 no one unintentionally breaks file-level compatibility over a whole dev
 cycle, even a short one.


I think the key is to put someone in charge of either 8.1 or 8.2 and let them 
be the primary gatekeeper for that release.  It can work either way... people 
develop against 8.1 and we have an 8.2 gatekeeper(s) responsible for patching 
forward any new commits into 8.2 and handling file-level incompatible feature 
commits.  Conversly we can have folks develop against 8.2 and have someone in 
charge of backpatching any non file-level incompatible changes backwards and 
the ARC changes.  

There are other upsides to this as well.  If we could do this now it would 
help move us to the ability to keep feature development going year round.  
Rather than having to stop 4-5 months every year to do beta we could create a 
new branch during beta and let people continue on with that... we already had 
some rumblings of that idea during the 8.0 beta cycle, this would give us a 
good test run. 

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Dawid Kuroczko
On Sat, 29 Jan 2005 21:54:39 +0200, Victor Yegorov [EMAIL PROTECTED] wrote:
 * Dawid Kuroczko [EMAIL PROTECTED] [29.01.2005 21:25]:
 
  With in-memory bitmap, the search would start with index a, all
  matching rows would form the bitmap; then the second search
  would go through b index, forming another bitmap.  Which then
  would be ANDed with previous bitmap.
 
 Not only matching rows will form a bitmap, all rows should.
 
 And the physical order of rows in the table is important to form bitmap.

My mistake -- when I said all matching rows would form the bitmap
I meant all matching rows would form '1's in the bitmap. :)
Gotta work on clarity of my messages.

   Regards,
   Dawid

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Victor Yegorov
* Dawid Kuroczko [EMAIL PROTECTED] [29.01.2005 21:25]:

 With in-memory bitmap, the search would start with index a, all
 matching rows would form the bitmap; then the second search
 would go through b index, forming another bitmap.  Which then
 would be ANDed with previous bitmap.

Not only matching rows will form a bitmap, all rows should.

And the physical order of rows in the table is important to form bitmap.


-- 

Victor Y. Yegorov

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] [pgsql-hackers] Allow GRANT/REVOKE permissions to be applied to all schema

2005-01-29 Thread Robert Treat
On Saturday 29 January 2005 09:14, Stephen Frost wrote:
 * Tom Lane ([EMAIL PROTECTED]) wrote:
  Or just make the user enter two commands for this case.  Aside from
  syntactic simplicity, that might be a good idea anyway.  The NEW TABLES
  case is *fundamentally* different from every other form of GRANT, in
  that it causes future actions.  So it might be a wise idea from the
  standpoint of understandability to keep it as a separate command from
  the immediate-acting ALL TABLES.

 I agree with this- issueing two seperate commands in this instance seems
 like it'd be fine and not terribly onerous.


In general I'd agree, although as I start to think of the different 
combinations of tables/views/functions/sequences/types/operators/etc.. my 
head does start to spin.  

-- 
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Tom Lane
Mike Rylander [EMAIL PROTECTED] writes:
 As a side note, wouldn't the in-memory bitmaps pretty much kill the
 need for multicolumn indexes?  It seems that they would be able to
 join index scans on the same table, and then there would be no need
 for industrial strength cross-column correlation stats.

No, because the ability to do it is not the same as the ability to
predict in advance how many rows will result.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] IBM patent

2005-01-29 Thread Christopher Browne
Oops! [EMAIL PROTECTED] (Tommi Maekitalo) was seen spray-painting on a wall:
 Hi,

 I just read about this IBM-patent-issue at www.heise.de. IBM grants
 this patens to all projects, which follow one of the licenses, which
 are approved by the open-source-initiative. And the BSD-license is
 as far as I see approved (I found New BSD license).

 When releasing commercial closed-source-variants of postgresql this
 BSD-license stays intact, so the use of these patents in postgresql
 seems ok.

Actually, the latter isn't so.

If Mammoth or Pervasive or such release their own release of
PostgreSQL, nothing has historically mandated that they make that
release available under the BSD license.

Presumably acceptance of the patent would change that.

You and I might not have individual objections to this situation, but
one or another of the companies putting together PostgreSQL releases
very well might.
-- 
output = (cbbrowne @ gmail.com)
http://www.ntlug.org/~cbbrowne/oses.html
If you were plowing a field, which  would you rather use?  Two strong
oxen or 1024 chickens?  -- Seymour Cray

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] IBM patent

2005-01-29 Thread Marc G. Fournier
On Wed, 26 Jan 2005, Christopher Browne wrote:
Actually, the latter isn't so.
If Mammoth or Pervasive or such release their own release of
PostgreSQL, nothing has historically mandated that they make that
release available under the BSD license.
Presumably acceptance of the patent would change that.
You and I might not have individual objections to this situation, but
one or another of the companies putting together PostgreSQL releases
very well might.
But, there is nothing stop'ng them from replacing the ARC code with their 
own variant though ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Neil Conway
Mike Rylander wrote:
For on-disk bitmap indexes, yes.  I don't see any reason this couldn't
be done with GiST
It might be possible to do it with GiST, but GiST is designed for 
implementing tree-structured indexes; I don't think it's the right tool 
for the job.

-Neil
---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Mike Rylander
On Sun, 30 Jan 2005 11:07:59 +1100, Neil Conway [EMAIL PROTECTED] wrote:
 Mike Rylander wrote:
  For on-disk bitmap indexes, yes.  I don't see any reason this couldn't
  be done with GiST
 
 It might be possible to do it with GiST, but GiST is designed for
 implementing tree-structured indexes; I don't think it's the right tool
 for the job.

For the initial example where the index is implemented as a set of
unique keys from the table and a bitmap for each key this would look a
unique index, but with an extra datum at at each index node to hold
the bitmap for that key.  If implemented that way an augmented B-Tree
structure would work fine.  At least that's how I would imagine an
on-disk bitmap index would work.  I suppose that would make the index
much more efficient for high-cardinality values, no?

-- 
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Neil Conway
Mike Rylander wrote:
For the initial example where the index is implemented as a set of
unique keys from the table and a bitmap for each key this would look a
unique index, but with an extra datum at at each index node to hold
the bitmap for that key.  If implemented that way an augmented B-Tree
structure would work fine.  At least that's how I would imagine an
on-disk bitmap index would work.
It might _work_, I just don't see the point. Given an attribute of a 
heap relation that has N distinct values and T tuples, you need to store

- N bitmaps, each of T bits (before compression)
- T ctids
- a way to map from a bit in one of the bitmaps to a heap tuple
- a way to decide which bitmap(s) to use for a given index scan
I don't see why it's a win to organize this data in a tree. Why not 
store the ctids in a simple array? You then know that bit K of any 
bitmap refers to entry K of the ctid array. You'd also need some meta 
data to figure out which bitmap to use for a given scankey, but it 
should be pretty easy to do that efficiently.

-Neil
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Implementing Bitmap Indexes

2005-01-29 Thread Mike Rylander
On Sun, 30 Jan 2005 12:15:20 +1100, Neil Conway [EMAIL PROTECTED] wrote:
 It might _work_, I just don't see the point. Given an attribute of a
 heap relation that has N distinct values and T tuples, you need to store
 
 - N bitmaps, each of T bits (before compression)
 - T ctids
 - a way to map from a bit in one of the bitmaps to a heap tuple
 - a way to decide which bitmap(s) to use for a given index scan
 
 I don't see why it's a win to organize this data in a tree. Why not
 store the ctids in a simple array? You then know that bit K of any
 bitmap refers to entry K of the ctid array. You'd also need some meta
 data to figure out which bitmap to use for a given scankey, but it
 should be pretty easy to do that efficiently.

OK, I think it just clicked.  I was seeing a tree for the initial
lookup to find the right bitmaps to scan.  Does that seem like to much
overhead for the first step?

So, pick the bitmap(s) based on the key, scan the bitmaps and combine
them based on the WHERE condition combination type, and as you find
matching bits you toss the ctids into a matching array.  Then it's a
fast ctid scan.  That it?  I'm very interested in this after reading a
bit (heh he) about bitmap indexes.  Here's how I'm visualizing it now:

For a query like SELECT * FROM table WHERE a IN (1,3) ...

Index on table.a looks like:

bitmaps
1 | 001001001001000
2 | 1001011
3 | 010110100010110

ctids
1 | {2,5,8,11}
2 | {0,7,9,14}
3 | {1,3,4,6,10,12,13}


The index scan would do bitwise a OR on bitmaps 1 and 3, find the
possition of the 1s, jump to those possitions in the ctid array, and
bounce to the heap for the value.


-- 
Mike Rylander
[EMAIL PROTECTED]
GPLS -- PINES Development
Database Developer
http://open-ils.org

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Bug in create operator and/or initdb

2005-01-29 Thread John Hansen

The following seems to me a bug in either initdb or create operator:

CREATE FUNCTION my_func (inet, inet) as '$libdir/my_func.so' LANGUAGE 'C' 
IMMUTABLE STRICT;
CREATE OPERATOR  (
PROCEDURE = my_func,
LEFTARG = cidr,
RIGHTARG = cidr,
RESTRICT = contsel,
JOIN = contjoinsel
);

ERROR:  function my_func(cidr, cidr) does not exist


Now, if you look at the catalog, and the  (less than operator) as an example 
you will see that:

Two operators are defined for  - one for inet,inet and another for cidr,cidr.
Only one function exists named network_lt, and is declared as taking 
(inet,inet) as arguments.

Obviously, it should either have a corresponding function declaration, or it 
should be possible to create the operators using a binary compatible function 
(eg: where a binary compatible cast exists).

I propose, that the create operator syntax be modified to accept:

PROCEDURE = function_name (type{,type}) 

and that checks be made for the existence of binary compatible casts between 
the two (four) types.

Kind Regards,

John

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [BUGS] Bug in create operator and/or initdb

2005-01-29 Thread John Hansen
 My opinion is that this is a very bogus shortcut in the 
 network datatype code.  There are no cases outside the 
 inet/cidr group where an operator doesn't exactly match its 
 underlying function.  (The whole business of inet and cidr 
 being almost but not quite the same type is maldesigned
 anyway...)
 
 The right solution for you is to declare two SQL functions.  
 Whether you make them point at the same underlying C code is 
 up to you.

Right,...

In that case may I suggest fixing the catalog so network_* functions exists for 
both datatypes!
Anything less I'd consider inconsistent...

Kind regards,

John

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [BUGS] Bug in create operator and/or initdb

2005-01-29 Thread Tom Lane
John Hansen [EMAIL PROTECTED] writes:
 CREATE FUNCTION my_func (inet, inet) as '$libdir/my_func.so' LANGUAGE 'C' 
 IMMUTABLE STRICT;
 CREATE OPERATOR  (
   PROCEDURE = my_func,
   LEFTARG = cidr,
   RIGHTARG = cidr,
   RESTRICT = contsel,
   JOIN = contjoinsel
 );

 ERROR:  function my_func(cidr, cidr) does not exist

Right ...

 Now, if you look at the catalog, and the  (less than operator) as an example 
 you will see that:

 Two operators are defined for  - one for inet,inet and another for cidr,cidr.
 Only one function exists named network_lt, and is declared as taking 
 (inet,inet) as arguments.

My opinion is that this is a very bogus shortcut in the network datatype
code.  There are no cases outside the inet/cidr group where an operator
doesn't exactly match its underlying function.  (The whole business of
inet and cidr being almost but not quite the same type is maldesigned
anyway...)

The right solution for you is to declare two SQL functions.  Whether you
make them point at the same underlying C code is up to you.

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [BUGS] Bug in create operator and/or initdb

2005-01-29 Thread Tom Lane
John Hansen [EMAIL PROTECTED] writes:
 In that case may I suggest fixing the catalog so network_* functions exists 
 for both datatypes!

Redesigning the inet/cidr distinction is on the to-do list (though I'm
afraid not very high on the list).  ISTM it should either be one type
with a distinguishing bit in the runtime representation, or two types
with no such bit needed.  Having both is a schizophrenic design.  It's
led directly to bugs in the past, and I think there are still some
corner cases that act oddly (see the archives).

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] [BUGS] Bug in create operator and/or initdb

2005-01-29 Thread Steve Atkins
On Sat, Jan 29, 2005 at 10:07:30PM -0500, Tom Lane wrote:
 John Hansen [EMAIL PROTECTED] writes:
  In that case may I suggest fixing the catalog so network_* functions exists 
  for both datatypes!
 
 Redesigning the inet/cidr distinction is on the to-do list (though I'm
 afraid not very high on the list).  ISTM it should either be one type
 with a distinguishing bit in the runtime representation, or two types
 with no such bit needed.  Having both is a schizophrenic design.  It's
 led directly to bugs in the past, and I think there are still some
 corner cases that act oddly (see the archives).

From a network engineering point of view the inet type is utterly
bogus. I'm not aware of data of that type being needed or used in
any real application. Given that, the complexity that it causes
simply by existing seems too high a cost.

I suspect that the right thing to do is to kill the inet type
entirely, and replace it with a special case of cidr. (And possibly
then to kill cidr and replace it with something that can be indexed
more effectively.)

For a replacement type, how important is it that it be completely
compatible with the existing inet/cidr types? Is anyone actually using
inet types with a non-cidr mask?

Cheers,
  Steve

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] IBM patent

2005-01-29 Thread Kevin Brown
Marc G. Fournier wrote:

 On Wed, 26 Jan 2005, Christopher Browne wrote:
 
 Actually, the latter isn't so.
 
 If Mammoth or Pervasive or such release their own release of
 PostgreSQL, nothing has historically mandated that they make that
 release available under the BSD license.
 
 Presumably acceptance of the patent would change that.
 
 You and I might not have individual objections to this situation, but
 one or another of the companies putting together PostgreSQL releases
 very well might.
 
 But, there is nothing stop'ng them from replacing the ARC code with
 their own variant though ...

Not only that, I'd go further and say that they have a duty to either
do that or pay someone to do it.  They are, after all, the entities
that probably care about the situation the most.

This type of situation seems to me to be one that has to be examined
from a greatest good point of view.  If IBM were to allow all open
source projects to make free use of a patent (thus exposing only those
entities which sell commercial versions under a non-open-source
license to risk), then the PG group might be faced with the tradeoff
of using a superior but patented (though free for open source use)
algorithm, or using a possibly inferior but unencumbered one.  I'd
wager that the vast majority of PostgreSQL users received their copy
via the open source license.  Unless the encumbered algorithm is not
significantly superior to the unencumbered one, the greater good is
likely to be to make use of the patented algorithm and force the
non-open-source vendors to deal with removing the algorithm
themselves.


None of that really applies to the specific situation we're
discussing, however: the current ARC implementation is apparently not
showing itself to be a clearly superior approach, so some other
approach is probably warranted.


-- 
Kevin Brown   [EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Huge memory consumption during vacuum (v.8.0)

2005-01-29 Thread Oleg Bartunov
Hi there,
seems I have a serious problem with vacuuming of rather big table
(500,000,000 rows) on dual Intel(R) Xeon(TM) CPU 2.40GHz, 1Gb RAM,
running Linux 2.6.7.  I have PostgreSQL 8.0 release installed with
slightly changed postgresql.conf:
shared_buffers = 24576   # min 16, at least max_connections*2, 8KB each
maintenance_work_mem = 65536 # 16384# min 1024, size in KB
checkpoint_segments = 12 #3 # in logfile segments, min 1, 16MB each
I tried run 'vacuumdb -v -z -f wsdb  vacuum-wsdb.log 21' 
with default value of maintenance_work_mem but it was
too small for big table and I increased its value as Tom recommended.
But this change causes huge memory consumption - rather quickly memory
grew to 1Gb and after almost 42 hours of running (yes, it's still running) 
postmaster eats more than 2Gb of RAM

20458 postgres  15   0 2462m 646m 204m D 37.5 63.9 744:38.74 postmaster
There are no messages in log file since start (just pg_* tables),  so it's 
difficult to say if there is some useful activity :)

The only non-standard action was installing 8.0 in neighbour with running
7.4.6 version. I run configure with different prefix and pgport specified
and use PGPORT, PGLIB, PGDATA, PATH modified to work with new postmaster.
I don't see any problem here.
Does anybody have experience vacuuming large database with 8.0 ?
table is very simple:
 Table public.usno
 Column |  Type  | Modifiers 
++---
 ra | real   |
 dec| real   |
 bmag   | real   |
 rmag   | real   |
 ipix   | bigint | 
Indexes:
ipix_ind btree (ipix)
radec_idx1 btree (ra, dec)

Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] [BUGS] Bug in create operator and/or initdb

2005-01-29 Thread Tom Lane
Steve Atkins [EMAIL PROTECTED] writes:
 For a replacement type, how important is it that it be completely
 compatible with the existing inet/cidr types? Is anyone actually using
 inet types with a non-cidr mask?

If you check the archives you'll discover that our current inet/cidr
types were largely designed and implemented by Paul Vixie (yes, that
Vixie).  I'm disinclined to second-guess Paul about the external
definition of these types; I just want to rationalize the internal
representation a bit.  In particular we've got some issues about
conversions between the two types ...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Huge memory consumption during vacuum (v.8.0)

2005-01-29 Thread Tom Lane
Oleg Bartunov oleg@sai.msu.su writes:
 I tried run 'vacuumdb -v -z -f wsdb  vacuum-wsdb.log 21' 

I'm confused.  The log trace you showed us before appeared to be from
a non-FULL vacuum, but here you're saying it's VACUUM FULL.  Which is
it ... or did you change?

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Allowing VACUUM to time out when waiting for

2005-01-29 Thread Philip Warner
At 02:53 AM 30/01/2005, Tom Lane wrote:
Philip Warner [EMAIL PROTECTED] writes:
 We have a frequently updated (peak  5/sec) table with about 1000 rows.
 We run VACCUM FULL on this table every 5 minutes.
Plain vacuum (perhaps executed even more often, like
once a minute) will cause fewer locking headaches.
We have done both in the past, but found some tables still just grew 
(perhaps just because of infrequent locks that prevented the plain VACUUM). 
I'll go back to the plain VACUUM and monitor the table growth.

Am I correct in saying that the FSM now tracks the entire table, and that 
the FSM parameters just determine how much is stored in memory?


I think you could do that by setting a statement timeout.
This would be a good solution if we still see growth with plain VACUUM.
Is any type of opportunistic locking likely/planned for a future version 
(ie. a has lock, b asks for conflicting lock, c asks for lock that is OK 
with a but denied by b; so c's lock is allowed and b stays waiting).



Philip Warner| __---_
Albatross Consulting Pty. Ltd.   |/   -  \
(A.B.N. 75 008 659 498)  |  /(@)   __---_
Tel: (+61) 0500 83 82 81 | _  \
Fax: (+61) 03 5330 3172  | ___ |
Http://www.rhyme.com.au  |/   \|
 |----
PGP key available upon request,  |  /
and from pgp.mit.edu:11371   |/ 

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Huge memory consumption during vacuum (v.8.0)

2005-01-29 Thread Oleg Bartunov
On Sun, 30 Jan 2005, Tom Lane wrote:
Oleg Bartunov oleg@sai.msu.su writes:
I tried run 'vacuumdb -v -z -f wsdb  vacuum-wsdb.log 21'
I'm confused.  The log trace you showed us before appeared to be from
a non-FULL vacuum, but here you're saying it's VACUUM FULL.  Which is
it ... or did you change?
Yes, first time I tried vacuum from withing psql, next time I decided
to run vacuumdb and seems changed option.

regards, tom lane
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Huge memory consumption during vacuum (v.8.0)

2005-01-29 Thread Tom Lane
Oleg Bartunov oleg@sai.msu.su writes:
 On Sun, 30 Jan 2005, Tom Lane wrote:
 I'm confused.  The log trace you showed us before appeared to be from
 a non-FULL vacuum, but here you're saying it's VACUUM FULL.  Which is
 it ... or did you change?

 Yes, first time I tried vacuum from withing psql, next time I decided
 to run vacuumdb and seems changed option.

Um.  Well, a VACUUM FULL is going to build in-memory data structures
that represent *all* of the usable free space in a table.  I don't
actually think that VACUUM FULL is useful on an enormous table ... you
want to keep after it with routine plain VACUUMs, instead.

Another possibility is to use CLUSTER or a rewriting ALTER TABLE to
shrink the space, but be aware that this requires a transient second
copy of the table and indexes.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Allowing VACUUM to time out when waiting for locks?

2005-01-29 Thread Tom Lane
Philip Warner [EMAIL PROTECTED] writes:
 Am I correct in saying that the FSM now tracks the entire table, and that 
 the FSM parameters just determine how much is stored in memory?

No.  Any free space that can't be remembered in FSM is lost to use.
(Not completely --- an update of a row on the same page can reuse it ---
but for the most part you want to make FSM large enough to remember all
the useful free space.)

 Is any type of opportunistic locking likely/planned for a future version 
 (ie. a has lock, b asks for conflicting lock, c asks for lock that is OK 
 with a but denied by b; so c's lock is allowed and b stays waiting).

That's deliberately disallowed by the current logic because of the risk
of starving b indefinitely.  IIRC it would be a trivial code change to
do the other, but I doubt it's a good idea.  The typical situation is
exactly a VACUUM that wants an exclusive lock, versus a fairly
continuous stream of shared lock requests for select/insert/update/delete.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Huge memory consumption during vacuum (v.8.0)

2005-01-29 Thread Oleg Bartunov
On Sun, 30 Jan 2005, Tom Lane wrote:
Oleg Bartunov oleg@sai.msu.su writes:
On Sun, 30 Jan 2005, Tom Lane wrote:
I'm confused.  The log trace you showed us before appeared to be from
a non-FULL vacuum, but here you're saying it's VACUUM FULL.  Which is
it ... or did you change?

Yes, first time I tried vacuum from withing psql, next time I decided
to run vacuumdb and seems changed option.
Um.  Well, a VACUUM FULL is going to build in-memory data structures
that represent *all* of the usable free space in a table.  I don't
actually think that VACUUM FULL is useful on an enormous table ... you
want to keep after it with routine plain VACUUMs, instead.
ok. I'll try without FULL, but if memory does not fail me postmaster was
also greedy. Let's see
Another possibility is to use CLUSTER or a rewriting ALTER TABLE to
shrink the space, but be aware that this requires a transient second
copy of the table and indexes.
regards, tom lane
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faq


Re: [HACKERS] Huge memory consumption during vacuum (v.8.0)

2005-01-29 Thread Oleg Bartunov
On Sun, 30 Jan 2005, Oleg Bartunov wrote:
On Sun, 30 Jan 2005, Tom Lane wrote:
Oleg Bartunov oleg@sai.msu.su writes:
On Sun, 30 Jan 2005, Tom Lane wrote:
I'm confused.  The log trace you showed us before appeared to be from
a non-FULL vacuum, but here you're saying it's VACUUM FULL.  Which is
it ... or did you change?

Yes, first time I tried vacuum from withing psql, next time I decided
to run vacuumdb and seems changed option.
Um.  Well, a VACUUM FULL is going to build in-memory data structures
that represent *all* of the usable free space in a table.  I don't
actually think that VACUUM FULL is useful on an enormous table ... you
want to keep after it with routine plain VACUUMs, instead.
ok. I'll try without FULL, but if memory does not fail me postmaster was
also greedy. Let's see
Seems, postmaster eats expected amount of memory now ! Will see how long
it will proceeded. Probably, my case should be documented somewhere.

Another possibility is to use CLUSTER or a rewriting ALTER TABLE to
shrink the space, but be aware that this requires a transient second
copy of the table and indexes.
I aware, but I don't so much free space :)
Is there TODO for scaling VACUUM FULL ?

regards, tom lane
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
 http://www.postgresql.org/docs/faq
Regards,
Oleg
_
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83
---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match