Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Magnus Hagander
Tom Lane wrote:
 In the meantime, though, it's not quite clear why this would lead to
 a buildfarm failure --- it should just mean a lot of extraneous files
 appearing in a fresh checkout.  (Looks a bit harder ... Oh, it looks
 like btree_gist has some files that used to be autogenerated and are
 now in CVS, so the bogusly new versions from CVS are suppressing the
 desired generation from the old btree_num.c file.)

Funny enoughy, that's exactly the problem I saw on win32 when I rsynced
from a windows box and tried a local CVS checkout :-) Once I rsynced to
a local linux instead and checked out from there, things worked again :)

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] is this trigger safe and efective? - locking (caching via triiggers)

2007-08-15 Thread Pavel Stehule
Hello

I write sample about triggers and i have question. is my solution
correct and exists better solution?

Regards
Pavel Stehule

DROP SCHEMA safecache CASCADE;

CREATE SCHEMA safecache;

CREATE TABLE safecache.source_tbl(category int, int_value int);

CREATE TABLE safecache.cache(category int, sum_val int);

CREATE OR REPLACE FUNCTION safecache.source_tbl_trg_fce()
RETURNS trigger AS
$$
BEGIN
  IF TG_OP = 'INSERT' THEN
-- row cannot exists in cache -- complication
-- I would to finish these transaction without conflict
IF NOT EXISTS(SELECT category
 FROM safecache.cache
WHERE category = NEW.category) THEN
  LOCK TABLE safecache.cache IN SHARE ROW EXCLUSIVE MODE;
  -- I have to repeat test
  IF NOT EXISTS(SELECT category
   FROM safecache.cache
  WHERE category = NEW.category) THEN
INSERT INTO safecache.cache
   VALUES(NEW.category, NEW.int_value);
  END IF;
ELSE
  -- simple
  UPDATE safecache.cache
 SET sum_val = sum_val + NEW.int_value
WHERE category = NEW.category;
END IF;
  ELSEIF TG_OP = 'UPDATE' THEN
-- if category is without change simple
IF NEW.category = OLD.category THEN
  UPDATE safecache.cache
 SET sum_val = sum_val + (NEW.int_value - OLD.int_value)
WHERE category = OLD.category;
ELSE
  -- old category has to exists
  UPDATE safecache.cache
 SET sum_val = sum_val - OLD.int_value
WHERE category = OLD.category;
  -- new category is maybe problem
  IF NOT EXISTS(SELECT category
   FROM safecache.cache
  WHERE category = NEW.category) THEN
LOCK TABLE safecache.cache IN SHARE ROW EXCLUSIVE MODE;
-- I have to repeat test
IF NOT EXISTS(SELECT category
 FROM safecache.cache
WHERE category = NEW.category) THEN
  INSERT INTO safecache.cache
 VALUES(NEW.category, NEW.int_value);
END IF;
  ELSE
-- simple, new category exists
UPDATE safecache.cache
   SET sum_val = sum_val + OLD.int_value
  WHERE category = NEW.category;
  END IF;
END IF;
  ELSE -- DELETE
-- value have to exist in cache, simple
UPDATE safecache.cache
   SET sum_val = sum_val - OLD.int_value
  WHERE category = OLD.category;
  END IF;
  RETURN NEW;
END
$$ LANGUAGE plpgsql;

CREATE TRIGGER actualise_cache
   AFTER INSERT OR UPDATE OR DELETE
   ON safecache.source_tbl
   FOR EACH ROW EXECUTE PROCEDURE safecache.source_tbl_trg_fce();

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Magnus Hagander
Tom Lane wrote:
 So I think that cvs rtag -d REL7_4_STABLE pgsql will fix it.
 I'd like someone to double-check that though.  Also maybe we should back
 up the repository first?

Just for your info: all VMs on tribble, which includes cvs, are backed
up at 02:30 every day, CEST (that's 00:30 UTC right now, but it follows
the local time on tribble so it'll be 23:30 UTC when austra switches
later this year. Perhaps we should switch the machine to GMT, but that's
a different discussion)

For something like this it's good to run a special backup before that of
course, but it's probably good to know when the backups run anyway...

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Magnus Hagander
Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
 So I'd say your strategy looks good - backup and remove the  phony tag. 
 I'd also say we should probably be logging tag commands in taginfo. 
 Presumably we mere mortal committers should not be doing any tagging 
 whatsoever, and tags should only be applied in preparation for releases. 
 
 +1 ... we should at least log such commands, and maybe disallow to anyone
 except Marc's pgsql account.  Particularly since they don't get
 reported in pgsql-committers messages (or is that something we
 can/should change?)
 

I think you can, by sticking a command in the taginfo command. It's the
same script that Marc is using to try to restrict who can do it - once
he has that working, it should be as simple as adding a mail command to
the end of that script.

And +1 on doing it.

//Magnus


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Magnus Hagander
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Since I don't think that a datatype solution is the way to go,
 I don't feel that we are as far away from an agreement as Bruce
 is worried about.
 
 Well, from where I sit, there is one person saying give me the foot gun,
 and Heikki saying he wants a bullet-proof type system, and you and I are
 in the middle, so the big problem is I don't see a concensus forming,
 and we have been discussing this for a while.
 
 The people who actually use tsearch2 seem to all have the same opinion ...
 so I think we can't go too far in the bullet-proofing direction.
 
 But I would like a design that is bulletproof in dump/reload scenarios,
 and I think it's fair to question that aspect of the tsearch2 design
 because we've seen many reports of people having trouble updating
 databases that use tsearch2.

dump/reload is *the* biggest problem I've had with tsearch2 so far. But
it hasn't been with the actual data - it's been the functions, and only
when migrating between versions. But solving dump/reload reliably is one
of the main things I'm hoping for in 8.3 ;-)

As for a nother use-pointer, I use different configurations in the same
database - but only one per table. I explicitly use the to_tsvector that
specifies a configuration always - to avoid surprising myself.

I don't use the functional index part, but for new users I can see how
that's certainly a *lot* easier. Requiring the specification of the
configuration explicitly when creating this index I don't see as a big
problem at all - compared to the work needed to set up triggers. But
it's nice not to have to do it when querying. But wouldn't that be
solved by having to_tsvector() require the configuration, but
to_tsquery() and plainto_tsquery() not require it?

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'd like someone to double-check that though.  Also maybe we should back
 up the repository first?

 Just for your info: all VMs on tribble, which includes cvs, are backed
 up at 02:30 every day, CEST

Good, but the salient followup questions to that are (1) backed up to
where exactly?, and (2) how many days' past backups could we get at,
if we had to?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'd like someone to double-check that though.  Also maybe we should back
 up the repository first?
 
 Just for your info: all VMs on tribble, which includes cvs, are backed
 up at 02:30 every day, CEST
 
 Good, but the salient followup questions to that are (1) backed up to
 where exactly?, and (2) how many days' past backups could we get at,
 if we had to?

1) The entire VM, with dump
2) We have today, yesterday and one weekly (copies the daily on monday,
if I read it right)

They are dumped to a NFS share on this schedule. That NFS share is
dumped to tape by systems at Conova - I'll let Stefan fill in the
details about that.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Oleg Bartunov

On Wed, 15 Aug 2007, Magnus Hagander wrote:


Tom Lane wrote:

Bruce Momjian [EMAIL PROTECTED] writes:

Tom Lane wrote:

Since I don't think that a datatype solution is the way to go,
I don't feel that we are as far away from an agreement as Bruce
is worried about.



Well, from where I sit, there is one person saying give me the foot gun,
and Heikki saying he wants a bullet-proof type system, and you and I are
in the middle, so the big problem is I don't see a concensus forming,
and we have been discussing this for a while.


The people who actually use tsearch2 seem to all have the same opinion ...
so I think we can't go too far in the bullet-proofing direction.

But I would like a design that is bulletproof in dump/reload scenarios,
and I think it's fair to question that aspect of the tsearch2 design
because we've seen many reports of people having trouble updating
databases that use tsearch2.


dump/reload is *the* biggest problem I've had with tsearch2 so far. But
it hasn't been with the actual data - it's been the functions, and only
when migrating between versions. But solving dump/reload reliably is one
of the main things I'm hoping for in 8.3 ;-)


The dump/reload problem should be gone once tsearch2 became a part of core.
the problem is an  inability to say what is a correct configuration in case 
of expressional index when restoring. In any other case there are many

use cases when tsvector could be intentionally obtained using different
configurations.



As for a nother use-pointer, I use different configurations in the same
database - but only one per table. I explicitly use the to_tsvector that
specifies a configuration always - to avoid surprising myself.

I don't use the functional index part, but for new users I can see how
that's certainly a *lot* easier. Requiring the specification of the
configuration explicitly when creating this index I don't see as a big
problem at all - compared to the work needed to set up triggers. But
it's nice not to have to do it when querying. But wouldn't that be
solved by having to_tsvector() require the configuration, but
to_tsquery() and plainto_tsquery() not require it?


or better to introduce novice-level interface with configuration name
required and insist on using it with expressional index (don't know
if there is a machinery to do so).


Regards,
Oleg
_
Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru),
Sternberg Astronomical Institute, Moscow University, Russia
Internet: [EMAIL PROTECTED], http://www.sai.msu.su/~megera/
phone: +007(495)939-16-83, +007(495)939-23-83

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Good, but the salient followup questions to that are (1) backed up to
 where exactly?, and (2) how many days' past backups could we get at,
 if we had to?

 They are dumped to a NFS share on this schedule. That NFS share is
 dumped to tape by systems at Conova - I'll let Stefan fill in the
 details about that.

That's good as far as it goes, but seeing that PG is a worldwide
organization now, I wonder whether our primary CVS shouldn't have
backups on several continents.  Pardon my paranoia ... but our
collective arses have been saved by offsite backups at least once
already ...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Magnus Hagander
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Good, but the salient followup questions to that are (1) backed up to
 where exactly?, and (2) how many days' past backups could we get at,
 if we had to?
 
 They are dumped to a NFS share on this schedule. That NFS share is
 dumped to tape by systems at Conova - I'll let Stefan fill in the
 details about that.
 
 That's good as far as it goes, but seeing that PG is a worldwide
 organization now, I wonder whether our primary CVS shouldn't have
 backups on several continents.  Pardon my paranoia ... but our
 collective arses have been saved by offsite backups at least once
 already ...

The CVS repository is already synced almost-live back to both svr1 in
panama and the anoncvs box in the US. It will be synced to the new VM
marc is setting up as well. I assume these machines also have some kind
of backup - Marc will have to fill in on that.

Oh, and the plan is to have the entire VM synced up between austria and
panama machine once / day again once Marc has fixed the issues with that
box.

//Magnus

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Alvaro Herrera
Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Good, but the salient followup questions to that are (1) backed up to
  where exactly?, and (2) how many days' past backups could we get at,
  if we had to?
 
  They are dumped to a NFS share on this schedule. That NFS share is
  dumped to tape by systems at Conova - I'll let Stefan fill in the
  details about that.
 
 That's good as far as it goes, but seeing that PG is a worldwide
 organization now, I wonder whether our primary CVS shouldn't have
 backups on several continents.  Pardon my paranoia ... but our
 collective arses have been saved by offsite backups at least once
 already ...

We already have that -- there's plenty of people who have rsync'ed
copies of the repository.  Now, I don't back it up, so if someone makes
a mistake my copy will faithfully follow it.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Marko Kreen
On 8/15/07, Tom Lane [EMAIL PROTECTED] wrote:
 I wrote:
  Kris Jurka [EMAIL PROTECTED] writes:
  It looks like parts of the CVS repository have been mistagged as belonging
  to REL7_4_STABLE or have been corrupted somehow:

 I have no idea how you make CVS do that, but I'm
 sure there is some magic one-liner for it.

Something like cvs tag without -F on the HEAD?

-- 
marko

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Stefan Kaltenbrunner
Magnus Hagander wrote:
 Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 Good, but the salient followup questions to that are (1) backed up to
 where exactly?, and (2) how many days' past backups could we get at,
 if we had to?
 They are dumped to a NFS share on this schedule. That NFS share is
 dumped to tape by systems at Conova - I'll let Stefan fill in the
 details about that.
 That's good as far as it goes, but seeing that PG is a worldwide
 organization now, I wonder whether our primary CVS shouldn't have
 backups on several continents.  Pardon my paranoia ... but our
 collective arses have been saved by offsite backups at least once
 already ...
 
 The CVS repository is already synced almost-live back to both svr1 in
 panama and the anoncvs box in the US. It will be synced to the new VM
 marc is setting up as well. I assume these machines also have some kind
 of backup - Marc will have to fill in on that.
 
 Oh, and the plan is to have the entire VM synced up between austria and
 panama machine once / day again once Marc has fixed the issues with that
 box.

yeah that has been my understanding as well though I'm not sure if marc
has set up the sync of the full VM already.


Stefan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Stefan Kaltenbrunner
Magnus Hagander wrote:
 Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'd like someone to double-check that though.  Also maybe we should back
 up the repository first?
 Just for your info: all VMs on tribble, which includes cvs, are backed
 up at 02:30 every day, CEST
 Good, but the salient followup questions to that are (1) backed up to
 where exactly?, and (2) how many days' past backups could we get at,
 if we had to?
 
 1) The entire VM, with dump
 2) We have today, yesterday and one weekly (copies the daily on monday,
 if I read it right)
 
 They are dumped to a NFS share on this schedule. That NFS share is
 dumped to tape by systems at Conova - I'll let Stefan fill in the
 details about that.

that's exactly what happens on the backup side here - but maybe I missed
an earlier mail - is something broken on tribble or on the new CVS VM
that requires a restore from backup ?


Stefan

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Stefan Kaltenbrunner
Stefan Kaltenbrunner wrote:
 Magnus Hagander wrote:
 Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
 Tom Lane wrote:
 I'd like someone to double-check that though.  Also maybe we should back
 up the repository first?
 Just for your info: all VMs on tribble, which includes cvs, are backed
 up at 02:30 every day, CEST
 Good, but the salient followup questions to that are (1) backed up to
 where exactly?, and (2) how many days' past backups could we get at,
 if we had to?
 1) The entire VM, with dump
 2) We have today, yesterday and one weekly (copies the daily on monday,
 if I read it right)

 They are dumped to a NFS share on this schedule. That NFS share is
 dumped to tape by systems at Conova - I'll let Stefan fill in the
 details about that.
 
 that's exactly what happens on the backup side here - but maybe I missed
 an earlier mail - is something broken on tribble or on the new CVS VM
 that requires a restore from backup ?

erm -ENOTENOUGHCOFFEE


Stefan

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Peter Eisentraut
Am Mittwoch, 15. August 2007 04:20 schrieb Tom Lane:
 we should at least log such commands, and maybe disallow to anyone
 except Marc's pgsql account.

I don't think we should disallow it.  Or otherwise we might one day be stuck 
if we need to release while some specific person is on vacation.

I never understood why tagging uses a special account anyway.  It should be 
done as the person doing the tagging.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Index Tuple Compression Approach?

2007-08-15 Thread Simon Riggs
On Wed, 2007-08-15 at 06:51 +0100, Heikki Linnakangas wrote:
 Jeff Davis wrote:
  On Tue, 2007-08-14 at 16:27 -0500, Decibel! wrote:
  Isn't this what Grouped Index Tuples is?
  
  http://community.enterprisedb.com/git/git-readme.txt
  
  It looks like GIT is a little different. 
  
  GIT actually stores a lower-bound key of a contiguous* range of keys
  that all point to the same page, and for each of those ranges stores a
  bitmap of page offsets. A search searches first for an exact match in
  the index, and failing that, looks to see if the previous index tuple
  happens to be one of these ranges.
  
  The algorithm Chris is talking about stores a set of tuple ids (which
  include page and offset) for each distinct key.
 
 Right.
 
  Both could be helpful, although I don't think they can work together
  very well.
 
 What Chris is suggesting is basically a special case of GIT, where all
 the heap tuples represented by an index tuple have the same key. I was
 actually thinking of adding a flag to index tuples to indicate that
 special case in GIT. We could effectively do both.

A few additional thoughts...

The approach suggested by Chris is also used by Teradata Non-Unique
Secondary Indexes, known as NUSIs (but not named by me!). The following
link is a public domain description that is detailed enough:
http://teradata.uark.edu/research/wang/indexes.html

Replicating this approach directly isn't that useful because it would
interact badly with the way we handle updates. Thinking about the basic
design pattern however, we can envisage a type of index that changes the
1:1 mapping between index and heap tuple into the concept of a tuple
set index where we have a 1:Many mapping between index and heap. 

In general, the tuple set index approach can significantly reduce index
size. This won't be of interest to anyone unless all of your data
overflows RAM and you need to do I/O to constantly page back in pieces
of your data. If your database does fit in RAM, reducing the number of
index blocks might just increase contention. This means that the tuple
set approach is only useful when you have very large databases, but when
you do it is very, very useful.

GIT is a tuple set index with two important extensions:

1. GIT allows a range of values to be indexed, not just a single value,
so this allows it to be useful for both Unique and Non-Unique cases.

2. GIT restricts the set of tuples to a small range of blocks within the
table. Making the range of blocks = 1 means that GIT is designed to work
well with HOT, which also stays on the same block. Keeping the range of
blocks small means GIT degrades as slowly as possible in the face of
cold UPDATEs. If the values are inserted in roughly ordered/clustered
sequence then this doesn't increase index size at all, so the most
common/highest volume use cases are covered.

So from my perspective, GIT is very close to the optimal design for a
tuple set index that addresses the need for high concurrency and
unique/near-uniqueness with PostgreSQL. There are certainly many other
options for a tuple set index, and bitmap indexes are simply another
version of a tuple set index but with different behaviours tuned to a
different use case. There maybe other use cases that require more than
two kinds of tuple set index...and we have discussed implementing the
tuple set behaviour as part of the other main index types.

As an aside, it turns out, after further research that GIT is similar to
a clustered index in SQLServer, as described by Louis Davidson, Pro SQL
Server 2005 Database Design and Optimization, p.405. SQLServer creates
a clustered index by default on each PK, so it says.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Testing the async-commit patch

2007-08-15 Thread Simon Riggs
On Tue, 2007-08-14 at 12:29 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  On Tue, 2007-08-14 at 12:09 -0400, Tom Lane wrote:
  heapam.c lines 1843-1852 presume previous xmax can be hinted
  immediately, ditto lines 2167-2176, ditto lines 2716-2725.
  I think probably we should just remove those lines --- they are only
  trying to save work for future tqual.c calls.
 
  I'll check those out later tonight.
 
 [ looks closer ] Actually, we can't just dike out those code sections,
 because the immediately following code assumes that XMAX_INVALID is
 correct.  So I guess we need to export HeapTupleSetHintBits from tqual.c
 and do the full pushup in these places.

Looks good: the code is neater and better commented than before as well
as being fully accurate with async commit.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Simon Riggs
On Tue, 2007-08-14 at 17:41 -0400, Tom Lane wrote:

 I've just finished re-reading the prior thread, and here are what seem
 to me to be the salient points:
 
 * Oleg, Teodor, and all of the old-line users of tsearch2 are
 comfortable with setting up a trigger to maintain a materialized
 tsvector column for a table.  They argue that recomputing the tsvector
 (possibly more than once) during a query is more expensive than fetching
 it from disk.  My suspicion is that the latter argument gets weaker
 every year --- CPUs are getting faster lots faster than disks are.
 
 * Bruce (and I ... not sure about anyone else) want to support usage of
 text search via a functional index.  This is argued to be easier to set
 up (no fooling with triggers) and possibly faster depending on CPU vs
 I/O speeds.  I don't think there is any desire here to eliminate the
 trigger approach, just to provide an alternative.

ISTM that the functional index would be considerably smaller than the
additional column approach, since tsvectors can be quite long. That
seems like a very desirable thing with larger textbases. However,
without an additional column certain queries would not be possible, such
as IndexScans on a non-text search index with an additional filter on
text search. So each way would be desirable in different situations.

Would it be wrong to allow both approaches? If there is strong
disagreement then it usually means both people are right.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Simon Riggs
On Wed, 2007-08-15 at 08:10 +0200, Magnus Hagander wrote:
 Tom Lane wrote:
  Bruce Momjian [EMAIL PROTECTED] writes:
  Tom Lane wrote:
  Since I don't think that a datatype solution is the way to go,
  I don't feel that we are as far away from an agreement as Bruce
  is worried about.
  
  Well, from where I sit, there is one person saying give me the foot gun,
  and Heikki saying he wants a bullet-proof type system, and you and I are
  in the middle, so the big problem is I don't see a concensus forming,
  and we have been discussing this for a while.
  
  The people who actually use tsearch2 seem to all have the same opinion ...
  so I think we can't go too far in the bullet-proofing direction.
  
  But I would like a design that is bulletproof in dump/reload scenarios,
  and I think it's fair to question that aspect of the tsearch2 design
  because we've seen many reports of people having trouble updating
  databases that use tsearch2.
 
 dump/reload is *the* biggest problem I've had with tsearch2 so far. But
 it hasn't been with the actual data - it's been the functions, and only
 when migrating between versions. But solving dump/reload reliably is one
 of the main things I'm hoping for in 8.3 ;-)

I can see the problem, but I'm sure there are more solutions than have
been listed so far.

If dump/restore is a problem we can:

1. force pg_dump to output a SET command for the GUC, so it is correctly
set at restore time. That seems like a straightforward addition to
pg_dump. Maybe this can be done in a generalised manner to support other
dump/restore configuration difficulties that might occur in the future.

2. put the desired value in a table and make sure the text_search_config
table is dumped ahead of other objects. When we restore we build the
index based on the config option set in the table, so it all just
works. 

3... probably other options too. 

Maybe we should consider that the user may be dumping and reloading
*because* they want the configuration to change. Just a thought.

-- 
  Simon Riggs
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Bruce Momjian
Magnus Hagander wrote:
  But I would like a design that is bulletproof in dump/reload scenarios,
  and I think it's fair to question that aspect of the tsearch2 design
  because we've seen many reports of people having trouble updating
  databases that use tsearch2.
 
 dump/reload is *the* biggest problem I've had with tsearch2 so far. But
 it hasn't been with the actual data - it's been the functions, and only
 when migrating between versions. But solving dump/reload reliably is one
 of the main things I'm hoping for in 8.3 ;-)
 
 As for a nother use-pointer, I use different configurations in the same
 database - but only one per table. I explicitly use the to_tsvector that
 specifies a configuration always - to avoid surprising myself.
 
 I don't use the functional index part, but for new users I can see how
 that's certainly a *lot* easier. Requiring the specification of the
 configuration explicitly when creating this index I don't see as a big
 problem at all - compared to the work needed to set up triggers. But
 it's nice not to have to do it when querying. But wouldn't that be
 solved by having to_tsvector() require the configuration, but
 to_tsquery() and plainto_tsquery() not require it?

Yea, I have thought about splitting up the behavior so tsvector always
needs the configuration but tsquery does not.  However, for a query, you
are probably still creating a tsvector so it didn't see to help much in
clarity.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Magnus Hagander
On Wed, Aug 15, 2007 at 10:23:00AM -0400, Bruce Momjian wrote:
 Magnus Hagander wrote:
   But I would like a design that is bulletproof in dump/reload scenarios,
   and I think it's fair to question that aspect of the tsearch2 design
   because we've seen many reports of people having trouble updating
   databases that use tsearch2.
  
  dump/reload is *the* biggest problem I've had with tsearch2 so far. But
  it hasn't been with the actual data - it's been the functions, and only
  when migrating between versions. But solving dump/reload reliably is one
  of the main things I'm hoping for in 8.3 ;-)
  
  As for a nother use-pointer, I use different configurations in the same
  database - but only one per table. I explicitly use the to_tsvector that
  specifies a configuration always - to avoid surprising myself.
  
  I don't use the functional index part, but for new users I can see how
  that's certainly a *lot* easier. Requiring the specification of the
  configuration explicitly when creating this index I don't see as a big
  problem at all - compared to the work needed to set up triggers. But
  it's nice not to have to do it when querying. But wouldn't that be
  solved by having to_tsvector() require the configuration, but
  to_tsquery() and plainto_tsquery() not require it?
 
 Yea, I have thought about splitting up the behavior so tsvector always
 needs the configuration but tsquery does not.  However, for a query, you
 are probably still creating a tsvector so it didn't see to help much in
 clarity.

Agh, I got stuck thinking the trigger case aagin - when you don't need to
create a vector at all.

//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] is this trigger safe and efective? - locking (caching via triiggers)

2007-08-15 Thread Decibel!
I don't like the locking... take a look at Ex 37-1 at the end of
http://lnk.nu/postgresql.org/fhe.html for a better way (though, the
comment below about going into an infinite loop is a good observantion,
but I think perhaps after some number of fast tries it should start
putting a sleep in the loop, rather than just arbitrarily bombing after
10 tries.

Also, I remember discussion on -performance about this from folks using
it in the real world... the problem they ran into is that doing the
updates in the cache/mview table directly bloated it too much... they
found it was better to just insert changes into an interim table, and
then periodically batch-process that table.

On Wed, Aug 15, 2007 at 08:01:24AM +0200, Pavel Stehule wrote:
 Hello
 
 I write sample about triggers and i have question. is my solution
 correct and exists better solution?
 
 Regards
 Pavel Stehule
 
 DROP SCHEMA safecache CASCADE;
 
 CREATE SCHEMA safecache;
 
 CREATE TABLE safecache.source_tbl(category int, int_value int);
 
 CREATE TABLE safecache.cache(category int, sum_val int);
 
 CREATE OR REPLACE FUNCTION safecache.source_tbl_trg_fce()
 RETURNS trigger AS
 $$
 BEGIN
   IF TG_OP = 'INSERT' THEN
 -- row cannot exists in cache -- complication
 -- I would to finish these transaction without conflict
 IF NOT EXISTS(SELECT category
  FROM safecache.cache
 WHERE category = NEW.category) THEN
   LOCK TABLE safecache.cache IN SHARE ROW EXCLUSIVE MODE;
   -- I have to repeat test
   IF NOT EXISTS(SELECT category
FROM safecache.cache
   WHERE category = NEW.category) THEN
 INSERT INTO safecache.cache
VALUES(NEW.category, NEW.int_value);
   END IF;
 ELSE
   -- simple
   UPDATE safecache.cache
  SET sum_val = sum_val + NEW.int_value
 WHERE category = NEW.category;
 END IF;
   ELSEIF TG_OP = 'UPDATE' THEN
 -- if category is without change simple
 IF NEW.category = OLD.category THEN
   UPDATE safecache.cache
  SET sum_val = sum_val + (NEW.int_value - OLD.int_value)
 WHERE category = OLD.category;
 ELSE
   -- old category has to exists
   UPDATE safecache.cache
  SET sum_val = sum_val - OLD.int_value
 WHERE category = OLD.category;
   -- new category is maybe problem
   IF NOT EXISTS(SELECT category
FROM safecache.cache
   WHERE category = NEW.category) THEN
 LOCK TABLE safecache.cache IN SHARE ROW EXCLUSIVE MODE;
 -- I have to repeat test
 IF NOT EXISTS(SELECT category
  FROM safecache.cache
 WHERE category = NEW.category) THEN
   INSERT INTO safecache.cache
  VALUES(NEW.category, NEW.int_value);
 END IF;
   ELSE
 -- simple, new category exists
 UPDATE safecache.cache
SET sum_val = sum_val + OLD.int_value
   WHERE category = NEW.category;
   END IF;
 END IF;
   ELSE -- DELETE
 -- value have to exist in cache, simple
 UPDATE safecache.cache
SET sum_val = sum_val - OLD.int_value
   WHERE category = OLD.category;
   END IF;
   RETURN NEW;
 END
 $$ LANGUAGE plpgsql;
 
 CREATE TRIGGER actualise_cache
AFTER INSERT OR UPDATE OR DELETE
ON safecache.source_tbl
FOR EACH ROW EXECUTE PROCEDURE safecache.source_tbl_trg_fce();
 
 ---(end of broadcast)---
 TIP 7: You can help support the PostgreSQL project by donating at
 
 http://www.postgresql.org/about/donate
 

-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgpM1sQ9YWbpd.pgp
Description: PGP signature


Re: [HACKERS] [EMAIL PROTECTED]: Re: [GENERAL] array_to_set functions]

2007-08-15 Thread Decibel!
On Wed, Aug 15, 2007 at 06:47:05AM +0200, Pavel Stehule wrote:
 2007/8/14, Decibel! [EMAIL PROTECTED]:
  On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:
   2007/8/14, Bruce Momjian [EMAIL PROTECTED]:
   
TODO item?
   
  + If your life is a hard drive, Christ can be your backup. +
   
---(end of broadcast)---
TIP 6: explain analyze is your friend
  
   I am against. It's too simple do it in SQL language.
 
  Why make everyone who works with arrays create a function just to do
  this? Something that's of use to common users should be included, simple
  or not.
  --
 
 Unpacking array is more SQL construct for me, than SRF function. With
 function you cannot conntrol behave of unpacking. With SQL construct I
 can

Huh? You can do a DISTINCT or an ORDER BY on the output of a SRF.

 SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
 SELECT a(i) FROM generate_series ORDER BY .. sorted output
 etc
 
 But I can
 
 SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);
 
 else
 FUNCTION generate_series(anyarray) returns setof any
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgphVkwnyowqO.pgp
Description: PGP signature


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 ISTM that the functional index would be considerably smaller than the
 additional column approach, since tsvectors can be quite long. That
 seems like a very desirable thing with larger textbases. However,
 without an additional column certain queries would not be possible, such
 as IndexScans on a non-text search index with an additional filter on
 text search. So each way would be desirable in different situations.

Huh?  Of course you can do the searching without an additional column;
you just have to compute the tsvector on-the-fly at each row.  This is
a straight trade of more CPU cycles for less I/O.

 Would it be wrong to allow both approaches?

Nobody has suggested disallowing the trigger approach (indeed it's hard
to see how we could).  The argument is mostly about how to make a
functional index approach work conveniently; and secondarily about
what's needed to make dump/restore reliably reproduce the current
database state, whichever approach you choose.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Andrew Dunstan



Tom Lane wrote:

Magnus Hagander [EMAIL PROTECTED] writes:
  

Tom Lane wrote:


Good, but the salient followup questions to that are (1) backed up to
where exactly?, and (2) how many days' past backups could we get at,
if we had to?
  


  

They are dumped to a NFS share on this schedule. That NFS share is
dumped to tape by systems at Conova - I'll let Stefan fill in the
details about that.



That's good as far as it goes, but seeing that PG is a worldwide
organization now, I wonder whether our primary CVS shouldn't have
backups on several continents.  Pardon my paranoia ... but our
collective arses have been saved by offsite backups at least once
already ...


  


Yes, I think we could improve on that. Have we considered more 
sophisticated solutions that provide incremental backup on a more 
frequent basis? I'd be inclined to use Bacula or similar (and it uses 
Postgres for its metadata store :-) ). Ideally I think we'd like to be 
able fairly easily and quickly to roll the repo (or some portion of it) 
back to a fairly arbitrary and fairly precise (say within an hour or 
two)  point in recent time.


Meanwhile, those of us who rsync the entire repo could easily make 
rolling backup copies. Arguably this might be better done from a repo 
made using --numeric-ids. Tarred and compressed it's a shade under 90 
Mb, which isn't huge. If people do this at various times of the day we'd 
get pretty good coverage :-)



cheers

andrew

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


[HACKERS] Another idea for index-only scans

2007-08-15 Thread Bruce Momjian
I have added another idea for index-only scans to the TODO list:

   A third idea would be for a heap scan to check if all rows are visible
   and if so set a per-table flag which can be checked by index scans.
   Any change to the table would have to clear the flag.  To detect
   changes during the heap scan a counter could be set at the start and
   checked at the end --- if it is the same, the table has not been
   modified --- any table change would increment the counter.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] XID wraparound and busy databases

2007-08-15 Thread Bruce Momjian
I was talking to someone at LinuxWorld and they mentioned they often
have activity of 6k SELECTs per second, and that they were needing to
autovacuum every few days because of xid wraparound.

I did some calculations and found that:

 6000 * 60 * 60 * 24
51840

or 500 million xids per day, confirming they would need the autovacuum
to run every few days.

Is enlarging the xid field something we should consider for 8.4?  Is the
autovacuum I/O less then the extra I/O needed for an expanded xid fields
on every row?

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Stefan Kaltenbrunner
Andrew Dunstan wrote:
 
 
 Tom Lane wrote:
 Magnus Hagander [EMAIL PROTECTED] writes:
  
 Tom Lane wrote:

 Good, but the salient followup questions to that are (1) backed up to
 where exactly?, and (2) how many days' past backups could we get at,
 if we had to?
   

  
 They are dumped to a NFS share on this schedule. That NFS share is
 dumped to tape by systems at Conova - I'll let Stefan fill in the
 details about that.
 

 That's good as far as it goes, but seeing that PG is a worldwide
 organization now, I wonder whether our primary CVS shouldn't have
 backups on several continents.  Pardon my paranoia ... but our
 collective arses have been saved by offsite backups at least once
 already ...


   
 
 Yes, I think we could improve on that. Have we considered more
 sophisticated solutions that provide incremental backup on a more
 frequent basis? I'd be inclined to use Bacula or similar (and it uses
 Postgres for its metadata store :-) ). Ideally I think we'd like to be
 able fairly easily and quickly to roll the repo (or some portion of it)
 back to a fairly arbitrary and fairly precise (say within an hour or
 two)  point in recent time.

well yeah - while I do think that something as complex like bacula is
probably overkill for our needs we can certainly improve over the
current state.
One thing to consider is that we actually have two major scenarios to
deal with:

1. simple repo corruption (accident,cvs software bug, admin error)
this one might require us to restore the repo from an older backup in
the case the corruption cannot be repaired easily.
For this we already have myriads of copies of the trees out in the wild
but i might be a good idea to keep a number snapshots of the main repo
on the CVS-VPS itself (either done every few hours or made as part of
the push to anoncvs and svr1).
This way we could do a very simple  inplace recovery on the same
running VPS instance with fairly low inpact to all the commiters (and
depending parts of teh infrastructure)

2. total loss of the main CVS-VPS (extended power failure, hardware
error, OS bug, admin error, fire, some other catastropic event) - in
this case we will have to fail over to one of the other project
datacenters and for this we need to have full regular copies of the
whole VM on external hosts.

Stefan

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Stefan Kaltenbrunner
Peter Eisentraut wrote:
 Am Mittwoch, 15. August 2007 04:20 schrieb Tom Lane:
 we should at least log such commands, and maybe disallow to anyone
 except Marc's pgsql account.
 
 I don't think we should disallow it.  Or otherwise we might one day be stuck 
 if we need to release while some specific person is on vacation.

Is this really a problem in practise ?
If such a need arises any commiter (or sysadmin) would probably be able
to remove that restriction in a few minutes.
I think the point here is to prevent such things done by accident(vs. on
purpose) which a script like that seems like a fairly simple yet
effective solution.


Stefan

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Tom Lane
Stefan Kaltenbrunner [EMAIL PROTECTED] writes:
 Peter Eisentraut wrote:
 Am Mittwoch, 15. August 2007 04:20 schrieb Tom Lane:
 we should at least log such commands, and maybe disallow to anyone
 except Marc's pgsql account.
 
 I don't think we should disallow it.  Or otherwise we might one day be stuck 
 if we need to release while some specific person is on vacation.

 Is this really a problem in practise ?

I'd be happy if such commands got reported to the pgsql-committers
mailing list.  The real problem with this mistake is not so much
that it was made as that we had to do detective work to find out.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] [EMAIL PROTECTED]: Re: [GENERAL] array_to_set functions]

2007-08-15 Thread Pavel Stehule
2007/8/15, Decibel! [EMAIL PROTECTED]:
 On Wed, Aug 15, 2007 at 06:47:05AM +0200, Pavel Stehule wrote:
  2007/8/14, Decibel! [EMAIL PROTECTED]:
   On Tue, Aug 14, 2007 at 05:38:33PM +0200, Pavel Stehule wrote:
2007/8/14, Bruce Momjian [EMAIL PROTECTED]:

 TODO item?

   + If your life is a hard drive, Christ can be your backup. +

 ---(end of 
 broadcast)---
 TIP 6: explain analyze is your friend
   
I am against. It's too simple do it in SQL language.
  
   Why make everyone who works with arrays create a function just to do
   this? Something that's of use to common users should be included, simple
   or not.
   --
 
  Unpacking array is more SQL construct for me, than SRF function. With
  function you cannot conntrol behave of unpacking. With SQL construct I
  can

 Huh? You can do a DISTINCT or an ORDER BY on the output of a SRF.

Yes, but then you get little bit different my form :)



  SELECT DISTINCT a(i) FROM generate_series ... remove duplicities
  SELECT a(i) FROM generate_series ORDER BY .. sorted output
  etc
 
  But I can
 
  SELECT * FROM generate_series(ARRAY[1,3,4,5,7,10]);
 
  else
  FUNCTION generate_series(anyarray) returns setof any
 --
 Decibel!, aka Jim Nasby[EMAIL PROTECTED]
 EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Is enlarging the xid field something we should consider for 8.4?

No.  We just got the tuple header down to 24 bytes, we are not going
to give that back and then some.

If you are processing 6K transactions per second, you can afford to
vacuum every couple days... and probably need to vacuum much more often
than that anyway, to avoid table bloat.

Possibly your respondent should think about trying to do more than one
thing per transaction?

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Is enlarging the xid field something we should consider for 8.4?
 
 No.  We just got the tuple header down to 24 bytes, we are not going
 to give that back and then some.
 
 If you are processing 6K transactions per second, you can afford to
 vacuum every couple days... and probably need to vacuum much more often
 than that anyway, to avoid table bloat.
 
 Possibly your respondent should think about trying to do more than one
 thing per transaction?

OK, yea, I think that makes sense.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Heikki Linnakangas
Bruce Momjian wrote:
 Is enlarging the xid field something we should consider for 8.4?  Is the
 autovacuum I/O less then the extra I/O needed for an expanded xid fields
 on every row?

I doubt that's going to happen...

Maybe we can do something to reduce the xid consumption? For example,
reuse xids for read-only queries.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Wednesday, August 15, 2007 12:11:35 +0200 Peter Eisentraut 
[EMAIL PROTECTED] wrote:

 Am Mittwoch, 15. August 2007 04:20 schrieb Tom Lane:
 we should at least log such commands, and maybe disallow to anyone
 except Marc's pgsql account.

 I don't think we should disallow it.  Or otherwise we might one day be stuck
 if we need to release while some specific person is on vacation.

This isn't a big issue ... note that the 'restriction'  is easy to remove ... 
you checkout CVSROOT, modify taginfo and comment out the ALL l ine and check 
that in ...

What this will prevent is an 'accidental tagging' ... you would have to 
consciously remove the restriction .. but its something anyone could do ...

 I never understood why tagging uses a special account anyway.  It should be
 done as the person doing the tagging.

Agreed, I'm going to start doing it as myself from now on ... I'm not even 100% 
certain *why* I started doing it as pgsql in the first place ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFGwye54QvfyHIvDvMRAiHEAKCHGtXA+r5PM1SoBewMJDo3An7BKACg0//z
gM7eZNoTEU4sqwBIHprFK1k=
=/Qga
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Tom Lane
Heikki Linnakangas [EMAIL PROTECTED] writes:
 Maybe we can do something to reduce the xid consumption? For example,
 reuse xids for read-only queries.

Hmm, that's an idea.

More simply, just keep the current transaction open (resetting
everything but the XID) if we have made no changes by the time we're
told to commit or rollback ... which is something we track already,
so as not to waste cycles on useless commit XLOG records.

You'd want some upper limit on transaction lifetime, so as to avoid the
long lived transactions hurt VACUUM problem, but even reusing a single
xact for a few seconds would pretty much eliminate this issue, I bet.
It's hard to see how anyone could be doing 6K xacts/sec unless most are
read-only.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


[HACKERS] Index Tuple Compression Approach?

2007-08-15 Thread Dawid Kuroczko
On 8/14/07, Chris Browne [EMAIL PROTECTED] wrote:
 I recently had a chat with someone who was pretty intimate with Adabas
 for a number of years who's in the process of figuring things out
 about PostgreSQL.  We poked at bits of the respective implementations,
 seeing some similarities and differences.  He pointed out one aspect
 of index handling that could (in principle) be an interesting
 optimization.

 Evidently, in Adabas, index leaf nodes were not simply tuples, but
 lists where the index value would not be repeated.

 In PostgreSQL, if you have the index value 'abc', and there are 10
 tuples with that value, then you'll have a page full of tuples of the
 following form:

 |abc|ptr[rec1]|abc|ptr[rec2]|abc|ptr[rec3]| ...and so forth...

 Now, the Adabas approach was rather different.  It would only have the
 index value once, and then have the list of tuple pointers:

 |abc|ptr[rec1],ptr[rec2],ptr[rec3],...[ptr[rec10]|

 This could allow a fair bit of compression, for cases where the index
 value is not unique.

Interesting.  Some time ago I've played a little with quite a big table
which constained path (file path) as a primary key.  It did have sense
to have a strucure (SELECTs were mostly ORDER BY path WHERE path 
'/foo' LIMIT n).
The actual index was only a little bit smaller than the table (there were
maybe 4 or 5 columns there).

Some time ago I've had an idea that it might be possible to compress
th index size, even if it is a unique index.  Take the path example.
My idea would be to to split indexed value to 8-byte chunks.
For example: /var/lib/postgresql/8.2/main would be split into:
  /var/lib
  /postgre
  sql/8.2 -- these would be insertered into a tree as a scaffold,
and only vacuum should remove them..
  main -- this would be a leaf node.  It could be repeated in non-unique
indexes.

[/etc/pas] -- scaffold-node
 |-swd-- leaf node
[/etc/sha]
 |-dow
[/var/lib]   -- a problematic mixed scaffold/leaf node.
 [/postgre]
  |-sql
  |-sql/8.2
  [sql/8.2/]
   |-main
   |-foobar

The scaffold nodes would be there to guarantee that there is some
place to attach leafs to.  They should not be removed by DELETE
(unless we are sure no other node depends on them).

Advantages?  The repeated values (as /var/lib/postgresql/8.2)
are not repeated -- they are embedded into tree, as a scaffold,
actual nodes that are significant (files, not directories, in my
example) are put as actual leafs.

I guess it would reduce large indexes size and at the same time
it could remove limitation that B-tree index cannot index values
larger than 1/3 of the database page.  8-byte chunks was given
as an example here, perhaps larger value would be better.

(Of course redesigning schema to put directories separate from
files woul be useful, but it would not help with ORDER BY .. LIMIT
queries -- they would have to be JOIN-ed and re-sorted in memory
I'm afraid).

  Regards,
Dawid

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Decibel!
On Wed, Aug 15, 2007 at 12:49:52PM -0400, Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
  Maybe we can do something to reduce the xid consumption? For example,
  reuse xids for read-only queries.
 
 Hmm, that's an idea.
 
 More simply, just keep the current transaction open (resetting
 everything but the XID) if we have made no changes by the time we're
 told to commit or rollback ... which is something we track already,
 so as not to waste cycles on useless commit XLOG records.
 
 You'd want some upper limit on transaction lifetime, so as to avoid the
 long lived transactions hurt VACUUM problem, but even reusing a single
 xact for a few seconds would pretty much eliminate this issue, I bet.
 It's hard to see how anyone could be doing 6K xacts/sec unless most are
 read-only.

Aren't there potential issues with keeping the same XID if a transaction
in the middle has modified data? IE:

SELECT * FROM a
DELETE FROM a ...
SELECT * FROM a

I'm thinking that in any system that's doing an enormous transaction
rate, most will be read-only, which means there'd still be a lot of
benefit to simply using one transaction until the next write transaction
came along.

Something else to think about... any app that's doing that kind of
transaction rate is likely going to have a large number of backends, so
it would be even better if one XID could be shared across backends.
-- 
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)


pgp8a0Ey4UZrF.pgp
Description: PGP signature


Re: [HACKERS] Index Tuple Compression Approach?

2007-08-15 Thread Jeff Davis
On Wed, 2007-08-15 at 06:51 +0100, Heikki Linnakangas wrote:
 What Chris is suggesting is basically a special case of GIT, where all
 the heap tuples represented by an index tuple have the same key. I was
 actually thinking of adding a flag to index tuples to indicate that
 special case in GIT. We could effectively do both.
 

The bigger difference that I see is that GIT doesn't just group together
ranges of keys, it also groups by heap page number (or a small range of
page numbers, as Simon pointed out). 

Regards,
Jeff Davis




---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Tom Lane
Decibel! [EMAIL PROTECTED] writes:
 Aren't there potential issues with keeping the same XID if a transaction
 in the middle has modified data?

I don't see any, as long as you take a new snapshot.

 Something else to think about... any app that's doing that kind of
 transaction rate is likely going to have a large number of backends, so
 it would be even better if one XID could be shared across backends.

Not sane in the least.  What happens if two of them start to modify
data, and then one commits and one rolls back?

In any case, if we can cut the xact rate to one every few seconds per
backend, the problem goes away.

Actually ... an idea that might be worth investigating is to do
something similar to what we already did for subtransactions: don't
assign an XID at all until the transaction makes a data change.
I think that the main reason for assigning an XID earlier is just that
the ProcArray routines ignore backends that are not currently showing an
active XID when they figure global XMIN.  But a backend could probably
advertise an XMIN, indicating the age of the oldest active snapshot it's
got, without advertising an active XID.  (This could also tie into the
idea we discussed awhile back of tracking minimum XMIN better by
maintaining a central list of open snapshots within each backend.
Perhaps that should be totally decoupled from whether we are advertising
an active XID ...)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Another idea for index-only scans

2007-08-15 Thread Mike Rylander
On 8/15/07, Bruce Momjian [EMAIL PROTECTED] wrote:
 I have added another idea for index-only scans to the TODO list:

A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag.  To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.

Perhaps this is naive (or discussed and discarded... if so, I couldn't
find it, but I apologize if that's the case), but wouldn't recording
the xid of non-readonly transactions, at commit time, and at the table
level, be equivalent to the flag and remove the need for a counter?
Readers could just check the last-modification-xid at the beginning
and end of their scans to test for heap stability.

I suppose that would require a write-exclusive lock on some metadata
for each modified table during each commit... so perhaps it's a
non-starter right there.

--miker

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Darcy Buskermolen
On Wednesday 15 August 2007 09:49:52 Tom Lane wrote:
 Heikki Linnakangas [EMAIL PROTECTED] writes:
  Maybe we can do something to reduce the xid consumption? For example,
  reuse xids for read-only queries.

 Hmm, that's an idea.

 More simply, just keep the current transaction open (resetting
 everything but the XID) if we have made no changes by the time we're
 told to commit or rollback ... which is something we track already,
 so as not to waste cycles on useless commit XLOG records.

Jan and myself were discussing something like this as it would relate to a 
subscribe process for slony.

Jan care to summerize your thoughts on this?



 You'd want some upper limit on transaction lifetime, so as to avoid the
 long lived transactions hurt VACUUM problem, but even reusing a single
 xact for a few seconds would pretty much eliminate this issue, I bet.
 It's hard to see how anyone could be doing 6K xacts/sec unless most are
 read-only.

   regards, tom lane

 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Another idea for index-only scans

2007-08-15 Thread Josh Berkus
Bruce,

 I have added another idea for index-only scans to the TODO list:
A third idea would be for a heap scan to check if all rows are
  visible and if so set a per-table flag which can be checked by index
  scans. Any change to the table would have to clear the flag.  To
  detect changes during the heap scan a counter could be set at the
  start and checked at the end --- if it is the same, the table has not
  been modified --- any table change would increment the counter.

Seems marginal at best.  Checking overlap between the index and the FSM/DSM 
and only check dirty pages seems more intelligent, and able to cover a 
larger number of cases.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] change name of redirect_stderr?

2007-08-15 Thread Josh Berkus
Heikki,

 Should we change the ordering of pg_settings?

Well, an unfixed issue in pg_settings is that the category/subcategory 
relationship got represented by a non-atomic field which makes sorting on 
that difficult.  

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Another idea for index-only scans

2007-08-15 Thread Gregory Stark
Bruce Momjian [EMAIL PROTECTED] writes:

 I have added another idea for index-only scans to the TODO list:

   A third idea would be for a heap scan to check if all rows are visible
   and if so set a per-table flag which can be checked by index scans.
   Any change to the table would have to clear the flag.  To detect
   changes during the heap scan a counter could be set at the start and
   checked at the end --- if it is the same, the table has not been
   modified --- any table change would increment the counter.

I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Magnus Hagander
Marc G. Fournier wrote:
 
 
 --On Wednesday, August 15, 2007 12:11:35 +0200 Peter Eisentraut 
 [EMAIL PROTECTED] wrote:
 
 Am Mittwoch, 15. August 2007 04:20 schrieb Tom Lane:
 we should at least log such commands, and maybe disallow to anyone
 except Marc's pgsql account.
 I don't think we should disallow it.  Or otherwise we might one day be stuck
 if we need to release while some specific person is on vacation.
 
 This isn't a big issue ... note that the 'restriction'  is easy to remove ... 
 you checkout CVSROOT, modify taginfo and comment out the ALL l ine and check 
 that in ...
 
 What this will prevent is an 'accidental tagging' ... you would have to 
 consciously remove the restriction .. but its something anyone could do ...
 
 I never understood why tagging uses a special account anyway.  It should be
 done as the person doing the tagging.
 
 Agreed, I'm going to start doing it as myself from now on ... I'm not even 
 100% 
 certain *why* I started doing it as pgsql in the first place ...

If you're doing that, we should probably just delete the pgsql userid
from the system? Or at least change it so it doesn't have 'dev'
permissions. That way you can't do it wrong in that direction ;-)
Seems reasonable?

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Heikki Linnakangas
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 Is enlarging the xid field something we should consider for 8.4?
 
 No.  We just got the tuple header down to 24 bytes, we are not going
 to give that back and then some.
 
 If you are processing 6K transactions per second, you can afford to
 vacuum every couple days... and probably need to vacuum much more often
 than that anyway, to avoid table bloat.

Speaking of vacuum, hopefully we'll get some sort of dead space map in
8.4. If we keep track of frozen pages there, vacuuming to avoid xid
wraparound will be much cheaper.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Another idea for index-only scans

2007-08-15 Thread Luke Lonergan
A hybrid scan approach combined with this idea would fit nicely - provide 
results for tids that are directly visible and set a bit in a bitmap for those 
that need recheck and extend recheck to take a bitmap (wait - it already does 
:-)

- Luke

Msg is shrt cuz m on ma treo

 -Original Message-
From:   Gregory Stark [mailto:[EMAIL PROTECTED]
Sent:   Wednesday, August 15, 2007 02:58 PM Eastern Standard Time
To: Bruce Momjian
Cc: PostgreSQL-development
Subject:Re: [HACKERS] Another idea for index-only scans

Bruce Momjian [EMAIL PROTECTED] writes:

 I have added another idea for index-only scans to the TODO list:

   A third idea would be for a heap scan to check if all rows are visible
   and if so set a per-table flag which can be checked by index scans.
   Any change to the table would have to clear the flag.  To detect
   changes during the heap scan a counter could be set at the start and
   checked at the end --- if it is the same, the table has not been
   modified --- any table change would increment the counter.

I think I would prefer to address this in the same infrastructure as the
dead-space-map. That way you're not dependent on having no updates happening
on the table at all. Any tuples on pages which contain no in-doubt tuples
could have their visibility check skipped but when you come across a tuple on
a page which has been modified since the last vacuum then you have to check
the visibility.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Wednesday, August 15, 2007 20:57:14 +0200 Magnus Hagander 
[EMAIL PROTECTED] wrote:

 Marc G. Fournier wrote:


 --On Wednesday, August 15, 2007 12:11:35 +0200 Peter Eisentraut
 [EMAIL PROTECTED] wrote:

 Am Mittwoch, 15. August 2007 04:20 schrieb Tom Lane:
 we should at least log such commands, and maybe disallow to anyone
 except Marc's pgsql account.
 I don't think we should disallow it.  Or otherwise we might one day be stuck
 if we need to release while some specific person is on vacation.

 This isn't a big issue ... note that the 'restriction'  is easy to remove
 ...  you checkout CVSROOT, modify taginfo and comment out the ALL l ine and
 check  that in ...

 What this will prevent is an 'accidental tagging' ... you would have to
 consciously remove the restriction .. but its something anyone could do ...

 I never understood why tagging uses a special account anyway.  It should be
 done as the person doing the tagging.

 Agreed, I'm going to start doing it as myself from now on ... I'm not even
 100%  certain *why* I started doing it as pgsql in the first place ...

 If you're doing that, we should probably just delete the pgsql userid
 from the system? Or at least change it so it doesn't have 'dev'
 permissions. That way you can't do it wrong in that direction ;-)
 Seems reasonable?

there is no pgsql user *on* cvs.postgresql.org, at least there wasn't when I 
logged on last night:

cvs# grep pgsql /etc/passwd
cvs#

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFGw1AR4QvfyHIvDvMRArsSAJ9WzXyMu7Io6dP8kDnR5HHex1f1gQCg7iDq
3p2RrAzww4dCDPVCyMozAnM=
=L+/M
-END PGP SIGNATURE-


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Zoltan Boszormenyi

Tom Lane írta:

It's hard to see how anyone could be doing 6K xacts/sec unless most are
read-only.

regards, tom lane
  


In a recent stress test with our PostgreSQL-based cluster between two 
machines

3 million transaction were performed with pgbench -c 150 -t 2 -s 200
in about _ten hours_. The primary machine (desktop-level machine for 
development)

used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want
about my disk lying about flush, its 75MB/sec transfer rate transfer 
rate is real.

So 5 million real transaction in 24 hours is not unrealistic.

--
--
Zoltán Böszörményi
Cybertec Geschwinde  Schönig GmbH
http://www.postgresql.at/



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Another idea for index-only scans

2007-08-15 Thread Bruce Momjian
Gregory Stark wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
 
  I have added another idea for index-only scans to the TODO list:
 
A third idea would be for a heap scan to check if all rows are visible
and if so set a per-table flag which can be checked by index scans.
Any change to the table would have to clear the flag.  To detect
changes during the heap scan a counter could be set at the start and
checked at the end --- if it is the same, the table has not been
modified --- any table change would increment the counter.
 
 I think I would prefer to address this in the same infrastructure as the
 dead-space-map. That way you're not dependent on having no updates happening
 on the table at all. Any tuples on pages which contain no in-doubt tuples
 could have their visibility check skipped but when you come across a tuple on
 a page which has been modified since the last vacuum then you have to check
 the visibility.

Yea, the bitmap/page idea is already on the TODO list.  This was just a
less granular idea.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CVS corruption/mistagging?

2007-08-15 Thread Magnus Hagander
Marc G. Fournier wrote:
 If you're doing that, we should probably just delete the pgsql userid
 from the system? Or at least change it so it doesn't have 'dev'
 permissions. That way you can't do it wrong in that direction ;-)
 Seems reasonable?
 
 there is no pgsql user *on* cvs.postgresql.org, at least there wasn't when I 
 logged on last night:
 
 cvs# grep pgsql /etc/passwd
 cvs#

Ah, seems I was faster than I expected myself :-) Problem solved then.

//Magnus

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Index Tuple Compression Approach?

2007-08-15 Thread Heikki Linnakangas
Dawid Kuroczko wrote:
 Some time ago I've had an idea that it might be possible to compress
 th index size, even if it is a unique index.  Take the path example.
 My idea would be to to split indexed value to 8-byte chunks.
 For example: /var/lib/postgresql/8.2/main would be split into:
   /var/lib
   /postgre
   sql/8.2 -- these would be insertered into a tree as a scaffold,
 and only vacuum should remove them..
   main -- this would be a leaf node.  It could be repeated in non-unique
 indexes.

That general approach of storing a common part leading part just once is
called prefix compression. Yeah, it helps a lot on long text fields.
Tree structures like file paths in particular.

It's been discussed before. One big problem is extracting the common
leading part. You could only do it for text, but it should be done in a
datatype neutral way.

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Zoltan Boszormenyi

Heikki Linnakangas írta:

Zoltan Boszormenyi wrote:
  

Tom Lane írta:


It's hard to see how anyone could be doing 6K xacts/sec unless most are
read-only.
  

In a recent stress test with our PostgreSQL-based cluster between two
machines
3 million transaction were performed with pgbench -c 150 -t 2 -s 200
in about _ten hours_. The primary machine (desktop-level machine for
development)
used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want
about my disk lying about flush, its 75MB/sec transfer rate transfer
rate is real.
So 5 million real transaction in 24 hours is not unrealistic.



6k xacts / s is five *hundred* million transactions, not five million...
  


Blush. :-) You're right. However a single machine with ramdisk is able 
to do that.


--
--
Zoltán Böszörményi
Cybertec Geschwinde  Schönig GmbH
http://www.postgresql.at/



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Heikki Linnakangas
Zoltan Boszormenyi wrote:
 Tom Lane írta:
 It's hard to see how anyone could be doing 6K xacts/sec unless most are
 read-only.
 
 In a recent stress test with our PostgreSQL-based cluster between two
 machines
 3 million transaction were performed with pgbench -c 150 -t 2 -s 200
 in about _ten hours_. The primary machine (desktop-level machine for
 development)
 used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want
 about my disk lying about flush, its 75MB/sec transfer rate transfer
 rate is real.
 So 5 million real transaction in 24 hours is not unrealistic.

6k xacts / s is five *hundred* million transactions, not five million...

-- 
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Bruce Momjian
Zoltan Boszormenyi wrote:
  used a real disk, the secondary used tmpfs as PGDATA. Say whatever you want
  about my disk lying about flush, its 75MB/sec transfer rate transfer
  rate is real.
  So 5 million real transaction in 24 hours is not unrealistic.
  
 
  6k xacts / s is five *hundred* million transactions, not five million...

 
 Blush. :-) You're right. However a single machine with ramdisk is able 
 to do that.

A battery-backed hard disk controller card can do that too.

-- 
  Bruce Momjian  [EMAIL PROTECTED]  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index Tuple Compression Approach?

2007-08-15 Thread Chris Browne
[EMAIL PROTECTED] (Dawid Kuroczko) writes:
 On 8/14/07, Chris Browne [EMAIL PROTECTED] wrote:
 I recently had a chat with someone who was pretty intimate with Adabas
 for a number of years who's in the process of figuring things out
 about PostgreSQL.  We poked at bits of the respective implementations,
 seeing some similarities and differences.  He pointed out one aspect
 of index handling that could (in principle) be an interesting
 optimization.

 Evidently, in Adabas, index leaf nodes were not simply tuples, but
 lists where the index value would not be repeated.

 In PostgreSQL, if you have the index value 'abc', and there are 10
 tuples with that value, then you'll have a page full of tuples of the
 following form:

 |abc|ptr[rec1]|abc|ptr[rec2]|abc|ptr[rec3]| ...and so forth...

 Now, the Adabas approach was rather different.  It would only have the
 index value once, and then have the list of tuple pointers:

 |abc|ptr[rec1],ptr[rec2],ptr[rec3],...[ptr[rec10]|

 This could allow a fair bit of compression, for cases where the index
 value is not unique.

 Interesting.  Some time ago I've played a little with quite a big table
 which constained path (file path) as a primary key.  It did have sense
 to have a strucure (SELECTs were mostly ORDER BY path WHERE path 
 '/foo' LIMIT n).
 The actual index was only a little bit smaller than the table (there were
 maybe 4 or 5 columns there).

 Some time ago I've had an idea that it might be possible to compress
 th index size, even if it is a unique index.  Take the path example.
 My idea would be to to split indexed value to 8-byte chunks.
 For example: /var/lib/postgresql/8.2/main would be split into:
   /var/lib
   /postgre
   sql/8.2 -- these would be insertered into a tree as a scaffold,
 and only vacuum should remove them..
   main -- this would be a leaf node.  It could be repeated in non-unique
 indexes.

 [/etc/pas] -- scaffold-node
  |-swd-- leaf node
 [/etc/sha]
  |-dow
 [/var/lib]   -- a problematic mixed scaffold/leaf node.
  [/postgre]
   |-sql
   |-sql/8.2
   [sql/8.2/]
|-main
|-foobar

 The scaffold nodes would be there to guarantee that there is some
 place to attach leafs to.  They should not be removed by DELETE
 (unless we are sure no other node depends on them).

Note that there is a well-understood name for this; this is assortedly
known as a Radix tree or a Patricia trie.

  http://en.wikipedia.org/wiki/Radix_tree

As you observe, the tree/trie edges consist not of individual
characters, but rather of sequences of characters.

 Advantages?  The repeated values (as /var/lib/postgresql/8.2)
 are not repeated -- they are embedded into tree, as a scaffold,
 actual nodes that are significant (files, not directories, in my
 example) are put as actual leafs.

Certainly as compared to a traditional trie, this representation leads
to there being a whole lot less nodes and a whole lot less branching.

The Radix/Patricia tree compresses things two ways:

 - As you observe, there can be fewer, larger componets

 - By combining common prefixes together, this makes cases of
   strings that are highly patterned much, much, much cheaper, as the
   tree branches at (and only at) the places where they tend to
   differ.

It could conceivably make it workable to have indexes on big, highly
repeating things such as blobs of XML.  (Although it *doesn't* get you
the ability to search on substrings, which is probably what you'd also
want...)

 I guess it would reduce large indexes size and at the same time it
 could remove limitation that B-tree index cannot index values larger
 than 1/3 of the database page.  8-byte chunks was given as an
 example here, perhaps larger value would be better.

 (Of course redesigning schema to put directories separate from files
 woul be useful, but it would not help with ORDER BY .. LIMIT queries
 -- they would have to be JOIN-ed and re-sorted in memory I'm
 afraid).

I'll gleefully ignore the nature of the example, as it's kind of
beside the point.  The point is to try to compress what's in the
column.  If it's being abused somewhat, and has more crud in it, that
gives a potential for a *bigger* win.
-- 
output = reverse(gro.mca @ enworbbc)
http://linuxdatabases.info/info/spiritual.html
Documentation wants to be obsolete. 
-- Bruce R. Lewis

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Robert Treat
On Wednesday 15 August 2007 13:54, Tom Lane wrote:
 Decibel! [EMAIL PROTECTED] writes:
  Aren't there potential issues with keeping the same XID if a transaction
  in the middle has modified data?

 I don't see any, as long as you take a new snapshot.


I'm a little confused, wouldnt the transaction that waits 30 minutes before 
modifying data need to get an XID that jives with the system when it's 
transaction started, not when it began manipulating data?   Would it really 
be safe to take a new snapshot at that time, istm concurrent writers might 
have caused potential issues by that point. 

-- 
Robert Treat
Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Index Tuple Compression Approach?

2007-08-15 Thread Gregory Stark

Heikki Linnakangas [EMAIL PROTECTED] writes:

 That general approach of storing a common part leading part just once is
 called prefix compression. Yeah, it helps a lot on long text fields.
 Tree structures like file paths in particular.

You kind of want to do avoid both the prefix and the suffix, no? 

 It's been discussed before. One big problem is extracting the common
 leading part. You could only do it for text, 

Or for multi-column indexes

I could see this being especially useful if you have some columns in the index
key which are small and some that are quite large. So if you have an event
table with an index on userid,timestamp you wouldn't have to store lots of
timestamps on the upper level tree nodes. You would only store them for the
leaf nodes.

 but it should be done in a datatype neutral way.

I wonder if there's an analogous operation for other data types though.
Numeric could store the a value relative to the parent value. Arrays could
store only the elements needed. bytea of course works just as well as text (or
better in the face of i18n).

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Tom Lane
Robert Treat [EMAIL PROTECTED] writes:
 I'm a little confused, wouldnt the transaction that waits 30 minutes before 
 modifying data need to get an XID that jives with the system when it's 
 transaction started, not when it began manipulating data?

Why?

 Would it really be safe to take a new snapshot at that time,

You wouldn't take a new snapshot.  The thought that occurs to me is that
there's no reason that a transaction has to have an XID for itself
before it takes a snapshot.  We always special-case our own XID anyway.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] tsearch2 in PostgreSQL 8.3?

2007-08-15 Thread Ron Mayer
Magnus Hagander wrote:
 I don't use the functional index part, but for new users I can see how
 that's certainly a *lot* easier. 

Can someone with modern hardware test to see if it's
still quite a bit slower than the extra column.  I had
tried it too years ago; and found the functional index
to be quite a bit slower:
http://archives.postgresql.org/pgsql-general/2005-10/msg00475.php
but it'd be interesting to see if faster CPUs changed this.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Kevin Grittner
 On Wed, Aug 15, 2007 at  5:06 PM, in message [EMAIL PROTECTED],
Tom Lane [EMAIL PROTECTED] wrote: 
 Robert Treat [EMAIL PROTECTED] writes:
 I'm a little confused, wouldnt the transaction that waits 30 minutes before 
 modifying data need to get an XID that jives with the system when it's 
 transaction started, not when it began manipulating data?
 
 Why?
 
 Would it really be safe to take a new snapshot at that time,
 
 You wouldn't take a new snapshot.  The thought that occurs to me is that
 there's no reason that a transaction has to have an XID for itself
 before it takes a snapshot.  We always special-case our own XID anyway.

I'm having trouble picturing how that would work with a transaction using
the SERIALIZABLE transaction isolation level, or would this just be done at
the READ COMMITTED level?

-Kevin



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] XID wraparound and busy databases

2007-08-15 Thread Tom Lane
Kevin Grittner [EMAIL PROTECTED] writes:
 Tom Lane [EMAIL PROTECTED] wrote:
 You wouldn't take a new snapshot.  The thought that occurs to me is that
 there's no reason that a transaction has to have an XID for itself
 before it takes a snapshot.  We always special-case our own XID anyway.

 I'm having trouble picturing how that would work with a transaction using
 the SERIALIZABLE transaction isolation level,

Why?  You take a snapshot, you use it.  If you later need to allocate an
XID for yourself, you do that --- your own XID is surely uncommitted in
any case, so this doesn't affect the validity of the snapshot.

The bottom line here is that we need own-XID-if-any to be = snapshot
xmin, but there's no obvious reason why it has to be  snapshot xmax.
This is, in fact, *not* the case for subtransaction XIDs, and I see
no fundamental reason why it need be true for the top transaction XID.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Another idea for index-only scans

2007-08-15 Thread Decibel!

On Aug 15, 2007, at 1:54 PM, Gregory Stark wrote:
  A third idea would be for a heap scan to check if all rows are  
visible
  and if so set a per-table flag which can be checked by index  
scans.

  Any change to the table would have to clear the flag.  To detect
  changes during the heap scan a counter could be set at the  
start and

  checked at the end --- if it is the same, the table has not been
  modified --- any table change would increment the counter.


I think I would prefer to address this in the same infrastructure  
as the
dead-space-map. That way you're not dependent on having no updates  
happening
on the table at all. Any tuples on pages which contain no in-doubt  
tuples
could have their visibility check skipped but when you come across  
a tuple on
a page which has been modified since the last vacuum then you have  
to check

the visibility.


The advantage to Bruce's idea is that it sounds pretty simple to  
implement. While it wouldn't be of use for many general cases, it  
*would* be useful for read-only tables, ie: old partitions.

--
Decibel!, aka Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq