[HACKERS] Vacuum dead tuples that are between transactions

2006-02-27 Thread Paul Tillotson
The topic of improving vacuum for use in heavy-update environments seems 
to come up frequently on the list.  Has anyone weighed the costs of 
allowing VACUUM to reclaim tuples that are not older than the oldest 
transaction but are nonetheless invisible to all running transactions?  
It seems that it's not that hard


Currently, a tuple is not elligible to be reclaimed by vacuum unless it 
was deleted by a transaction that committed before the oldest currently 
running transaction committed. (i.e., it's xmax is known to have 
committed before the oldest-currently-running xid was started.)  Right?


However, it seems like under certain scenarios (heavy updates to small 
tables while a long-running transaction is occurring) there might be a 
lot of tuples that are invisible to all transactions but not able to be 
vacuumed under the current method.  Example: updating a single row over 
and over again while pg_dump is running.


Suppose that in the system, we have a serializable transaction with xid 
1000 and a read committed transaction with xid 1001.  Other than these 
two, the oldest running xid is 2000.


Suppose we consider a tuple with xmin 1200 and xmax 1201.  We will 
assume that xid 1201 committed before xid 2000 began to run.


So:

(A) This tuple is invisible to the serializable transaction, since its 
snapshot can't ever advance.


(B) The read committed transaction might be able to see it.  However, if 
its current command started AFTER xid 1201 committed, it can't. 

Unless I'm missing something, it seems that when vacuuming you can leave 
serializable transactions (like pg_dump) out of the calculation of the 
oldest running transaction so long as you keep a list of them and 
check each tuple T against each serializable transaction X to make sure 
that T's xmin is greater than X, or else T's xmax committed before X 
started to run.  Of course this is a lot of work, but this should 
mitigate the effect of long running serializable transactions until such 
time as processor power becomes your limiting factor.


The read committed ones are a more difficult matter, but I think you can 
treat a tuple as dead if it was inserted after the read committed 
transaction started to run AND the tuple was deleted before the 
transaction's currently running command started to run.  I suppose the 
major difficulty here is that currently a transaction has no way of 
knowing when another backend's command started to run?


Is this too difficult to do or is it a good idea that no one has enough 
'round tuits for?


Regards,

Paul Tillotson

---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] t_ctid chains

2005-08-18 Thread Paul Tillotson
(Prompted by nearby thread about VACUUM FULL bugs, but not having 
anything to do with that properly speaking.)


Hackers,

For some time, I have wondered: what does postgres use t_ctid chains 
for?  It seems like it is useful to find the newer version of a 
tuple.  However, wouldn't that eventually get found anyway?  A 
sequential scan scans the whole table, and so it will find the new 
tuple.  Since indexes contain all tuples, so will an index scan. 

I infer that the there must be some sort of optimization to make it 
worth (a) using extra space in the disk pages and (b) causing the extra 
complexity such as the bugs mentioned in VACUUM FULL.


So: what are the t_ctid chains good for?  If this is too long or too 
elementary to type, can someone point me to the source code that uses 
t_ctid chains? 


Regards,
Paul Tillotson

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [COMMITTERS] pgsql: Fix NUMERIC modulus to properly

2005-06-06 Thread Paul Tillotson

Bruce Momjian wrote:


Tom Lane wrote:
 


Christopher Kings-Lynne [EMAIL PROTECTED] writes:
   


No, I don't think so.  It doesn't seem to be something that enough
people use to risk the change in behavior --- it might break something
that was working.  But, if folks want it backported we can do it.  It is
only a change to properly do modulus for numeric.
   

Well, from my point of view it's an absolute mathematical error - i'd 
backport it.  I can't see anyone relying on it :)
 


Doesn't this patch break the basic theorem that

a = trunc(a / b) * b + (a mod b)

?  If division rounds and mod doesn't, you've got pretty serious issues.
   



Well, this is a good question.  In the equation above we assume '/' is
an integer division.  The problem with NUMERIC when used with zero-scale
operands is that the result is already _rounded_ to the nearest hole
number before it gets to trunc(), and that is why we used to get
negative modulus values.  I assume the big point is that we don't offer
any way for users to get a NUMERIC division without rounding.

With integers, we always round down to the nearest whole number on
division;  float doesn't offer a modulus operator, and C doesn't support
it either.

We round NUMERICs to the specific scale because we want to give the most
accurate value:

test= select 1000::numeric(24,0) /
11::numeric(24,0);
?column?

 9090909090909090909091

The actual values is:
   --
 9090909090909090909090.90

But the problem is that the equation at the top assumes the division is
not rounded.  Should we supply a NUMERIC division operator that doesn't
round?  integer doesn't need it, and float doesn't have the accurate
precision needed for modulus operators.  The user could supply some
digits in the division:

test= select 1000::numeric(30,6) /
11::numeric(24,0);
   ?column?
---
 9090909090909090909090.909091
(1 row)

but there really is no _right_ value to prevent rounding (think
0.999).  A non-rounding NUMERIC division would require duplicating
numeric_div() but with a false for 'round', and adding operators.

 

I would prefer that division didn't round, as with integers.  You can 
always calculate your result to 1 more decimal place and then round, but 
there is no way to unround a rounded result.


Tom had asked whether PG passed the regression tests if we change the 
round_var() to a trunc_var() at the end of the function div_var().


It does not pass, but I think that is because the regression test is 
expecting that division will round up.  (Curiously, the regression test 
for numeric passes, but the regression test for aggregation--sum() I 
think--is the one that fails.)  I have attached the diffs here if anyone 
is interested.


Regards,
Paul Tillotson

*** ./expected/aggregates.out   Sun May 29 19:58:43 2005
--- ./results/aggregates.outMon Jun  6 21:01:11 2005
***
*** 10,16 
  SELECT avg(a) AS avg_32 FROM aggtest WHERE a  100;
 avg_32
  -
!  32.6667
  (1 row)
  
  -- In 7.1, avg(float4) is computed using float8 arithmetic.
--- 10,16 
  SELECT avg(a) AS avg_32 FROM aggtest WHERE a  100;
 avg_32
  -
!  32.
  (1 row)
  
  -- In 7.1, avg(float4) is computed using float8 arithmetic.

==

test boolean  ... ok
test char ... ok
test name ... ok
test varchar  ... ok
test text ... ok
test int2 ... ok
test int4 ... ok
test int8 ... ok
test oid  ... ok
test float4   ... ok
test float8   ... ok
test bit  ... ok
test numeric  ... ok
test strings  ... ok
test numerology   ... ok
test point... ok
test lseg ... ok
test box  ... ok
test path ... ok
test polygon  ... ok
test circle   ... ok
test date ... ok
test time ... ok
test timetz   ... ok
test timestamp... ok
test timestamptz  ... ok
test interval ... ok
test abstime  ... ok
test reltime  ... ok
test tinterval... ok
test inet ... ok
test comments ... ok
test oidjoins ... ok
test type_sanity  ... ok
test opr_sanity   ... ok
test geometry ... ok
test horology ... ok
test insert   ... ok
test create_function_1... ok
test create_type  ... ok
test create_table ... ok
test create_function_2

[HACKERS] tool for incrementally shrinking bloated tables

2004-12-20 Thread Paul Tillotson
Goal: on a prduction server, to gradually shrink a table (no matter how 
large) back to  10% free space without noticeably interrupting write 
access to it.  (noticeably = without taking any exclusive locks for 
more than a few seconds at a time.)

I am thinking about making this if it proves to be not to difficult.
To accomplish this, tuples need to be moved into free space in the 
beginning of the table, and the table must be shrunk using ftruncate().

It seems that I could make these two changes:
(a) Modifying the VACUUM command to take an option that means pack the 
free space map with the pages that occur earliest in
the table rather than the pages with the most free space.

(b) Create a command that will take an exclusive lock, scan a table 
backwards until it comes to a tuple that cannot be removed (i.e., a 
tuple that is not HEAPTUPLE_DEAD (see scan_heap() in 
src/backend/commands/vacuum.c)) or until some preset amount of time has 
elapsed, and then ftruncate() the table.

To use this system one would do this:
(1) VACUUM KEEP_EARLY_FREE_PAGES mybloatedtable;   -- use 
item (a) discussed above
(2) UPDATE mybloatedtable SET foo = foo WHERE ctid  '(n, 0)';  -- 
move tuples in end of the table to the front.
(3) SHRINK TABLE 
mybloatedtable;  
-- use item (b) discussed above

Then repeat as many times as necessary to accomplish the desired shrinking.
In defense of the need for this tool:  Although this is usually 
preventable by proper vacuuming and FSM configuration, often on the list 
I see people say that they have a huge multi-gigabyte table that is 
using up all their drive space, but they cannot afford the interruption 
that VACUUM FULL would entail.  Also, certain maintenance operations 
(e.g., adding a column and populating it within a transaction) can 
double the on-disk size of a table, not to mention user error such as 
running an unconstrained UPDATE command inside a transaction and then 
rolling it back.

Comments?  Am I missing some obvious way of accomplishing this goal?  Is 
anyone working on something like this?

Paul Tillotson
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] pg_locks view and user locks

2004-09-13 Thread Paul Tillotson
This is the first I have ever heard user locks, but I have more than 
once wanted a lock that would persist beyond the end of a transaction.  
Do these do that?

Paul
Merlin Moncure [EMAIL PROTECTED] writes:
 

... is there any merit to promoting the user lock wrappers out of contrib
   

Dunno.  Yours is the first message I can recall in quite a long time
indicating that anyone was using userlocks.  I thought the code was kind
of dying on the vine.  Of course it's hard to tell whether that's not
just *because* it's in contrib and not mainstream.  But personally I'd
like to see some more evidence of use before we promote it.  (And yeah,
the API could probably use some cleanup first.)
regards, tom lane
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html
 


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] [pgsql-advocacy] What can we learn from MySQL?

2004-04-27 Thread Paul Tillotson
On the other topics...
I think the biggest service PGSQL could provide to the open source
community is a resource that teaches people with no database experience
the fundamentals of databases. If people had an understanding of what a
RDBMS should be capable of and how it should be used, they wouldn't pick
MySQL.
 

I think that this is incredibly important.  Many many developers 
choose MySQL because MySQL really does make the effort in this 
regard.  This strategy has helped both MySQL and Red Hat become the 
commercial successes they are today.
I believe that postgres is making an effort here.  I learned SQL from 
the postgres docs found in the first few chapters here:

http://www.postgresql.org/docs/7.4/static/tutorial.html
Those, in my opinion, are excellent, and were way more informative to me 
than anything on the MySQL website (I tried reading there first).  Maybe 
we are aiming for users who had a clue quotient much lower than I, but 
those attain an excellent balance between too short and simple to be 
useful and too long and complicated.

Paul Tillotson
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Usability, MySQL, Postgresql.org, gborg, contrib,

2004-04-27 Thread Paul Tillotson
Joshua D. Drake wrote:
I know both. :-).
Seriously - I'd like to raise my voice in favor of installing plpgsql 
in template1 by default. I haven't heard any good reason not to (nor 
even a bad reason).

If we install plPGSQL by default, we should install any other pl 
language that was configured at runtime by default as well. This 
includes plPerl, plTCL, and plPython.

Of course only if they were compiled in, but sense they are a part of 
the core distribution we shouldn't favor one over the other.

Personally, plpgSQL is only useful to those who are coming from Oracle.
People are more likely to be comfortable with plPython or plPerl than
plpgSQL.
I disagree. I know python and perl and I have never touched Oracle, but 
I find plpgsql to be more useful than the others. Mainly because
(1) It is strongly typed, has all the same types as postgresql itself, 
consistent support for NULLS, etc.

(2) Upon a casual read of the docs I couldn't figure out how to read any 
values from the database other than what was passed into the 
pl{perl|python|php} function. I.e., how do you run a select? What client 
libraries do you use? Maybe the others just need better docs.

Paul Tillotson
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


[HACKERS] where can I get the HTML docs

2004-03-23 Thread Paul Tillotson
Can anyone show me where to download a zipped tarball of .html files of
what exists at the following link?
http://www.postgresql.org/docs/7.4/static/index.html

Thanks.

Paul



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Avoid MVCC using exclusive lock possible?

2004-03-01 Thread Paul Tillotson
I use this type of approach when mirroring data from a foxpro database 
(yuck) to a read-only postgres database.  It is quicker and cleaner than 
deleting all of the rows and inserting them again (TRUNCATE is not 
transaction safe, which I need).

However, for this to be useful, your table must not have any indexes, 
views, foreign keys, sequences, triggers, etc., or else you must be 
prepared to re-create all of them using application level code.

I imagine this would break lots of things, but it would be nice if 
instead of Shridhar's rename step (see below) one could do this:

$table1node = query(SELECT relfilenode FROM pg_class WHERE relname = 
'$old_table';);
$table2node = query(SELECT relfilenode FROM pg_class WHERE relname = 
'$new_table';);
exec(UPDATE pg_class SET relfilenode = $table2node WHERE relname = 
'$old_table';);
exec(UPDATE pg_class SET relfilenode = $table1node WHERE relname = 
'$new_table';);

You would of course need to change the relfilenode for all of the 
toasted columns and indexes as well in the same atomic step, but it 
seems like this might be more compatible with postgresql's MVCC model 
than other ideas suggested.

Regards,
Paul Tillotson
Shridhar Daithankar wrote:

I am sure people have answered the approach you have suggested so let me 
suggest a workaround for your problem.

You could run following in a transaction.

- begin
- Create another table with exact same structure
- write a procedure that reads from input table and updates the value in 
between
- drop the original table
- rename new table to old one
- commit
- analyze new table 

Except for increased disk space, this approach has all the good things 
postgresql offers. Especially using transactable DDLs it is huge benefit. You 
certainly do save on vacuum.

If the entire table is updated then you can almost certainly get things done 
faster this way.

HTH

Shridhar

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
 



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])