Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Mark Kirkwood

Tom Lane wrote:

Mark Kirkwood [EMAIL PROTECTED] writes:
Interesting that the slice-8 algorithm seems to work noticeably better 
on FreeBSD than Linux -


Are you running similar gcc versions on both?  I realize I forgot to
document what I was using:

HPPA: gcc version 2.95.3 20010315 (release)
PPC: gcc version 4.0.1 (Apple Computer, Inc. build 5363)
both Intels: gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)

Interesting that it's only the oldest-line tech that's showing a win
for me ...




Ah - good point, FreeBSD is using an older compiler:

FreeBSD: gcc (GCC) 3.4.6 [FreeBSD] 20060305
Linux: gcc (GCC) 4.1.1 (Gentoo 4.1.1)

Hmm - there is a FreeBSD port for 4.1.2, I might set that off to build 
itself and see if compiling with it changes the results any


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Simon Riggs
On Sun, 2006-10-22 at 17:18 +0200, Martijn van Oosterhout wrote:
 On Sun, Oct 22, 2006 at 08:10:56PM +0530, Gurjeet Singh wrote:
  Hi all,
  
 Michael Kounavis has given a green signal (please refer the forwarded
  message).
 
 I don't think that helps. The publishing date of this article was March
 2006. If this is really the first time this algorithm was published, that
 means that anyone else (or even the author) has the option of patenting
 this algorithm sometime before March 2007 and still claiming they
 invented it first.

Slice-By-8 was first mentioned here:

M. E. Kounavis and F. L. Berry, A Systematic Approach to Building High
Performance Software-based CRC Generators, Proceedings, Tenth IEEE
International Symposium on Computers and Communications (ISCC 2005), La
Manga Del Mar Menor, Spain, June, 2005.

so presumably the opportunity to patent has already passed, given what
you say.

 And realistically we would wait at least a year or three after that,
 because you don't get to see patents as they're applied for.
 
 Maybe March 2010 we can look into it...

I think that's a tad overcooked. We have to balance caution with
boldness. Simply copying stuff from research can be dangerous, but I
think this is about as safe as it gets.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Anders Steinlein

On Mon, 23 Oct 2006 07:35:08 +0200, Tom Lane [EMAIL PROTECTED] wrote:


I realize I forgot to document what I was using:

HPPA: gcc version 2.95.3 20010315 (release)
PPC: gcc version 4.0.1 (Apple Computer, Inc. build 5363)
both Intels: gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)

Interesting that it's only the oldest-line tech that's showing a win
for me ...


Does anyone have an Intel compiler availible? It would be interesting to  
see results on that compared to gcc (given that it is an Intel algorithm  
;).


\Anders

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Heikki Linnakangas

Tom Lane wrote:


Are you running similar gcc versions on both?  I realize I forgot to
document what I was using:

HPPA: gcc version 2.95.3 20010315 (release)
PPC: gcc version 4.0.1 (Apple Computer, Inc. build 5363)
both Intels: gcc version 4.1.1 20060525 (Red Hat 4.1.1-1)

Interesting that it's only the oldest-line tech that's showing a win
for me ...


It would be interesting to see if the Intel compiler works better for 
this algorithm. Anyone have it installed?


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Jeremy Drake
On Mon, 23 Oct 2006, Mark Kirkwood wrote:

 Tom Lane wrote:
 
 
  Yah, I checked.  Several times... but if anyone else wants to repeat
  the experiment, please do.  Or look for bugs in either my test case
  or Gurjeet's.



Just for fun, I tried it out with both GCC and with Intel's C compiler
with some agressive platform-specific flags on my 2.8Ghz Xeon running
Gentoo.

Std crc Slice-8 crc

Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2)

8192 bytes  4.6975729.806341
1024 bytes  0.5974291.181828
64 bytes0.0466360.086984

Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel)

8192 bytes  0.040.001085
1024 bytes  0.040.001292
64 bytes0.030.001078


So at this point I realize that intel's compiler is optimizing the loop
away, at least for the std crc and probably for both.  So I make mycrc an
array of 2, and substript mycrc[j1] in the loop.

Std crc Slice-8 crc

Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2)

8192 bytes  51.397146   9.523182
1024 bytes  6.4309861.229043
64 bytes0.4000620.128579

Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel)

8192 bytes  29.881708   0.001432
1024 bytes  3.7503130.001432
64 bytes0.2385830.001431

So it looks like something fishy is still going on with the slice-8 with
the intel compiler.

I have attached my changed testcrc.c file.


 FWIW - FreeBSD and Linux results using Tom's test program on almost identical
 hardware[1]:

 Std crc Slice-8 crc

 Intel P-III 1.26Ghz (FreeBSD 6.2)

 8192 bytes  12.975314   14.503810
 1024 bytes  1.6335571.852322
 64 bytes0.1115800.206975


 Intel P-III 1.26Ghz (Gentoo 2006.1)


 8192 bytes  12.967997   28.363876
 1024 bytes  1.6323173.626230
 64 bytes0.1115130.326557


 Interesting that the slice-8 algorithm seems to work noticeably better on
 FreeBSD than Linux - but still not as well as the standard one (for these
 tests anyway)...


 Cheers

 Mark

 [1] Both  boxes have identical mobos, memory and CPUs (same sspec nos).


 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings



-- 
You can tune a piano, but you can't tuna fish.#include postgres.h

#include time.h
#include sys/time.h

#include pg_crc.h

int
main()
{
charbuffer[TESTSIZE];
pg_crc32mycrc[2];
int j;
struct timeval tstart;
struct timeval tstop;

srand(time(NULL));
for (j = 0; j  TESTSIZE; ++j)
buffer[j] = (char) (255 * (rand() / (RAND_MAX + 1.0)));

gettimeofday(tstart, NULL);

for (j = 0; j  NTESTS; j++)
{
INIT_CRC32(mycrc[j1]);
COMP_CRC32(mycrc[j1], buffer, TESTSIZE);
FIN_CRC32(mycrc[j1]);
}

gettimeofday(tstop, NULL);

if (tstop.tv_usec  tstart.tv_usec)
{
tstop.tv_sec--;
tstop.tv_usec += 100;
}

printf(bufsize = %d, loops = %d, elapsed = %ld.%06ld\n,
   TESTSIZE, NTESTS,
   (long) (tstop.tv_sec - tstart.tv_sec),
   (long) (tstop.tv_usec - tstart.tv_usec));

return 0;
}

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [SPAM?] Re: [HACKERS] Asynchronous I/O Support

2006-10-23 Thread Zeugswetter Andreas ADI SD

  So far I've seen no evidence that async I/O would help us, only a
lot 
  of wishful thinking.
  
  is this thread moot?  while researching this thread I came across
this
  article: http://kerneltrap.org/node/6642 describing claims of 30% 
  performance boost when using posix_fadvise to ask the o/s to
prefetch 
  data.  istm that this kind of improvement is in line with what aio
can 
  provide, and posix_fadvise is cleaner, not requiring threads and
such.
 
 Hmm, my man page says:
 
POSIX_FADV_WILLNEED and POSIX_FADV_NOREUSE both initiate a
non-blocking read of the specified region into the page cache. 
The amount of data read may be decreased by the kernel
depending
on VM load. (A few megabytes will usually be fully satisfied,
and more is rarely useful.)
 
 This appears to be exactly what we want, no? It would be nice 
 to get some idea of what systems support this.

POSIX_FADV_WILLNEED definitely sounds very interesting, but:

I think this interface was intended to hint larger areas (megabytes).
But the wishful thinking was not to hint seq scans, but to advise
single 8k pages.
The OS is responsible for sequential readahead, but it cannot anticipate
random access that results from btree access (unless of course we are
talking about
very small tables).

But I doubt, that with this interface many OS's will actually forward
multiple IO's
to the disk subsystem in parallel, which would be what is needed. 
Also the comment Bruce quoted does not sound incouraging :-(

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [SPAM?] Re: [HACKERS] Asynchronous I/O Support

2006-10-23 Thread Zeugswetter Andreas ADI SD

  So far I've seen no evidence that async I/O would help us, only a
lot 
  of wishful thinking.
 
 is this thread moot?  while researching this thread I came across this
 article: http://kerneltrap.org/node/6642 describing claims of 
 30% performance boost when using posix_fadvise to ask the o/s 
 to prefetch data.  istm that this kind of improvement is in 
 line with what aio can provide, and posix_fadvise is cleaner, 
 not requiring threads and such.

This again is for better OS readahead for sequential access, where
standard Linux obviously behaves differently. It is not about random
access.

Btw. I do understand the opinion from Linux developers, that pg should
actually
read larger blocks for seq scans. In cases of high disk load OS's tend
to not
do all needed readahead, which has pros and cons, but mainly cons for
pg.

Andreas

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [SPAM?] Re: [HACKERS] Asynchronous I/O Support

2006-10-23 Thread Zeugswetter Andreas ADI SD

  Yup, that would be the scenario where it helps (provided that you
have 
  a smart disk or a disk array and an intelligent OS aio
implementation).
  It would be used to fetch the data pages pointed at from an index 
  leaf, or the next level index pages.
  We measured the IO bandwidth difference on Windows with EMC as
beeing 
  nearly proportional to parallel outstanding requests up to at least
 
 Measured it using what? I was under the impression only one 
 proof-of-implementation existed, and that the scenarios and 
 configuration of the person who wrote it, did not show 
 significant improvement.

IIRC the configuration of that test was not suitable to show any
benefit.
Minimum requirements to show improvement are:
- very few active sessions (typically less than number of disks)
- a table that spans multiple disks (typically on a stripe set)
   (or one intelligent scsi disk)
- only random disk access plans
 
 You have PostgreSQL on Windows with EMC with async I/O 
 support to test with?

No, sorry. Was a MaxDB issue.

Andreas

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] bug or feature, || -operator and NULLs

2006-10-23 Thread Albe Laurenz
Mario Weilguni wrote:
 This has been been discussed before, but Oracle behaves differently,
and
 IMHO in a more correct way.

 The following query returns NULL in PG:
 SELECT NULL || 'fisk';

 But in Oracle, it returns 'fisk':
 SELECT NULL || 'fisk' FROM DUAL;

 The latter seems more logical...
 
 I've worked alot with oracle a few years ago and I agree, the feature
is handy 
 and makes sometimes life easier, but it's simply wrong. I heard a
while ago 
 that newer oracle versions changed this to sql - standard, is this
true?

Unfortunately not, in Oracle's current version (10.2.0.2.0)
it is still that way.

I think that this Oracle 'feature' is almost as terrible as the
fact that they treat '' as NULL, which is (as has been pointed
out) most likely the reason for treating NULL as '' in ||.

Yours,
Laurenz Albe

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Gregory Stark

Simon Riggs [EMAIL PROTECTED] writes:

 Slice-By-8 was first mentioned here:

Are you sure? 

US patent 7,047,479 filed in 2002 sounds like it may be relevant: 

http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1Sect2=HITOFFd=PALLp=1u=%2Fnetahtml%2FPTO%2Fsrchnum.htmr=1f=Gl=50s1=7047479.PN.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Mark Kirkwood

Mark Kirkwood wrote:

Tom Lane wrote:



Are you running similar gcc versions on both?  I realize I forgot to
document what I was using:



Ah - good point, FreeBSD is using an older compiler:

FreeBSD: gcc (GCC) 3.4.6 [FreeBSD] 20060305
Linux: gcc (GCC) 4.1.1 (Gentoo 4.1.1)

Hmm - there is a FreeBSD port for 4.1.2, I might set that off to build 
itself and see if compiling with it changes the results any




Here are the results after building gcc 4.1.2 (repeating results for gcc 
3.4.6 for comparison). I suspect that performance is probably impacted 
because gcc 4.1.2 (and also the rest of the tool-chain) is built with 
gcc 3.4.6 - but it certainly suggests that the newer gcc versions don't 
 like the slice-8 algorithm for some reason.


Std crc Slice-8 crc

Intel P-III 1.26Ghz (FreeBSD 6.2 gcc 3.4.6)

8192 bytes  12.975314   14.503810
1024 bytes  1.6335571.852322
64 bytes0.1115800.206975

Intel P-III 1.26Ghz (FreeBSD 6.2 gcc 4.1.2)

8192 bytes  19.516974   29.457739
1024 bytes  2.4485703.742106
64 bytes0.1126440.335292

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Simon Riggs
On Mon, 2006-10-23 at 05:22 -0400, Gregory Stark wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
 
  Slice-By-8 was first mentioned here:
 
 Are you sure? 
 
 US patent 7,047,479 filed in 2002 sounds like it may be relevant: 
 
 http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1Sect2=HITOFFd=PALLp=1u=%2Fnetahtml%2FPTO%2Fsrchnum.htmr=1f=Gl=50s1=7047479.PN.

Yes, I'm sure.

That patent is titled Parallel CRC formulation and the abstract
describes the use of two threads to calculate different parts of the
CRC. It's designed for parallel circuit designs in hardware.

That doesn't resemble SB8 at all - there's still only one thread. The
cleverness of SB8 is to calculate more bytes simultaneously *without*
requiring a corresponding increase in the size of the lookup table. 

We have the original author's word that no patent has been filed. Even
if we disbelieve that, which I have no reason to do, that alone would be
sufficient to make a counter-claim for any damages claimed by any
hypothetical patent holder in the future.


What is the difference between this case and other patches?


-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] MVCC question

2006-10-23 Thread Gevik Babakhani
Folks,

How long are we supporting MVCC?
It is from the beginning or is it added later to PG

-- 
Regards,
Gevik Babakhani
www.postgresql.nl
www.truesoftware.nl


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Benny Amorsen
 MK == Mark Kirkwood [EMAIL PROTECTED] writes:

MK Here are the results after building gcc 4.1.2 (repeating results
MK for gcc 3.4.6 for comparison). I suspect that performance is
MK probably impacted because gcc 4.1.2 (and also the rest of the
MK tool-chain) is built with gcc 3.4.6 - but it certainly suggests
MK that the newer gcc versions don't like the slice-8 algorithm for
MK some reason.

They don't seem to like the old CRC algorithms either. It is quite
strange, such dramatic performance regressions from 3.x to 4.x are
rare.


/Benny



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] MVCC question

2006-10-23 Thread imad

Release 6.5 brings MVCC to PostgreSQL.
Check out the following doc for details
http://www.postgresql.org/docs/7.1/static/release-6-5.html

--Imad
EnterpriseDB
(http://www.enterprisedb.com)

On 10/23/06, Gevik Babakhani [EMAIL PROTECTED] wrote:

Folks,

How long are we supporting MVCC?
It is from the beginning or is it added later to PG

--
Regards,
Gevik Babakhani
www.postgresql.nl
www.truesoftware.nl


---(end of broadcast)---
TIP 6: explain analyze is your friend



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] MVCC question

2006-10-23 Thread Peter Eisentraut
Am Montag, 23. Oktober 2006 13:50 schrieb Gevik Babakhani:
 How long are we supporting MVCC?

6.5
-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 So at this point I realize that intel's compiler is optimizing the loop
 away, at least for the std crc and probably for both.  So I make mycrc an
 array of 2, and substript mycrc[j1] in the loop.

That's not a good workaround, because making mycrc expensive to access
means your inner loop timing isn't credible at all.  Instead try making the
buffer array nonlocal --- malloc it, perhaps.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PgSQL users quota

2006-10-23 Thread Jim C. Nasby
On Fri, Oct 20, 2006 at 03:30:40AM +0300, Tux P wrote:
 Hi .*
 
 Is there any chance to see the quota implementation described in this post
 in any next releases?
 
 http://archives.postgresql.org/pgsql-hackers/2004-07/msg00392.php

Since Jonah hasn't done anything with it he's presumably lost interest,
so you'd need to find someone else looking for an itch to scratch. And
it appears the original patch was against 7.4, so it'd probably need a
decent amount of work to make it work with our current code.

On the bright side, there is the following TODO item:

o Allow per-tablespace quotas
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] PgSQL users quota

2006-10-23 Thread Jonah H. Harris

On 10/23/06, Jim C. Nasby [EMAIL PROTECTED] wrote:

Since Jonah hasn't done anything with it he's presumably lost interest,
so you'd need to find someone else looking for an itch to scratch. And
it appears the original patch was against 7.4, so it'd probably need a
decent amount of work to make it work with our current code.


Well, it was more of the case of community wants vs. what I needed at
the time.  I'm not quite sure if I have the patch lying around
anywhere, but it was fairly trivial to implement.


o Allow per-tablespace quotas


Yes, this is what came out of the discussion.  A per-tablespace patch
should be fairly easy to implement once you get a consensus on what
constitutes actual space usage... as VACUUMable space may or may not
apply depending on your use-case.

--
Jonah H. Harris, Software Architect | phone: 732.331.1300
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 2nd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Simon Riggs
On Sun, 2006-10-22 at 18:06 -0400, Tom Lane wrote:

 These numbers are um, not impressive.  Considering that a large fraction
 of our WAL records are pretty short, the fact that slice8 consistently
 loses at short buffer lengths is especially discouraging.  Much of that
 ground could be made up perhaps with tenser coding of the initialization
 and finalization code, but it'd still not be worth taking any legal risk
 for AFAICS.

It doesn't look good for SB8, does it? Nor for gcc4.1 either.

Presumably Intel themselves will have some come-back, but I'm not sure
what they'll so to so many conclusive tests.

Instead, I'd like to include a parameter to turn off CRC altogether, for
heavily CPU bound operations and the WAL drive on trustworthy hardware.

wal_checksum = off

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 Instead, I'd like to include a parameter to turn off CRC altogether, for
 heavily CPU bound operations and the WAL drive on trustworthy hardware.

No can do --- we rely on the checksums to be able to tell when we've hit
the end of WAL during replay.  You may as well propose not writing WAL
at all (and no, I don't think it'd pass).

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Dawid Kuroczko

On 10/23/06, Tom Lane [EMAIL PROTECTED] wrote:

It's not so much that I don't trust Intel as that a CRC algorithm is
exactly the sort of nice little self-contained thing that people love
to try to patent these days.  What I am really afraid of is that someone
else has already invented this same method (or something close enough
to it) and filed for a patent that Intel doesn't know about either.
I'd be wondering about that no matter where the code had come from.

Given the numbers I posted earlier today, the proposal is dead in the
water anyway, quite aside from any legal considerations.


The horror, the horror.  I wonder if changing to Slicing by 8 would be
so self contained, so that people from software-patent free world
would be able to just patch their distribution if they will.

  Regards,
 Dawid

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Problems starting Template1...

2006-10-23 Thread luis garcia
Hello, I'm from Venezuela, and I've been making some modifications to Postgre's Catalog, but it seems to be a problem creating the Template1 Database. When the creation of the database is starting this is what happens:
[EMAIL PROTECTED]:~ /home/luis/pgsql/bin/initdb -D /home/luis/pgsql/data/The files belonging to this database system will be owned by user luis.This user must also own the server process.
The database cluster will be initialized with locale es_ES.UTF-8.The default database encoding has accordingly been set to UTF8.fixing permissions on existing directory /home/luis/pgsql/data ... ok
creating directory /home/luis/pgsql/data/global ... okcreating directory /home/luis/pgsql/data/pg_xlog ... okcreating directory /home/luis/pgsql/data/pg_xlog/archive_status ... okcreating directory /home/luis/pgsql/data/pg_clog ... ok
creating directory /home/luis/pgsql/data/pg_subtrans ... okcreating directory /home/luis/pgsql/data/pg_twophase ... okcreating directory /home/luis/pgsql/data/pg_multixact/members ... okcreating directory /home/luis/pgsql/data/pg_multixact/offsets ... ok
creating directory /home/luis/pgsql/data/base ... okcreating directory /home/luis/pgsql/data/base/1 ... okcreating directory /home/luis/pgsql/data/pg_tblspc ... okselecting default max_connections ... 100
selecting default shared_buffers ... 1000creating configuration files ... okcreating template1 database in /home/luis
/pgsql/data/base/1 ... FATAL: incorrect number of columns in row (expected 32, got 29)child process exited with exit code 1
initdb: removing contents of data directory /home/luis/pgsql/dataI just added three more fields to pg_class structure (pg_class.h) just like this:
..bool  relhasfrequency; /* has time frequency */NameData
 relfrequencytype; /* table time frequency type (REAL_TIME, HISTORIC)*/NameData relfrequency
; /* class time frequency value*/#define Natts_pg_class_fixed   31
#define Natts_pg_class 32#define Anum_pg_class_relname   1
#define Anum_pg_class_relnamespace 2...
DATA(insert OID = 1259 ( pg_class  PGNSP 83 PGUID 0 1259 0 0 0 0 0 f f r 32 0 0 0 0 0 t f f f f null 0 f _null_ ));...-pg_class- has originally 25 fields, and one of my friends has added four more
with no problems, but it seems to be a problem with the changes I've made, andwe don't know witch is it...I also made the next changes to pg_attribute.h:...{ 1259, {relhasfrequency},16, -1, 1, 29, 0, -1, -1, true, 'p', 'c', true, false, false, true, 0 }, \
{ 1259, {relfrequencytype},  19, -1, NAMEDATALEN, 30, 0, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \{ 1259, {relfrequency}, 19, -1, NAMEDATALEN, 31, 0, -1, -1, false, 'p', 'i', true, false, false, true, 0 }, \
{ 1259, {relacl},  1034, -1, -1, 32, 1, -1, -1, false, 'x', 'i', false, false, false, true, 0 }...DATA(insert ( 1259,relhasfrequency 16 -1 1 29 0 -1 -1 t p c t f f t 0));DATA(insert ( 1259,relfrequencytype  19 -1 NAMEDATALEN 30 0 -1 -1 f p i t f f t 0));
DATA(insert ( 1259,relfrequency   19 -1 NAMEDATALEN 31 0 -1 -1 f p i t f f t 0));DATA(insert ( 1259 relacl   1034 -1 -1 32 1 -1 -1 f x i f f f t 0));...If anyone could help me on this I would be very grateful, because this is the final
project of my Career... THANKS-- Luis D. García M.Telf: 0414-3482018- FACYT - UC -- Computación -


Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum

2006-10-23 Thread Jim C. Nasby
If the decision to vacuum based on autovacuum criteria is good enough
for you then I think you should just focus on getting autovac to do what
you want it to do. Perhaps you just need to decrease the sleep time to a
few seconds, so that autovac will quickly detect when something needs to
be vacuumed.

The only case I can think of where autovac might not work as well as
smartvacuum would be if you had a lot of databases in the cluster, since
autovacuum will only vacuum one database at a time.

On Mon, Oct 23, 2006 at 11:18:39AM +0900, Hitoshi Harada wrote:
 Ok, 
 
 But my point is, autovacuum may corrupt with vacuum analyze command
 on another session. My intention of smartvacuum() is based on this.
 Any solution for this??
 
 Regards, 
 
 
 Hitoshi Harada
 
  -Original Message-
  From: [EMAIL PROTECTED]
  [mailto:[EMAIL PROTECTED] On Behalf Of Tom Lane
  Sent: Monday, October 23, 2006 11:10 AM
  To: Hitoshi Harada
  Cc: 'Peter Eisentraut'; pgsql-hackers@postgresql.org
  Subject: Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum
  
  Hitoshi Harada [EMAIL PROTECTED] writes:
   How is this different from what autovacuum does?
  
   My application needs to do vacuum by itself, while
   autovacuum does it as daemon.
   The database is updated so frequently that
   normal vacuum costs too much and tables to be updated are
   not so many as the whole database is vacuumed.
   I want to use autovacuum except the feature of daemon,
   but want to control when to vacuum and which table to vacuum.
   So, nothing is different between autovacuum and smartvacuum(),
   but former is daemon and later is user function.
  
  This seems completely unconvincing.  What are you going to do that
  couldn't be done by autovacuum?
  
  regards, tom lane
  
  ---(end of broadcast)---
  TIP 5: don't forget to increase your free space map settings
 
 
 ---(end of broadcast)---
 TIP 5: don't forget to increase your free space map settings
 

-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Josh Berkus
Bruce, Tom, All:

  Given the numbers I posted earlier today, the proposal is dead in the
  water anyway, quite aside from any legal considerations.

 Agreed.  I just wanted to point out we have other sharks in the water.

*IF* Slice-by-8 turned out to be a winner, I could get the legal issues 
looked into and probably resolved.   But we should only do that if it 
turns out we really want to use it.

-- 
--Josh

Josh Berkus
PostgreSQL @ Sun
San Francisco

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Simon Riggs
On Mon, 2006-10-23 at 13:52 -0400, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Instead, I'd like to include a parameter to turn off CRC altogether, for
  heavily CPU bound operations and the WAL drive on trustworthy hardware.
 
 No can do --- we rely on the checksums to be able to tell when we've hit
 the end of WAL during replay.  

No we don't: Zero length records are the trigger for EOF.

Sure, a CRC failure will cause replay to end, but only if you calculate
it and compare it to whats on disk - which is the bit I would turn off.
We don't rely on that, however, so it is avoidable.

In most cases, it would be foolish to avoid: but there are cases where
the data is CRC checked by the hardware/system already, so I'd like to
make an option to turn this off, defaulting to on, for safety.

 You may as well propose not writing WAL
 at all (and no, I don't think it'd pass).

That would undo all of my efforts, so no I wouldn't consider that.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Joshua D. Drake

 In most cases, it would be foolish to avoid: but there are cases where
 the data is CRC checked by the hardware/system already, so I'd like to
 make an option to turn this off, defaulting to on, for safety.

How would we know? What are those cases?

Sounds like a foot gun to me.

Sincerely,

Joshua D. Drake

 
 You may as well propose not writing WAL
 at all (and no, I don't think it'd pass).
 
 That would undo all of my efforts, so no I wouldn't consider that.
 


-- 

   === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
   Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum

2006-10-23 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 The only case I can think of where autovac might not work as well as
 smartvacuum would be if you had a lot of databases in the cluster, since
 autovacuum will only vacuum one database at a time.

It's conceivable that it'd make sense to allow multiple autovac
processes running in parallel.  (The infrastructure part of this is easy
enough, the hard part is keeping them from all deciding to vacuum the
same table.)

One reason we have not done that already is the thought that multiple
vacuum processes would suck too much I/O to be reasonable.  Now you
could dial back their resource demands with the cost-delay settings,
but it's not clear that ten autovacs running at one-tenth speed are
better than one autovac using all the cycles you can spare.  Usually
I think it's best if a vacuum transaction finishes as fast as it can.

In any case, these exact same concerns would apply to manual vacuums
or a combination of manual and auto vacuum.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Tsearch2 index size

2006-10-23 Thread richard-pgodbc
I am running versions 8.1.2 and I installed 8.2B last week.  I dumped 
the data from the old version into the new version. The db consists of 
several million records.  Total disk usage is approximately 114GB.


My two observations are as follows... Also, keep in mind these are truly 
just observations, I didn't use any benchmarking tools.  If someone can 
point me to a link of performance tools, I'd be happy to try them and report 
back!


1. The release notes indicate more efficient vacuuming.  However, both 
vacuums seems to take about the same amount of time, ie. approx: 9 hours.  
Does more efficient simply mean, less IO/CPU busyness?  This one doesn't 
really bother me, the next one does...


Here are my vacuum parms, I used the same ones for both versions, of 
course.

--
maintenance_work_mem = 40 # Unnecessarily high, I know I left it
 # for comparison's sake.
vacuum_cost_delay = 50
vacuum_cost_page_hit = 1
vacuum_cost_page_miss = 10
vacuum_cost_page_dirty = 20
vacuum_cost_limit = 2000
--



2. I have a tsearch2 index which is 756MB in size in 8.1.2 but balloons to 
910MB in 8.2!  These numbers were taken right after a REINDEX.  Normally, I 
wouldn't care about physical index size, but this particular index is 
sitting on a ramdisk, so size really does matter.  I see that the tsearch2 
type was diddled with in 8.2.  Is this an intentional change to improve 
tsearch2 performance?


Thank you for any advice or abuse you give.  No.  Wait.  No abuse please.

Richard Whidden

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 On Mon, 2006-10-23 at 13:52 -0400, Tom Lane wrote:
 No can do --- we rely on the checksums to be able to tell when we've hit
 the end of WAL during replay.  

 No we don't: Zero length records are the trigger for EOF.

Only if the file happens to be all-zero already, which is not the normal
operating state (see WAL-file recycling).  Otherwise you have to be able
to detect an invalid record.

There are actually three checks used to detect end of WAL: zero record
length, invalid checksum, and incorrect back-pointer.  Zero length is
the first and cleanest-looking test, but AFAICS we have to have both of
the others to avoid obvious failure modes.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Jeremy Drake
On Mon, 23 Oct 2006, Tom Lane wrote:

 Jeremy Drake [EMAIL PROTECTED] writes:
  So at this point I realize that intel's compiler is optimizing the loop
  away, at least for the std crc and probably for both.  So I make mycrc an
  array of 2, and substript mycrc[j1] in the loop.

 That's not a good workaround, because making mycrc expensive to access
 means your inner loop timing isn't credible at all.  Instead try making the
 buffer array nonlocal --- malloc it, perhaps.

That did not make any difference.  The way I see it, the only way to
convince the compiler it really needs to do this loop more than once is to
make it think it is not overwriting the same variable every time.  The
subscript was the cheapest way I could think of to do that.  Any other
suggestions on how to do this are welcome.


   regards, tom lane

 ---(end of broadcast)---
 TIP 6: explain analyze is your friend



-- 
I like being single.  I'm always there when I need me.
-- Art Leo

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 On Mon, 23 Oct 2006, Tom Lane wrote:
 That's not a good workaround, because making mycrc expensive to access
 means your inner loop timing isn't credible at all.  Instead try making the
 buffer array nonlocal --- malloc it, perhaps.

 That did not make any difference.  The way I see it, the only way to
 convince the compiler it really needs to do this loop more than once is to
 make it think it is not overwriting the same variable every time.  The
 subscript was the cheapest way I could think of to do that.  Any other
 suggestions on how to do this are welcome.

Hmm.  Maybe store the CRCs into a global array somewhere?

uint32 results[NTESTS];

for ...
{
INIT/COMP/FIN_CRC32...
results[j] = mycrc;
}

This still adds a bit of overhead to the outer loop, but not much.

Another possibility is to put the INIT/COMP/FIN_CRC32 into an external
subroutine, thereby adding a call/return to the outer loop overhead.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Tsearch2 index size

2006-10-23 Thread Tom Lane
[EMAIL PROTECTED] writes:
 1. The release notes indicate more efficient vacuuming.  However, both 
 vacuums seems to take about the same amount of time, ie. approx: 9 hours.  

I think the improvements were only in btree index vacuuming, which it
sounds like isn't your big problem.

 2. I have a tsearch2 index which is 756MB in size in 8.1.2 but balloons to 
 910MB in 8.2!

FILLFACTOR?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Jeremy Drake
On Mon, 23 Oct 2006, Tom Lane wrote:

 Hmm.  Maybe store the CRCs into a global array somewhere?

   uint32 results[NTESTS];

   for ...
   {
   INIT/COMP/FIN_CRC32...
   results[j] = mycrc;
   }

 This still adds a bit of overhead to the outer loop, but not much.


That seems to have worked.

Std crc Slice-8 crc

Intel P4 Xeon 2.8Ghz (Gentoo, gcc-3.4.5, -O2)

8192 bytes  26.765317   10.511143
1024 bytes  3.3578431.280890
64 bytes0.2232130.103767

Intel P4 Xeon 2.8Ghz (Gentoo, icc-9.0.032, -O2 -xN -ipo -parallel)

8192 bytes  29.495836   0.007107
1024 bytes  3.7086650.012183
64 bytes0.2425790.008700


So the gcc times are reasonable, but the icc times for the slice-by-8 are
still too fast to be believed.  I will have to take a look at the
generated assembly later and see what gives.

My changed testcrc.c is attached, again.


-- 
I'd love to go out with you, but I did my own thing and now I've got
to undo it.#include postgres.h

#include time.h
#include sys/time.h

#include pg_crc.h

int
main()
{
volatile char   buffer[TESTSIZE];
pg_crc32results[NTESTS];
pg_crc32mycrc;
int j;
struct timeval tstart;
struct timeval tstop;

srand(time(NULL));
for (j = 0; j  TESTSIZE; ++j)
buffer[j] = (char) (255 * (rand() / (RAND_MAX + 1.0)));

gettimeofday(tstart, NULL);

for (j = 0; j  NTESTS; j++)
{
INIT_CRC32(mycrc);
COMP_CRC32(mycrc, buffer, TESTSIZE);
FIN_CRC32(mycrc);
results[j] = mycrc;
}

gettimeofday(tstop, NULL);

if (tstop.tv_usec  tstart.tv_usec)
{
tstop.tv_sec--;
tstop.tv_usec += 100;
}

printf(bufsize = %d, loops = %d, elapsed = %ld.%06ld\n,
   TESTSIZE, NTESTS,
   (long) (tstop.tv_sec - tstart.tv_sec),
   (long) (tstop.tv_usec - tstart.tv_usec));

return 0;
}

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Zdenek Kotala
I tried to use COPY command to export and import tables from catalog, 
but COPY command has problem with data type regproc. See example


  create table test (like pg_aggregate);
  copy pg_aggregate to '/tmp/pg_agg.out';
  copy test from '/tmp/pg_agg.out';

ERROR:  more than one function named pg_catalog.avg
CONTEXT:  COPY test, line 1, column aggfnoid: pg_catalog.avg


The problem is that pg_proc table has following unique indexes:

 pg_proc_oid_index UNIQUE, btree (oid)
 pg_proc_proname_args_nsp_index UNIQUE, btree (proname, proargtypes, 
pronamespace)


And regprocin in the backend/utils/adt/regproc.c cannot found unique OID 
for proname.


Workaround is use binary mode, but on other side aclitem is not 
supported in the binary mode.


  postgres=# copy pg_class to '/tmp/pg_class.out' binary;
  ERROR:  no binary output function available for type aclitem


The solution is that COPY command will be use OID instead procname for 
export regproc.



Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Andrew Dunstan

Zdenek Kotala wrote:

I tried to use COPY command to export and import tables from catalog



Is it just me or does this seem like a strange thing to want to do? I am 
trying to think of a good use case, so far without much success.


cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Alvaro Herrera
Zdenek Kotala wrote:
 I tried to use COPY command to export and import tables from catalog, 
 but COPY command has problem with data type regproc. See example
 
   create table test (like pg_aggregate);
   copy pg_aggregate to '/tmp/pg_agg.out';
   copy test from '/tmp/pg_agg.out';
 
 ERROR:  more than one function named pg_catalog.avg
 CONTEXT:  COPY test, line 1, column aggfnoid: pg_catalog.avg

Hmm, maybe it should be using regprocedure instead?  That one emits
type-qualified function names, IIRC.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Zdenek Kotala

Andrew Dunstan wrote:

Zdenek Kotala wrote:

I tried to use COPY command to export and import tables from catalog



Is it just me or does this seem like a strange thing to want to do? I am 
trying to think of a good use case, so far without much success.




I'm playing with catalog upgrade. The very basic idea of my experiment 
is export data from catalog and import it back to the new 
initialized/fresh catalog.



Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Hmm, maybe it should be using regprocedure instead?

Not unless you want to break initdb.  The only reason regproc still
exists, really, is to accommodate loading of pg_type during initdb.
Guess what: we can't do type lookup at that point.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Hmm, maybe it should be using regprocedure instead?
 
 Not unless you want to break initdb.  The only reason regproc still
 exists, really, is to accommodate loading of pg_type during initdb.
 Guess what: we can't do type lookup at that point.

I was thinking in the copied-out table, which not necessarily has to be
pg_aggregate.  I just tried, and it works to do this:

alvherre=# create table pg_aggregate2 (aggfnoid regprocedure, aggtransfn
alvherre(# regprocedure, aggfinalfn regprocedure, aggsortop oid, aggtranstype 
oid,
alvherre(# agginitval text);
CREATE TABLE
alvherre=# insert into pg_aggregate2 select * from pg_aggregate;
INSERT 0 114
alvherre=# create table test (like pg_aggregate2);
CREATE TABLE
alvherre=# copy pg_aggregate2 to '/tmp/pg_agg.out';
COPY 114
alvherre=# copy test from '/tmp/pg_agg.out';
COPY 114
alvherre=# 

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Andrew Dunstan

Zdenek Kotala wrote:

Andrew Dunstan wrote:

Zdenek Kotala wrote:

I tried to use COPY command to export and import tables from catalog



Is it just me or does this seem like a strange thing to want to do? I 
am trying to think of a good use case, so far without much success.




I'm playing with catalog upgrade. The very basic idea of my experiment 
is export data from catalog and import it back to the new 
initialized/fresh catalog.



 


Fair enough, but I am somewhat doubtful that COPY is the best way to do 
this.


cheers

andrew


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Zdenek Kotala

Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:

Hmm, maybe it should be using regprocedure instead?


Not unless you want to break initdb.  The only reason regproc still
exists, really, is to accommodate loading of pg_type during initdb.
Guess what: we can't do type lookup at that point.


Do you mean something like this:


Datum
regprocout(PG_FUNCTION_ARGS)
{

  ...

  if( donot_resolve_procname == TRUE)
  {
 result = (char *) palloc(NAMEDATALEN);
 snprintf(result, NAMEDATALEN, %u, proid);
  }

  ...

  PG_RETURN_CSTRING(result);
}


donot_resolve_procname will be set when COPY will be performed.


Zdenek

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
 I'm playing with catalog upgrade. The very basic idea of my experiment 
 is export data from catalog and import it back to the new 
 initialized/fresh catalog.

That is never going to work, at least not for any interesting catalogs.
A system with a fresh (I assume you mean empty) pg_proc, for instance,
is non functional.

A much bigger problem, if you're thinking of this as a component step
of pg_upgrade, is that you can't use anything at the COPY level of
detail because it will fail if the new version wants a different catalog
layout --- for instance, if someone's added a column to the catalog.
The right way to implement pg_upgrade is to transfer the catalog data
at the SQL-command level of abstraction, ie, pg_dump -s and reload.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Tom Lane
Zdenek Kotala [EMAIL PROTECTED] writes:
if( donot_resolve_procname == TRUE)
{
   result = (char *) palloc(NAMEDATALEN);
   snprintf(result, NAMEDATALEN, %u, proid);
}

What for?  If you want numeric OIDs you can have that today by casting
the column to OID.  More to the point, the issue is hardly restricted
to COPY --- you'd get the same thing if you tried to insert data with
INSERT.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Zdenek Kotala

Tom Lane wrote:

Zdenek Kotala [EMAIL PROTECTED] writes:
I'm playing with catalog upgrade. The very basic idea of my experiment 
is export data from catalog and import it back to the new 
initialized/fresh catalog.


That is never going to work, at least not for any interesting catalogs.
A system with a fresh (I assume you mean empty) pg_proc, for instance,
is non functional.


No empty, fresh initialized by initdb. I want to copy only user data 
which is not created during boostrap.




A much bigger problem, if you're thinking of this as a component step
of pg_upgrade, is that you can't use anything at the COPY level of
detail because it will fail if the new version wants a different catalog
layout --- for instance, if someone's added a column to the catalog.


Yes, I know about it. It is not problem, I want to prepare shadow 
catalog with new structure on old database in separate schema and adjust 
data in these tables. After it I want to make final COPY - data will be 
copied with correct structure.




The right way to implement pg_upgrade is to transfer the catalog data
at the SQL-command level of abstraction, ie, pg_dump -s and reload.


I'm not sure if it is important, but I think that preserve OID is 
important and SQL level does not allow set OID.


Zdenek

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Andrew Dunstan

Zdenek Kotala wrote:

Tom Lane wrote:


The right way to implement pg_upgrade is to transfer the catalog data
at the SQL-command level of abstraction, ie, pg_dump -s and reload.


I'm not sure if it is important, but I think that preserve OID is 
important and SQL level does not allow set OID.


 



Does it matter in any case other than where it refers to an on-disk 
object? And does that need anything other than a fixup to 
pg_class::relfilenode?



cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Tsearch2 index size

2006-10-23 Thread richard-pgodbc


2. I have a tsearch2 index which is 756MB in size in 8.1.2 but balloons to 
910MB in 8.2!


FILLFACTOR?



Tom, 

Of course!  I had it in my head that fillfactor had to be explicitely set.  
But then, after RTFM, it looks like there are defaults!  Thank you!


One more inane question, though.  The default fillfactors are currently 
#define'ed.  Is there or will there be a way to set the default fillfactor, 
using a system table perhaps?  I realize this is just a convenience, not 
integral to fillfactor functionality...


I sniffed around the docs and mailing lists but I could not find an answer.  My 
apologies if this question has been answered already.

Richard Whidden

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] COPY does not work with regproc and aclitem

2006-10-23 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Zdenek Kotala wrote:
 I'm not sure if it is important, but I think that preserve OID is 
 important and SQL level does not allow set OID.

 Does it matter in any case other than where it refers to an on-disk 
 object? And does that need anything other than a fixup to 
 pg_class::relfilenode?

The only things pg_upgrade should be trying to preserve OIDs for are
large objects.  I don't even see a need to worry about relfilenode:
you've got to link the physical files into the new directory tree
anyway, you can perfectly well link them in under whatever new
relfilenode identity happens to be assigned during the dump-reload step.

This was all worked out years ago.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Mark Kirkwood

Benny Amorsen wrote:

MK == Mark Kirkwood [EMAIL PROTECTED] writes:


MK Here are the results after building gcc 4.1.2 (repeating results
MK for gcc 3.4.6 for comparison). I suspect that performance is
MK probably impacted because gcc 4.1.2 (and also the rest of the
MK tool-chain) is built with gcc 3.4.6 - but it certainly suggests
MK that the newer gcc versions don't like the slice-8 algorithm for
MK some reason.

They don't seem to like the old CRC algorithms either. It is quite
strange, such dramatic performance regressions from 3.x to 4.x are
rare.



Right - I think the regression is caused by libc and kernel being built 
with gcc 3.4.6 and the test program being built with gcc 4.1.2. 
Rebuilding *everything* with 4.1.2 (which I'm not sure is possible for 
FreeBSD at the moment) would probably get us back to numbers that looked 
more like my Gentoo ones [1].


Cheers

Mark

[1] Note that the upgrade process for switching Gentoo from gcc 3.4 to 
4.1 involves precisely this - build 4.1, then rebuild everything using 
4.1 (including 4.1 itself!)


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Tom Lane
Mark Kirkwood [EMAIL PROTECTED] writes:
 Right - I think the regression is caused by libc and kernel being built 
 with gcc 3.4.6 and the test program being built with gcc 4.1.2. 

Why do you think that?  The performance of the CRC loop shouldn't depend
at all on either libc or the kernel, because they're not invoked inside
the loop.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum

2006-10-23 Thread Matthew O'Connor

Tom Lane wrote:

Jim C. Nasby [EMAIL PROTECTED] writes:

The only case I can think of where autovac might not work as well as
smartvacuum would be if you had a lot of databases in the cluster, since
autovacuum will only vacuum one database at a time.


It's conceivable that it'd make sense to allow multiple autovac
processes running in parallel.  (The infrastructure part of this is easy
enough, the hard part is keeping them from all deciding to vacuum the
same table.)

One reason we have not done that already is the thought that multiple
vacuum processes would suck too much I/O to be reasonable.  Now you
could dial back their resource demands with the cost-delay settings,
but it's not clear that ten autovacs running at one-tenth speed are
better than one autovac using all the cycles you can spare.  Usually
I think it's best if a vacuum transaction finishes as fast as it can.


I think this is one of the reasons table specific delay settings were 
designed in from the beginning.  I think the main use cases for multiple 
vacuums at once are:
1) Vacuum per table space assuming each table space is on a different 
drive with it's own I/O.
2) the frequently updated table that can't wait to be vacuumed while a 
large table is being vacuumed.  In this case if you set a system default 
delay setting and set a more aggressive table specific delay setting for 
your hot spot tables then multiple vacuums become a clear win.  This is 
an important case that I hope we handle soon.  At this point it's one of 
the main failings of the current autovacuum system.


---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] New CRC algorithm: Slicing by 8

2006-10-23 Thread Mark Kirkwood

Tom Lane wrote:

Mark Kirkwood [EMAIL PROTECTED] writes:
Right - I think the regression is caused by libc and kernel being built 
with gcc 3.4.6 and the test program being built with gcc 4.1.2. 


Why do you think that?  The performance of the CRC loop shouldn't depend
at all on either libc or the kernel, because they're not invoked inside
the loop.




Just come to the same conclusion myself... I've got gcc 3.4.6 on Gentoo, 
 so tried using that, and observed no regression at all there for the 
std case - which pretty much rules out any issues connected with what 
did I build kernel + libc with vs what compiler am I using now:


Intel P-III 1.26Ghz (Gentoo 2006.1)

Std crc Slice-8 crc

8192 bytes  12.967997   28.363876   (gcc 4.1.1)
8192 bytes  12.956765   13.978495   (gcc 3.4.6)

So Gentoo using gcc 3.4.6 looks like FreeBSD using 3.4.6, so the std vs 
slice-8 performance seems to be all about compiler version.


Now the regression on FreeBSD for the std case might be (as Kenneth 
pointed out) due to 4.1.1 being built by 3.4.6 but I guess it is 
just a nit at this point.



Cheers

Mark

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum

2006-10-23 Thread Jim C. Nasby
On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  The only case I can think of where autovac might not work as well as
  smartvacuum would be if you had a lot of databases in the cluster, since
  autovacuum will only vacuum one database at a time.
 
 It's conceivable that it'd make sense to allow multiple autovac
 processes running in parallel.  (The infrastructure part of this is easy
 enough, the hard part is keeping them from all deciding to vacuum the
 same table.)
 
It might be worth creating a generic framework that prevents multiple
vacuums from hitting a table at once, autovac or not.

 One reason we have not done that already is the thought that multiple
 vacuum processes would suck too much I/O to be reasonable.  Now you
 could dial back their resource demands with the cost-delay settings,
 but it's not clear that ten autovacs running at one-tenth speed are
 better than one autovac using all the cycles you can spare.  Usually
 I think it's best if a vacuum transaction finishes as fast as it can.
 
There's other things that would benefit from having some idea on what IO
resources are available. For example, having a separate bgwriter (or
reader) for each set of physical volumes. So a means of grouping
tablespaces wouldn't hurt.

 In any case, these exact same concerns would apply to manual vacuums
 or a combination of manual and auto vacuum.

Well, the advantage to manual vacuums is that you can tune things to
utilize multiple arrays...
-- 
Jim Nasby[EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] smartvacuum() instead of autovacuum

2006-10-23 Thread Alvaro Herrera
Jim C. Nasby wrote:
 On Mon, Oct 23, 2006 at 03:08:03PM -0400, Tom Lane wrote:
  Jim C. Nasby [EMAIL PROTECTED] writes:
   The only case I can think of where autovac might not work as well as
   smartvacuum would be if you had a lot of databases in the cluster, since
   autovacuum will only vacuum one database at a time.
  
  It's conceivable that it'd make sense to allow multiple autovac
  processes running in parallel.  (The infrastructure part of this is easy
  enough, the hard part is keeping them from all deciding to vacuum the
  same table.)
  
 It might be worth creating a generic framework that prevents multiple
 vacuums from hitting a table at once, autovac or not.

That one is easy, because vacuum gets a lock on the affected table that
conflicts with itself.  The problem is that the second vacuum would
actually wait for the first to finish.

A naive idea is to use ConditionalLockAcquire, and if it fails just skip
the table.

  One reason we have not done that already is the thought that multiple
  vacuum processes would suck too much I/O to be reasonable.  Now you
  could dial back their resource demands with the cost-delay settings,
  but it's not clear that ten autovacs running at one-tenth speed are
  better than one autovac using all the cycles you can spare.  Usually
  I think it's best if a vacuum transaction finishes as fast as it can.

In the scenario where one table is huge and another is very small, it
can certainly be useful to vacuum the small table several times while
the huge one has only been vacuumed once.  For that you definitively
need the ability to run parallel vacuums.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Bug related to out of memory condition

2006-10-23 Thread Jeff Davis
This behavior exists in 8.1.4 and CVS HEAD. 

I list below my preexisting schema, a set of commands that behave as I
expect (and result in an ERROR), and a similar set of commands that do
not behave as I expect (and result in a PANIC). Note the position of
BEGIN in each.

This is quite finicky behavior. Apparently the columns in the crashme
table have a significant effect on the results. If you have trouble
reproducing this, I can give more system details. However, it behaves
this way on a fresh install of CVS HEAD, and it's 100% reproducible (for
me, anyway).

Regards,
Jeff Davis

--- EXISTING SCHEMA -

CREATE TABLE r1( i INT PRIMARY KEY );
INSERT INTO r1 VALUES(1);
CREATE TABLE r2( i INT PRIMARY KEY );
INSERT INTO r2 VALUES(1);
CREATE TABLE r3( i INT PRIMARY KEY );
INSERT INTO r3 VALUES(1);
CREATE TABLE r4( i INT PRIMARY KEY );
INSERT INTO r4 VALUES(1);

First, here is the correct behavior:

 CORRECT BEHAVIOR ---

crashme= CREATE TABLE crashme (
crashme(   attr1SERIAL8 PRIMARY KEY,
crashme(   attr2TIMESTAMPTZ DEFAULT NOW(),
crashme(   attr3TIMESTAMPTZ,
crashme(   attr4INT REFERENCES r1(i),
crashme(   attr5INT REFERENCES r2(i),
crashme(   attr6INT REFERENCES r3(i),
crashme(   attr7INT REFERENCES r4(i),
crashme(   attr8TEXT
crashme( );
NOTICE:  CREATE TABLE will create implicit sequence crashme_attr1_seq
for serial column crashme.attr1
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
crashme_pkey for table crashme
CREATE TABLE
crashme= BEGIN;
BEGIN
crashme= INSERT INTO crashme
(attr1,attr2,attr3,attr4,attr5,attr6,attr7,attr8) SELECT NEXTVAL
('crashme_attr1_seq'),NOW(),NOW(),1,1,1,1,'t1' FROM generate_series
(1,500);
ERROR:  out of memory
DETAIL:  Failed on request of size 32.
crashme=

--- UNEXPECTED BEHAVIOR --

crashme= BEGIN;
BEGIN
crashme= CREATE TABLE crashme (
crashme(   attr1SERIAL8 PRIMARY KEY,
crashme(   attr2TIMESTAMPTZ DEFAULT NOW(),
crashme(   attr3TIMESTAMPTZ,
crashme(   attr4INT REFERENCES r1(i),
crashme(   attr5INT REFERENCES r2(i),
crashme(   attr6INT REFERENCES r3(i),
crashme(   attr7INT REFERENCES r4(i),
crashme(   attr8TEXT
crashme( );
NOTICE:  CREATE TABLE will create implicit sequence crashme_attr1_seq
for serial column crashme.attr1
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index
crashme_pkey for table crashme
CREATE TABLE
crashme= INSERT INTO crashme
(attr1,attr2,attr3,attr4,attr5,attr6,attr7,attr8) SELECT NEXTVAL
('crashme_attr1_seq'),NOW(),NOW(),1,1,1,1,'t1' FROM generate_series
(1,500);
WARNING:  AbortTransaction while in ABORT state
WARNING:  AbortTransaction while in ABORT state
WARNING:  AbortTransaction while in ABORT state
ERROR:  out of memory
DETAIL:  Failed on request of size 32.
ERROR:  out of memory
DETAIL:  Failed on request of size 27.
ERROR:  out of memory
DETAIL:  Failed on request of size 27.
ERROR:  out of memory
DETAIL:  Failed on request of size 27.
ERROR:  out of memory
DETAIL:  Failed on request of size 24.
PANIC:  ERRORDATA_STACK_SIZE exceeded
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


[HACKERS] Replication documentation addition

2006-10-23 Thread Bruce Momjian

Here is my first draft of a new replication section for our
documentation.  I am looking for any comments.

---

Replication
===

Database replication allows multiple computers to work together, making
them appear as a single computer to user applications.  This might
involve allowing a backup server to take over if the primary server
fails, or it might involve allowing several computers to work together
at the same time.

It would be ideal if database servers could be combined seamlessly.  Web
servers serving static web pages can be combined quite easily by merely
load-balancing web requests to multiple machines.  In fact, most
read-only servers can be combined relatively easily.

Unfortunately, most database servers have a read/write mix of requests,
and read/write servers are much harder to combine.  This is because
though read-only data has to be placed on each each server only once, a
write to any server has to be seen by all other servers so that future
read requests to those servers return consistent results.  

This sync problem is the fundamental difficulty of doing database
replication.  Because there is no single solution that limits the impact
of the sync problem for all workloads, there are multiple replication
solutions.  Each solution addresses the sync problem in a different way,
and minimizes its impact for a specific workload.  

This section first outlines two important replication capabilities, and
then outlines various replication solutions.

Sychronous vs. Asynchronous Replication
---

The term sychronous replication means that a query is not considered
committed unless all servers have access to the committed records.  In
that case, a failover to a backup server will lose no data records. 
Asynchronous replication has a small delay between the time of commit
and its propogation to backup servers, opening the possibility that some
transactions might be lost in a switch to a backup server.  Asynchronous
is used when sychronous replication would be too slow.

Full vs. Partial Replication


The term full replication means only a full database cluster can be
replicated, while partial replication means more fine-grained control
over replicated objects is possible.

Shared Disk Failover
 

This replication solution avoids the sync problem by having only one
copy of the database.  This is possible because a single disk array is
shared by multiple servers.  If the main database server fails, the
backup server is able to mount and start the database as though it was
restarting after a database crash.  This shared hardware functionality
is common in network storage devices.  This allows sychronous, full
replication.

Warm Standby Using Point-In-Time Recovery
-

A warm standby server (add doc xref) can be kept current by reading a
stream of WAL records.  If the main server fails, the warm standby
contains almost all of the data as the main server, and can be used as
the new database server.  This allows asychronous, full replication.

Point-In-Time Recovery  [Asychronous, Full]
--

A Point-In-Time Recovery is the same as a Warm Standby server except
that the standby server must go though a full restore and archive
recovery operation, delaying how quickly it can be used as the main
database server.  This allows asychronous, full replication.

Continuously Running Failover Server


A continuously running failover server allows the backup server to
answer read-only queries while the master server is running.  It
receives a continuous stream of write activity from the master server. 
Because the failover server can be used for read-only database requests,
it is ideal for data warehouse queries. Slony offers this as
asychronous, partial replication.

Data Partitioning
-

Data partitioning partitions the database into data sets.  To achieve
replication, each data set can only be modified by one server.  For
example, data can be partitioned by main office, e.g. London and Paris. 
While London and Paris servers have all data records, only London can
modify London records, and Paris can only modify Paris records.  Such
partitioning is usually accomplished in application code, though rules
and triggers can help enforce such partitioning and keep the read-only
data sets current.  Slony can also be used in such a setup.  While Slony
replicates only entire tables, London and Paris can be placed in
separate tables, and inheritance can be used to pull from both tables at
the same time.

Query Broadcast Replication
---

This involves sending write queries to multiple servers.  Read-only
queries can be sent to a single server because there is no need for all
servers to process it.   This can be complex to setup 

Re: [HACKERS] [DOCS] Replication documentation addition

2006-10-23 Thread Bruce Momjian

Please disregard.  I am redoing it and will post a URL with the most
recent version.

---

Bruce Momjian wrote:
 
 Here is my first draft of a new replication section for our
 documentation.  I am looking for any comments.
 
 ---
 
 Replication
 ===
 
 Database replication allows multiple computers to work together, making
 them appear as a single computer to user applications.  This might
 involve allowing a backup server to take over if the primary server
 fails, or it might involve allowing several computers to work together
 at the same time.
 
 It would be ideal if database servers could be combined seamlessly.  Web
 servers serving static web pages can be combined quite easily by merely
 load-balancing web requests to multiple machines.  In fact, most
 read-only servers can be combined relatively easily.
 
 Unfortunately, most database servers have a read/write mix of requests,
 and read/write servers are much harder to combine.  This is because
 though read-only data has to be placed on each each server only once, a
 write to any server has to be seen by all other servers so that future
 read requests to those servers return consistent results.  
 
 This sync problem is the fundamental difficulty of doing database
 replication.  Because there is no single solution that limits the impact
 of the sync problem for all workloads, there are multiple replication
 solutions.  Each solution addresses the sync problem in a different way,
 and minimizes its impact for a specific workload.  
 
 This section first outlines two important replication capabilities, and
 then outlines various replication solutions.
 
 Sychronous vs. Asynchronous Replication
 ---
 
 The term sychronous replication means that a query is not considered
 committed unless all servers have access to the committed records.  In
 that case, a failover to a backup server will lose no data records. 
 Asynchronous replication has a small delay between the time of commit
 and its propogation to backup servers, opening the possibility that some
 transactions might be lost in a switch to a backup server.  Asynchronous
 is used when sychronous replication would be too slow.
 
 Full vs. Partial Replication
 
 
 The term full replication means only a full database cluster can be
 replicated, while partial replication means more fine-grained control
 over replicated objects is possible.
 
 Shared Disk Failover
  
 
 This replication solution avoids the sync problem by having only one
 copy of the database.  This is possible because a single disk array is
 shared by multiple servers.  If the main database server fails, the
 backup server is able to mount and start the database as though it was
 restarting after a database crash.  This shared hardware functionality
 is common in network storage devices.  This allows sychronous, full
 replication.
 
 Warm Standby Using Point-In-Time Recovery
 -
 
 A warm standby server (add doc xref) can be kept current by reading a
 stream of WAL records.  If the main server fails, the warm standby
 contains almost all of the data as the main server, and can be used as
 the new database server.  This allows asychronous, full replication.
 
 Point-In-Time Recovery  [Asychronous, Full]
 --
 
 A Point-In-Time Recovery is the same as a Warm Standby server except
 that the standby server must go though a full restore and archive
 recovery operation, delaying how quickly it can be used as the main
 database server.  This allows asychronous, full replication.
 
 Continuously Running Failover Server
 
 
 A continuously running failover server allows the backup server to
 answer read-only queries while the master server is running.  It
 receives a continuous stream of write activity from the master server. 
 Because the failover server can be used for read-only database requests,
 it is ideal for data warehouse queries. Slony offers this as
 asychronous, partial replication.
 
 Data Partitioning
 -
 
 Data partitioning partitions the database into data sets.  To achieve
 replication, each data set can only be modified by one server.  For
 example, data can be partitioned by main office, e.g. London and Paris. 
 While London and Paris servers have all data records, only London can
 modify London records, and Paris can only modify Paris records.  Such
 partitioning is usually accomplished in application code, though rules
 and triggers can help enforce such partitioning and keep the read-only
 data sets current.  Slony can also be used in such a setup.  While Slony
 replicates only entire tables, London and Paris can be placed in
 separate tables, and inheritance can be used to pull 

[HACKERS] Replication documentation addition

2006-10-23 Thread Bruce Momjian
Here is a new replication documentation section I want to add for 8.2:

ftp://momjian.us/pub/postgresql/mypatches/replication

Comments welcomed.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly