Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-09 Thread Simon Riggs
On Fri, 2006-01-06 at 16:13 -0500, Greg Stark wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
 
  Before we start debating merits of proposals based on random reads, can
  someone confirm that the sampling code actually does read randomly? I
  looked at it yesterday; there is a comment that states that blocks to be
  scanned are passed to the analyze function in physical order, and AFAICT
  the function that chooses blocks does so based strictly on applying a
  probability function to block numbers as it increments a counter. It
  seems that any reading is actually sequential and not random, which
  makes all the random_page_cost hand-waving null and void.
 
 Hm. I'm curious just how much that behaves like a sequential scan actually. I
 think I'll do some experiments. 
 
 Reading 1% (1267 read, 126733 skipped):7748264us
 Reading 2% (2609 read, 125391 skipped):   12672025us
 Reading 5% (6502 read, 121498 skipped):   19005678us
 Reading 5% (6246 read, 121754 skipped):   18509770us
 Reading 10% (12975 read, 115025 skipped): 19305446us
 Reading 20% (25716 read, 102284 skipped): 18147151us
 Reading 50% (63656 read, 64344 skipped):  18089229us
 Reading 100% (128000 read, 0 skipped):18173003us
 
 These numbers don't make much sense to me. It seems like 5% is about as slow
 as reading the whole file which is even worse than I expected. I thought I was
 being a bit pessimistic to think reading 5% would be as slow as reading 20% of
 the table.

Just to put a few rumours to bed:

- the current code does *not* use block sampling, it uses random row
sampling. (There is a part of the code that selects the next block but
that should not confuse you into thinking that the whole block is
sampled).

- yes, the random sampling is random - please read the code and comments

- yes, I would expect the results you get. If you sample 5% of rows and
each block has on average at least 20 rows, then we should expect the
majority of blocks to be hit.

Best Regards, Simon Riggs


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-09 Thread Lukas Smith

Simon Riggs wrote:


- yes, the random sampling is random - please read the code and comments

- yes, I would expect the results you get. If you sample 5% of rows and
each block has on average at least 20 rows, then we should expect the
majority of blocks to be hit.


and it seems from the benchmark posted to this list that random is 
_very_ expensive (probably because the random reads are spread out so 
well, that we do alot of I/O instead of just logical I/O from some cache)


regards,
Lukas

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Incremental Backup Script

2006-01-09 Thread Gregor Zeitlinger
 As per docs, if the databases are rarely updated it could take a long
 time for the WAL segment to roll over. 
Yes, therefore I want to copy the current WAL (as I said earlier).
When restoring, I also want to make sure that I restore exactely to the point 
when I copied the current WA segment.

Hence I consider to do it as follows:
1) take the t = current time
2) copy the current WAL
3) when restoring, set  recovery_target_time  = t
 
Maybe there is even a way to ask Postgres of its last commited x = xid.
In that case, we could set recovery_target_xid = x
Is that possible?
 
Regards,
 
Gregor

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] ISO 8601 Intervals

2006-01-09 Thread Michael Glaesemann


On Jan 8, 2006, at 12:12 , Larry Rosenman wrote:


   I was thinking of handling the TODO for ISO8601 Interval output.


Just to be clear, you're talking about the ISO8601 duration syntax  
(PnYnMnDTnHnMnS), correct? (The SQL standard made the unfortunate  
choice to call durations, i.e., lengths of time, intervals.)


Michael Glaesemann
grzm myrealbox com




---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Andrew Dunstan



Greg Stark wrote:


Andrew Dunstan [EMAIL PROTECTED] writes:
   

The attached patch against cvs tip does seem to work. Instead of playing 
with the environment, we simply allow perl to do its worst and then put 
things back the way we wanted them.
 



How does that affect to the API calls you can make from Perl back into the
database? What if you change the locale and then issue a query from within
Perl?

 



If you deliberately change the locale settings (especially LC_COLLATE), 
all bets are off, surely. REINDEX will be in your future.


Calling setlocale() is in fact a forbidden operation in trusted plperl.

AFAICT, perl doesn't keep any state about locale settings, it just 
reacts to whatever the current settings are, I think, but I could be wrong.


My main concern has been that we are pushing out a point release that 
advertises a fix for a problem, when the fix doesn't work on Windows. 
Either we need to find a fix (and I tried to supply one) or we need to 
change what we say about the release.


I'm also a bit distressed that nobody else has tested this, and we have 
just assumed that the fix would work, despite what we already know about 
how setlocale() works on Windows.


cheers

andrew




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] ISO 8601 Intervals

2006-01-09 Thread Larry Rosenman
Michael Glaesemann wrote:
 On Jan 8, 2006, at 12:12 , Larry Rosenman wrote:
 
I was thinking of handling the TODO for ISO8601 Interval output.
 
 Just to be clear, you're talking about the ISO8601 duration syntax
 (PnYnMnDTnHnMnS), correct? (The SQL standard made the unfortunate
 choice to call durations, i.e., lengths of time, intervals.)  

Yes.

LER


-- 
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 512-248-2683 E-Mail: ler@lerctr.org
US Mail: 430 Valona Loop, Round Rock, TX 78681-3683 US


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working

2006-01-09 Thread Jim Buttafuoco
Stefan,

well that is good news, can you tell me what version of linux you are using and 
what gcc version also.  I will let
Martin know.

Thanks
Jim


-- Original Message ---
From: Stefan Kaltenbrunner [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: pgsql-hackers pgsql-hackers@postgresql.org
Sent: Mon, 09 Jan 2006 08:55:06 +0100
Subject: Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working

 Jim Buttafuoco wrote:
  Hackers,
  
  I can confirm that HEAD does not initdb because of a SIGBUS as reported 
  below by Martin Pitt @ debian (see his email 
  below).  My build farm member (corgi) did pass all checks 6 days ago (I was 
  having some issues with the build farm 
  code before that).  If anyone would like to SSH into the box, please 
  contact me via email and I will get an account 
  setup.  Right now, I am trying to build 8.1 to see if it passes.
 
 I cannot confirm this - the mipsel box I have on the buildfarm
 (lionfish) seems to be happyily building all branches and completing
 make check.
 
 Stefan
 
 ---(end of broadcast)---
 TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
--- End of Original Message ---


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working

2006-01-09 Thread Stefan Kaltenbrunner

Jim Buttafuoco wrote:

Stefan,


first i would ask you to fix your mailserver setup because my last Mail 
to you bounced with:


550 5.0.0 Sorry we don't accept mail from Austria

which makes it rather difficult for me to reply to your personal mail



well that is good news, can you tell me what version of linux you are using and 
what gcc version also.  I will let
Martin know.



lionfish is a stock Debian/Sarge box (a cobalt cube) with gcc 3.3.5.


Stefan

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working

2006-01-09 Thread Jim Buttafuoco
Stefan,

My mail admin has removed the Austria block, I guess we were getting spammed by 
some one there.  Can you send the output
of dpkg --list, so I can compare what packages you are using to what I have.

Thanks
Jim





-- Original Message ---
From: Stefan Kaltenbrunner [EMAIL PROTECTED]
To: pgsql-hackers pgsql-hackers@postgresql.org
Cc: [EMAIL PROTECTED]
Sent: Mon, 09 Jan 2006 15:03:28 +0100
Subject: Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working

 Jim Buttafuoco wrote:
  Stefan,
 
 first i would ask you to fix your mailserver setup because my last Mail 
 to you bounced with:
 
 550 5.0.0 Sorry we don't accept mail from Austria
 
 which makes it rather difficult for me to reply to your personal mail
 
  
  well that is good news, can you tell me what version of linux you are using 
  and what gcc version also.  I will let
  Martin know.
 
 lionfish is a stock Debian/Sarge box (a cobalt cube) with gcc 3.3.5.
 
 Stefan
 
 ---(end of broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
--- End of Original Message ---


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-09 Thread Greg Stark
Simon Riggs [EMAIL PROTECTED] writes:

 - yes, I would expect the results you get. If you sample 5% of rows and
 each block has on average at least 20 rows, then we should expect the
 majority of blocks to be hit.

These results are from my test program. 5% means 5% of 8k blocks from the test
file. In other words, reading a random 5% of the blocks from the test file in
sequential order but seeking over the skipped blocks is just as slow as
reading the entire file.

I feel like that can't be right but I can't find anything wrong with the
methodology.

An updated program is attached with which I got these results:

bash-3.00# for i in `seq 1 100` ; do umount /u6; mount /dev/sda1 /u6; 
~stark/src/pg/a.out /u6/temp/small $i ; done
Reading 1% (1280/128000 blocks 1048576000 bytes) total time 7662706us MB/s 1.37 
effective MB/s 136.84u
Reading 2% (2560/128000 blocks 1048576000 bytes) total time 12495106us MB/s 
1.68 effective MB/s 83.92
Reading 3% (3840/128000 blocks 1048576000 bytes) total time 15847342us MB/s 
1.99 effective MB/s 66.17
Reading 4% (5120/128000 blocks 1048576000 bytes) total time 18281244us MB/s 
2.29 effective MB/s 57.36
Reading 5% (6400/128000 blocks 1048576000 bytes) total time 18988843us MB/s 
2.76 effective MB/s 55.22
Reading 6% (7680/128000 blocks 1048576000 bytes) total time 19225394us MB/s 
3.27 effective MB/s 54.54
Reading 7% (8960/128000 blocks 1048576000 bytes) total time 19462241us MB/s 
3.77 effective MB/s 53.88
Reading 8% (10240/128000 blocks 1048576000 bytes) total time 19747881us MB/s 
4.25 effective MB/s 53.10
Reading 9% (11520/128000 blocks 1048576000 bytes) total time 19451411us MB/s 
4.85 effective MB/s 53.91
Reading 10% (12800/128000 blocks 1048576000 bytes) total time 19546511us MB/s 
5.36 effective MB/s 53.65
Reading 11% (14080/128000 blocks 1048576000 bytes) total time 18989375us MB/s 
6.07 effective MB/s 55.22
Reading 12% (15360/128000 blocks 1048576000 bytes) total time 18722848us MB/s 
6.72 effective MB/s 56.01
Reading 13% (16640/128000 blocks 1048576000 bytes) total time 18621588us MB/s 
7.32 effective MB/s 56.31
Reading 14% (17920/128000 blocks 1048576000 bytes) total time 18581751us MB/s 
7.90 effective MB/s 56.43
Reading 15% (19200/128000 blocks 1048576000 bytes) total time 18422160us MB/s 
8.54 effective MB/s 56.92
Reading 16% (20480/128000 blocks 1048576000 bytes) total time 18148012us MB/s 
9.24 effective MB/s 57.78
Reading 17% (21760/128000 blocks 1048576000 bytes) total time 18147779us MB/s 
9.82 effective MB/s 57.78
Reading 18% (23040/128000 blocks 1048576000 bytes) total time 18023256us MB/s 
10.47 effective MB/s 58.18
Reading 19% (24320/128000 blocks 1048576000 bytes) total time 18039846us MB/s 
11.04 effective MB/s 58.13
Reading 20% (25600/128000 blocks 1048576000 bytes) total time 18081214us MB/s 
11.60 effective MB/s 57.99
...

#include sys/types.h
#include sys/stat.h
#include sys/time.h
#include time.h
#include fcntl.h
#include unistd.h

#include stdio.h
#include stdlib.h

#define BLOCKSIZE 8192

int main(int argc, char *argv[], char *arge[]) 
{
  char *fn;
  int fd;
  int perc;
  struct stat statbuf;
  struct timeval tv1,tv2;
  off_t size, offset;
  char *buf[BLOCKSIZE];
  int b_toread, b_toskip, b_read=0, b_skipped=0;
  long us;

  fn = argv[1];
  perc = atoi(argv[2]);

  fd = open(fn, O_RDONLY);
  fstat(fd, statbuf);
  size = statbuf.st_size;
  
  size = size/BLOCKSIZE*BLOCKSIZE;
  
  gettimeofday(tv1, NULL);

  srandom(getpid()^tv1.tv_sec^tv1.tv_usec);

  b_toread = size/BLOCKSIZE*perc/100;
  b_toskip = size/BLOCKSIZE-b_toread;

  for(offset=0;offsetsize;offset+=BLOCKSIZE) {
if (random()%(b_toread+b_toskip)  b_toread) {
  lseek(fd, offset, SEEK_SET);
  read(fd, buf, BLOCKSIZE);
  b_toread--;
  b_read++;
} else {
  b_toskip--;
  b_skipped++;
}
  }
  
  gettimeofday(tv2, NULL);
  
  us = (tv2.tv_sec-tv1.tv_sec)*100 + (tv2.tv_usec-tv1.tv_usec);
  
  fprintf(stderr,
	  Reading %d%% (%d/%d blocks %ld bytes) total time %ldus MB/s %.2f effective MB/s %.2f\n,
	  perc,
	  b_read, b_read+b_skipped, size,
	  us,
	  (double)b_read*BLOCKSIZE/us,
	  (double)size/us
	  );
  exit(0);
}



-- 
greg

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] ISO 8601 Intervals

2006-01-09 Thread Ron Mayer

Larry Rosenman wrote:

Michael Glaesemann wrote:

On Jan 8, 2006, at 12:12 , Larry Rosenman wrote:

  I was thinking of handling the TODO for ISO8601 Interval output.


Just to be clear, you're talking about the ISO8601 duration syntax
(PnYnMnDTnHnMnS), correct? (The SQL standard made the unfortunate
choice to call durations, i.e., lengths of time, intervals.)  


Back in 2003 I submitted such a patch [1,1b] that resulted in a fair
amount of discussion including some still (AFAIK) open issues
about the naming of the datestyle settings to control it [2,3,4].

There was also some discussion of the range off ISO 8601 durations
to support (ISO 8601 Basic Format, ISO  8601 Alternative Format,
and ISO 8601 Extended Format (which is more human-readable)) [5].

Finally, there is a similar, but different syntax currently supported
by postgresql (where '1Y1M' means 1 year 1 minute, while ISO 'P1Y1M'
would mean 1 year 1 month) and Tom recommended ripping that code
out[7] and at one point said my patch was looking cleaner than
the exiting code [8].  My patch does not yet rip that out.


I still use the patch myself, but don't have it updated to CVS tip.
I'd be happy to do so if people want that as a starting point.

   Ron


[1] http://archives.postgresql.org/pgsql-patches/2003-09/msg00103.php
[1b] http://archives.postgresql.org/pgsql-patches/2003-09/msg00286.php
[2] http://archives.postgresql.org/pgsql-patches/2003-09/msg00122.php
[3] http://archives.postgresql.org/pgsql-patches/2003-09/msg00129.php
[4] http://archives.postgresql.org/pgsql-patches/2003-09/msg00130.php
[5] http://archives.postgresql.org/pgsql-patches/2003-09/msg00133.php
[6] http://archives.postgresql.org/pgsql-patches/2003-09/msg00134.php
[7] http://archives.postgresql.org/pgsql-patches/2003-09/msg00134.php

[8] http://archives.postgresql.org/pgsql-patches/2003-09/msg00121.php

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] ISO 8601 Intervals

2006-01-09 Thread Ron Mayer

One more link...
this http://archives.postgresql.org/pgsql-patches/2003-12/msg00049.php
was the final draft of the patch I submitted, with docs patches, that
did not break backward computability (did not rip out the old syntax)
and supported both input and output of ISO-8601 compliant intervals
by setting the datestyle to iso8601basic as discussed in the thread
linked in the quoted article below.

It was applied http://archives.postgresql.org/pgsql-patches/2003-12/msg00253.php
and then debated 
http://archives.postgresql.org/pgsql-patches/2003-12/msg00202.php
and then unapplied 
http://archives.postgresql.org/pgsql-patches/2003-12/msg00030.php
on Peter Eisentraut's recommendation to implement SQL standard intervals first.



Ron Mayer wrote:

Larry Rosenman wrote:


Michael Glaesemann wrote:


On Jan 8, 2006, at 12:12 , Larry Rosenman wrote:


  I was thinking of handling the TODO for ISO8601 Interval output.



Just to be clear, you're talking about the ISO8601 duration syntax
(PnYnMnDTnHnMnS), correct? (The SQL standard made the unfortunate
choice to call durations, i.e., lengths of time, intervals.)  



Back in 2003 I submitted such a patch [1,1b] that resulted in a fair
amount of discussion including some still (AFAIK) open issues
about the naming of the datestyle settings to control it [2,3,4].

There was also some discussion of the range off ISO 8601 durations
to support (ISO 8601 Basic Format, ISO  8601 Alternative Format,
and ISO 8601 Extended Format (which is more human-readable)) [5].

Finally, there is a similar, but different syntax currently supported
by postgresql (where '1Y1M' means 1 year 1 minute, while ISO 'P1Y1M'
would mean 1 year 1 month) and Tom recommended ripping that code
out[7] and at one point said my patch was looking cleaner than
the exiting code [8].  My patch does not yet rip that out.


I still use the patch myself, but don't have it updated to CVS tip.
I'd be happy to do so if people want that as a starting point.

   Ron


[1] http://archives.postgresql.org/pgsql-patches/2003-09/msg00103.php
[1b] http://archives.postgresql.org/pgsql-patches/2003-09/msg00286.php
[2] http://archives.postgresql.org/pgsql-patches/2003-09/msg00122.php
[3] http://archives.postgresql.org/pgsql-patches/2003-09/msg00129.php
[4] http://archives.postgresql.org/pgsql-patches/2003-09/msg00130.php
[5] http://archives.postgresql.org/pgsql-patches/2003-09/msg00133.php
[6] http://archives.postgresql.org/pgsql-patches/2003-09/msg00134.php
[7] http://archives.postgresql.org/pgsql-patches/2003-09/msg00134.php

[8] http://archives.postgresql.org/pgsql-patches/2003-09/msg00121.php


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Stats collector performance improvement

2006-01-09 Thread Hannu Krosing
Ühel kenal päeval, P, 2006-01-08 kell 11:49, kirjutas Greg Stark:
 Hannu Krosing [EMAIL PROTECTED] writes:
 
  Interestingly I use pg_stat_activity view to watch for stuck backends,
  stuck in the sense that they have not noticed when client want away
  and are now waitin the TCP timeout to happen. I query for backends which
  have been in IDLE state for longer than XX seconds. I guess that at
  least some kind of indication for this should be available.
 
 You mean like the tcp_keepalives_idle option?
 
 http://www.postgresql.org/docs/8.1/interactive/runtime-config-connection.html#GUC-TCP-KEEPALIVES-IDLE
 

Kind of, only I'd like to be able to set timeouts less than 120 minutes.

from:
http://developer.apple.com/documentation/mac/NetworkingOT/NetworkingWOT-390.html#HEADING390-0

kp_timeout
Set the requested timeout value, in minutes. Specify a value of
T_UNSPEC to use the default value. You may specify any positive
value for this field of 120 minutes or greater. The timeout
value is not an absolute requirement; if you specify a value
less than 120 minutes, TCP will renegotiate a timeout of 120
minutes.

---
Hannu



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-09 Thread Greg Stark

  These numbers don't make much sense to me. It seems like 5% is about as slow
  as reading the whole file which is even worse than I expected. I thought I 
  was
  being a bit pessimistic to think reading 5% would be as slow as reading 20% 
  of
  the table.

I have a theory. My test program, like Postgres, is reading in 8k chunks.
Perhaps that's fooling Linux into thinking it's a sequential read and reading
in 32k chunks internally. That would effectively make a 25% scan a full table
scan. And a 5% scan would be a 20% scan which is about where I would have
expected the breakeven point to be.

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 I don't know. Reading that code just makes my head spin ...

Yeah, too many ifdefs :-(.  But I suppose that the initial
#ifdef LOCALE_ENVIRON_REQUIRED block is not compiled on sane
platforms, meaning that the first code in the routine is the
unconditional
if (! setlocale(LC_ALL, ))
setlocale_failure = TRUE;

 I should have thought a library shouldn't make too many assumptions
 about locale settings anyway.

Indeed; I think a pretty strong case can be made that this is a Perl
bug.  It's reasonable to be doing the setlocale call in a standalone
Perl executable, but libperl should just work with whatever locale
settings have been chosen by the surrounding program (ie, all these
calls should be setlocale(LC_xxx, NULL) in the libperl case).

 I'm just about out of ideas and right out of time to spend on this.

We could just file a Perl bug report and wait for them to fix it.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] lookup_rowtype_tupdesc considered harmful

2006-01-09 Thread Neil Conway
On Sun, 2006-01-08 at 20:04 -0500, Tom Lane wrote:
 On reflection I think that lookup_rowtype_tupdesc is simply misdesigned.
 We can't have it handing back a pointer to a data structure of unspecified
 lifetime.  One possibility is to give it an API comparable to the
 syscache lookup functions, ie you get a reference-counted pointer that
 you have to explicitly release when done with it.

This seems a cleaner approach.

 One big strike against the reference-count approach is that it'd be
 difficult to back-patch such a solution into existing branches, since
 it would amount to an incompatible API change.

Perhaps adding a palloc() to lookup_rowtype_tupdesc() can be done for
back branches (which should be minimally invasive), and a
reference-counting API can be implemented for 8.2.

I'd be happy to volunteer to do the refcounting changes, in the
(probably unlikely :) ) event you'd rather not do the work yourself.

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] lookup_rowtype_tupdesc considered harmful

2006-01-09 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Sun, 2006-01-08 at 20:04 -0500, Tom Lane wrote:
 On reflection I think that lookup_rowtype_tupdesc is simply misdesigned.
 We can't have it handing back a pointer to a data structure of unspecified
 lifetime.  One possibility is to give it an API comparable to the
 syscache lookup functions, ie you get a reference-counted pointer that
 you have to explicitly release when done with it.

 This seems a cleaner approach.

I have not been able to think of an efficient way to make it work while
still handing back a simple TupleDesc pointer --- seems like we'd have
to contort the API somehow so that the release function can find the
reference count.  Any thoughts about that?

 Perhaps adding a palloc() to lookup_rowtype_tupdesc() can be done for
 back branches (which should be minimally invasive), and a
 reference-counting API can be implemented for 8.2.

Agreed, it doesn't necessarily have to be the same solution in the back
branches.  OTOH I am still not convinced it's worth the trouble to
implement a reference-counting solution.

 I'd be happy to volunteer to do the refcounting changes, in the
 (probably unlikely :) ) event you'd rather not do the work yourself.

I've got plenty of stuff to do, will be happy to let you take up this
problem, if you have time to do something with it soon.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] catalog corruption bug

2006-01-09 Thread Jeremy Drake
On Sun, 8 Jan 2006, Tom Lane wrote:

 Yeah, that's not very surprising.  Running the forced-cache-resets
 function will definitely expose that catcache bug pretty quickly.
 You'd need to apply the patches I put in yesterday to have a system
 that has any chance of withstanding that treatment for any length of time.

  I think I am going to just run without the function running this time and
  see if it does the duplicate type error and if it will generate two cores.

I ran without that function you made, and it got the error, but not a
crash.  I stuck an Assert(false) right before the ereport for that
particular error, and I did end up with a core there, but I don't see
anything out of the ordinary (what little I know of the ordinary ;)

#0  0x2b8a0cf9 in kill () from /usr/lib64/libc.so.6
#1  0x2b8a0a3d in raise () from /usr/lib64/libc.so.6
#2  0x2b8a1c82 in abort () from /usr/lib64/libc.so.6
#3  0x005f9878 in ExceptionalCondition (
conditionName=0x2c53 Address 0x2c53 out of bounds,
errorType=0x6 Address 0x6 out of bounds, fileName=0x0,
lineNumber=-1)
at assert.c:51
#4  0x00460967 in _bt_doinsert (rel=0x2ab05568,
btitem=0xbec2c0,
index_is_unique=1 '\001', heapRel=0x8bf0f0) at nbtinsert.c:247
#5  0x00463773 in btinsert (fcinfo=0x2c53) at nbtree.c:228
#6  0x005fe869 in FunctionCall6 (flinfo=0x8, arg1=6, arg2=0,
arg3=18446744073709551615, arg4=0, arg5=0, arg6=0) at fmgr.c:1267
#7  0x0045bf4f in index_insert (indexRelation=0x2ab05568,
values=0x7fdfde20, isnull=0x7fdfde00 , heap_t_ctid=0xbebeac,
heapRelation=0x8bf0f0, check_uniqueness=1 '\001') at indexam.c:215
#8  0x0048f8fa in CatalogIndexInsert (indstate=0x2c53,
heapTuple=0xbebb88) at indexing.c:124
#9  0x0048f994 in CatalogUpdateIndexes (heapRel=0x2c53,
heapTuple=0xbebea8) at indexing.c:149
#10 0x0049bc67 in TypeCreate (typeName=0x7fdfe3e0
push_tmp,
typeNamespace=11057063, relationOid=12171371, relationKind=114 'r',
internalSize=-16728, typeType=99 'c', typDelim=44 ',',
inputProcedure=2290, outputProcedure=2291, receiveProcedure=2402,
sendProcedure=2403, analyzeProcedure=0, elementType=0, baseType=0,
defaultTypeValue=0x0, defaultTypeBin=0x0, passedByValue=-16 '',
alignment=100 'd', storage=120 'x', typeMod=-1, typNDims=0,
typeNotNull=0 '\0') at pg_type.c:316
#11 0x0048c361 in heap_create_with_catalog (
relname=0x7fdfe3e0 push_tmp, relnamespace=11057063,
reltablespace=0, relid=12171371, ownerid=16384, tupdesc=0xbeb8e8,
relkind=114 'r', shared_relation=0 '\0', oidislocal=0 '\0',
oidinhcount=0,
oncommit=ONCOMMIT_DROP, allow_system_table_mods=0 '\0') at heap.c:634
#12 0x004de220 in DefineRelation (stmt=0x93fc30, relkind=114 'r')
at tablecmds.c:423
#13 0x0058bfd0 in ProcessUtility (parsetree=0x93fc30, params=0x0,
dest=0x814b40, completionTag=0x0) at utility.c:497
#14 0x00515cb5 in _SPI_execute_plan (plan=0x93f9a8,
Values=0x9c5798,
Nulls=0x9c57b8 ~, '\177' repeats 199 times..., snapshot=0x0,
crosscheck_snapshot=0x0, read_only=0 '\0', tcount=0) at spi.c:1449
#15 0x005165fc in SPI_execute_plan (plan=0x93f9a8,
Values=0x9c5798,
Nulls=0x9c57b8 ~, '\177' repeats 199 times..., read_only=0 '\0',
tcount=0) at spi.c:336
#16 0x2c95d8a4 in exec_stmts (estate=0x7fdfe950, stmts=0x6)
at pl_exec.c:2280
#17 0x2c95ebc2 in exec_stmt_block (estate=0x7fdfe950,
block=0x8f2c70) at pl_exec.c:936
#18 0x2c95f5ab in plpgsql_exec_function (func=0x913bc8,
fcinfo=0x7fdfea90) at pl_exec.c:286
#19 0x2c9573f5 in plpgsql_call_handler (fcinfo=0x7fdfea90)
at pl_handler.c:123
#20 0x00501a74 in ExecMakeFunctionResult (fcache=0x90a7f0,
econtext=0x90a6c0, isNull=0x90ae38
\177~\177\177\177\177\177\177!\006,
isDone=0x90aef0) at execQual.c:1095
#21 0x00505543 in ExecProject (projInfo=0x90ae58,
isDone=0x7fdfeef4) at execQual.c:3669
#22 0x0050ff5a in ExecResult (node=0x90a5a8) at nodeResult.c:157
#23 0x0050034d in ExecProcNode (node=0x90a5a8) at
execProcnode.c:306
#24 0x004ff5ea in ExecutorRun (queryDesc=0x90a5a8,
direction=ForwardScanDirection, count=0) at execMain.c:1110
#25 0x0058a5de in PortalRunSelect (portal=0x8e6c68, forward=1
'\001',
count=0, dest=0x8dad30) at pquery.c:794
#26 0x0058abdf in PortalRun (portal=0x8e6c68,
count=9223372036854775807, dest=0x8dad30, altdest=0x8dad30,
completionTag=0x7fdff320 ) at pquery.c:646
#27 0x00588fcb in PostgresMain (argc=9333864, argv=0x8dac18,
username=0x8853f0 jeremyd) at postgres.c:1754
#28 0x0055e20a in ServerLoop () at postmaster.c:2853
#29 0x0055f9f9 in PostmasterMain (argc=3, argv=0x8832e0)
at postmaster.c:943
#30 0x0051fb83 in main (argc=3, argv=0x8832e0) at main.c:256


 Please also look at putting together a 

Re: [HACKERS] [PATCHES] plpgsql: check domain constraints

2006-01-09 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 GetDomainConstraints() looks fairly expensive, so it would be nice to do
 some caching. What would the best way to implement this be? I had
 thought that perhaps the typcache would work, but there seems to be no
 method to flush stale typcache data. Perhaps we could add support for
 typcache invalidation (via a new sinval message), and then add the
 domain constraint information to the typcache. Or is there an easier
 way?

Yeah, I had been thinking of exactly the same thing a few months ago
after noting that GetDomainConstraints() can be pretty dang slow ---
it seemed to be a major bottleneck for Kevin Grittner here:
http://archives.postgresql.org/pgsql-hackers/2005-09/msg01135.php
Unfortunately the rest of that conversation was unintentionally
off-list, but we identified heavy use of pg_constraint_contypid_index
as the source of his issue, and I said

: Hmm.  The only commonly-used code path I can see that would touch
: pg_constraint_contypid_index is GetDomainConstraints(), which would be
: called (once) during startup of a command that involves a CoerceToDomain
: expression node.  So if the heavily-hit table has any domain-type
: columns, it's possible that a steady stream of inserts or updates
: could have kept that index tied up.
: 
: It might be worth introducing a system cache that could be used to
: extract the constraints for a domain without going to the catalog for
: every single command.  There's been very little work done to date on
: optimizing operations on domains :-(

The lack of typcache invalidation is something that will eventually
bite us in other ways, so we need to add the facility anyway.  We
don't really need a new sinval message, as an inval on the pg_type
row will serve perfectly well --- what we need is something comparable
to CacheInvalidateRelcache() to cause such a message to be sent when
we haven't actually changed the pg_type row itself.

Do you want to work on this?  I can if you don't.

BTW, in connection with the lookup_rowtype_tupdesc fiasco, it's pretty
obvious that any data structure returned by this function will need
to be either copied or reference-counted.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] catalog corruption bug

2006-01-09 Thread Tom Lane
Jeremy Drake [EMAIL PROTECTED] writes:
 I ran without that function you made, and it got the error, but not a
 crash.  I stuck an Assert(false) right before the ereport for that
 particular error, and I did end up with a core there, but I don't see
 anything out of the ordinary (what little I know of the ordinary ;)

Yeah, that's just the CREATE TEMP TABLE doing what it's supposed to do.
The problem is presumably that a prior DROP operation failed to remove
the pg_type row associated with a previous temp table of the same name
... but why that would happen is still really unclear.

Does your application drop these temp tables explicitly, or leave them
to be dropped automatically during commit?  It might be interesting to
see whether changing that makes any difference.  I'm also curious
whether the transaction that makes the temp table is ever rolled back
instead of committed.

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PLs and domain constraints

2006-01-09 Thread Thomas Hallgren

Tom Lane wrote:

Neil Conway [EMAIL PROTECTED] writes:
For #2, I'm not sure where the right place to check domain constraints 
is. I was thinking about adding the check to the fmgr function call 
logic[1], but the domain checking code needs an ExprContext in which to 
evaluate the constraint, which wouldn't easily be accessible.


I'd go with making the PLs do it.  fmgr is a very low logical level and
it's inappropriate for it to even know what a domain is.  As an example
of the problems you will run into: how is fmgr going to find out whether
the target type is a domain, much less what its constraints are?  It
can't assume that it's running inside a transaction, or even that the
system catalog access machinery is alive yet.

Should I consider this as something to add to the PL/Java TODO list? Or is there more to be 
discussed?


Regards,
Thomas Hallgren


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PLs and domain constraints

2006-01-09 Thread Tom Lane
Thomas Hallgren [EMAIL PROTECTED] writes:
 Neil Conway [EMAIL PROTECTED] writes:
 For #2, I'm not sure where the right place to check domain constraints 
 is.

 Should I consider this as something to add to the PL/Java TODO list?

Yup, probably.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] PLs and domain constraints

2006-01-09 Thread Neil Conway
On Mon, 2006-01-09 at 20:23 +0100, Thomas Hallgren wrote:
 Should I consider this as something to add to the PL/Java TODO list?

Probably, yes (if/when I fix the in-tree PLs I was planning to take a
look at all the externally-maintained ones, although you're welcome to
do it instead).

Before domain constraints can be efficiently checked in PLs, the backend
needs some additional infrastructure (see recent discussion of typcache
changes on pgsql-patches, subject: plpgsql: check domain constraints).
When that's finished, checking constraints in a PL should be fairly
easy.

-Neil



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] catalog corruption bug

2006-01-09 Thread Jeremy Drake
On Mon, 9 Jan 2006, Tom Lane wrote:

 Does your application drop these temp tables explicitly, or leave them
 to be dropped automatically during commit?  It might be interesting to
 see whether changing that makes any difference.

I drop them explicitly at the end of the function.

 I'm also curious
 whether the transaction that makes the temp table is ever rolled back
 instead of committed.

Not intentionally/explicitly.  The only time it should roll back is if it
gets an error (which tends to be this error).  I do sometimes hit ^C on
the perl scripts to tweak something, which would roll it back if in this
particular code, but I don't think i did that on the last run at least.


   regards, tom lane

 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


-- 
Every little picofarad has a nanohenry all its own.
-- Don Vonada

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] lookup_rowtype_tupdesc considered harmful

2006-01-09 Thread Neil Conway
On Mon, 2006-01-09 at 12:57 -0500, Tom Lane wrote:
 I have not been able to think of an efficient way to make it work while
 still handing back a simple TupleDesc pointer --- seems like we'd have
 to contort the API somehow so that the release function can find the
 reference count.  Any thoughts about that?

Perhaps the release function can just take a type OID. We could then
use that to lookup the OID's typcache entry, which would be a convenient
place to store the reference count (especially if we do generalized
typcache invalidation, per discussion on -patches). That means two hash
table lookups for each lookup/release pair, which isn't ideal but
doesn't seem too bad.

 I've got plenty of stuff to do, will be happy to let you take up this
 problem, if you have time to do something with it soon.

I'll take a look.

-Neil



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] lookup_rowtype_tupdesc considered harmful

2006-01-09 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 On Mon, 2006-01-09 at 12:57 -0500, Tom Lane wrote:
 I have not been able to think of an efficient way to make it work while
 still handing back a simple TupleDesc pointer --- seems like we'd have
 to contort the API somehow so that the release function can find the
 reference count.  Any thoughts about that?

 Perhaps the release function can just take a type OID. We could then
 use that to lookup the OID's typcache entry, which would be a convenient
 place to store the reference count (especially if we do generalized
 typcache invalidation, per discussion on -patches). That means two hash
 table lookups for each lookup/release pair, which isn't ideal but
 doesn't seem too bad.

Nah, I don't think this works.  The problem is that after an inval,
you may have to provide an updated TupleDesc to new callers while
old callers still have open reference counts to the old TupleDesc.
The type OID isn't enough info to let you determine which generation
of TupleDesc is being released.

However, you might be able to adopt the same trick used in catcache.c:
the callers think they have pointers to HeapTuples and are unaware that
that is just a field of a larger struct.  Add a reference count and a
dead flag, and a magic value for safety checking, and you've got it.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Fw: Returned mail: see transcript for details

2006-01-09 Thread Jim Buttafuoco
Tom,

My email to you was blocked.

Jim

-- Forwarded Message ---
From: Mail Delivery Subsystem [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sun, 8 Jan 2006 20:16:27 -0500
Subject: Returned mail: see transcript for details

- The following addresses had permanent fatal errors -
[EMAIL PROTECTED]
(reason: 550 5.7.1 Probable spam from 216.204.66.227 refused - see 
http://www.five-ten-sg.com/blackhole.php?
216.204.66.227)

   - Transcript of session follows -
... while talking to sss.pgh.pa.us.:
 MAIL From:[EMAIL PROTECTED]
 550 5.7.1 Probable spam from 216.204.66.227 refused - see 
http://www.five-ten-sg.com/blackhole.php?216.204.66.227
554 5.0.0 Service unavailable

Return-Path: [EMAIL PROTECTED]
Received: from amanda.contactbda.com (amanda.contactbda.com [192.168.1.2])
by amanda.contactbda.com (8.12.11/8.12.11/Debian-3) with ESMTP id 
k091GQxs027867
for [EMAIL PROTECTED]; Sun, 8 Jan 2006 20:16:26 -0500
From: Jim Buttafuoco [EMAIL PROTECTED]
To: Tom Lane [EMAIL PROTECTED]
Reply-To: [EMAIL PROTECTED]
Subject: Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working 
on mips[el]? 
Date: Sun, 8 Jan 2006 20:16:26 -0500
Message-Id: [EMAIL PROTECTED]
In-Reply-To: [EMAIL PROTECTED]
References: [EMAIL PROTECTED] [EMAIL PROTECTED] 
[EMAIL PROTECTED]
X-Mailer: Open WebMail 2.41 20040926
X-OriginatingIP: 192.168.1.1 (jim)
MIME-Version: 1.0
Content-Type: text/plain;
charset=iso-8859-1

Tom,

I will setup the ssh and forward you the info.

Thanks
Jim

-- Original Message ---
From: Tom Lane [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sun, 08 Jan 2006 16:41:38 -0500
Subject: Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working 
on mips[el]?

 Jim Buttafuoco [EMAIL PROTECTED] writes:
  I can confirm that HEAD does not initdb because of a SIGBUS as reported 
  below by Martin Pitt @ debian (see his 
email
  below).  My build farm member (corgi) did pass all checks 6 days ago (I was 
  having some issues with the build farm 
  code before that).  If anyone would like to SSH into the box, please 
  contact me via email and I will get an 
account 
  setup.  Right now, I am trying to build 8.1 to see if it passes.
 
 I can take a look.
 
   regards, tom lane
--- End of Original Message ---
--- End of Forwarded Message ---

---BeginMessage---
Tom,

I will setup the ssh and forward you the info.

Thanks
Jim

-- Original Message ---
From: Tom Lane [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sun, 08 Jan 2006 16:41:38 -0500
Subject: Re: [HACKERS] Fw: Is anyone interested in getting PostgreSQL working 
on mips[el]? 

 Jim Buttafuoco [EMAIL PROTECTED] writes:
  I can confirm that HEAD does not initdb because of a SIGBUS as reported 
  below by Martin Pitt @ debian (see his 
email
  below).  My build farm member (corgi) did pass all checks 6 days ago (I was 
  having some issues with the build farm 
  code before that).  If anyone would like to SSH into the box, please 
  contact me via email and I will get an 
account 
  setup.  Right now, I am trying to build 8.1 to see if it passes.
 
 I can take a look.
 
   regards, tom lane
--- End of Original Message ---


---End Message---

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Andrew Dunstan
On Mon, 2006-01-09 at 12:06 -0500, Tom Lane wrote:
 Andrew Dunstan [EMAIL PROTECTED] writes:
  I don't know. Reading that code just makes my head spin ...
 
 Yeah, too many ifdefs :-(.  But I suppose that the initial
 #ifdef LOCALE_ENVIRON_REQUIRED block is not compiled on sane
 platforms, meaning that the first code in the routine is the
 unconditional
 if (! setlocale(LC_ALL, ))
   setlocale_failure = TRUE;
 


*doh!* I had misread that. Now I see.

On Windows that pretty much gives the game away.



 
  I'm just about out of ideas and right out of time to spend on this.
 
 We could just file a Perl bug report and wait for them to fix it.
 


What's the data risk?

cheers

andrew


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 On Mon, 2006-01-09 at 12:06 -0500, Tom Lane wrote:
 We could just file a Perl bug report and wait for them to fix it.

 What's the data risk?

Given that it took us this long to identify the problem, I'm guessing
that it doesn't affect too many people.  For starters you'd have to
run the postmaster under a locale environment different from what
initdb saw (or was told).

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] lookup_rowtype_tupdesc considered harmful

2006-01-09 Thread Neil Conway
On Mon, 2006-01-09 at 14:51 -0500, Tom Lane wrote:
 Nah, I don't think this works.  The problem is that after an inval,
 you may have to provide an updated TupleDesc to new callers while
 old callers still have open reference counts to the old TupleDesc.

Good point.

 However, you might be able to adopt the same trick used in catcache.c:
 the callers think they have pointers to HeapTuples and are unaware that
 that is just a field of a larger struct.  Add a reference count and a
 dead flag, and a magic value for safety checking, and you've got it.

Hmm, okay. There's the additional complication that we need to handle
record types (see RecordCacheArray in typcache.c). Since I don't think
we need reference counting for those, I'm envisioning something like:

TupleDesc lookup_rowtype_tupdesc(Oid type_id, int32 typmod);
void release_rowtype_tupdesc(TupleDesc tdesc); /* better name? */

TypeCacheEntry *lookup_type_cache(Oid type_id, int flags);
void release_type_cache(TypeCacheEntry *tentry);

where lookup_rowtype_tupdesc() returns a pointer to this struct:

typedef struct
{
struct tupleDesc tdesc; /* must be first field */

TypeCacheEntry *tentry; /* pointer to owning TypeCacheEntry,
   or NULL if this is a record type */
} MagicTupleDesc;

and where TypeCacheEntry has been modified to contain a reference count
and an is dead? flag. Is there actually a need for the (ugly) magic
value hackery used by catcache?

-Neil



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] lookup_rowtype_tupdesc considered harmful

2006-01-09 Thread Tom Lane
Neil Conway [EMAIL PROTECTED] writes:
 Hmm, okay. There's the additional complication that we need to handle
 record types (see RecordCacheArray in typcache.c). Since I don't think
 we need reference counting for those,

Yeah, you do.  See record_out for instance, and reflect on the fact that
it can have no idea what the called I/O functions are liable to do.

 where lookup_rowtype_tupdesc() returns a pointer to this struct:

 typedef struct
 {
 struct tupleDesc tdesc; /* must be first field */

 TypeCacheEntry *tentry; /* pointer to owning TypeCacheEntry,
or NULL if this is a record type */
 } MagicTupleDesc;

 and where TypeCacheEntry has been modified to contain a reference count
 and an is dead? flag.

No, the refcount and isdead flags should be in *this* struct, and
there's really no need for a back-link to TypeCacheEntry.  Think harder
about the situation where there are both old and new tupdescs
outstanding for a single typcache entry.

 Is there actually a need for the (ugly) magic
 value hackery used by catcache?

It's a safety check to make sure that what got passed to
release_rowtype_tupdesc is really one of these animals and not just any
old TupleDesc.  It's not *necessary*, granted, but given that the compiler
is not going to help people avoid such errors, I think it's prudent.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] cleaning up plperl warnings

2006-01-09 Thread Bruce Momjian
Andrew Dunstan wrote:
 Now, in src/include/port/win32.h we have this:
 
 /*
  * Supplement to sys/types.h.
  *
  * Perl already has conflicting defines for uid_t and gid_t.
  */
 #ifndef PLPERL_HAVE_UID_GID
 typedef int uid_t;
 typedef int gid_t;
 #else
 /* These are redefined by perl. */
 #define uid_t int
 #define gid_t int
 #endif
 
 
 But in my perl installation, at least, these are not redefined at all, 
 but typedef''d:
 
 typedef long uid_t;
 typedef long gid_t;
 
 Which causes a couple of pretty obscure warnings:
 
 C:/Perl/lib/CORE/win32.h:219: warning: useless keyword or type name in empty 
 declaration
 C:/Perl/lib/CORE/win32.h:219: warning: empty declaration
 C:/Perl/lib/CORE/win32.h:220: warning: useless keyword or type name in empty 
 declaration
 C:/Perl/lib/CORE/win32.h:220: warning: empty declaration
 
 
 Why do we need these defines at all? We don't use either of these types 
 anywhere in the plperl code.
 
 Is the community perl different in what it does here from what the 
 ActiveState perl?

No idea, but if you don't need them, remove them so they work on your
setup, and if it breaks other people's perl, we will hear about it
before we package 8.2 and make adjustments.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Andrew Dunstan



Tom Lane wrote:


I'm just about out of ideas and right out of time to spend on this.
   



We could just file a Perl bug report and wait for them to fix it.


 



done

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Bruce Momjian

Is there a TODO here, even if the Perl folks are supposed to fix it?

---

Andrew Dunstan wrote:
 
 
 Tom Lane wrote:
 
 I'm just about out of ideas and right out of time to spend on this.
 
 
 
 We could just file a Perl bug report and wait for them to fix it.
 
  
   
 
 
 done
 
 cheers
 
 andrew
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Tom Lane
Bruce Momjian pgman@candle.pha.pa.us writes:
 Is there a TODO here, even if the Perl folks are supposed to fix it?

When and if they fix it, it'd be useful for us to document the gotcha
someplace (not sure where, though).  Maybe we should even go so far as
to refuse to work with older libperls on Windows.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Andrew Dunstan


It has probably been sufficiently mitigated on *nix. On Windows, the 
choice seems to be between living with the risk and trying my put the 
locales back where they were patch, which as Tom and Greg point out 
might have other consequences. Take your pick.


cheers

andrew

Bruce Momjian wrote:


Is there a TODO here, even if the Perl folks are supposed to fix it?

---

Andrew Dunstan wrote:
 


Tom Lane wrote:

   


I'm just about out of ideas and right out of time to spend on this.
  

   


We could just file a Perl bug report and wait for them to fix it.




 


done

   





---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Bruce Momjian

I can put it in the Win32 section of the TODO list.  If we have
something not working on Win32, I would like to document it.

Is it:
plperl changes the locale in Win32?

---

Andrew Dunstan wrote:
 
 It has probably been sufficiently mitigated on *nix. On Windows, the 
 choice seems to be between living with the risk and trying my put the 
 locales back where they were patch, which as Tom and Greg point out 
 might have other consequences. Take your pick.
 
 cheers
 
 andrew
 
 Bruce Momjian wrote:
 
 Is there a TODO here, even if the Perl folks are supposed to fix it?
 
 ---
 
 Andrew Dunstan wrote:
   
 
 Tom Lane wrote:
 
 
 
 I'm just about out of ideas and right out of time to spend on this.

 
 
 
 We could just file a Perl bug report and wait for them to fix it.
 

  
 
   
 
 done
 
 
 
 
 

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Andrew Dunstan



Bruce Momjian wrote:


I can put it in the Win32 section of the TODO list.  If we have
something not working on Win32, I would like to document it.

Is it:
plperl changes the locale in Win32?

 



As long as the locale is consistent I think we're OK (is that right, Tom?)

Would that mean not using any of initdb's locale settings?

cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-09 Thread Greg Stark

Greg Stark [EMAIL PROTECTED] writes:

   These numbers don't make much sense to me. It seems like 5% is about as 
   slow
   as reading the whole file which is even worse than I expected. I thought 
   I was
   being a bit pessimistic to think reading 5% would be as slow as reading 
   20% of
   the table.
 
 I have a theory. My test program, like Postgres, is reading in 8k chunks.
 Perhaps that's fooling Linux into thinking it's a sequential read and reading
 in 32k chunks internally. That would effectively make a 25% scan a full table
 scan. And a 5% scan would be a 20% scan which is about where I would have
 expected the breakeven point to be.

Well my theory was sort of half right. It has nothing to do with fooling Linux
into thinking it's a sequential read. Apparently this filesystem was created
with 32k blocks. I don't remember if that was intentional or if ext2/3 did it
automatically based on the size of the filesystem.

So it doesn't have wide-ranging implications for Postgres's default 8k block
size. But it is a good lesson about the importance of not using a larger
filesystem block than Postgres's block size. The net effect is that if the
filesystem block is N*8k then your random_page_cost goes up by a factor of N.
That could be devastating for OLTP performance.

-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Is anyone interested in getting PostgreSQL working on mips[el]?

2006-01-09 Thread Jim Buttafuoco
Martin,

I have installed the Sarge binutils on my testing/Etch system and all of the 
Postgresql regression test pass.  I 
don't know where to go from here, any suggestions?

Jim


-- Original Message ---
From: Martin Pitt [EMAIL PROTECTED]
To: Jim Buttafuoco [EMAIL PROTECTED]
Sent: Mon, 9 Jan 2006 16:10:18 +0100
Subject: Re: Is anyone interested in getting PostgreSQL working on mips[el]?

 Hi Jim!
 
 Jim Buttafuoco [2006-01-09  8:44 -0500]:
  Martin,
  
  While my build farm mipsel system (corgi) get the sigbus as reported
  below, lionfish does not.  see
  http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=lionfishdt=2006-01-08%2016:51:36
 
 This looks indeed fine. What's the difference between these two?
 
  that I have.  So it must be in the software.  What do you think is
  the best way to tell?  I believe my raq is up to date as far as
  debian packages for sarge.
 
 I can't test this on sarge/mips myself. I have a sid dchroot on a box
 from Florian Lohoff, that's all.
 
 Indeed it could very well be that Debian's sarge packages worked fine
 on mips and that this whole story turns out to be a gcc or libc bug in
 sid, I don't know. Maybe you could do the test with identical
 compiler/postgresql versions and identical  postgresql
 patches/configure options once on sarge and once on sid?
 
 Thanks a lot for looking into this issue!
 
 Martin
 
 -- 
 Martin Pitthttp://www.piware.de
 Ubuntu Developer   http://www.ubuntu.com
 Debian Developer   http://www.debian.org
 
 In a world without walls and fences, who needs Windows and Gates?
--- End of Original Message ---


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] cleaning up plperl warnings

2006-01-09 Thread Andrew Dunstan



Bruce Momjian wrote:



Why do we need these defines at all? We don't use either of these types 
anywhere in the plperl code.

Is the community perl different in what it does here from what the ActiveState 
perl?
   



No idea, but if you don't need them, remove them so they work on your
setup, and if it breaks other people's perl, we will hear about it
before we package 8.2 and make adjustments.
 



done

cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: plperl vs LC_COLLATE (was Re: [HACKERS] Possible savepoint bug)

2006-01-09 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 As long as the locale is consistent I think we're OK (is that right, Tom?)

Right.

 Would that mean not using any of initdb's locale settings?

Yeah, you'd want to not use the --locale switch for initdb, and also not
to change the system-wide locale settings afterwards.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Improving N-Distinct estimation by ANALYZE

2006-01-09 Thread Greg Stark

Greg Stark [EMAIL PROTECTED] writes:

 Well my theory was sort of half right. It has nothing to do with fooling Linux
 into thinking it's a sequential read. Apparently this filesystem was created
 with 32k blocks. I don't remember if that was intentional or if ext2/3 did it
 automatically based on the size of the filesystem.
 
 So it doesn't have wide-ranging implications for Postgres's default 8k block
 size. But it is a good lesson about the importance of not using a larger
 filesystem block than Postgres's block size. The net effect is that if the
 filesystem block is N*8k then your random_page_cost goes up by a factor of N.
 That could be devastating for OLTP performance.

Hm, apparently I spoke too soon. tune2fs says the block size is in fact 4k.
Yet the performance of the block reading test program with 4k or 8k blocks
behaves as if Linux is reading 32k blocks. And in fact when I run it with 32k
blocks I get kind of behaviour we were expecting where the breakeven point is
around 20%.

So it's not the 8k block reading that's fooling Linux into reading ahead 32k.
It seems 32k readahead is the default for Linux, or perhaps it's the
sequential access pattern that's triggering it.

I'm trying to test it with posix_fadvise() set to random access but I'm having
trouble compiling. Do I need a special #define to get posix_fadvise from
glibc?

-- 
greg


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match