Re: [HACKERS] make output

2003-11-05 Thread Peter Eisentraut
Andrew Overholt writes:

 [EMAIL PROTECTED] postgresql]$ time make all  make.nonsmp
 mkdir man7
 some minor bison and ant warnings

 Should the man directory creation be echoed like that?

Yes, that appears to be normal.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Open Sourcing pgManage

2003-11-05 Thread Dave Page
 

 -Original Message-
 From: Andrew Dunstan [mailto:[EMAIL PROTECTED] 
 Sent: 05 November 2003 00:53
 To: PostgreSQL-development
 Subject: Re: [HACKERS] Open Sourcing pgManage
 
 
 
 Andreas Pflug wrote:
  pgAdmin is designed for a good interactive experience, which isn't 
  achievable using web technologies. SSL connection is 
 supported, so on 
  not-too-slow lines remote usage should be possible without security 
  issues, or over a VPN (I'm working like this).
  For web access, phpPGadmin should be usable; haven't tried so far.
 
 
 I don't think any of this contradicts what Marc said.

Maybe not, but you can easily run pgAdmin remotely over X if you have a
need to run it locally on a remote server. I've done so a number of
times and found it quite usable on my cheapo DSL line at home.

 And, as Joshua pointed out it could with some work be made to 
 run as an applet, which would be very cool for, say, an ISP 
 to provide (nothing at all required for the user to install).

Yeah, I agree X is not a solution to that. phpPgAdmin is though...

 Don't get me wrong - pgadmin is cool - I especially recommend 
 it to my Windows oriented clients and colleagues who hate 
 using command lines.

Why not your Linux or FreeBSD oriented colleagues? It runs just as well
on those platforms.

Regards, Dave.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] UPPER()/LOWER() and UTF-8

2003-11-05 Thread Karel Zak
On Tue, Nov 04, 2003 at 04:52:33PM -0500, Tom Lane wrote:
 Alexey Mahotkin [EMAIL PROTECTED] writes:
  I'm running Postgresql 7.3.4 with ru_RU.UTF-8 locale (with UNICODE
  database encoding), and all is almost well, except that UPPER() and
  LOWER() seem to ignore locale.
 
 upper/lower aren't going to work desirably in any multi-byte character
 set encoding.  I think Peter E. is looking into what it would take to

 It's a PostgreSQL and no  UTF problem, because standard PostgreSQL text
 functions doesn't know something about  arguments encoding and for this
 functions cannot use another (an  example UTF's lower/upper) method for
 a work with strings.

 Maybe a little  extend internal text datatype and  like VARSIZE() use
 VARENCODING(). Maybe Peter already has some better idea.

 fix this for 7.5, but at present you are going to need to use a
 single-byte encoding within the server.  (Nothing to stop you from using
 UTF-8 on the client side though.)

 You  can use  mutibyte on  server side  too, but  you must  to use  for
 example convert() function for upper/lower arguments.

Karel

-- 
 Karel Zak  [EMAIL PROTECTED]
 http://home.zf.jcu.cz/~zakkr/

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


[HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Jason Godden
Hi all,

This is my first patch for PostgreSQL against the 7.5devel cvs (please advise if this 
is the wrong place to post patches).  This patch simply enables the \xDD (or \XDD) 
hexadecimal import in the copy command (im starting with the simple stuff first).  I 
did notice that there may be a need to issue an error if an invalid octal or hex 
character is found following a \ or \x.  No errors are currently flagged by the octal 
(or this hex) import.

Rgds,

Jason

cvs server: Diffing .
Index: copy.c
===
RCS file: /projects/cvsroot/pgsql-server/src/backend/commands/copy.c,v
retrieving revision 1.213
diff -u -r1.213 copy.c
--- copy.c  6 Oct 2003 02:38:53 -   1.213
+++ copy.c  5 Nov 2003 11:19:33 -
@@ -48,9 +48,10 @@
 #include utils/lsyscache.h
 #include utils/syscache.h

-
 #define ISOCTAL(c) (((c) = '0')  ((c) = '7'))
 #define OCTVALUE(c) ((c) - '0')
+#define ISHEX(c) c)='0')  ((c)='9')) || (((c)='A')  ((c)='F')) || 
(((c)='a')  ((c)='f')))
+#define HEXVALUE(c) (((c)='a') ? ((c)-87) : (((c)='A') ? ((c)-55) : ((c)-'0')))

 /*
  * Represents the different source/dest cases we need to worry about at
@@ -1947,6 +1948,33 @@
c = line_buf.data[line_buf.cursor++];
switch (c)
{
+   case 'x':
+   case 'X':
+   {
+   if (line_buf.cursor  line_buf.len)
+   {
+   int hexval = 0;
+
+   c = 
line_buf.data[line_buf.cursor];
+   if (ISHEX(c))
+   {
+   line_buf.cursor++;
+   hexval = HEXVALUE(c);
+   if (line_buf.cursor  
line_buf.len)
+   {
+   c = 
line_buf.data[line_buf.cursor];
+   
line_buf.cursor++;
+   if (ISHEX(c))
+   {
+   
line_buf.cursor++;
+   hexval 
= (hexval  4) + HEXVALUE(c);
+   }
+   }
+   }
+   c = hexval;
+   }
+   }
+   break;
case '0':
case '1':
case '2':


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Peter Eisentraut
Jason Godden writes:

 This is my first patch for PostgreSQL against the 7.5devel cvs (please advise if 
 this is the wrong place to post patches).  This patch simply enables the \xDD (or 
 \XDD) hexadecimal import in the copy command (im starting with the simple stuff 
 first).  I did notice that there may be a need to issue an error if an invalid octal 
 or hex character is found following a \ or \x.  No errors are currently flagged by 
 the octal (or this hex) import.

I think this belongs into the string literal parser (at least in addition
to COPY).

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Open Sourcing pgManage

2003-11-05 Thread Andrew Dunstan


Dave Page wrote:

-Original Message-
From: Andrew Dunstan [mailto:[EMAIL PROTECTED] 
   


Don't get me wrong - pgadmin is cool - I especially recommend 
it to my Windows oriented clients and colleagues who hate 
using command lines.
   

Why not your Linux or FreeBSD oriented colleagues? It runs just as well
on those platforms.
 

especially != only

:-)

I have it installed on the RH9 machine I use for development, and 
happily show it to people there.

most *nix people I come into contact with are old fossils like me who 
prefer scripts and command lines for doing things. I don't use an IDE 
(unless you count emacs as an IDE) for development, and I rarely use 
control-panel-like apps.

BTW, pgadmin could improve its Linux coverage somewhat by a) providing 
RPMs for versions of RedHat before 9, or at least providing SRPMs that 
can be built on such platforms, and b) getting pgadmin included in the 
fedora package set.

Anyway, CommandPrompt have apparently done something cool and they are 
donating it and we should all be happy, no? :-)

cheers

andrew

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Open Sourcing pgManage

2003-11-05 Thread Dave Page
 

 -Original Message-
 From: Andrew Dunstan [mailto:[EMAIL PROTECTED] 
 Sent: 05 November 2003 12:48
 To: PostgreSQL-development
 Subject: Re: [HACKERS] Open Sourcing pgManage
 
 
 
 especially != only
 

Very true :-)


 BTW, pgadmin could improve its Linux coverage somewhat by a) 
 providing 
 RPMs for versions of RedHat before 9, 

Unfortunately we are somewhat limited to the boxes that our developers
have available, however if anyone can help out  with additional
ports/distributions we would welcome them and help out in any way we
can.

 or at least providing 
 SRPMs that 
 can be built on such platforms, 

I believe they should do know - Jean-Michel, were you looking at this?

 and b) getting pgadmin 
 included in the 
 fedora package set.

I only heard about Fedora about 20 minutes ago! Jean-Michel, do you have
any contacts that might be able to help with this?

 Anyway, CommandPrompt have apparently done something cool and 
 they are 
 donating it and we should all be happy, no? :-)

With my PostgreSQL hat on, yes, it's a good thing. With my pgAdmin hat,
no, it's a bad thing!

Regards, Dave.

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Christopher Kings-Lynne
This is my first patch for PostgreSQL against the 7.5devel cvs (please advise if this is the wrong place to post patches).  This patch simply enables the \xDD (or \XDD) hexadecimal import in the copy command (im starting with the simple stuff first).  I did notice that there may be a need to issue an error if an invalid octal or hex character is found following a \ or \x.  No errors are currently flagged by the octal (or this hex) import.


I think this belongs into the string literal parser (at least in addition
to COPY).
That's what always happens when I start working on something - someone 
points out something that makes it 100 times harder :P

Chris



---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Tom Lane
Jason Godden [EMAIL PROTECTED] writes:
 This is my first patch for PostgreSQL against the 7.5devel cvs (please
 advise if this is the wrong place to post patches).

pgsql-patches in future, please.

 +#define HEXVALUE(c) (((c)='a') ? ((c)-87) : (((c)='A') ? ((c)-55) : ((c)-'0')))

This seems excessively dependent on the assumption that the character
set is ASCII.  Why have you hard-coded numeric equivalents into this
macro?

BTW, the patch is incomplete because it is lacking documentation.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Experimental patch for inter-page delay in VACUUM

2003-11-05 Thread Zeugswetter Andreas SB SD

  The only idea I have come up with is to move all buffer write operations
  into a background writer process, which could easily keep track of
  every file it's written into since the last checkpoint.  
 
 I fear this approach. It seems to limit a lot of design flexibility later. But
 I can't come up with any concrete way it limits things so perhaps that
 instinct is just fud.

A lot of modern disk subsystems can only be saturated with more then one parallel 
IO request. So it would at least need a tuneable number of parallel writer processes,
or one writer that uses AIO to dump all outstanding IO requests out at once.
(Optimal would be all, in reality it would need to be batched into groups of 
n pages, since most systems have a max aio request queue size, e.g. 8192).

Andreas

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Experimental ARC implementation

2003-11-05 Thread Zeugswetter Andreas SB SD
 My plan is to create another background process very similar to 
 the checkpointer and to let that run forever basically looping over that 
 BufferSync() with a bool telling that it's the bg_writer.

Why not use the checkpointer itself inbetween checkpoints ?
use a min and a max dirty setting like Informix. Start writing
when more than max are dirty stop when at min. This avoids writing
single pages (which is slow, since it cannot be grouped together
by the OS).

Andreas

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Hacking PostgreSQL to work in Mac OS X 10.3 (Panther

2003-11-05 Thread scott.marlowe
Is this a bug we should fix for 7.3.5 when it eventually comes out?

On Tue, 4 Nov 2003, Andrew Rawnsley wrote:

 
 Just build RC1 today on Panther, no problems.
 
 
 On Nov 4, 2003, at 5:06 PM, Jeff Hoffmann wrote:
 
  Tom Lane wrote:
  [EMAIL PROTECTED] writes:
  After spending a few hours of trying to get Postgresql7.3.4 to build
  from source (tar.gz) on a Panther (release, not beta) system,
  Try 7.4RC1 instead.  Apple made some incompatible changes in their
  compiler in Panther.
 
  I was going to recommend the same thing.  I compiled a 7.4 beta out of 
  the box without a hitch, so I'd assume the RC would be fine as well.
 
  -- 
 
  Jeff Hoffmann
  PropertyKey.com
 
 
  ---(end of 
  broadcast)---
  TIP 8: explain analyze is your friend
 
 
 
 Andrew Rawnsley
 President
 The Ravensfield Digital Resource Group, Ltd.
 (740) 587-0114
 www.ravensfield.com
 
 
 ---(end of broadcast)---
 TIP 5: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faqs/FAQ.html
 
 


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] 7.4RC1 tag'd, branched and bundled ...

2003-11-05 Thread scott.marlowe
On Tue, 4 Nov 2003, Gaetano Mendola wrote:

 I agree in general with you for these general arguments, but here we 
 are talking about to introduce a sleep ( removable by guc ) or not! What 
 about the hash refactoring introduced with 7.4? Are you going to
 discourage people to use the hash?

That's not fair.  Everyone's had the chance to test 7.4 betas 1 through 4 
on their own hardware to see if it is safe and stable.  If I've load 
tested the betas and found them reliable, and suddenly find that 7.4 rc1 
or release were to be less reliable (I'm not talking about THIS patch in 
particular, I'm talking about ANY patch...) than the betas I would be 
upset that the change happened at the end instead of the beginning of the 
release cycle.

I think this patch is a great idea, but it's probably got a lot of 
refinement coming down the line.  I'd be glad to test it on the 7.4 branch 
as a patch, but it's just too late to put performance enhancements into 
the main branch.  Now is bug fixing time, not performance tweaking time.  
That was pre-beta1 really.

To review: The hasgagg has had testing, and lots of it, by the people who 
need it.  This patch has had little testing.  While I'm sure it's a fairly 
safe change, and likely to be a good thing overall, it's just too late in 
the cycle for it to get in.


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Open Sourcing pgManage

2003-11-05 Thread Christopher Browne
[EMAIL PROTECTED] (Greg Stark) writes:
 Andrew Dunstan [EMAIL PROTECTED] writes:

 I think there is room for lots of GUIs, though, and having a Java
 admin GUI would be cool too, as would having a servlet/JSP based
 admin client deployable as a web archive.

 If someone's looking for an interesting GUI project, Applix had a
 database frontend that was geared more for data rather than DDL. It
 presented a spreadsheet-like interface for arbitrary sql queries and
 handled dealing with arbitrary sized result sets and allowing
 editing of fields using primary keys etc.

 It was actually part of their open source release. I looked at
 trying to pull it out of their build system and package it up
 independently a while back. It was a bit of a pain. But I did manage
 to get it compiled and up and running against Oracle at the
 time. The main pain was getting the ODBC drivers set up.

 Getting that working smoothly with postgres and actively developed
 could make for a really nice DML tool.

Was that a 'native' part of SHELF?  Or more related to their TM1
product?

FYI, while Applix and VistaSource have orphaned it, source code for
SHELF is still available at SourceForge.

ftp://ftp.sourceforge.net/pub/sourceforge/shelf/

It was written for GTK 1.2; we're up to much newer stuff, and it's not
self-evident that it will play with newer versions.  (Old versions are
presumably still available and quasi-usable...)
-- 
output = reverse(moc.enworbbc @ enworbbc)
http://cbbrowne.com/info/sap.html
For be a man's intellectual superiority what it will, it can never
assume the practical, available supremacy over other men, without the
aid of some sort of external arts and entrenchments, always, in
themselves, more or less paltry and base.  This it is, that forever
keeps God's true princes of the Empire from the world's hustings; and
leaves the highest honors that this air can give, to those men who
become famous more through their infinite inferiority to the choice
hidden handful of the Divine Inert, than through their undoubted
superiority over the dead level of the mass. --Moby Dick, Ch 33

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] BTree information

2003-11-05 Thread Anand, VJ (MED, GEMS-IT)

I am trying to find information regarding creation of B-tree index in
postgres for variable length
character data (Char/varchar type). Specifically, what pagination policy
is used, does it use prefix,
or any other form of compression (encoding)? 

 Regards,
 VJ Anand


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Experimental ARC implementation

2003-11-05 Thread Jan Wieck
Zeugswetter Andreas SB SD wrote:

My plan is to create another background process very similar to 
the checkpointer and to let that run forever basically looping over that 
BufferSync() with a bool telling that it's the bg_writer.
Why not use the checkpointer itself inbetween checkpoints ?
use a min and a max dirty setting like Informix. Start writing
when more than max are dirty stop when at min. This avoids writing
single pages (which is slow, since it cannot be grouped together
by the OS).
Current approach is similar ... if I strech the IO and syncing over the 
entire 150-300 second checkpoint interval, grouping in 50 pages then 
sync()+nap, the system purr's pretty nice and without any peaks.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Open Sourcing pgManage

2003-11-05 Thread Greg Stark

Christopher Browne [EMAIL PROTECTED] writes:

 Was that a 'native' part of SHELF?  Or more related to their TM1
 product?

The full source was included in SHELF (if that was the source release I'm
thinking of.) I think it was called axdata.

 FYI, while Applix and VistaSource have orphaned it, source code for
 SHELF is still available at SourceForge.
 
 ftp://ftp.sourceforge.net/pub/sourceforge/shelf/

-- 
greg


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Erroneous PPC spinlock code

2003-11-05 Thread Peter Eisentraut
The SuSE PPC guru said that the PPC spinlock code we currently use may
behave erroneously on multiprocessor systems.  Attached is the proposed
patch, suggested for inclusion in 7.4.  Comments?

-- 
Peter Eisentraut   [EMAIL PROTECTED]
--- src/include/storage/s_lock.h
+++ src/include/storage/s_lock.h
@@ -207,24 +207,24 @@
slock_t _t;
int _res;
 
-   __asm__ __volatile__(
-  lwarx   %0,0,%2 \n
-  cmpwi   %0,0\n
-  bne 1f  \n
-  addi%0,%0,1 \n
-  stwcx.  %0,0,%2 \n
-  beq 2f  \n
-1:li  %1,1\n
-  b   3f  \n
-2:\n
-  isync   \n
-  li  %1,0\n
-3:\n
-
-:  =r (_t), =r (_res)
-:  r (lock)
-:  cc, memory
-   );
+__asm__ __volatile__(
+   lwarx   %0,0,%3 \n
+   cmpwi   %0,0\n
+   bne 2f  \n
+   addi%0,%0,1 \n
+   stwcx.  %0,0,%3 \n
+   bne 2f  \n
+   isync   \n
+1: li  %2,0\n
+   b   3f  \n
+2: \n
+   li  %2,1\n
+3: \n
+
+:   =r (_t), =r (lock), =r (_res)
+:   r (lock)
+:   cc, memory
+);
return _res;
 }
 
@@ -260,7 +260,7 @@
 #define S_UNLOCK(lock) \
 do \
 {\
-   __asm__ __volatile__ ( sync \n); \
+   __asm__ __volatile__ ( lwsync \n); \
*((volatile slock_t *) (lock)) = 0; \
 } while (0)
 

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


[HACKERS] Open Issues for 7.4

2003-11-05 Thread Peter Eisentraut
Here are some issues that might need to be addressed before 7.4 goes out:

* ECPG has some new include files such as

datetime.h
decimal.h

that come as part of the Informix compatibility mode.  I don't think these
should be installed directly in $includedir because of potential
conflicts.  We can't rename them either, without ruining the compatibility
aspect.  So I'm looking for suggestions about what to do with them.

* Fix uselessly executable files in the source tree.  See my recent post.
Any ideas on that?

* I'm planning to merge some of Jason Tishler's Cygwin README into our
FAQ_MSWIN, plus recently submitted changes for the same file.

* INSTALL, HISTORY and the regress/README need to be regenerated.  I'll do
1 and 3 soon, 2 has to wait until the release notes are edited to death.

* There is a bug in configure that prints an ugly error message on systems
without nsgmls and certain weird shells (NetBSD, UnixWare, for example),
because they interpret the construct ${NSGMLS-false} differently.

* I've generated new man pages that are on the ftp server under
pub/dev/doc/man-7.4.tar.gz.  They need to be copied into the release
tarball as doc/man.tar.gz.  This was a lot of manual work this time, so
any further reference page changes need to be integrated by hand.

* Did we really decide not to rename the check_function_bodies parameter
to something more general?  Maybe I've missed it.

* PPC spinlock patch from SuSE.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Erroneous PPC spinlock code

2003-11-05 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 The SuSE PPC guru said that the PPC spinlock code we currently use may
 behave erroneously on multiprocessor systems.

What's his evidence for that claim?  The code we have is based directly
on the recommendations in the PPC manuals, and has been tested on
multi-CPU systems.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Performance features the 4th

2003-11-05 Thread Jan Wieck
I've just uploaded

http://developer.postgresql.org/~wieck/all_performance.v4.74.diff.gz

This patch contains the still not yet ready performance improvements 
discussed over the couple last days.

_Shared buffer replacement_:

The buffer replacement strategy is a slightly modified version of ARC. 
The modifications are some specializations about CDB promotions. Since 
PostgreSQL allways looks for buffers multiple times when updating (first 
during the scan, then during the heap_update() etc.), every updated 
block would jump right into the T2 (frequent accessed) queue. To prevent 
that the Xid when a buffer got added to the T1 queue is remembered and 
if a block is found in T1, the same transaction will not promote it into 
T2. This also affects blocks accessed like SELECT ... FOR UPDATE; UPDATE 
as this is a usual strategy and does not mean that this particular datum 
is accessed frequently.

Blocks faulted in by vacuum are handled special in that they end up at 
the LRU of the T1 queue and when evicted from there their CDB get's 
destroyed instead of added to the B1 queue to prevent vacuum from 
polluting the caches autotuning.

A guc variable

buffer_strategy_status_interval = 0 # 0-600 seconds

controls DEBUG1 messages every n seconds showing the current queue sizes 
and the cache hitrates during the last interval.

_Vacuum page delay_:

Tom Lane's napping during vacuums with another tuning option. I replaced 
the usleep() call with a PG_DELAY(msec) macro in miscadmin.h, which does 
use select(2) instead. That should address the possible portability 
problems.

The config options

vacuum_page_group_delay = 0  # 0-100 milliseconds
vacuum_page_group_size  = 10 # 1-1000 pages
control how many pages get vacuumed as a group and how long vacuum will 
nap between groups.

I think this can be improved more if vacuum get's feedback from the 
buffer manager if a page actually was found clean or already dirty in 
the cache or faulted in. This together with the fact if vacuum actually 
dirties the page or not would result in a sort of vacuum page cost 
that is accumulated and controls how often to nap. So that vacuuming a 
page found in the cache and that has no dead tuples is cheap, but 
vacuuming a page that caused another dirty block to get evicted, then 
read in and finally ends up dirty because of dead tuples is expensive.

_Lazy checkpoint_:

This is the checkpoint process with the ability to schedule the buffer 
flushing over some time. Also the buffers are written in an order told 
by the buffer replacement strategy. Currently that is a merged list of 
dirty buffers in the order of the T1 and T2 queues of ARC. Since buffers 
are replaced in that order, it causes backends to find clean buffers for 
eviction more often.

The config options

lazy_checkpoint_time = 0# 0-3600 seconds
lazy_checkpoint_group_size = 50 # 10-1000 pages
lazy_checkpoint_maxdelay = 500  # 100-1000 milliseconds
control how long the buffer flushing should take, how many dirty pages 
to write as a group before syncing and napping. The maxdelay is a 
parameter that causes really small amounts of changes not to spread out 
over that long.

The syncing is currently done in a new function in md.c, mdfsyncrecent() 
called through the smgr. The intention is to maintain some LRU of 
written to file descriptors and pg_fdatasync() them. I haven't found the 
right place for that yet, so it simply does a system global sync().

My idea here is that it really does not matter how accurate the single 
files are forced to disk during this, all we care for is to cause some 
physical writes performed by the kernel while we're writing them out, 
and not to buffer those writes in the OS until we finish the checkpoint.

The lazy checkpoint configuration should only affect automatic 
checkpoints started by postmaster because a checkpoint_timeout occured. 
Acutally it seems to apply this to manually started checkpoints as well. 
BufferSync() monitors the time to finish, held in shared memory, so it 
would be relatively easy to hurry up a running lazy checkpoint by 
setting that to zero. It's just that the postmaster can't do that 
because he does not have a PGPROC structure and therefore can't lock 
that shmem structure. This is a must fix item because to hurry up the 
checkpointer is very critical at shutdown time.

_TODO_:

* Replace the global sync() in mdfsyncrecent(int max) with calls to
  pg_fdatasync()
* Add functionality to postmaster to hurry up a running checkpoint
  at shutdown.
* Make sure that manual checkpoints are not affected by the lazy
  checkpoint config options and that they too hurry up a running one.
* Further improve vacuums napping strategy depending on actual caused
  IO per page.
_NOTE_:

The core team is well aware of the high demand for these features. As 
things stand however, it is impossible to get this functionality 
released in version 7.4.

That does not mean, that we have no chance to include 

[HACKERS] Very poor estimates from planner

2003-11-05 Thread Rod Taylor
Since this is a large query, attachments for the explains / query.

Configuration:
dev_iqdb=# select version();
version

 PostgreSQL 7.4beta1 on i386-portbld-freebsd4.7, compiled by GCC
2.95.4
(1 row)

SET default_statistics_target = 1000;
ANALYZE;
set from_collapse_limit = 100;
set join_collapse_limit = 20;

The query is a segment of a report but shows the issue well enough.

Effectively, the planner has amazingly inaccurate row estimates.  With
nestloop on, it estimates 1 row returned. With it off it estimates 12
rows returned.  The query actually returns several hundred thousand
entries.

ANALYZE output is in the nestloopoff file. With nestloopon I actually
run out of memory prior to query completion (1GB ram to the single
process).

Any hints? I'm basically stuck. Oh, and I would like to ask for a
pgadmin feature -- visual explain :)
QUERY PLAN
  
--
 Merge Right Join  (cost=62642.05..62651.25 rows=12 width=74) (actual time=682187.16..698880.37 rows=362126 loops=1)
   Merge Cond: (outer.service_id = inner.service_id)
   Filter: ((outer.account_id = inner.account_id) OR (outer.account_id IS NULL))
   -  Index Scan using exposure_service_id_key on exposure tse  (cost=0.00..8.27 rows=300 width=8) (actual time=0.47..18.46
rows=300 loops=1)
   -  Sort  (cost=62642.05..62642.08 rows=12 width=74) (actual time=676920.81..683084.38 rows=813179 loops=1)
 Sort Key: tss.service_id
 -  Hash Left Join  (cost=62630.22..62641.84 rows=12 width=74) (actual time=590658.20..641520.35 rows=813179 loops=1)
   Hash Cond: (outer.service_order_detail_id = inner.service_order_detail_id_new)
   Filter: (inner.service_order_detail_id_new IS NULL)
   -  Merge Join  (cost=62622.49..62626.34 rows=12 width=74) (actual time=590651.37..621488.04 rows=829688 loops=1)
 Merge Cond: (outer.account_id = inner.parent_account_id)
 Join Filter: (inner.account_id = outer.parent_account_id)
 -  Index Scan using account_pkey on account aa  (cost=0.00..1004.01 rows=19539 width=8) (actual time=0.43..0.88 rows=11 loops=1)
 -  Sort  (cost=62622.49..62622.89 rows=158 width=82) (actual time=577505.17..591090.69 rows=1659376 loops=1)
   Sort Key: ab.parent_account_id
   -  Hash Join  (cost=61817.14..62616.72 rows=158 width=82) (actual time=459071.50..511843.21 rows=1659376 loops=1)
 Hash Cond: (outer.account_id = inner.account_id)
 -  Seq Scan on breakdown ab  (cost=0.00..602.67 rows=39067 width=8) (actual time=4.64..493.15 rows=39067 loops=1)
 -  Hash  (cost=61816.94..61816.94 rows=79 width=78) (actual time=459061.75..459061.75 rows=0 loops=1)
   -  Hash Join  (cost=61803.77..61816.94 rows=79 width=78) (actual time=287563.65..450075.88 rows=829688 loops=1)
 Hash Cond: (outer.service_order_detail_type_id = inner.service_order_detail_type_id)
 -  Hash Join  (cost=61802.74..61814.33 rows=158 width=82) (actual time=287562.87..419673.93 rows=1121988 loops=1)
   Hash Cond: (outer.service_semantic_type_id = inner.service_semantic_type_id)
   -  Hash Join  (cost=61801.71..61810.93 rows=158 width=74) (actual time=287562.31..383292.28 rows=1121988 loops=1)
 Hash Cond: (outer.service_semantic_type_id = inner.service_semantic_type_id)
 -  Merge Join  (cost=61800.67..61807.52 rows=158 width=66) (actual
time=287561.86..349895.76 rows=1121988 loops=1)
   Merge Cond: (outer.service_id = inner.service_id)
   -  Sort  (cost=61660.77..61661.17 rows=158 width=66) (actual
time=287532.78..298241.29 rows=1121988 loops=1)
 Sort Key: tsb.service_id
 -  Hash Join  (cost=60582.21..61655.00 rows=158 

Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Markus Bertheau
 , 05.11.2003,  16:25, Tom Lane :

  +#define HEXVALUE(c) (((c)='a') ? ((c)-87) : (((c)='A') ? ((c)-55) : ((c)-'0')))
 
 This seems excessively dependent on the assumption that the character
 set is ASCII.  Why have you hard-coded numeric equivalents into this
 macro?

What not ASCII compatible character sets are out there in use still
today?

-- 
Markus Bertheau [EMAIL PROTECTED]

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Experimental ARC implementation

2003-11-05 Thread Zeugswetter Andreas SB SD

  Why not use the checkpointer itself inbetween checkpoints ?
  use a min and a max dirty setting like Informix. Start writing
  when more than max are dirty stop when at min. This avoids writing
  single pages (which is slow, since it cannot be grouped together
  by the OS).
 
 Current approach is similar ... if I strech the IO and syncing over the 
 entire 150-300 second checkpoint interval, grouping in 50 pages then 
 sync()+nap, the system purr's pretty nice and without any peaks.

But how do you handle a write IO bound system then ? My thought was to 
let the checkpointer write dirty pages inbetween checkpoints with a min max,
but still try to do the checkpoint as fast as possible. I don't think
streching the checkpoint is good, since it needs to write hot pages, which the 
inbetween IO should avoid doing. The checkpointer would have two tasks,
that it handles alternately, checkpoint or flush LRU from max to min.

Andreas

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Manfred Spraul
Jan Wieck wrote:

_Vacuum page delay_:

Tom Lane's napping during vacuums with another tuning option. I 
replaced the usleep() call with a PG_DELAY(msec) macro in miscadmin.h, 
which does use select(2) instead. That should address the possible 
portability problems.
What about skipping the delay if there are no outstanding disk 
operations? Then vacuum would get the full disk bandwidth if the system 
is idle.

--
   Manfred


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [HACKERS] Experimental ARC implementation

2003-11-05 Thread Jan Wieck
Zeugswetter Andreas SB SD wrote:
 Why not use the checkpointer itself inbetween checkpoints ?
 use a min and a max dirty setting like Informix. Start writing
 when more than max are dirty stop when at min. This avoids writing
 single pages (which is slow, since it cannot be grouped together
 by the OS).
Current approach is similar ... if I strech the IO and syncing over the 
entire 150-300 second checkpoint interval, grouping in 50 pages then 
sync()+nap, the system purr's pretty nice and without any peaks.
But how do you handle a write IO bound system then ? My thought was to 
let the checkpointer write dirty pages inbetween checkpoints with a min max,
but still try to do the checkpoint as fast as possible. I don't think
streching the checkpoint is good, since it needs to write hot pages, which the 
inbetween IO should avoid doing. The checkpointer would have two tasks,
that it handles alternately, checkpoint or flush LRU from max to min.

Andreas
By actually moving a lot of the IO work into the checkpointer. It asks 
the buffer strategy about the order in which dirty blocks would 
currently get evicted from the cache. The checkpointer now flushes them 
in that order. Your hot pages will be found at the end of that list 
and thus flushed last in the checkpoint, why it's good to keep them 
dirty longer.

The problem with the checkpointer flushing as fast as possible is, that 
the entire system literally freezes. In my tests I use something that 
resembles the transaction profile of a TPC-C including the thinking and 
keying times. Those are important as they are a very realistic thing. A 
stock 7.4.RC1 handles a right scaled DB with new_order response times of 
0.2 to 1.5 seconds, but when the checkpoint occurs, it can't keep up and 
the response times go up to anything between 20-60 seconds. What makes 
the situation worse is that in the meantime, all simulated terminals hit 
the send button again, which lead's to a transaction pileup right 
during the checkpoint. It takes a while until the system recovers from 
that.

If the system is write-bound, the checkpointer will find that many dirty 
blocks that he has no time to nap and will burst them out as fast as 
possible anyway. Well, at least that's the theory.

PostgreSQL with the non-overwriting storage concept can never have 
hot-written pages for a long time anyway, can it? They fill up and cool 
down until vacuum.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Neil Conway
Jan Wieck [EMAIL PROTECTED] writes:
 This patch contains the still not yet ready performance improvements
 discussed over the couple last days.

Cool stuff!

 The buffer replacement strategy is a slightly modified version of
 ARC.

BTW Jan, I got your message about taking a look at the ARC code; I'm
really busy at the moment, but I'll definitely take a look at it when
I get a chance.

 I personally would like to see this work included in a 7.4.x
 release.

Personally, I can't see any circumstance under which I would view this
as appropriate for integration into the 7.4 branch -- the changes this
patch introduces are pretty fundamental to the system; even with
testing I'd rather not see a stable release series potentially
destabilized. Furthermore, it's not as if these performance issues
have been recently discovered: we've been aware of most of them for at
least one or two prior releases (if not much longer).

-Neil


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Jan Wieck
Manfred Spraul wrote:

Jan Wieck wrote:

_Vacuum page delay_:

Tom Lane's napping during vacuums with another tuning option. I 
replaced the usleep() call with a PG_DELAY(msec) macro in miscadmin.h, 
which does use select(2) instead. That should address the possible 
portability problems.
What about skipping the delay if there are no outstanding disk 
operations? Then vacuum would get the full disk bandwidth if the system 
is idle.
All we could do is to monitor our own recent activity. I doubt that 
anything else would be portable. And on a dedicated DB server that is 
very close to the truth anyway.

How portable is getrusage()? Could the postmaster issue that frequently 
for RUSAGE_CHILDREN and leave the result somewhere in the shared memory 
for whoever is concerned?

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Jan Wieck
Neil Conway wrote:

Jan Wieck [EMAIL PROTECTED] writes:
This patch contains the still not yet ready performance improvements
discussed over the couple last days.
Cool stuff!

The buffer replacement strategy is a slightly modified version of
ARC.
BTW Jan, I got your message about taking a look at the ARC code; I'm
really busy at the moment, but I'll definitely take a look at it when
I get a chance.
I personally would like to see this work included in a 7.4.x
release.
Personally, I can't see any circumstance under which I would view this
as appropriate for integration into the 7.4 branch -- the changes this
patch introduces are pretty fundamental to the system; even with
testing I'd rather not see a stable release series potentially
destabilized. Furthermore, it's not as if these performance issues
have been recently discovered: we've been aware of most of them for at
least one or two prior releases (if not much longer).
There are many aspects to this, and a full consensus will probably not 
be reachable.

As a matter of fact, people who have performance problems are likely to 
be the same who have upgrade problems. And as Gaetano pointed out 
correctly, we will see wildforms with one or the other feature applied.

My opinion is that it is best for us as supporters and for the 
reputation of PostgreSQL to try to keep the number of wildforms as small 
as possible and to provide those features applied in the best possible 
quality.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Open Issues for 7.4

2003-11-05 Thread Marc G. Fournier


On Wed, 5 Nov 2003, Peter Eisentraut wrote:

 Here are some issues that might need to be addressed before 7.4 goes out:

 * ECPG has some new include files such as

 datetime.h
 decimal.h

 that come as part of the Informix compatibility mode.  I don't think these
 should be installed directly in $includedir because of potential
 conflicts.  We can't rename them either, without ruining the compatibility
 aspect.  So I'm looking for suggestions about what to do with them.

could they not be installed in $includedir/informix?  building software
with them should just be a matter of adding -I$includedir/informix, no?

 * I've generated new man pages that are on the ftp server under
 pub/dev/doc/man-7.4.tar.gz.  They need to be copied into the release
 tarball as doc/man.tar.gz.  This was a lot of manual work this time, so
 any further reference page changes need to be integrated by hand.

I changed the build script to rename appropriately, and RC1 should have
the proper files in place ... no?


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Andrew Dunstan
Jan Wieck wrote:

How portable is getrusage()? Could the postmaster issue that 
frequently for RUSAGE_CHILDREN and leave the result somewhere in the 
shared memory for whoever is concerned?

SVr4, BSD4.3, SUS2 and POSIX1003.1, I believe.

I also believe there is a M$ dll available that gives that functionality 
(psapi.dll).

cheers

andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


[HACKERS] Schema boggle...

2003-11-05 Thread Chris Bowlby
Hi All, 

 I work with Marc Fournier, for those who don't know, and have been
working extensively with the schemas feature for Hub.Org's new Account
Management package. Each client's data is stored in a seperate schema
set asside just for them (though they will never have direct access to
it, it helps us balance out the data. Each schema has the same set of
tables, with which different data is stored, depending on the client.

 I've run into two issues and I've been reading over the docs in
relation to schemas in the hopes that I could find a solution to my
problems. Here are the details of the issues:

 1. The first issue I'm running with, is that if I have several schemas
with the following name:

  public
  test_001
  test_002
  test_003

 the test_* schemas have the same tables listed in the their name
spaces, such that:

 Schema test_001 contains:

  test_contact;
  test_domains;
  test_accounts;
  ...
 
 Schema test_002 contains:

  test_contact;
  test_domains;
  test_accounts;
  ...

 etc.

 As you can see this is a nice, clean way to break down some datasets.
But, if I do:

 set search_path to public, test_001, test_002;

 I only get access to the tables in test_001 and public, the tables in
test_002 are not listed, and thus I do not see them on the screen while
doing a \d.

2. Issue 2, is more a SQL issue then anything, but I'd like to be able
to do something to the affect of:

 SELECT * FROM test_*.test_domains ORDER BY domain_name;

 or better yet:

 SELECT * from public.domain_summary ds LEFT JOIN test_*.test_domains td
ON (td.domain_id = ds.id);

 This would mean I can run one query to get all of the information I
need, rather then having to run a query for each client that I have to
list domains for. If I've got 10,000 clients, that's going to take a ton
of time to get the results I'm looking for...

 Does anyone know if any of these issues have been addressed in 7.4? I
see some references to schema based changes, but no details on what
those changes were. If not, are there any plans to do something like
that?
-- 
Chris Bowlby [EMAIL PROTECTED]
Hub.Org Networking Services


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Jason Godden
On Thu, 6 Nov 2003 06:25 am, Markus Bertheau wrote:
  , 05.11.2003,  16:25, Tom Lane :
   +#define HEXVALUE(c) (((c)='a') ? ((c)-87) : (((c)='A') ? ((c)-55) :
   ((c)-'0')))
 
  This seems excessively dependent on the assumption that the character
  set is ASCII.  Why have you hard-coded numeric equivalents into this
  macro?

 What not ASCII compatible character sets are out there in use still
 today?

Ah, yes - didn't even think about the character sets.  If thats the case then 
octal needs attention as well because it makes a similar assumption.  Peter 
Eisentraut commented that this should be in the string literal parser.  
Should this be the case? and if so should i migrate both octal and hex to 
this parser?

Rgds,

Jason


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Larry Rosenman


--On Thursday, November 06, 2003 07:43:07 +1100 Jason Godden 
[EMAIL PROTECTED] wrote:

On Thu, 6 Nov 2003 06:25 am, Markus Bertheau wrote:
? ???, 05.11.2003, ? 16:25, Tom Lane ?:
  +#define HEXVALUE(c) (((c)='a') ? ((c)-87) : (((c)='A') ? ((c)-55)
  : ((c)-'0')))

 This seems excessively dependent on the assumption that the character
 set is ASCII.  Why have you hard-coded numeric equivalents into this
 macro?
What not ASCII compatible character sets are out there in use still
today?

EBCDIC as far as I know is still the default on IBM Mainframes (been 5+ 
years but...).



--
Larry Rosenman http://www.lerctr.org/~ler
Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED]
US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749


pgp0.pgp
Description: PGP signature


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Jan Wieck
Andrew Dunstan wrote:

Jan Wieck wrote:

How portable is getrusage()? Could the postmaster issue that 
frequently for RUSAGE_CHILDREN and leave the result somewhere in the 
shared memory for whoever is concerned?

SVr4, BSD4.3, SUS2 and POSIX1003.1, I believe.

I also believe there is a M$ dll available that gives that functionality 
(psapi.dll).
Remains the question when it is updated, the manpage doesn't tell. If 
the RUSAGE_CHILDREN information is updated only when the child exits, 
each backend has to do it.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Andrew Sullivan
On Wed, Nov 05, 2003 at 03:08:53PM -0500, Neil Conway wrote:
 Jan Wieck [EMAIL PROTECTED] writes:
  I personally would like to see this work included in a 7.4.x
  release.
 
 Personally, I can't see any circumstance under which I would view this
 as appropriate for integration into the 7.4 branch -- the changes this

As unhappy as I am to say so, I agree strongly.  Dot releases don't
get anything like enough testing to make me comfortable with putting
this kind of patch into such a release.  I'm just a user though.

A

-- 

Andrew Sullivan 204-4141 Yonge Street
Afilias CanadaToronto, Ontario Canada
[EMAIL PROTECTED]  M2P 2A8
 +1 416 646 3304 x110


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Jan Wieck
Chris Bowlby wrote:

Hi All, 

 I work with Marc Fournier, for those who don't know, and have been
working extensively with the schemas feature for Hub.Org's new Account
Management package. Each client's data is stored in a seperate schema
set asside just for them (though they will never have direct access to
it, it helps us balance out the data. Each schema has the same set of
tables, with which different data is stored, depending on the client.
 I've run into two issues and I've been reading over the docs in
relation to schemas in the hopes that I could find a solution to my
problems. Here are the details of the issues:
 1. The first issue I'm running with, is that if I have several schemas
with the following name:
  public
  test_001
  test_002
  test_003
 the test_* schemas have the same tables listed in the their name
spaces, such that:
 Schema test_001 contains:

  test_contact;
  test_domains;
  test_accounts;
  ...
 
 Schema test_002 contains:

  test_contact;
  test_domains;
  test_accounts;
  ...
 etc.

 As you can see this is a nice, clean way to break down some datasets.
But, if I do:
 set search_path to public, test_001, test_002;

 I only get access to the tables in test_001 and public, the tables in
test_002 are not listed, and thus I do not see them on the screen while
doing a \d.
2. Issue 2, is more a SQL issue then anything, but I'd like to be able
to do something to the affect of:
 SELECT * FROM test_*.test_domains ORDER BY domain_name;

 or better yet:

 SELECT * from public.domain_summary ds LEFT JOIN test_*.test_domains td
ON (td.domain_id = ds.id);
 This would mean I can run one query to get all of the information I
need, rather then having to run a query for each client that I have to
list domains for. If I've got 10,000 clients, that's going to take a ton
of time to get the results I'm looking for...
 Does anyone know if any of these issues have been addressed in 7.4? I
see some references to schema based changes, but no details on what
those changes were. If not, are there any plans to do something like
that?
You can't use a wildcard like that, but you can create a view like

CREATE VIEW public.all_test_domains AS
SELECT * FROM test_001.test_domains
UNION ALL
SELECT * FROM test_002.test_domains;
You have to change that view as customers (resp. schemas) come and go.

Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes:
 Manfred Spraul wrote:
 What about skipping the delay if there are no outstanding disk 
 operations?

 How portable is getrusage()? Could the postmaster issue that frequently 
 for RUSAGE_CHILDREN and leave the result somewhere in the shared memory 
 for whoever is concerned?

How would that tell you about currently outstanding operations?

Manfred's idea is interesting but AFAICS completely unimplementable
in any portable fashion.  You'd have to have hooks into the kernel.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Kurt Roeckx
On Wed, Nov 05, 2003 at 03:49:54PM -0500, Jan Wieck wrote:
 Andrew Dunstan wrote:
 
 Jan Wieck wrote:
 
 
 How portable is getrusage()? Could the postmaster issue that 
 frequently for RUSAGE_CHILDREN and leave the result somewhere in the 
 shared memory for whoever is concerned?
 
 SVr4, BSD4.3, SUS2 and POSIX1003.1, I believe.
 
 I also believe there is a M$ dll available that gives that functionality 
 (psapi.dll).
 
 Remains the question when it is updated, the manpage doesn't tell. If 
 the RUSAGE_CHILDREN information is updated only when the child exits, 
 each backend has to do it.

If the value of the who argument is RUSAGE_CHILDREN,
information shall be returned about resources used by the
terminated and waited-for children of the current process


Kurt


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Marc G. Fournier


On Wed, 5 Nov 2003, Jan Wieck wrote:

 Chris Bowlby wrote:

  Hi All,
 
   I work with Marc Fournier, for those who don't know, and have been
  working extensively with the schemas feature for Hub.Org's new Account
  Management package. Each client's data is stored in a seperate schema
  set asside just for them (though they will never have direct access to
  it, it helps us balance out the data. Each schema has the same set of
  tables, with which different data is stored, depending on the client.
 
   I've run into two issues and I've been reading over the docs in
  relation to schemas in the hopes that I could find a solution to my
  problems. Here are the details of the issues:
 
   1. The first issue I'm running with, is that if I have several schemas
  with the following name:
 
public
test_001
test_002
test_003
 
   the test_* schemas have the same tables listed in the their name
  spaces, such that:
 
   Schema test_001 contains:
 
test_contact;
test_domains;
test_accounts;
...
 
   Schema test_002 contains:
 
test_contact;
test_domains;
test_accounts;
...
 
   etc.
 
   As you can see this is a nice, clean way to break down some datasets.
  But, if I do:
 
   set search_path to public, test_001, test_002;
 
   I only get access to the tables in test_001 and public, the tables in
  test_002 are not listed, and thus I do not see them on the screen while
  doing a \d.
 
  2. Issue 2, is more a SQL issue then anything, but I'd like to be able
  to do something to the affect of:
 
   SELECT * FROM test_*.test_domains ORDER BY domain_name;
 
   or better yet:
 
   SELECT * from public.domain_summary ds LEFT JOIN test_*.test_domains td
  ON (td.domain_id = ds.id);
 
   This would mean I can run one query to get all of the information I
  need, rather then having to run a query for each client that I have to
  list domains for. If I've got 10,000 clients, that's going to take a ton
  of time to get the results I'm looking for...
 
   Does anyone know if any of these issues have been addressed in 7.4? I
  see some references to schema based changes, but no details on what
  those changes were. If not, are there any plans to do something like
  that?

 You can't use a wildcard like that, but you can create a view like

  CREATE VIEW public.all_test_domains AS
  SELECT * FROM test_001.test_domains
  UNION ALL
  SELECT * FROM test_002.test_domains;

 You have to change that view as customers (resp. schemas) come and go.

Oh, just thought of it ... I know there used to be limitation on this, but
I swore that they were removed ... how about a function?  could you do
something like:

for i in `SELECT nspname FROM nspname WHERE nspname LIKE 'test_*'`

... I know functins returning multi-rows was a limitation in the past, but
didn't that *just* get corrected, or am I mis-remembering?


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Stephan Szabo
On Thu, 6 Nov 2003, Jason Godden wrote:

 On Thu, 6 Nov 2003 06:25 am, Markus Bertheau wrote:
   , 05.11.2003,  16:25, Tom Lane :
+#define HEXVALUE(c) (((c)='a') ? ((c)-87) : (((c)='A') ? ((c)-55) :
((c)-'0')))
  
   This seems excessively dependent on the assumption that the character
   set is ASCII.  Why have you hard-coded numeric equivalents into this
   macro?
 
  What not ASCII compatible character sets are out there in use still
  today?

 Ah, yes - didn't even think about the character sets.  If thats the case then
 octal needs attention as well because it makes a similar assumption.  Peter

I haven't looked at the code in question, but assuming the digits are
contiguous and in order is safe, the C spec mandates that.  Assuming that
the letters are in order and contiguous is not safe.

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Tom Lane
Jan Wieck [EMAIL PROTECTED] writes:
 As a matter of fact, people who have performance problems are likely to 
 be the same who have upgrade problems. And as Gaetano pointed out 
 correctly, we will see wildforms with one or the other feature applied.

I'd believe that for patches of the size of my original VACUUM-delay
hack (or even a production-grade version of same, which'd probably be
10x larger).  The kind of wholesale rewrite you are currently proposing
is much too large to consider folding back into 7.4.*, IMHO.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Manfred Spraul
Tom Lane wrote:

Manfred's idea is interesting but AFAICS completely unimplementable
in any portable fashion.  You'd have to have hooks into the kernel.
 

I thought about outstanding operations from postgres - I don't know 
enough about the buffer layer if it's possible to keep a counter of the 
currently running read() and write() operations, or something similar.

--
   Manfred
---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Tom Lane
Chris Bowlby [EMAIL PROTECTED] writes:
  As you can see this is a nice, clean way to break down some datasets.
 But, if I do:

  set search_path to public, test_001, test_002;

  I only get access to the tables in test_001 and public, the tables in
 test_002 are not listed, and thus I do not see them on the screen while
 doing a \d.

Well, sure.  They are masked by the identically named tables in
test_001.  How else would you expect it to work?

 2. Issue 2, is more a SQL issue then anything, but I'd like to be able
 to do something to the affect of:

  SELECT * FROM test_*.test_domains ORDER BY domain_name;

I can't see trying to support this at the SQL level.  You might consider
making a plpgsql function that constructs a UNION ALL query listing all
the proper schemas and then EXECUTEs it on-the-fly.

Another possible solution is to make all the individual tables be
inheritance children of a master table.  Then selecting from the master
would work (and would be equivalent to the UNION performance-wise, I
think).

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Tom Lane
Stephan Szabo [EMAIL PROTECTED] writes:
 On Thu, 6 Nov 2003, Jason Godden wrote:
 On Thu, 6 Nov 2003 06:25 am, Markus Bertheau wrote:
 +#define HEXVALUE(c) (((c)='a') ? ((c)-87) : (((c)='A') ? ((c)-55) :
 ((c)-'0')))

 I haven't looked at the code in question, but assuming the digits are
 contiguous and in order is safe, the C spec mandates that.  Assuming that
 the letters are in order and contiguous is not safe.

I believe that's a true statement with respect to the character sets
used in the field; I dunno whether the C spec actually says that though.

My original concern about this macro had several different levels:

1. I can't see any reason why the subtractions are coded as -55 and
not -'A' + 10.  The existing coding is less understandable than doing
it right, and won't save anything in runtime (since the compiler will fold
the constants anyway), on top of being a character set dependency.

2. I don't much care for the assumption that lower case letters are
greater than upper case are greater than digits.  This could be avoided
at a fairly small runtime penalty by making range tests:

#define HEXVALUE(c) \
(((c) = '0'  (c) = '9') ? ((c) - '0') : \
 (((c) = 'A'  (c) = 'F') ? ((c) - 'A' + 10) : \
  ((c) - 'a' + 10)))

3. The third level would be to get rid of the assumption that letters
are contiguous, which would probably require making a lookup table to
map from char code to hex value.

I'm not sure level 3 is really worth doing, since AFAIK no one tries to
run Postgres on any EBCDIC platform.  (It's likely that there are other
places that depend on the letters-are-contiguous assumption anyway.)
I do think level 1 and probably level 2 are appropriate changes.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


[HACKERS] Changes to Contributor List

2003-11-05 Thread Josh Berkus
Folks,

If possible, for the upcoming release we'd like to get the Contributor List on 
developer.postgresql.org updated.   Can everyone please take a gander at:
http://developer.postgresql.org/bios.php

... and tell me what's out of date other than me  Vadim?   i.e.:

A) What contributors are missing from the list?
B) What contributors are listed under Major Developers who haven't contributed 
any code since 7.1.0?
C) Who needs their e-mail address updated?  
D) Who needs their description updated?  (Text, please ... I won't write it)

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Marc G. Fournier


On Wed, 5 Nov 2003, Tom Lane wrote:

 Chris Bowlby [EMAIL PROTECTED] writes:
   As you can see this is a nice, clean way to break down some datasets.
  But, if I do:

   set search_path to public, test_001, test_002;

   I only get access to the tables in test_001 and public, the tables in
  test_002 are not listed, and thus I do not see them on the screen while
  doing a \d.

 Well, sure.  They are masked by the identically named tables in
 test_001.  How else would you expect it to work?

 List of relations
 Schema   | Name  |   Type   |   Owner
--+---+--+---
 public   | categories| table| 186_pgsql
 public   | categories_rec_id_seq | sequence | 186_pgsql
 test_001 | table1| table| 186_pgsql
 test_002 | table1| table| 186_pgsql

the uniqueness, I would have thought, woudl have been schema.name, not
just name ...


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Josh Berkus
Chris,

  I work with Marc Fournier, for those who don't know, and have been
 working extensively with the schemas feature for Hub.Org's new Account
 Management package. Each client's data is stored in a seperate schema
 set asside just for them (though they will never have direct access to
 it, it helps us balance out the data. Each schema has the same set of
 tables, with which different data is stored, depending on the client.

While this is an interesting use of Schema, it is not, IMHO, a good way to 
partition client data.Far better to have monolithic tables with a 
client_id column and then construct *views* which are available in each 
schema only to that client.   Otherwise, as you've discovered, top-level 
management becomes a royal pain.

I'd be happy to discuss this further on PGSQL-SQL, which is really the 
appropriate mailing list.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Tom Lane
Marc G. Fournier [EMAIL PROTECTED] writes:
  How else would you expect it to work?

  List of relations
  Schema   | Name  |   Type   |   Owner
 --+---+--+---
  public   | categories| table| 186_pgsql
  public   | categories_rec_id_seq | sequence | 186_pgsql
  test_001 | table1| table| 186_pgsql
  test_002 | table1| table| 186_pgsql

You can get something like that if you say \d *.*.  If you say \d *
or equivalently just \d, then what you see is only the tables that you
could reference with unqualified names.  Which does not include the
tables in test_002, because they're hidden by the ones in test_001.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Performance features the 4th

2003-11-05 Thread Matthew T. O'Connor
Tom Lane wrote:

Jan Wieck [EMAIL PROTECTED] writes:
 

As a matter of fact, people who have performance problems are likely to 
be the same who have upgrade problems. And as Gaetano pointed out 
correctly, we will see wildforms with one or the other feature applied.
   

I'd believe that for patches of the size of my original VACUUM-delay
hack (or even a production-grade version of same, which'd probably be
10x larger).  The kind of wholesale rewrite you are currently proposing
is much too large to consider folding back into 7.4.*, IMHO.
 

Do people think that the VACUUM-delay patch by itself, would be usefully 
enough on it's own to consider working it into 7.4.1 or something?  From 
the little feedback I have read on the VACUUM-delay patch used in 
isolation, it certainly does help.  I would love to see it put into 7.4 
somehow. 

The far more rigorous changes that Jan is working on, will be welcome 
improvements for 7.5.

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Changes to Contributor List

2003-11-05 Thread Tom Lane
Josh Berkus [EMAIL PROTECTED] writes:
 B) What contributors are listed under Major Developers who haven't
 contributed any code since 7.1.0?

I think we had agreed that formerly-listed contributors would not be
deleted, but would be moved to a new section titled Contributors
Emeritus or some such.  Please make sure that Tom Lockhart and Vadim
get listed that way, at least.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Changes to Contributor List

2003-11-05 Thread Josh Berkus
Tom,

 I think we had agreed that formerly-listed contributors would not be
 deleted, but would be moved to a new section titled Contributors
 Emeritus or some such.  Please make sure that Tom Lockhart and Vadim
 get listed that way, at least.

Yeah, I was thinking of moving them from Major Contributors to just 
Contributors.   We'd rather have a Past Contributors?

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Changes to Contributor List

2003-11-05 Thread Bruce Momjian
Tom Lane wrote:
 Josh Berkus [EMAIL PROTECTED] writes:
  B) What contributors are listed under Major Developers who haven't
  contributed any code since 7.1.0?
 
 I think we had agreed that formerly-listed contributors would not be
 deleted, but would be moved to a new section titled Contributors
 Emeritus or some such.  Please make sure that Tom Lockhart and Vadim
 get listed that way, at least.

Yes, I think we are going to use that Emeritus list only for special
contributors who aren't involved any more.  I think Brian ??? would be a
good addition too.  He was around during 1.X and 6.X. Marc?  Remember?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Bruce Momjian
Tom Lane wrote:
 Josh Berkus [EMAIL PROTECTED] writes:
  B) What contributors are listed under Major Developers who haven't
  contributed any code since 7.1.0?
 
 I think we had agreed that formerly-listed contributors would not be
 deleted, but would be moved to a new section titled Contributors
 Emeritus or some such.  Please make sure that Tom Lockhart and Vadim
 get listed that way, at least.


I think the Emeritus word might be too hard for non-native English
speakers, and even for less educated English speakers.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Changes to Contributor List

2003-11-05 Thread Jan Wieck
Josh Berkus wrote:

Folks,

If possible, for the upcoming release we'd like to get the Contributor List on 
developer.postgresql.org updated.   Can everyone please take a gander at:
http://developer.postgresql.org/bios.php

... and tell me what's out of date other than me  Vadim?   i.e.:

A) What contributors are missing from the list?
B) What contributors are listed under Major Developers who haven't contributed 
any code since 7.1.0?
C) Who needs their e-mail address updated?  
D) Who needs their description updated?  (Text, please ... I won't write it)

Jan Wieck in Philadelphia, Pennsylvania, USA, ([EMAIL PROTECTED],
a href=http://www.afilias.info;Afilias USA INC./a)
overhauled the query rewrite rule system, wrote our procedural languages
PL/pgSQL and PL/Tcl and many other complex features like TOAST.
Jan

--
#==#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.  #
#== [EMAIL PROTECTED] #
---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Josh Berkus
Guys,

Oh, and how about we kill the Image Map of major developers?   It's about 4 
years out of date, and makes developers.postgresql.org load like molasses in 
January.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Bruce Momjian
Josh Berkus wrote:
 Guys,
 
 Oh, and how about we kill the Image Map of major developers?   It's about 4 
 years out of date, and makes developers.postgresql.org load like molasses in 
 January.

Agreed.  I think Jan is the only one who knows how to updated it.  Jan?

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Josh Berkus wrote:
 Oh, and how about we kill the Image Map of major developers?   It's about 4 
 years out of date, and makes developers.postgresql.org load like molasses in
 January.

 Agreed.

Yeah.  It was cute when we did it but it's overkill as a way of pointing
out that we have a worldwide development community.  We can just say
that...

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Josh Berkus
Tom,

 Yeah.  It was cute when we did it but it's overkill as a way of pointing
 out that we have a worldwide development community.  We can just say
 that...

I'll be happy to re-do it someday using OOo's imagemapper.   But not this week 
...

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Neil Conway
Josh Berkus [EMAIL PROTECTED] writes:
 Oh, and how about we kill the Image Map of major developers?   It's about 4 
 years out of date, and makes developers.postgresql.org load like molasses in 
 January.

Personally, I don't really see the need for developer.postgresql.org
to be a separate sub-site. Since it is basically just a collection of
links to other resources (CVSweb, TODO list, devel docs build, etc.),
why not just make it a page or two on www.postgresql.org?

-Neil


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread scott.marlowe
On Wed, 5 Nov 2003, Rod Taylor wrote:

 Since this is a large query, attachments for the explains / query.
 
 Configuration:
 dev_iqdb=# select version();
 version
 
  PostgreSQL 7.4beta1 on i386-portbld-freebsd4.7, compiled by GCC
 2.95.4
 (1 row)
 
 SET default_statistics_target = 1000;
 ANALYZE;
 set from_collapse_limit = 100;
 set join_collapse_limit = 20;

I'm not sure if that will actually change the default_statistics_target of 
the tables you're analyzing, I think it will only apply to newly created 
tables.  

I believe you have to alter table alter column set statistics 1000 for 
each column you want a statistic of 1000.  You might wanna try starting 
with 50 or 100 and see if that works first.


---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [HACKERS] Changes to Contributor List

2003-11-05 Thread Peter Eisentraut
Josh Berkus writes:

 If possible, for the upcoming release we'd like to get the Contributor List on
 developer.postgresql.org updated.   Can everyone please take a gander at:
 http://developer.postgresql.org/bios.php

One thing that really puzzles me is this web page:

http://advocacy.postgresql.org/about/

This is sort of the same web page that we're talking about here, but it
lists the developers below the press contacts as also-rans.  What's worse,
this is the web page that people will get to if they go to if they go to
http://www.postgresql.org and click on the first link they see: What
is  Don't get me wrong, press contacts and advocacy team deserve
recognition as well, but not on a page that is indirectly labelled What
is PostgreSQL.  It makes it look like the project is run by marketing
dudes.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Erroneous PPC spinlock code

2003-11-05 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 The SuSE PPC guru said that the PPC spinlock code we currently use may
 behave erroneously on multiprocessor systems.  Attached is the proposed
 patch, suggested for inclusion in 7.4.  Comments?

I looked into this some more.  The current CVS tip is identical to the
TAS code used in 7.3.* except for being gcc inlined assembler instead of
an out-of-line subroutine.  The 7.3 code is known to work, cf
http://archives.postgresql.org/pgsql-bugs/2002-09/msg00252.php
Given that testing and the lack of any bug reports against 7.3.*,
I think the burden of proof is on the person who thinks we should
change it.

AFAICS the proposed change for TAS() simply amounts to reversing the
sense of the test following stwcx., so that the success path
corresponds to branch not taken rather than branch taken.
I cannot see anything in the IBM PPC architecture manual
http://www-3.ibm.com/chips/techlib/techlib.nsf/techdocs/852569B20050FF778525699600682CC7
to justify thinking that this changes anything.  If there is any
difference in behavior then I'd think that having the isync in the
forward branch path is safer than not.  The TAS example in the manual
(p. 398) looks like

loop: lwarx ...
  ...
  stwcx....
  bne   loop
  isync

which might be read as saying that the isync should be in the fall
through path, but I think it is more correctly read as putting the
isync in the not predicted to be taken path.  Branch backward will
be statically predicted to be taken, branch forward not.  In any case
there's no mention here of needing to code the branch in one particular
way.

The proposed change from sync to lwsync during S_UNLOCK is interesting,
but we have to keep in mind that that is *not* a bug fix but an attempt
at performance improvement --- lwsync is a weaker constraint than sync.
I am not convinced that this change is safe for our usage, and I think
it would be folly to stick it into 7.4 at the RC stage of the cycle.

In short, my vote is show me why for the TAS change, and no way for
7.4, but we can look at it later for the S_UNLOCK change.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Peter Eisentraut
Neil Conway writes:

 Personally, I don't really see the need for developer.postgresql.org
 to be a separate sub-site. Since it is basically just a collection of
 links to other resources (CVSweb, TODO list, devel docs build, etc.),
 why not just make it a page or two on www.postgresql.org?

I agree to that.  The best way to get people involved is if we have an
integrated presentation of the project.  That is, users, developers,
marketing, documentation, web mastering, translation, whatever.  Right
now, the developers sit in their own corner, and users think, These
people can't even be bothered to present relevant information in the main
web site; they don't want me.  The marketing people sit in another
corner, and apparently their marketing strategy is make the marketing
site look as much unlike anything else in the project as possible.  And,
well, all the other people don't sit anywhere, because the main site
understands itself as a portal, and there is no obvious way that other
groups can integrate.

Check out www.debian.org or www.freebsd.org to see what I mean.
Everything is there at one glance, everything looks the same, everyone is
invited everywhere.

-- 
Peter Eisentraut   [EMAIL PROTECTED]


---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Andrew Dunstan


Bruce Momjian wrote:

Tom Lane wrote:
 

Josh Berkus [EMAIL PROTECTED] writes:
   

B) What contributors are listed under Major Developers who haven't
contributed any code since 7.1.0?
 

I think we had agreed that formerly-listed contributors would not be
deleted, but would be moved to a new section titled Contributors
Emeritus or some such.  Please make sure that Tom Lockhart and Vadim
get listed that way, at least.
   



I think the Emeritus word might be too hard for non-native English
speakers, and even for less educated English speakers.
 

Rupert Murdoch once sacked an editor (over the Hitler Diaries forgery 
fiasco) by giving him the title of Editor Emeritus. The editor asked 
what it meant and Murdoch is reported to have replied It's Latin, 
Frank. The 'e' means you're out and the 'meritus' means you deserve it.

:-)

cheers

andrew

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Open Issues for 7.4

2003-11-05 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 * Fix uselessly executable files in the source tree.  See my recent post.
 Any ideas on that?

They've been uselessly executable since they were put there.  I don't
think this is something that we need to fix in time for 7.4, or even
should risk trying to fix during RC stage.

 * There is a bug in configure that prints an ugly error message on systems
 without nsgmls and certain weird shells (NetBSD, UnixWare, for example),
 because they interpret the construct ${NSGMLS-false} differently.

Okay, should fix that, unless it's harder to fix than it sounds.

 * Did we really decide not to rename the check_function_bodies parameter
 to something more general?  Maybe I've missed it.

There didn't seem to be either widespread interest in this, or good
consensus on what to change it to.  I proposed doing nothing here:
http://archives.postgresql.org/pgsql-hackers/2003-10/msg01521.php
and so far as I saw no one responded one way or the other.

 * PPC spinlock patch from SuSE.

See my prior response.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send unregister YourEmailAddressHere to [EMAIL PROTECTED])


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread Rod Taylor
 I'm not sure if that will actually change the default_statistics_target of 
 the tables you're analyzing, I think it will only apply to newly created 
 tables.  
 
 I believe you have to alter table alter column set statistics 1000 for 
 each column you want a statistic of 1000.  You might wanna try starting 
 with 50 or 100 and see if that works first.

Hmm.. I was under the impression that it would work for any tables that
haven't otherwise been overridden.

Sets the default statistics target for table columns that have
not had a column-specific target set via ALTER TABLE SET
STATISTICS. Larger values increase the time needed to do
ANALYZE, but may improve the quality of the planner's estimates.
The default is 10.


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread Tom Lane
Rod Taylor [EMAIL PROTECTED] writes:
 I'm not sure if that will actually change the default_statistics_target

 Hmm.. I was under the impression that it would work for any tables that
 haven't otherwise been overridden.

It will.  I think Scott is recalling the original circa-7.2
implementation, where it wouldn't.  If you're unsure that you
affected it, check out the actual sizes of the array values in pg_stats.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread Rod Taylor
On Wed, 2003-11-05 at 18:57, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  I'm not sure if that will actually change the default_statistics_target
 
  Hmm.. I was under the impression that it would work for any tables that
  haven't otherwise been overridden.
 
 It will.  I think Scott is recalling the original circa-7.2
 implementation, where it wouldn't.  If you're unsure that you
 affected it, check out the actual sizes of the array values in pg_stats.

The plan does change when I remove it and re-analyze it.

Either way, it's off by a factor of a 10^5 or so.


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread Tom Lane
Rod Taylor [EMAIL PROTECTED] writes:
 Effectively, the planner has amazingly inaccurate row estimates.

It seems the key estimation failure is in this join step:

  -  Hash Join  (cost=1230.79..60581.82 rows=158 width=54) (actual 
time=1262.35..151200.29 rows=1121988 loops=1)
Hash Cond: (outer.account_id = inner.account_id)
-  Hash Join  (cost=1226.78..52863.43 rows=1542558 width=50) (actual 
time=1261.63..100418.30 rows=1573190 loops=1)
  (join of bsod, tsb, tss)
-  Hash  (cost=4.01..4.01 rows=1 width=4) (actual time=0.33..0.33 
rows=0 loops=1)
  -  Index Scan using single_null_parent_account_hack on account 
ap  (cost=0.00..4.01 rows=1 width=4) (actual time=0.26..0.28 rows=1 loops=1)
Filter: (parent_account_id IS NULL)

The estimated number of rows out of the join of bsod, tsb, tss isn't far
off, but the estimate for the result of joining that to ap is WAY off.
Apparently the planner thinks that only a few rows in the join will have
matches in ap, but really they almost all do.  Any idea why?  The
account_id stats for each seem to be the thing to look at.

 Any hints? I'm basically stuck. Oh, and I would like to ask for a
 pgadmin feature -- visual explain :)

You do know that Red Hat has been offering a Visual Explain tool for
some time?
http://sources.redhat.com/rhdb/
I've not had much occasion to use it myself, but it works ...

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread scott.marlowe
On Wed, 5 Nov 2003, Tom Lane wrote:

 Rod Taylor [EMAIL PROTECTED] writes:
  I'm not sure if that will actually change the default_statistics_target
 
  Hmm.. I was under the impression that it would work for any tables that
  haven't otherwise been overridden.
 
 It will.  I think Scott is recalling the original circa-7.2
 implementation, where it wouldn't.  If you're unsure that you
 affected it, check out the actual sizes of the array values in pg_stats.

Hey, can anyone guess what version I'm running in production :-)  Thanks 
for the catch.


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] Very poor estimates from planner

2003-11-05 Thread Rod Taylor
On Wed, 2003-11-05 at 19:18, Tom Lane wrote:
 Rod Taylor [EMAIL PROTECTED] writes:
  Effectively, the planner has amazingly inaccurate row estimates.
 
 It seems the key estimation failure is in this join step:
 
   -  Hash Join  (cost=1230.79..60581.82 rows=158 width=54) (actual 
 time=1262.35..151200.29 rows=1121988 loops=1)
 Hash Cond: (outer.account_id = inner.account_id)
 -  Hash Join  (cost=1226.78..52863.43 rows=1542558 width=50) 
 (actual time=1261.63..100418.30 rows=1573190 loops=1)
   (join of bsod, tsb, tss)
 -  Hash  (cost=4.01..4.01 rows=1 width=4) (actual time=0.33..0.33 
 rows=0 loops=1)
   -  Index Scan using single_null_parent_account_hack on 
 account ap  (cost=0.00..4.01 rows=1 width=4) (actual time=0.26..0.28 rows=1 loops=1)
 Filter: (parent_account_id IS NULL)
 
 The estimated number of rows out of the join of bsod, tsb, tss isn't far
 off, but the estimate for the result of joining that to ap is WAY off.
 Apparently the planner thinks that only a few rows in the join will have
 matches in ap, but really they almost all do.  Any idea why?  The
 account_id stats for each seem to be the thing to look at.

The account structure is tree, as is the product catalogue. Essentially
what the query does is convert high level recorded invoices into the
lower (not quite base) items for billing the different parties involved
in the transaction. bsod and tsb are both foreign keys to tss on the
columns being joined. Since these are full table scans, all values will
join.

This join in particular is  bsod (lineitems) and tsb (cached graph of
the product catalog tree) which are both foreign key'd off of tss
(product catalog tree).

So yes, since this is a full table scan all values will be joined since
the foreign key enforces them all to exist.

  Any hints? I'm basically stuck. Oh, and I would like to ask for a
  pgadmin feature -- visual explain :)
 
 You do know that Red Hat has been offering a Visual Explain tool for
 some time?
 http://sources.redhat.com/rhdb/
 I've not had much occasion to use it myself, but it works ...

Yeah.. but pgadmin is in the ports tree.

I'll take a peak at it.


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] Changes to Contributor List

2003-11-05 Thread Josh Berkus
Peter,
 
 http://advocacy.postgresql.org/about/
 
 This is sort of the same web page that we're talking about here, but it
 lists the developers below the press contacts as also-rans.  What's worse,
 this is the web page that people will get to if they go to if they go to
 http://www.postgresql.org and click on the first link they see: What
 is  

??? I thought we fixed that link.

 Don't get me wrong, press contacts and advocacy team deserve
 recognition as well, but not on a page that is indirectly labelled What
 is PostgreSQL.  It makes it look like the project is run by marketing
 dudes.

Yeah, you're right,it's confusing  we should have two seperate pages, one 
for What is the PGDG possibly linking to developer., and one page for 
Contact Us.   As it is, they two are munged together.

The problem is getting it fixed before 7.4  there are 7 translations 

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Marc G. Fournier

Actually, the use of schema's was my idea, to speed up some dreadfully
slow queries dealing with traffic stats from a table that was growing
painfully monolithic ... the idea is/was that it would be easier to
backup/remove all data pertaining to a specific client if they decided to
close their account ...

On Wed, 5 Nov 2003, Josh Berkus wrote:

 Chris,

   I work with Marc Fournier, for those who don't know, and have been
  working extensively with the schemas feature for Hub.Org's new Account
  Management package. Each client's data is stored in a seperate schema
  set asside just for them (though they will never have direct access to
  it, it helps us balance out the data. Each schema has the same set of
  tables, with which different data is stored, depending on the client.

 While this is an interesting use of Schema, it is not, IMHO, a good way to
 partition client data.Far better to have monolithic tables with a
 client_id column and then construct *views* which are available in each
 schema only to that client.   Otherwise, as you've discovered, top-level
 management becomes a royal pain.

 I'd be happy to discuss this further on PGSQL-SQL, which is really the
 appropriate mailing list.

 --
 -Josh Berkus
  Aglio Database Solutions
  San Francisco


 ---(end of broadcast)---
 TIP 8: explain analyze is your friend


---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Open Issues for 7.4

2003-11-05 Thread Christopher Kings-Lynne

* Fix uselessly executable files in the source tree.  See my recent post.
Any ideas on that?
As far as I'm aware, the only way to fix this is to get into the cvsroot 
 and chmod them by hand.

Chris



---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?
  http://www.postgresql.org/docs/faqs/FAQ.html


Re: [HACKERS] Changes to Contributor List

2003-11-05 Thread Gaetano Mendola
Josh Berkus wrote:

Folks,

If possible, for the upcoming release we'd like to get the Contributor List on 
developer.postgresql.org updated.   Can everyone please take a gander at:
http://developer.postgresql.org/bios.php

... and tell me what's out of date other than me  Vadim?   i.e.:

A) What contributors are missing from the list?
B) What contributors are listed under Major Developers who haven't contributed 
any code since 7.1.0?
C) Who needs their e-mail address updated?  
D) Who needs their description updated?  (Text, please ... I won't write it)
Who is eligible to be a contributor ?
Who wrote a single line of code that now is inside Postgres?
Who discovered a major bug ?
Who partecipate actively to all discussions ?
Regards
Gaetano Mendola


---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Christopher Kings-Lynne
I think we had agreed that formerly-listed contributors would not be
deleted, but would be moved to a new section titled Contributors
Emeritus or some such.  Please make sure that Tom Lockhart and Vadim
get listed that way, at least.
I think the Emeritus word might be too hard for non-native English
speakers, and even for less educated English speakers.
Isn't that an even better reason to use it? :)

Chris



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [pgsql-www] [HACKERS] Changes to Contributor List

2003-11-05 Thread Joshua D. Drake


I think the Emeritus word might be too hard for non-native English
speakers, and even for less educated English speakers.


Isn't that an even better reason to use it? :)
My personal opinion would be that they can use dictionary.com if they 
don't know what it means.



Chris



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [HACKERS] Schema boggle...

2003-11-05 Thread Greg Stark

Marc G. Fournier [EMAIL PROTECTED] writes:

 Actually, the use of schema's was my idea, to speed up some dreadfully
 slow queries dealing with traffic stats from a table that was growing
 painfully monolithic ... the idea is/was that it would be easier to
 backup/remove all data pertaining to a specific client if they decided to
 close their account ...

I would add an aol me too /aol to the comment that this is a horrible idea
and will be an enormous pain. You're denormalizing your data in a way that
will cause you great pain. 

What you're trying to implement is called partitioned tables. And you're
right, bulk loading and deleting is one of the big advantages of partitioned
tables. But Postgres doesn't support partitioned tables, so trying to roll
your own is hard.

Using schemas seems like a bad impedance match here too, as you've found out
with the search_path. Inherited tables is a closer match, still a lot of
things won't be automatic, and primary keys may be a problem.

But what most people use as the closest approximation to partitioned tables in
Postgres is partial indexes. It lets you keep your indexes to a reasonable
size but still accelerate the poor-selectivity client_id column in parallel.

But you still would be storing all the records in a single table and would
have to do some big vacuuming whenever you delete a whole client. vacuum full
may be your friend.

I don't understand why the queries should be any slower dealing with the
normalized data in a single table versus the dernormalized tables in multiple
tables. The one big exception is any query doing where client_id = ? where
the latter allows the use of a sequential scan instead of an index scan.

-- 
greg


---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [HACKERS] \xDD patch for 7.5devel

2003-11-05 Thread Jason Godden
 #define HEXVALUE(c) \
   (((c) = '0'  (c) = '9') ? ((c) - '0') : \
(((c) = 'A'  (c) = 'F') ? ((c) - 'A' + 10) : \
 ((c) - 'a' + 10)))

 3. The third level would be to get rid of the assumption that letters
 are contiguous, which would probably require making a lookup table to
 map from char code to hex value.

 I'm not sure level 3 is really worth doing, since AFAIK no one tries to
 run Postgres on any EBCDIC platform.  (It's likely that there are other
 places that depend on the letters-are-contiguous assumption anyway.)
 I do think level 1 and probably level 2 are appropriate changes.

   regards, tom lane

Hi Guys,

Thanks for the feedback.  Tom I'll make the changes proposed here to the macro 
and repost the patch to pgsql-patches (and do some reading on Unicode!).  I 
guess at this stage I would like to offer any of my time to any janitorial 
work that might be needed as until im more knowledgeable about the pg source 
I think any large scale stuff is off the cards.

Rgds,

Jason

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster