[HACKERS] postgresql-8.1RC1 on Solaris 10, amd64x2

2005-11-04 Thread Cedric Berger

./configure --without-readline
Everything compiled fine, and all tests passed on make check.

However, when I looked at my server this morning, one of the four
postmaster process was taking all CPU on one of my core (no db was
created, the postmaster should have been totally idle)

 load averages:  0.09,  0.10,  
0.09   
08:47:40

 252 processes: 250 sleeping, 2 on cpu
 CPU states: 46.9% idle,  2.0% user, 51.0% kernel,  0.0% iowait,  0.0% swap
 Memory: 2048M real, 1321M free, 389M swap in use, 3077M swap free

  PID USERNAME LWP PRI NICE  SIZE   RES STATETIMECPU COMMAND
 7188 postgres   1   00   17M   12M cpu/05:07 50.00% postgres
10440 root   1  490 3192K 1296K cpu/10:00  0.20% top
10392 cedric 1  590 9700K 2056K sleep0:00  0.01% sshd
10438 root   1  590 5212K 2040K sleep0:00  0.01% zsh

I tried to stop postgress using: 'su - postgres -c pg_ctl stop -swm fast'
but that function never returns. Even kill -9 couldn't stop the process.

This looks like a Solaris bug firstly, but I thought I'd mention it.
Cedric


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] somebody could explain this?

2005-11-04 Thread Cristian Prieto
Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
little about the postgresql internals and the way some kind of SPs could be
written in c language; I found something really weird and I cannot explain
to me this behavior:

#include postgres.h
#include fmgr.h

PG_FUNCTION_INFO_V1(test);

Datum
repeat_item(PG_FUNCTION_ARGS)
{
int num_times;

num_times = PG_GETARG_FLOAT8(0) * 100;
PG_RETURN_INT32(num_times);
}


Inside psql this happens:

# Create or replace function test(float) returns integer as 'test.so'
language 'c' stable;

select test(0.1);
Returns 10

Select test(0.11);
Returns 11

Select test(0.12)
Returns 11

Select test(0.13)
Returns 13

Select test(0.14)
Returns 14

Select test(0.15)
Returns 14


What Is happening here?


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] PG 8.1 supported platforms list: IRIX is MIA

2005-11-04 Thread Tom Lane
The 8.1 supported-platforms list is looking pretty good, I think -- we
don't have updates for every single combination of OS and hardware,
but we have updates for every OS and at least one instance of all
supported CPU types.

Except IRIX.  There's been no port report since Robert Bruccoleri
confirmed PG 7.4 worked.  If we can't get a confirmation of that
I fear we ought to drop it down to the unsupported list.
Can anyone test 8.1RC1 on IRIX?

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] postgresql-8.1RC1 on Solaris 10, amd64x2

2005-11-04 Thread Andrew Dunstan



Cedric Berger wrote:



I tried to stop postgress using: 'su - postgres -c pg_ctl stop -swm 
fast'

but that function never returns. Even kill -9 couldn't stop the process.

This looks like a Solaris bug firstly, but I thought I'd mention it.



Yeah, that sounds like it's stuck in the kernel.

cheers

andrew

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] somebody could explain this?

2005-11-04 Thread Andrew Dunstan


Classic behaviour, which has nothing to do with postgres. Try the 
program below to see the same effect. You probably should be using a 
rounding function to see what you seem to expect.


cheers

andrew



#include stdio.h

main()
{
double x[] = { 0.1, 0.11, 0.12, 0.13, 0.14, 0.15 };
int i,n;
for (i=0 ; i  6; i++)
{
n = x[i] * 100;
printf(%d\n,n);
}

}


Cristian Prieto wrote:


Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
little about the postgresql internals and the way some kind of SPs could be
written in c language; I found something really weird and I cannot explain
to me this behavior:

#include postgres.h
#include fmgr.h

PG_FUNCTION_INFO_V1(test);

Datum
repeat_item(PG_FUNCTION_ARGS)
{
int num_times;

num_times = PG_GETARG_FLOAT8(0) * 100;
PG_RETURN_INT32(num_times);
}


Inside psql this happens:

# Create or replace function test(float) returns integer as 'test.so'
language 'c' stable;

select test(0.1);
Returns 10

Select test(0.11);
Returns 11

Select test(0.12)
Returns 11

Select test(0.13)
Returns 13

Select test(0.14)
Returns 14

Select test(0.15)
Returns 14


What Is happening here?


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

 



---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] somebody could explain this?

2005-11-04 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 10:16:50AM -0600, Cristian Prieto wrote:
 Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
 little about the postgresql internals and the way some kind of SPs could be
 written in c language; I found something really weird and I cannot explain
 to me this behavior:

What's happening here is that the multiplication, being floating point,
has some accumulated error such that when you multiply it by 100 and
convert it to an int, it hits the cutoff.

Note: converting a float to an int tends to invoke trunc() or something
similar. trunc(14.) = 14. You need to decide whether maybe
you want round() instead. Or decide the actual cutoff yourself. I
sometimes fix this by adding 0.1 to numbers before converting to
integer, to avoid these issues.

Floating points numbers are accurate but not precise.

Hope this helps,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpkJMD5sNVGk.pgp
Description: PGP signature


Re: [HACKERS] somebody could explain this?

2005-11-04 Thread Tom Lane
Cristian Prieto [EMAIL PROTECTED] writes:
 Datum
 repeat_item(PG_FUNCTION_ARGS)
 {
   int num_times;

   num_times = PG_GETARG_FLOAT8(0) * 100;
   PG_RETURN_INT32(num_times);
 }

 # Create or replace function test(float) returns integer as 'test.so'
 language 'c' stable;

float is FLOAT4 not FLOAT8 ... your function is picking up some
garbage data.

There might also be some roundoff issues to think about --- I'd be
inclined to add an rint() call, or at least add 0.5 before truncating to
integer.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread mark
On Thu, Nov 03, 2005 at 09:17:43PM -0500, Tom Lane wrote:
 Gregory Maxwell [EMAIL PROTECTED] writes:
  Another way to look at this is in the context of compression: With
  unicode, characters are really 32bit values... But only a small range
  of these values is common.  So we store and work with them in a
  compressed format, UTF-8.
  As such it might be more interesting to ask some other questions like:
  are we using the best compression algorithm for the application, and,
  why do we sometimes stack two compression algorithms?
 Actually, the real reason we use UTF-8 and not any of the
 sorta-fixed-size representations of Unicode is that the backend is by
 and large an ASCII, null-terminated-string engine.  *All* of the
 supported backend encodings are ASCII-superset codes.  Making
 everything null-safe in order to allow use of UCS2 or UCS4 would be
 a huge amount of work, and the benefit is at best questionable.

Perhaps on a side note - my intuition (which sometimes lies) would tell
me that, if the above is true, the backend is doing unnecessary copies
of read-only data, if only, to insert a '\0' at the end of the strings.
Is this true?

I'm thinking along the lines of the other threads that speak of PostgreSQL
being CPU or I/O bound, not disk bound, for many sorts of operations. Is
PostgreSQL unnecessary copying string data around (and other data, I would
assume).

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS][OT] somebody could explain this?

2005-11-04 Thread Csaba Nagy
[snip]
 Floating points numbers are accurate but not precise.

OK, now this one beats me... what's the difference between accurate
and exact ? I thought both mean something like correct, but precise
refers to some action and accurate applies to a situation or
description...

I'm actually curios what it means. Merriam-Webster refers for both to
correct as a synonym.

Cheers,
Csaba.



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS][OT] somebody could explain this?

2005-11-04 Thread Peter Eisentraut
Csaba Nagy wrote:
 [snip]

  Floating points numbers are accurate but not precise.

 OK, now this one beats me... what's the difference between accurate
 and exact ? I thought both mean something like correct, but
 precise refers to some action and accurate applies to a situation or
 description...

Accurate means close to the real value, precise means having a lot of 
detail.

So 3 is more accurate than 4 as a representation of Pi, but both are 
not very precise.

5.32290753057207250735 is a very precise representation of Pi but 
totally inaccurate.

This also means that the statement at the top is wrong.  It should be 
the other way around.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] PG 8.1 supported platforms list: IRIX is MIA

2005-11-04 Thread Kevin Grittner
I don't know if this fills in any of the gaps, but...

We passed regression tests with 8.1RC1 on dual hyperthreaded
Xeon systems.  One had SUSE 9.3 Professional; the other had
Windows Server 2003 Enterprise Edition (Service Pack 1).
Both had 8 GB RAM and six disk drives set up as RAID 5.

Configure switches were:
--enable-integer-datetimes
--enable-debug
--disable-nls

On SUSE we also had:
--with-python

If it would be helpful to have more info, let me know.


 Tom Lane [EMAIL PROTECTED]  
The 8.1 supported-platforms list is looking pretty good, I think -- we
don't have updates for every single combination of OS and hardware,
but we have updates for every OS and at least one instance of all
supported CPU types.


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Constraint exclusion description bug?

2005-11-04 Thread Peter Eisentraut
I really haven't read up much on the constraint exclusion feature, but 
this description in guc.c really doesn't want to make me use it. :)

This prevents table access if the table constraints guarantee that 
table access is necessary.

This is a typo, right?

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS][OT] somebody could explain this?

2005-11-04 Thread Kevin Grittner
I think the crucial point is that the common IEEE floating point
formats are unable to store an EXACT representation of common
decimal fractions (such as .1) -- they can only store an
APPROXIMATION.

 Peter Eisentraut [EMAIL PROTECTED]  
Csaba Nagy wrote:
 [snip]

  Floating points numbers are accurate but not precise.

 OK, now this one beats me... what's the difference between accurate
 and exact ? I thought both mean something like correct, but
 precise refers to some action and accurate applies to a situation or
 description...

Accurate means close to the real value, precise means having a lot of 
detail.

So 3 is more accurate than 4 as a representation of Pi, but both are 
not very precise.

5.32290753057207250735 is a very precise representation of Pi but 
totally inaccurate.

This also means that the statement at the top is wrong.  It should be 
the other way around.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [OT] somebody could explain this?

2005-11-04 Thread Bruno Wolff III
On Fri, Nov 04, 2005 at 18:30:56 +0100,
  Csaba Nagy [EMAIL PROTECTED] wrote:
 [snip]
  Floating points numbers are accurate but not precise.
 
 OK, now this one beats me... what's the difference between accurate
 and exact ? I thought both mean something like correct, but precise
 refers to some action and accurate applies to a situation or
 description...
 
 I'm actually curios what it means. Merriam-Webster refers for both to
 correct as a synonym.

My memory from science courses a long time ago, is that precision refers
to how much information you have (e.g. the number of digits in a number)
and accuracy refers to how close your information is to reality.
Using a precision that was much higher than justified by accuracy used to
get points deducted from lab report grades.

In mathematics precise has a somewhat different meaning, but isn't a synonym
for accurate in that context.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 Nailed it.

 problem is in mainloop.c - setup_cancel_handler.  Apparently you can
 have multiple handlers and windows keeps track of them all, even if they
 do the same thing.  Keeping track of so many system handles would
 naturally slow the whole process down.

Yipes.  So we really want to do that only once.

AFAICS it is appropriate to move the sigsetjmp and setup_cancel_handler
calls in front of the per-line loop inside MainLoop --- can anyone see
a reason not to?

I'm inclined to treat this as an outright bug, not just a minor
performance issue, because it implies that a sufficiently long psql
script would probably crash a Windows machine.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] PG 8.1 supported platforms list

2005-11-04 Thread Chris Browne
[EMAIL PROTECTED] (Tom Lane) writes:
 The 8.1 supported-platforms list is looking pretty good, I think -- we
 don't have updates for every single combination of OS and hardware,
 but we have updates for every OS and at least one instance of all
 supported CPU types.

Not to pester overly...

AIX 5.3 has had results reported, and entries for doc/FAQ_AIX
submitted; none of that is yet included :-(.
-- 
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://www.ntlug.org/~cbbrowne/linux.html
No, I'm not interested in developing a powerful brain.  All I'm after
is just  a mediocre  brain, something like  the president  of American
Telephone and Telegraph Company.  -- Alan Turing on the possibilities
of a thinking machine, 1943.

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Merlin Moncure
 Merlin Moncure [EMAIL PROTECTED] writes:
  Nailed it.
 
  problem is in mainloop.c - setup_cancel_handler.  Apparently you
can
  have multiple handlers and windows keeps track of them all, even if
they
  do the same thing.  Keeping track of so many system handles would
  naturally slow the whole process down.
 
 Yipes.  So we really want to do that only once.
 
 AFAICS it is appropriate to move the sigsetjmp and
setup_cancel_handler
 calls in front of the per-line loop inside MainLoop --- can anyone see
 a reason not to?

hm. mainloop is re-entrant, right?  That means each \i would reset the
handler...what is downside to keeping global flag?


 I'm inclined to treat this as an outright bug, not just a minor
certainly...

 performance issue, because it implies that a sufficiently long psql
 script would probably crash a Windows machine.

actually, it's worse than that, it's more of a dos on the whole system,
as windows will eventually stop granting handles, but there is a good
chance of side effects on other applications.

Merlin

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread David Fetter
On Fri, Nov 04, 2005 at 01:01:20PM -0500, Tom Lane wrote:
 Merlin Moncure [EMAIL PROTECTED] writes:
  Nailed it.
 
  problem is in mainloop.c - setup_cancel_handler.  Apparently you
  can have multiple handlers and windows keeps track of them all,
  even if they do the same thing.  Keeping track of so many system
  handles would naturally slow the whole process down.
 
 Yipes.  So we really want to do that only once.
 
 AFAICS it is appropriate to move the sigsetjmp and
 setup_cancel_handler calls in front of the per-line loop inside
 MainLoop --- can anyone see a reason not to?
 
 I'm inclined to treat this as an outright bug, not just a minor
 performance issue, because it implies that a sufficiently long psql
 script would probably crash a Windows machine.

Ouch.  In light of this, are we *sure* what we've got a is a candidate
for release?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 510 893 6100   mobile: +1 415 235 3778

Remember to vote!

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Bruce Momjian
Tom Lane wrote:
 Merlin Moncure [EMAIL PROTECTED] writes:
  Nailed it.
 
  problem is in mainloop.c - setup_cancel_handler.  Apparently you can
  have multiple handlers and windows keeps track of them all, even if they
  do the same thing.  Keeping track of so many system handles would
  naturally slow the whole process down.
 
 Yipes.  So we really want to do that only once.
 
 AFAICS it is appropriate to move the sigsetjmp and setup_cancel_handler
 calls in front of the per-line loop inside MainLoop --- can anyone see
 a reason not to?

Nope.

 I'm inclined to treat this as an outright bug, not just a minor
 performance issue, because it implies that a sufficiently long psql
 script would probably crash a Windows machine.

Agreed.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 AFAICS it is appropriate to move the sigsetjmp and
 setup_cancel_handler
 calls in front of the per-line loop inside MainLoop --- can anyone see
 a reason not to?

 hm. mainloop is re-entrant, right?  That means each \i would reset the
 handler...what is downside to keeping global flag?

Ah, right, and in fact I'd missed the comment at line 325 pointing out
that we're relying on the sigsetjmp to be re-executed every time
through.  That could be improved on, likely, but not right before a
release.

Does the flag need to be global?  I'm thinking

  void
  setup_cancel_handler(void)
  {
+   static bool done = false;
+
+   if (!done)
SetConsoleCtrlHandler(consoleHandler, TRUE);
+   done = true;
  }


regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Tom Lane
David Fetter [EMAIL PROTECTED] writes:
 On Fri, Nov 04, 2005 at 01:01:20PM -0500, Tom Lane wrote:
 I'm inclined to treat this as an outright bug, not just a minor
 performance issue, because it implies that a sufficiently long psql
 script would probably crash a Windows machine.

 Ouch.  In light of this, are we *sure* what we've got a is a candidate
 for release?

Sure.  This problem exists in 8.0.* too.  Pre-existing bugs don't
disqualify an RC in my mind --- we fix them and move on, same as we
would do at any other time.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Bruce Momjian
David Fetter wrote:
 On Fri, Nov 04, 2005 at 01:01:20PM -0500, Tom Lane wrote:
  Merlin Moncure [EMAIL PROTECTED] writes:
   Nailed it.
  
   problem is in mainloop.c - setup_cancel_handler.  Apparently you
   can have multiple handlers and windows keeps track of them all,
   even if they do the same thing.  Keeping track of so many system
   handles would naturally slow the whole process down.
  
  Yipes.  So we really want to do that only once.
  
  AFAICS it is appropriate to move the sigsetjmp and
  setup_cancel_handler calls in front of the per-line loop inside
  MainLoop --- can anyone see a reason not to?
  
  I'm inclined to treat this as an outright bug, not just a minor
  performance issue, because it implies that a sufficiently long psql
  script would probably crash a Windows machine.
 
 Ouch.  In light of this, are we *sure* what we've got a is a candidate
 for release?

Good point.  It is something we would fix in a minor release, so it
doesn't seem worth doing another RC just for that.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Magnus Hagander
  I'm inclined to treat this as an outright bug, not just a minor
 certainly...
 
  performance issue, because it implies that a sufficiently long psql 
  script would probably crash a Windows machine.
 
 actually, it's worse than that, it's more of a dos on the 
 whole system, as windows will eventually stop granting 
 handles, but there is a good chance of side effects on other 
 applications.

Does it actually use up *handles* there? I don't see anything in the
docs that says it should do that - and they usually do document when
handles are used. You should be seeing a *huge* increase in system
handles very fast if it does, right? 

That said, I definitly agree with calling it a bug :-)

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Constraint exclusion description bug?

2005-11-04 Thread Tom Lane
Peter Eisentraut [EMAIL PROTECTED] writes:
 I really haven't read up much on the constraint exclusion feature, but 
 this description in guc.c really doesn't want to make me use it. :)

 This prevents table access if the table constraints guarantee that 
 table access is necessary.

 This is a typo, right?

One hopes ;-).  I was planning to make an editorial pass over those doc
additions anyway ... will definitely get it done now.  Or are you doing
that already?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 08:38:38AM -0500, [EMAIL PROTECTED] wrote:
 On Thu, Nov 03, 2005 at 09:17:43PM -0500, Tom Lane wrote:
  Actually, the real reason we use UTF-8 and not any of the
  sorta-fixed-size representations of Unicode is that the backend is by
  and large an ASCII, null-terminated-string engine.  *All* of the
  supported backend encodings are ASCII-superset codes.  Making
  everything null-safe in order to allow use of UCS2 or UCS4 would be
  a huge amount of work, and the benefit is at best questionable.
 
 Perhaps on a side note - my intuition (which sometimes lies) would tell
 me that, if the above is true, the backend is doing unnecessary copies
 of read-only data, if only, to insert a '\0' at the end of the strings.
 Is this true?

It's not quite that bad. Obviously for all on disk datatype zeros are
allowed. Bit strings, arrays, timestamps, numerics can all have
embedded nulls and they have a length header.

Where this becomes an issue is for things like table names, field
names, encoding names, etc. The name type is a fixed length string
which is kept in a way that it can be treated as a C string. If these
could contain null characters it would get messy.

I do conceive that the backend could support a UTF-16 datatype which
would be indexable and have various support functions. But as soon as
it came to talking to clients, it would be converted back to UTF-8
because libpq treats all strings coming back as null terminated.
Similarly, querys sent couldn't be anything other than UTF-8 also.

And if users can't send or receive UTF-16 text, why should the backend
store it that way?

 I'm thinking along the lines of the other threads that speak of PostgreSQL
 being CPU or I/O bound, not disk bound, for many sorts of operations. Is
 PostgreSQL unnecessary copying string data around (and other data, I would
 assume).

Well, there is a bit of copying around while creating tuples and such,
but it's not to add null terminators.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpHxbnCTUZPz.pgp
Description: PGP signature


Re: [HACKERS] insert performance for win32

2005-11-04 Thread Merlin Moncure
 
   void
   setup_cancel_handler(void)
   {
 + static bool done = false;
 +
 + if (!done)
   SetConsoleCtrlHandler(consoleHandler, TRUE);
 + done = true;
   }
 

That works, I tried ctrl-c various ways including from within \i copy.
Problem solved!

Merlin

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread mark
On Fri, Nov 04, 2005 at 04:13:29PM +0100, Martijn van Oosterhout wrote:
 On Fri, Nov 04, 2005 at 08:38:38AM -0500, [EMAIL PROTECTED] wrote:
  On Thu, Nov 03, 2005 at 09:17:43PM -0500, Tom Lane wrote:
   Actually, the real reason we use UTF-8 and not any of the
   sorta-fixed-size representations of Unicode is that the backend is by
   and large an ASCII, null-terminated-string engine.  *All* of the
   supported backend encodings are ASCII-superset codes.  Making
   everything null-safe in order to allow use of UCS2 or UCS4 would be
   a huge amount of work, and the benefit is at best questionable.
  Perhaps on a side note - my intuition (which sometimes lies) would tell
  me that, if the above is true, the backend is doing unnecessary copies
  of read-only data, if only, to insert a '\0' at the end of the strings.
  Is this true?
 It's not quite that bad. Obviously for all on disk datatype zeros are
 allowed. Bit strings, arrays, timestamps, numerics can all have
 embedded nulls and they have a length header.

Are you and Tom conflicting in opinion? :-)

I read the backend is by and large an ASCII, null-terminated-string
engine with we use UTF-8 [for varlena strings?] as, a lot of the
code assumes varlena strings are '\0' terminated, and an assumption
on my part, that the varlena strings are not stored in the backend
with a '\0' terminator, therefore, they require being copied out,
terminated with a '\0', before they can be used?

Or perhaps I'm just confused. :-)

  I'm thinking along the lines of the other threads that speak of PostgreSQL
  being CPU or I/O bound, not disk bound, for many sorts of operations. Is
  PostgreSQL unnecessary copying string data around (and other data, I would
  assume).
 Well, there is a bit of copying around while creating tuples and such,
 but it's not to add null terminators.

How much effort (past discussions that I've missed from a decade ago? 
hehe) has been put into determining whether a zero-copy architecture,
or really, a minimum copy architecture, would address some of these
bottlenecks? Am I dreaming? :-)

Cheers,
mark

-- 
[EMAIL PROTECTED] / [EMAIL PROTECTED] / [EMAIL PROTECTED] 
__
.  .  _  ._  . .   .__.  . ._. .__ .   . . .__  | Neighbourhood Coder
|\/| |_| |_| |/|_ |\/|  |  |_  |   |/  |_   | 
|  | | | | \ | \   |__ .  |  | .|. |__ |__ | \ |__  | Ottawa, Ontario, Canada

  One ring to rule them all, one ring to find them, one ring to bring them all
   and in the darkness bind them...

   http://mark.mielke.cc/


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] insert performance for win32

2005-11-04 Thread Tom Lane
Merlin Moncure [EMAIL PROTECTED] writes:
 That works, I tried ctrl-c various ways including from within \i copy.
 Problem solved!

Good.  I've applied the patch in both HEAD and 8.0 branches.

Since we're very nearly ready to wrap 8.1, would someone with access to
a Windows machine please double-check that CVS tip still works?

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Magnus Hagander
  AFAICS it is appropriate to move the sigsetjmp and 
  setup_cancel_handler calls in front of the per-line loop inside 
  MainLoop --- can anyone see a reason not to?
 
  hm. mainloop is re-entrant, right?  That means each \i 
 would reset the 
  handler...what is downside to keeping global flag?
 
 Ah, right, and in fact I'd missed the comment at line 325 
 pointing out that we're relying on the sigsetjmp to be 
 re-executed every time through.  That could be improved on, 
 likely, but not right before a release.
 
 Does the flag need to be global?  I'm thinking
 
   void
   setup_cancel_handler(void)
   {
 + static bool done = false;
 +
 + if (!done)
   SetConsoleCtrlHandler(consoleHandler, TRUE);
 + done = true;
   }
 

Seems like a simple enough solution, don't see why it shouldn't work. As
long as psql is single-threaded, which it is...
(Actually, that code seems to re-set done=true on every call which seems
unnecessary - but that might be optimised away, I guess)

//Magnus


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Tom Lane
[EMAIL PROTECTED] writes:
 I read the backend is by and large an ASCII, null-terminated-string
 engine with we use UTF-8 [for varlena strings?] as, a lot of the
 code assumes varlena strings are '\0' terminated, and an assumption
 on my part, that the varlena strings are not stored in the backend
 with a '\0' terminator, therefore, they require being copied out,
 terminated with a '\0', before they can be used?

There are places where we have to do that, the worst from a performance
viewpoint being in string comparison --- we have to null-terminate both
values before we can pass them to strcoll().

One of the large bits that would have to be done before we could even
contemplate using UCS2/UCS4 is getting rid of our dependence on strcoll,
since its API is null-terminated-string.

 How much effort (past discussions that I've missed from a decade ago? 
 hehe) has been put into determining whether a zero-copy architecture,
 or really, a minimum copy architecture, would address some of these
 bottlenecks? Am I dreaming? :-)

We've already done it in places, for instance the new implementation
of virtual tuples in TupleTableSlots eliminates a lot of copying
of pass-by-reference values.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 01:54:04PM -0500, Tom Lane wrote:
 [EMAIL PROTECTED] writes:
  I read the backend is by and large an ASCII, null-terminated-string
  engine with we use UTF-8 [for varlena strings?] as, a lot of the
  code assumes varlena strings are '\0' terminated, and an assumption
  on my part, that the varlena strings are not stored in the backend
  with a '\0' terminator, therefore, they require being copied out,
  terminated with a '\0', before they can be used?
 
 There are places where we have to do that, the worst from a performance
 viewpoint being in string comparison --- we have to null-terminate both
 values before we can pass them to strcoll().
 
 One of the large bits that would have to be done before we could even
 contemplate using UCS2/UCS4 is getting rid of our dependence on strcoll,
 since its API is null-terminated-string.

Yeah, and while one way of removing that dependance is to use ICU, that
library wants everything in UTF-16. So we replace copying to add NULL
to string with converting UTF-8 to UTF-16 on each call. Ugh! The
argument for UTF-16 is that if you're using a language that doesn't use
ASCII at all, UTF-8 gets inefficient pretty quickly.

Locale sensetive, efficient storage, fast comparisons, pick any two!

My guess is that in the long run there would be two basic string
datatypes, one UTF-8, null terminated string used in the backend code
as a standard C string, default collation strcmp. The other UTF-16 for
user data that wants to be able to collate in a locale dependant way.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgp1SqmeHlE7l.pgp
Description: PGP signature


Re: [HACKERS] [OT] somebody could explain this?

2005-11-04 Thread Otto Hirr
Lets start with an agreed upon expert, Knuth.
The art of computer programming. Vol2,Seminumerical Algorithms.Ed2.
pg682: Precision: The number of digits in a representation.
pg212: Section: Accuracy of floating point numbers.
A rough (but reasonably useful) way to express the behavior of
floating point arithmetic can be based upon the concept of
significant figures or relative error.

The float point rep is built of a exponent and fractional part.
(Note mantissa means a worthless addition(pg199)

Most notably, the IEEE rep, either single or double, most certainly
has the ability to store the EXACT value for 0.1.

The issue is how to REPRESENT an exact value in the computer.
Some values may be represented exactly. Some values may not.
The tricky part is then operations on these values. We may have
exact representation of two values, then applying an operation
that results in a theoretical value that can only have an
approximate representation. Accuracy deals with the difference
between the theoretical exact value and our ability to
represent it.

Note that Knuth's observation about add/sub v. mul/div:

Roughly speaking, the operations of floating point multiplication
and division do not magnify the relative error by very much; but
floating point subtraction of nearly equal quantities (and floating
point addition, u+v, where u is nearly equal to -v) can very greatly
increase the relative error.

Surprised?

Things to consider:
Is the value I want to represent have, shall we say, a limit to
the number of digits required to fully define its value,
eg 0.5, 0.9987

Is the value I want to represent have no limit to the number of
digits, eg pi, etc.

Does my representation provide a value that is within acceptable
limits of error, between the rep and actual value.

!Are the operations that I am performing introducing greater
amounts of relative error!

Break out the old computer book/math book and review...
Lets not assume or re-invent the wheel.

..Otto

 I think the crucial point is that the common IEEE floating point
 formats are unable to store an EXACT representation of common
 decimal fractions (such as .1) -- they can only store an
 APPROXIMATION.
 
  Peter Eisentraut [EMAIL PROTECTED]  
 Csaba Nagy wrote:
  [snip]
 
   Floating points numbers are accurate but not precise.
 
  OK, now this one beats me... what's the difference between 
 accurate
  and exact ? I thought both mean something like correct, but
  precise refers to some action and accurate applies to a situation or
  description...
 
 Accurate means close to the real value, precise means having a lot of 
 detail.
 
 So 3 is more accurate than 4 as a representation of Pi, but 
 both are 
 not very precise.
 
 5.32290753057207250735 is a very precise representation of Pi but 
 totally inaccurate.
 
 This also means that the statement at the top is wrong.  It should be 
 the other way around.
 
 -- 
 Peter Eisentraut
 http://developer.postgresql.org/~petere/
 
 ---(end of 
 broadcast)---
 TIP 2: Don't 'kill -9' the postmaster
 
 
 ---(end of 
 broadcast)---
 TIP 6: explain analyze is your friend
 


---(end of broadcast)---
TIP 6: explain analyze is your friend


[HACKERS] Crash during elog.c...

2005-11-04 Thread Jim C. Nasby
My client (same one with the slru.c issue) has had 3 of these in the
past day...

The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0  0x003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1  0x003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2  0x003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3  0x004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e 
%s, args=0x7fbfffdb50) at stringinfo.c:125
#4  0x004ff746 in appendStringInfo (str=0x7fbfffde30, fmt=0x65f59e 
%s) at stringinfo.c:75
#5  0x005d3a26 in log_line_prefix (buf=0x7fbfffde30) at elog.c:1425
#6  0x005d4beb in EmitErrorReport () at elog.c:1465
#7  0x005d4345 in errfinish (dummy=Variable dummy is not available.
) at elog.c:382
#8  0x0056625f in exec_simple_query (
query_string=0x89e760 update summary_clicks set clicks = t.clicks, 
impressions = t.impressions, dollars = t.dollars from 
pending_summary_clicks_2005_11_02 t where summary_clicks.listingindex = 
t.listingindex and summary_cl...) at postgres.c:1030
#9  0x00567bb3 in PostgresMain (argc=4, argv=0x846380, 
username=0x846350 iacm) at postgres.c:3007
#10 0x0053acf0 in ServerLoop () at postmaster.c:2836
#11 0x0053c3f4 in PostmasterMain (argc=5, argv=0x843530) at 
postmaster.c:918
#12 0x0050806f in main (argc=5, argv=0x843530) at main.c:268
(gdb) f 3
#3  0x004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e 
%s, args=0x7fbfffdb50)
at stringinfo.c:125
125 nprinted = vsnprintf(str-data + str-len, avail, fmt, args);
(gdb) print *str
$39 = {data = 0x848030 2005-11-04 00:01:02 EST|2005-11-04 00:00:08 
EST|216.187.113.78(39476)|didit|, len = 76,
  maxlen = 256, cursor = 0}

Asserts are on, but for performance reasons the memory checking stuff is
commented out.

The good news is there's been no slru.c asserts...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] somebody could explain this?

2005-11-04 Thread Csaba Nagy
Cristian,

I bet it's related to some rounding issue and the fact that floating
formats are approximative even for small integers.
Probably 12 ands up being slightly less in floating format (something
like 11.999...), and the cast to integer is truncating it.
Not 100% sure though... read up on your API, I'm not a C programmer :-)

HTH,
Csaba.


On Fri, 2005-11-04 at 17:16, Cristian Prieto wrote:
 Hello, I'm using PostgreSQL 8.0.4 in Fedora Core 3, right now I'm learning a
 little about the postgresql internals and the way some kind of SPs could be
 written in c language; I found something really weird and I cannot explain
 to me this behavior:
 
 #include postgres.h
 #include fmgr.h
 
 PG_FUNCTION_INFO_V1(test);
 
 Datum
 repeat_item(PG_FUNCTION_ARGS)
 {
   int num_times;
   
   num_times = PG_GETARG_FLOAT8(0) * 100;
   PG_RETURN_INT32(num_times);
 }
 
 
 Inside psql this happens:
 
 # Create or replace function test(float) returns integer as 'test.so'
 language 'c' stable;
 
 select test(0.1);
 Returns 10
 
 Select test(0.11);
 Returns 11
 
 Select test(0.12)
 Returns 11
 
 Select test(0.13)
 Returns 13
 
 Select test(0.14)
 Returns 14
 
 Select test(0.15)
 Returns 14
 
 
 What Is happening here?
 
 
 ---(end of broadcast)---
 TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PERFORM] insert performance for win32

2005-11-04 Thread Dann Corbit
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Magnus Hagander
 Sent: Friday, November 04, 2005 10:31 AM
 To: Tom Lane; Merlin Moncure
 Cc: pgsql-hackers@postgresql.org; pgsql-performance@postgresql.org
 Subject: Re: [HACKERS] [PERFORM] insert performance for win32
 
   AFAICS it is appropriate to move the sigsetjmp and
   setup_cancel_handler calls in front of the per-line loop inside
   MainLoop --- can anyone see a reason not to?
 
   hm. mainloop is re-entrant, right?  That means each \i
  would reset the
   handler...what is downside to keeping global flag?
 
  Ah, right, and in fact I'd missed the comment at line 325
  pointing out that we're relying on the sigsetjmp to be
  re-executed every time through.  That could be improved on,
  likely, but not right before a release.
 
  Does the flag need to be global?  I'm thinking


How about:

   void
   setup_cancel_handler(void)
   {
 +  static bool done = false;
 +
 +  if (!done)
 +{
SetConsoleCtrlHandler(consoleHandler, TRUE);
 +done = true;
 + }
   }

 Seems like a simple enough solution, don't see why it shouldn't work.
As
 long as psql is single-threaded, which it is...
 (Actually, that code seems to re-set done=true on every call which
seems
 unnecessary - but that might be optimised away, I guess)
 
 //Magnus
 
 
 ---(end of
broadcast)---
 TIP 3: Have you checked our extensive FAQ?
 
http://www.postgresql.org/docs/faq

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] somebody could explain this?

2005-11-04 Thread Otto Hirr

 What's happening here is that the multiplication, being 
 floating point,
 has some accumulated error such that when you multiply it by 100 and
 convert it to an int, it hits the cutoff.

Trivia...
I heard a story many years ago that landed a programmer in prison...
He worked on the program that calculated interest that was to be
deposited into an account. Instead of rounding or truncating the
amount beyond what the bank wanted to use, he saved the small fractional
dollars (really micro-pennies) into an account. Soon he had several
hundred thousand in the account - there is a lot of multiplication...
He got caught.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Martijn van Oosterhout kleptog@svana.org wrote:
 Yeah, and while one way of removing that dependance is to use ICU, that
 library wants everything in UTF-16. So we replace copying to add NULL
 to string with converting UTF-8 to UTF-16 on each call. Ugh! The
 argument for UTF-16 is that if you're using a language that doesn't use
 ASCII at all, UTF-8 gets inefficient pretty quickly.

Is this really the case? Only unicode values 000800 - 00 are
smaller in UTF-16 than in UTF-8, and in their case it's three bytes vs
two. Cyrilic, Arabic, Greek, Latin, etc are all two bytes in both.

So, yes in some cases UTF-8 will use three bytes where UTF-16 would be
two, but thats less inefficient than UTF-16 for ASCII, which many
people find acceptable.

 Locale sensetive, efficient storage, fast comparisons, pick any two!

I don't know that the choices are that limited, as I indicated earlier
in the thread I think it's useful to think of all of these encodings
as just different compression algorithms. If our desire was to have
all three, the backend could be made null safe and we could use the
locale-sensitive and fast representation (Probably UTF-16 or UTF-32)
in memory, and store on disk whatever is most efficient for storage.
(lz compressed UTF-whatever for fat fields, UTF-8 for mostly ascii
small fields, SCSU for non-ascii short fields
(http://www.unicode.org/reports/tr6/), etc)

 My guess is that in the long run there would be two basic string
 datatypes, one UTF-8, null terminated string used in the backend code
 as a standard C string, default collation strcmp. The other UTF-16 for
 user data that wants to be able to collate in a locale dependant way.

So if we need locale dependant colation we suffer 2x inflation for
many texts, and multibyte complexity still required if we are to
collate correctly when there are characters outside of the BMP. Yuck.

Disk storage type, memory strorage type, user API type, and collation
should be decoupled.

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Crash during elog.c...

2005-11-04 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 The backtrace:
 Program terminated with signal 11, Segmentation fault.
 (gdb) bt
 #0  0x003b8946fb20 in strlen () from /lib64/tls/libc.so.6
 #1  0x003b894428dc in vfprintf () from /lib64/tls/libc.so.6
 #2  0x003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
 #3  0x004ff420 in appendStringInfoVA (str=0x7fbfffde30, fmt=0x65f59e 
 %s, args=0x7fbfffdb50) at stringinfo.c:125

Hrm ... what's the platform again?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Tom Lane [EMAIL PROTECTED] wrote:
 Martijn van Oosterhout kleptog@svana.org writes:
  Yeah, and while one way of removing that dependance is to use ICU, that
  library wants everything in UTF-16.

 Really?  Can't it do UCS4 (UTF-32)?  There's a nontrivial population
 of our users that isn't satisfied with UTF-16 anyway, so if that really
 is a restriction of ICU then we're going to have to look elsewhere :-(

The correct question to ask is something like Does it support non-bmp
characters? or Does it really support UTF-16 or just UCS2?

UTF-16 is (now) a variable width encoding which is a strict superset
of UCS2 which allows the representation of all Unicode characters.
UCS2 is fixed width and only supports characters from the basic
multilingual plane.  UTF-32 and UCS4 are (now) effectively the same
thing and can represent all unicode characters with a 4 byte fixed
length word.

The code can demand UTF-16 but still be fine for non-BMP characters.
However, many things which claim to support UTF-16 really only support
UCS2 or at least have bugs in their handling of non-bmp characters. 
Software that supports UTF-8 is somewhat more likely to support
non-bmp characters correctly since the variable length code paths get
more of a workout in many environments. :)

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [OT] somebody could explain this?

2005-11-04 Thread Tom Lane
Otto Hirr [EMAIL PROTECTED] writes:
 Most notably, the IEEE rep, either single or double, most certainly
 has the ability to store the EXACT value for 0.1.

Oh really?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Tom Lane
Martijn van Oosterhout kleptog@svana.org writes:
 Yeah, and while one way of removing that dependance is to use ICU, that
 library wants everything in UTF-16.

Really?  Can't it do UCS4 (UTF-32)?  There's a nontrivial population
of our users that isn't satisfied with UTF-16 anyway, so if that really
is a restriction of ICU then we're going to have to look elsewhere :-(

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [OT] somebody could explain this?

2005-11-04 Thread Kevin Grittner
No, the IEEE formats can not store .1 exactly.  How close it
comes depends on the rest of the number.  For single and
double precision, respectively, the IEEE representations fall
at about:

0.10001490116119384765625
0.155511151231257827021181583404541015625

Libraries must do some rounding of these approximations when
formatting for display, so the error is rarely seen, except when
conversions to other data types are done via truncation.  The
problem values from the previous example were actually stored
as something close to the following.

0.11999555910790149937383830547332763671875
0.1499944488848768742172978818416595458984375

Whole numbers, on the other hand, are always accurate unless
they exceed the number of significant digits for the format.

-Kevin


 Otto Hirr [EMAIL PROTECTED]  
Most notably, the IEEE rep, either single or double, most certainly
has the ability to store the EXACT value for 0.1.


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] insert performance for win32

2005-11-04 Thread Andrew Dunstan



Tom Lane wrote:


Merlin Moncure [EMAIL PROTECTED] writes:
 


That works, I tried ctrl-c various ways including from within \i copy.
Problem solved!
   



Good.  I've applied the patch in both HEAD and 8.0 branches.

Since we're very nearly ready to wrap 8.1, would someone with access to
a Windows machine please double-check that CVS tip still works?


 



Worked for me. See buildfarm. Or are there more tests you want run?

cheers

andrew

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Crash during elog.c...

2005-11-04 Thread Jim C. Nasby
On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  The backtrace:
  Program terminated with signal 11, Segmentation fault.
  (gdb) bt
  #0  0x003b8946fb20 in strlen () from /lib64/tls/libc.so.6
  #1  0x003b894428dc in vfprintf () from /lib64/tls/libc.so.6
  #2  0x003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
  #3  0x004ff420 in appendStringInfoVA (str=0x7fbfffde30, 
  fmt=0x65f59e %s, args=0x7fbfffdb50) at stringinfo.c:125
 
 Hrm ... what's the platform again?

8-way opteron, RHEL4.

BTW, should I be opening bugs for things like this? I guess I probably
should...
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] roundoff problem in time datatype

2005-11-04 Thread Gurjeet Singh
On 10/13/05, Josh Berkus josh@agliodbs.com wrote:
 Tom,

  I think my preference is to allow '24:00:00' (but not anything larger)
  as a valid input value of the time datatypes.  This for two reasons:
* existing dump files may contain such values
* it's consistent with allowing, eg, '12:13:60', which we
  allow even though it's certainly not a valid leap second.

we shouldn't be allowing such timestamps! We should enforce only the
canonical formats of any datatype. Imagine what chaos would have been
caused if we didn't have IEEE specifications for the floating point
numbers!!!


 It's also consistent with how several other RDBMSes do things (SQL Server,
 MySQL), and several programming languages.

Just wanted to note that this is not really consistent with other
databases. For eg. SQL Server's o/p is shown below.

select convert( datetime, '23:59:59.998' )
1900-01-01 23:59:59.997

select convert( datetime, '23:59:59.999' )
1900-01-02 00:00:00.000  /* the date changes but the time remains
under 24:00:00 */

select convert( datetime, '24:00:00' )
Server: Msg 242, Level 16, State 3, Line 1
The conversion of a char data type to a datetime data type resulted in
an out-of-range datetime value.

Moreover, 24:00:00 not in canonical format so it should not be encoraged at all.

Gujreet.

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Jim C. Nasby
On Thu, Nov 03, 2005 at 04:07:41PM +0100, Marcus Engene wrote:
 Simon Riggs wrote:
 On Thu, 2005-11-03 at 11:13 -0300, Alvaro Herrera wrote:
 
 Simon Riggs wrote:
 
 On PostgreSQL, CHAR(12) is a bpchar datatype with all instantiations of
 that datatype having a 4 byte varlena header. In this example, all of
 those instantiations having the varlena header set to 12, so essentially
 wasting the 4 byte header.
 
 We need the length word because the actual size in bytes is variable,
 due to multibyte encoding considerations.
 
 
 Succinctly put, thanks.
 
 Incidentally, you remind me that other databases do *not* vary the
 character length, even if they do have varying length UTF-8 within them.
 So if you define CHAR(255) then it could blow up at a random length if
 you store UTF-8 within it.
 
 That's behaviour that I could never sanction, so I'll leave this now.
 
 Best Regards, Simon Riggs
 
 
 Just as a side note, in Oracle you can use the syntax (f.ex on on a db 
 with utf-8 charset):
 
 column VARCHAR2(10 CHAR)
 
 ...to indicate that Oracle should fit 10 characters there. It might use 
 up to 40 bytes in the db, but that's up to Oracle. If I s/10 CHAR/10, at 
 most 10 characters will fit.
 
 This works very well. The only catch is that it's not good to use more 
 than 1000 chars since oracle's varchars dont want to go past 4000 bytes.

Likewise other databases use different character types such as NCHAR
(nationalized char), which is the 16 bit variant.

I think it's perfectly acceptable to have a char type that is a
fixed-width in terms of number of bytes, so long as we provide an
alternative. Heck, in my experience char is only used to store things
like hashes that are in ASCII anyway.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Assert failure found in 8.1RC1

2005-11-04 Thread Jim C. Nasby
On Wed, Nov 02, 2005 at 06:45:21PM -0500, Tom Lane wrote:
 Robert Creager [EMAIL PROTECTED] writes:
  Ran with both for an hour with no problem, where I could produce the ASSERT
  failure within minutes for the non patched version.
 
 Great.  I'll go ahead and commit the smaller fix into HEAD and the back
 branches, and hold the larger fix for 8.2.
 
 It's curious that two different people stumbled across this just
 recently, when the bug has been there since 7.2.  I suppose that the
 addition of pg_subtrans increased the probability of seeing the bug by
 a considerable amount, but I'm still surprised it wasn't identified
 before.  At the very least, we should have heard about it earlier in
 the 8.0 release cycle ...

Well, the common theme in each case IIRC is a fairly high transaction
rate; on the order of hundreds if not thousands per second.

Could something like that be added to regression, or maybe as a seperate
test case for the buildfarm?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Jim C. Nasby
On Thu, Nov 03, 2005 at 10:32:03AM -0500, Tom Lane wrote:
 I'd feel a lot happier about this if we could keep the dynamic range
 up to, say, 10^512 so that it's still true that NUMERIC can be a
 universal parse-time representation.  That would also make it even
 more unlikely that anyone would complain about loss of functionality.

Would it be feasable to have a type that satisfies that constraint but
isn't generally intended for on-disk use? My thought is that this new
type would be used mostly for casting purposes. Kind of like the
UNKNOWNNUMBER but easier to do since it'd just be another type. (BTW,
I'm not suggesting that we disallow un-disk storage of the type, only
discourage it unless someone really, really needs an absurd number of
digits).
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 On Thu, Nov 03, 2005 at 10:32:03AM -0500, Tom Lane wrote:
 I'd feel a lot happier about this if we could keep the dynamic range
 up to, say, 10^512 so that it's still true that NUMERIC can be a
 universal parse-time representation.  That would also make it even
 more unlikely that anyone would complain about loss of functionality.

 Would it be feasable to have a type that satisfies that constraint but
 isn't generally intended for on-disk use? My thought is that this new
 type would be used mostly for casting purposes. Kind of like the
 UNKNOWNNUMBER but easier to do since it'd just be another type.

What exactly would be easier about it?  ISTM you just described
UNKNOWNNUMERIC to a T.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Assert failure found in 8.1RC1

2005-11-04 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 Could something like that be added to regression, or maybe as a seperate
 test case for the buildfarm?

If you don't have a self-contained, reproducible test case, it's a bit
pointless to suggest adding the nonexistent test case to the regression
suite.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Exclusive lock for database rename

2005-11-04 Thread Jim C. Nasby
On Thu, Nov 03, 2005 at 07:40:15PM -, Andrew - Supernews wrote:
 On 2005-11-03, Peter Eisentraut [EMAIL PROTECTED] wrote:
  Peter Eisentraut wrote:
  Someone wanted to rename a database while someone else was running a
  rather long pg_dump, so the rename had to wait, and everyone else had
  to wait for the rename because no new connections would be allowed.
 
  As an auxiliary issue, why do the new connections have to wait in this
  case?  The rename waits for the AccessShareLock of the dump to be
  released, but meanwhile new connections should be able to get
  AccessShareLocks of their own.
 
 No. New AccessShare locks block behind the pending AccessExclusive lock.
 Otherwise AccessShare locks could starve the exclusive lock forever.

I would argue that in cases like this (and 'this' means just about any
DDL, for starters) that it would be better not to block everyone until
work can actually be done. Or at least make that an option.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Jim C. Nasby
On Fri, Nov 04, 2005 at 04:30:27PM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  On Thu, Nov 03, 2005 at 10:32:03AM -0500, Tom Lane wrote:
  I'd feel a lot happier about this if we could keep the dynamic range
  up to, say, 10^512 so that it's still true that NUMERIC can be a
  universal parse-time representation.  That would also make it even
  more unlikely that anyone would complain about loss of functionality.
 
  Would it be feasable to have a type that satisfies that constraint but
  isn't generally intended for on-disk use? My thought is that this new
  type would be used mostly for casting purposes. Kind of like the
  UNKNOWNNUMBER but easier to do since it'd just be another type.
 
 What exactly would be easier about it?  ISTM you just described
 UNKNOWNNUMERIC to a T.

Apologies then; it sounded like UNKNOWNNUMERIC was going to be something
that was internal-use only, presumably making it much harder to
implement than just adding an additional type.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Assert failure found in 8.1RC1

2005-11-04 Thread Jim C. Nasby
On Fri, Nov 04, 2005 at 04:35:10PM -0500, Tom Lane wrote:
 Jim C. Nasby [EMAIL PROTECTED] writes:
  Could something like that be added to regression, or maybe as a seperate
  test case for the buildfarm?
 
 If you don't have a self-contained, reproducible test case, it's a bit
 pointless to suggest adding the nonexistent test case to the regression
 suite.

Well, for things like race conditions I don't know that you can create
reproducable test cases. My point was that this bug was exposed by
databases with workloads that involved very high transaction rates. I
know in the case of my client this is due to some sub-optimal design
decisions, and I believe the other case was similar. My suggestion is
that having a test that involves a lot of row-by-row type operations
that generate a very high transaction rate would help expose these kinds
of bugs.

Of course if someone can come up with a self-contained reproducable test
case for this race condition that would be great as well. :)
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Crash during elog.c...

2005-11-04 Thread Jim C. Nasby
On Fri, Nov 04, 2005 at 04:34:35PM -0500, Bruce Momjian wrote:
 Jim C. Nasby wrote:
  On Fri, Nov 04, 2005 at 02:45:41PM -0500, Tom Lane wrote:
   Jim C. Nasby [EMAIL PROTECTED] writes:
The backtrace:
Program terminated with signal 11, Segmentation fault.
(gdb) bt
#0  0x003b8946fb20 in strlen () from /lib64/tls/libc.so.6
#1  0x003b894428dc in vfprintf () from /lib64/tls/libc.so.6
#2  0x003b89461ba4 in vsnprintf () from /lib64/tls/libc.so.6
#3  0x004ff420 in appendStringInfoVA (str=0x7fbfffde30, 
fmt=0x65f59e %s, args=0x7fbfffdb50) at stringinfo.c:125
   
   Hrm ... what's the platform again?
  
  8-way opteron, RHEL4.
  
  BTW, should I be opening bugs for things like this? I guess I probably
  should...
 
 Nope, reporting it here is fine.

I'm soon to be AFK all weekend... is there any more info anyone wanted
about this?
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] PG 8.1 supported platforms list

2005-11-04 Thread Bruce Momjian
Chris Browne wrote:
 [EMAIL PROTECTED] (Tom Lane) writes:
  The 8.1 supported-platforms list is looking pretty good, I think -- we
  don't have updates for every single combination of OS and hardware,
  but we have updates for every OS and at least one instance of all
  supported CPU types.
 
 Not to pester overly...
 
 AIX 5.3 has had results reported, and entries for doc/FAQ_AIX
 submitted; none of that is yet included :-(.

Done.  Sorry for the delay.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  pgman@candle.pha.pa.us   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Assert failure found in 8.1RC1

2005-11-04 Thread Andrew Dunstan



Jim C. Nasby wrote:


On Fri, Nov 04, 2005 at 04:35:10PM -0500, Tom Lane wrote:
 


Jim C. Nasby [EMAIL PROTECTED] writes:
   


Could something like that be added to regression, or maybe as a seperate
test case for the buildfarm?
 


If you don't have a self-contained, reproducible test case, it's a bit
pointless to suggest adding the nonexistent test case to the regression
suite.
   



Well, for things like race conditions I don't know that you can create
reproducable test cases. My point was that this bug was exposed by
databases with workloads that involved very high transaction rates. I
know in the case of my client this is due to some sub-optimal design
decisions, and I believe the other case was similar. My suggestion is
that having a test that involves a lot of row-by-row type operations
that generate a very high transaction rate would help expose these kinds
of bugs.

Of course if someone can come up with a self-contained reproducable test
case for this race condition that would be great as well. :)
 



These conditions make it quite unsuitable for buildfarm, which is 
designed as a thin veneer over the postgres build process, and intended 
to run anywhere you can build postgres.


Maybe you could use one of the Linux labs, since your client is on RHEL.

cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Assert failure found in 8.1RC1

2005-11-04 Thread Jim C. Nasby
On Fri, Nov 04, 2005 at 05:26:25PM -0500, Andrew Dunstan wrote:
 Well, for things like race conditions I don't know that you can create
 reproducable test cases. My point was that this bug was exposed by
 databases with workloads that involved very high transaction rates. I
 know in the case of my client this is due to some sub-optimal design
 decisions, and I believe the other case was similar. My suggestion is
 that having a test that involves a lot of row-by-row type operations
 that generate a very high transaction rate would help expose these kinds
 of bugs.
 
 Of course if someone can come up with a self-contained reproducable test
 case for this race condition that would be great as well. :)
  
 
 
 These conditions make it quite unsuitable for buildfarm, which is 
 designed as a thin veneer over the postgres build process, and intended 
 to run anywhere you can build postgres.
 
 Maybe you could use one of the Linux labs, since your client is on RHEL.

I'm not worried about my client, I'm just thinking of a way to better
ferret out bugs like this. And there's no real reason why something like
this couldn't be part of regression, or an additional build target.

BTW, I just realized that part of the answer to Tom's musing about why
this hasn't been seen before now is that few (if any) regular users are
running with asserts turned on, so odds are good that they'd never know
if this problem occured or not. Further argument for trying to test this
on the buildfarm and/or enabling assertions by default, IMHO.
-- 
Jim C. Nasby, Sr. Engineering Consultant  [EMAIL PROTECTED]
Pervasive Software  http://pervasive.comwork: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf   cell: 512-569-9461

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Tony Caduto

Hi,
I have been playing around with pg_reload_conf() and the pg_settings view.

I understand that the pg_settings view, if updated, applies to the 
current session only.
However I was under the impression that if I did a pg_reload_conf(), the 
pg_settings view would be updated at that time, but that does not seem to
happen. 
I am running on win32, but the same thing happens on Linux.


If I restart the Postgresql service then the pg_settings view contains 
the changes I made to the postgresql.conf file.


Any ideas, does this seem like a possible bug?  It just seems to me that 
pg_settings should be updated if a pg_reload_conf() is executed.



Thanks,

Tony

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Martijn van Oosterhout
On Fri, Nov 04, 2005 at 02:58:05PM -0500, Gregory Maxwell wrote:
 The correct question to ask is something like Does it support non-bmp
 characters? or Does it really support UTF-16 or just UCS2?
 
 UTF-16 is (now) a variable width encoding which is a strict superset
 of UCS2 which allows the representation of all Unicode characters.
 UCS2 is fixed width and only supports characters from the basic
 multilingual plane.  UTF-32 and UCS4 are (now) effectively the same
 thing and can represent all unicode characters with a 4 byte fixed
 length word.

It's all on their website:

: How is a Unicode string represented in ICU?
:
: A Unicode string is currently represented as UTF-16 by default. The
: endianess of UTF-16 is platform dependent. You can guarantee the
: endianess of UTF-16 by using a converter. UTF-16 strings can be
: converted to other Unicode forms by using a converter or with the UTF
: conversion macros.
:
: ICU does not use UCS-2. UCS-2 is a subset of UTF-16. UCS-2 does not
: support surrogates, and UTF-16 does support surrogates. This means
: that UCS-2 only supports UTF-16's Base Multilingual Plane (BMP). The
: notion of UCS-2 is deprecated and dead. Unicode 2.0 in 1996 changed
: its default encoding to UTF-16.
snip
: What is the performance difference between UTF-8 and UTF-16?
:
: Most of the time, the memory throughput of the hard drive and RAM is
: the main performance constraint. UTF-8 is 50% smaller than UTF-16 for
: US-ASCII, but UTF-8 is 50% larger than UTF-16 for East and South
: Asian scripts. There is no memory difference for Latin extensions,
: Greek, Cyrillic, Hebrew, and Arabic.
snip
http://icu.sourceforge.net/userguide/icufaq.html

: Using UTF-8 strings with ICU
:
: As mentioned in the overview of this chapter, ICU and most other
: Unicode-supporting software uses 16-bit Unicode for internal
: processing. However, there are circumstances where UTF-8 is used
: instead. This is usually the case for software that does little or no
: processing of non-ASCII characters, and/or for APIs that predate
: Unicode, use byte-based strings, and cannot be changed or replaced
: for various reasons.
snip
: While ICU does not natively use UTF-8 strings, there are many ways to
: work with UTF-8 strings and ICU. The following list is probably
: incomplete.
http://icu.sourceforge.net/userguide/strings.html#strings

Basically you use a converter to process the UTF-8 strings,
prusumably converting them to UTF-16 (which is not UCS-2 as noted
above). UTF-32 needs a converter also, so no point using that either.

 The code can demand UTF-16 but still be fine for non-BMP characters.
 However, many things which claim to support UTF-16 really only support
 UCS2 or at least have bugs in their handling of non-bmp characters. 
 Software that supports UTF-8 is somewhat more likely to support
 non-bmp characters correctly since the variable length code paths get
 more of a workout in many environments. :)

I think ICU deals with that, but feel free to peruse the website
yourself...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a
 tool for doing 5% of the work and then sitting around waiting for someone
 else to do the other 95% so you can sue them.


pgpm7YasNf1Cn.pgp
Description: PGP signature


Re: [HACKERS] Seeing context switch storm with 10/13 snapshot of

2005-11-04 Thread Tom Lane
Robert Creager [EMAIL PROTECTED] writes:
 I have a Perl script, a Perl module and a 1Mb database (from pg_dump
 -F c).  Are you interested at this time in receiving this?

Sure.  Please send it off-list, of course.

 An upgrade to RC2 might occur when RC2 comes out, unless there would
 be great benefit on running my tests against 8.1RC1.

There isn't going to be an RC2, barring catastrophic problems being
identified *very* shortly ...

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Spinlocks, yet again: analysis and proposed patches

2005-11-04 Thread Mark Wong
On Thu, 03 Nov 2005 18:29:09 +
Simon Riggs [EMAIL PROTECTED] wrote:

 On Thu, 2005-11-03 at 08:03 -0800, Mark Wong wrote:
  On Tue, 01 Nov 2005 07:32:32 +
  Simon Riggs [EMAIL PROTECTED] wrote:
   Concerned about the awful checkpointing. Can you bump wal_buffers to
   8192 just to make sure? Thats way too high, but just to prove it.
   
   We need to rdeuce the number of blocks to be written at checkpoint.
   
bgwriter_all_maxpages   5  -  15
bgwriter_all_percent0.333
bgwriter_delay  200  
bgwriter_lru_maxpages   5-  7
bgwriter_lru_percent1
   
shared_buffersset lower to 10
(which should cause some amusement on-list)
  
  
  Okay, here goes, all with the same source base w/ the lw.patch:
  
  http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/44/
  only increased wal_buffers to 8192 from 2048
  3242 notpm
 
 That looks to me like a clear negative effect from increasing
 wal_buffers. Try putting it back down to 1024.
 Looks like we need to plug that gap.
 
  http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/43/
  only increased bgwriter_all_maxpages to 15, and bgwriter_lru_maxpages to 7
  3019 notpm (but more interesting graph)
 
 Man that sucks. What the heck is happening there? Hackers - if you
 watching you should see this graph - it shows some very poor behaviour.
 
 I'm not happy with that performance at all any chance you could re-
 run that exact same test to see if we can get that repeatably?
 
 I see you have 
 vm.dirty_writeback_centisecs = 0
 
 which pretty much means we aren't ever writing to disk by the pdflush
 daemons, even when the bgwriter is active.
 
 Could we set the bgwriter stuff back to default and try 
 vm.dirty_writeback_centisecs = 500

http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/47/
3309 notpm
 
  http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/45/
  Same as the previously listen run with hared_buffers lowered to 1
  2503 notpm
 
 Sorry, that was 100,000 not 10,000. 

Oops!
http://www.testing.osdl.org/projects/dbt2dev/results/dev4-014/46/
2794 notpm

 Looks like we need dates on the log_line_prefix so we can check the
 logs.

Oops again!  I didn't check to make sure I had set this correctly before
I ran the last two tests, I'll get on it.
 
 ...not sure about the oprofile results. Seems to show CreateLWLocks
 being as high as xlog_insert, which is mad. Either that shows startup
 time is excessive, or it means the oprofile timing range is too short.
 Not sure which.

Yeah, we've seen this before.  I think I'll have to try pulling the
oprofile cvs code to see if there's any improvement.  I've been working
with oprofile-0.9.1.

Mark

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reducing the overhead of NUMERIC data

2005-11-04 Thread Gregory Maxwell
On 11/4/05, Martijn van Oosterhout kleptog@svana.org wrote:
[snip]
 : ICU does not use UCS-2. UCS-2 is a subset of UTF-16. UCS-2 does not
 : support surrogates, and UTF-16 does support surrogates. This means
 : that UCS-2 only supports UTF-16's Base Multilingual Plane (BMP). The
 : notion of UCS-2 is deprecated and dead. Unicode 2.0 in 1996 changed
 : its default encoding to UTF-16.
 snip

This means it's fine.. ICU's use of UTF-16 will not break our support
for all of unicode. Conversion too and from UTF-16 isn't cheap,
however, if you're doing it all the time. Storing ASCII in UTF-16 is
pretty lame. Widespread use of UTF-16 tends to hide bugs in the
handling of non-bmp characters. ...  I would be somewhat surprised to
see a substantial performance difference in working with UTF-16 data
over UTF-8, but then again ... they'd know and I wouldn't.

Other lame aspects of using unicode encodings other than UTF-8
internally is that it's harder to figure out what is text in GDB
output and such.. can make debugging more difficult.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Tom Lane
Tony Caduto [EMAIL PROTECTED] writes:
 However I was under the impression that if I did a pg_reload_conf(), the 
 pg_settings view would be updated at that time, but that does not seem to
 happen. 

It works for me ...

regression=# select setting from pg_settings where name = 
'constraint_exclusion';
 setting
-
 off
(1 row)

-- edit postgresql.conf in another window, set constraint_exclusion = on

regression=# select pg_reload_conf();
 pg_reload_conf

 t
(1 row)

regression=# select setting from pg_settings where name = 
'constraint_exclusion';
 setting
-
 on
(1 row)


regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Assert failure found in 8.1RC1

2005-11-04 Thread Marc G. Fournier

On Fri, 4 Nov 2005, Jim C. Nasby wrote:


On Fri, Nov 04, 2005 at 05:26:25PM -0500, Andrew Dunstan wrote:

Well, for things like race conditions I don't know that you can create
reproducable test cases. My point was that this bug was exposed by
databases with workloads that involved very high transaction rates. I
know in the case of my client this is due to some sub-optimal design
decisions, and I believe the other case was similar. My suggestion is
that having a test that involves a lot of row-by-row type operations
that generate a very high transaction rate would help expose these kinds
of bugs.

Of course if someone can come up with a self-contained reproducable test
case for this race condition that would be great as well. :)




These conditions make it quite unsuitable for buildfarm, which is
designed as a thin veneer over the postgres build process, and intended
to run anywhere you can build postgres.

Maybe you could use one of the Linux labs, since your client is on RHEL.


I'm not worried about my client, I'm just thinking of a way to better
ferret out bugs like this. And there's no real reason why something like
this couldn't be part of regression, or an additional build target.


For all the talk about couldn't it be part of regression, I haven't seen 
anyone submit a patch that would test for it ... since I believe both you 
and Tom have both stated that for things like race conditions, I don't 
know that you can create reproducable cases, can you submit a patch for 
how you propose this should be added to the regression tests?



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Crash during elog.c...

2005-11-04 Thread Tom Lane
Jim C. Nasby [EMAIL PROTECTED] writes:
 My client (same one with the slru.c issue) has had 3 of these in the
 past day...

 (gdb) print *str
 $39 = {data = 0x848030 2005-11-04 00:01:02 EST|2005-11-04 00:00:08 
 EST|216.187.113.78(39476)|didit|, len = 76,
   maxlen = 256, cursor = 0}

Um, what's your log_line_prefix setting, and is the next format code
%i by any chance?  I've just noticed an utterly brain-dead assumption
somebody stuck into ps_status.c awhile back.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] wrote in message 
news:[EMAIL PROTECTED]
 Tony Caduto [EMAIL PROTECTED] writes:
 However I was under the impression that if I did a pg_reload_conf(), the
 pg_settings view would be updated at that time, but that does not seem to
 happen.

I repeated Tony's result (Win32):

Welcome to psql 8.1RC1, the PostgreSQL interactive terminal.

Type:  \copyright for distribution terms
   \h for help with SQL commands
   \? for help with psql commands
   \g or terminate with semicolon to execute query
   \q to quit

test=# select setting from pg_settings where name = 'constraint_exclusion';
 setting
-
 off
(1 row)

test=# select pg_reload_conf();
 pg_reload_conf

 t
(1 row)

test=# select setting from pg_settings where name = 'constraint_exclusion';
 setting
-
 off
(1 row)

test=# LOG:  received SIGHUP, reloading configuration files

test=# select setting from pg_settings where name = 'constraint_exclusion';
 setting
-
 off
(1 row)

test=# select setting from pg_settings where name = 'constraint_exclusion';
 setting
-
 on
(1 row)


-- Seems that's due to delay of process SIGHUP ...




---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Andrew Dunstan



Qingqing Zhou wrote:



test=# LOG:  received SIGHUP, reloading configuration files

test=# select setting from pg_settings where name = 'constraint_exclusion';
setting
-
off
(1 row)

test=# select setting from pg_settings where name = 'constraint_exclusion';
setting
-
on
(1 row)


-- Seems that's due to delay of process SIGHUP ...


 



What's the delay? 1s? 5? 10?

cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Possible problem with pg_reload_conf() and view

2005-11-04 Thread Qingqing Zhou



 What's the delay? 1s? 5? 10?


Delay is the time difference we kill a signal and the time we really
process it. We kill at once, but only process it at proper idle time. In
my test, the delay is 2 seconds or so. I am looking into the problem - not
sure the exact details now though ...

Regards,
Qingqing

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Seeing context switch storm with 10/13 snapshot of

2005-11-04 Thread Robert Creager
On Thu, 20 Oct 2005 17:35:31 -0400
Tom Lane [EMAIL PROTECTED] wrote:

 Robert Creager [EMAIL PROTECTED] writes:
  Interesting.  7.4.1 is worse for this test, as two jump up to 130k.  But, my
  app runs fine against 7.4.1...
 
  Would it still be helpful to try and pull together a test case from my app
  against 8.1beta3?
 
 Yes, if you can show a case where 8.1 is much worse than 7.4 then we
 need to know about it yesterday.
 

Ok.  I finally have a stand alone test case that does not show that 8.1 is worse
than 7.4.  It shows the opposite quite nicely ;-).  In tests I'm running, 7.4.1
bogs down immediately and never recovers.  8.1 behaves itself for about 20
minutes before it goes out to lunch.  So, in those terms, it is worse as it
looks fine at the start, but then gets progressively worse, but overall, it's
much better.

I have a Perl script, a Perl module and a 1Mb database (from pg_dump -F c).  Are
you interested at this time in receiving this?  I plan on taking it home and
running against my AMD system to see what it does, but that will be against 8.0
if I remember correctly, maybe 8.03.  An upgrade to RC2 might occur when RC2
comes out, unless there would be great benefit on running my tests against
8.1RC1.

Cheers,
Rob





---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Tom Lane
Qingqing Zhou [EMAIL PROTECTED] writes:
 I repeated Tony's result (Win32):

Hmm, some delay in the signal being recognized in our Win32 signal
implementation?  Why would that be?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] wrote

 Hmm, some delay in the signal being recognized in our Win32 signal
 implementation?  Why would that be?


I believe this is a disease for all platforms, not only Windows. This is 
because the signals are asynchoronized. Think when you returned from 
kill(SIGHUP), you can't suppose that Postmaster already got it. Now when you 
do your next command very fast, then there is a chance you didn't receive 
the forwarded SIGHUP from Postmaster. But windows does make the disease 
easier to be seen.

Regards,
Qingqing 



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] postgresql-8.1RC1 on Solaris 10, amd64x2

2005-11-04 Thread Qingqing Zhou

Cedric Berger [EMAIL PROTECTED] wrote

 I tried to stop postgress using: 'su - postgres -c pg_ctl stop -swm 
 fast'
 but that function never returns. Even kill -9 couldn't stop the process.


If repeatable, can you strace (maybe not called this in Solaris) what 
postgres is doing?

Regards,
Qingqing 



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


[HACKERS] Its a first!! We are on scheduale ...

2005-11-04 Thread Marc G. Fournier


'k, 8.1.0 is bundled ... this is effectively RC2, so *please* look it 
over, test it and make sure that we haven't missed anything ... official 
release is on Tuesday the 8th, the intervening time is meant to allow Dave 
time to get the Windows binary/installer ready, Devrim to get RPMs, Magnus 
to get Bittorrent ... and mirrors to sync up ...


I haven't created all the links on the ftp yet, will do *that* on Monday 
morning, so that the mirrors can sync up before the press release ...


Well done folks ... I think this is the first time we've actually not had 
to slip on a release :)



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Tony Caduto

Andrew Dunstan wrote:




Qingqing Zhou wrote:



test=# LOG: received SIGHUP, reloading configuration files

test=# select setting from pg_settings where name = 
'constraint_exclusion';

setting
-
off
(1 row)

test=# select setting from pg_settings where name = 
'constraint_exclusion';

setting
-
on
(1 row)


-- Seems that's due to delay of process SIGHUP ...






What's the delay? 1s? 5? 10?

cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match


hmm, I waited for at least 1 minute after doing the reload and it was 
never updated.
It seemed at the time that the only way to get a updated pg_settings 
view was to actually restart the

server. I plan on doing some more testing on Saturday.

Tony


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Possible problem with pg_reload_conf() and view pg_settings

2005-11-04 Thread Qingqing Zhou


On Fri, 4 Nov 2005, Tony Caduto wrote:


 hmm, I waited for at least 1 minute after doing the reload and it was
 never updated. It seemed at the time that the only way to get a updated
 pg_settings view was to actually restart the server. I plan on doing
 some more testing on Saturday.


Did you set constraint_exclusion = on, then comment it
#constraint_exclusion = on in the .conf file? If so, then
ProcessConfigFile() will ignore this line since it is a comment, instead
of setting back its value to off. Try to do set constraint_exclusion =
off explicitely, see if long delay ever happens.

Btw, is this considered a bug of ProcessConfigFile()? To fix it, we can
call InitializeGUCOptions() somewhere but I am not sure which values
should be untouched, like ConfigFileName ...

Regards,
Qingqing

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] PG 8.1 supported platforms list: IRIX is MIA

2005-11-04 Thread Qingqing Zhou

Tom Lane [EMAIL PROTECTED] wrote
 Can anyone test 8.1RC1 on IRIX?


On a separate matter, I did a simple regression (--enable-cassert) of cvs 
tip on SunOS 5.8, seems no problem. I found the only thing related in our 
document is that SunOS 4 is not supported in 2001. Shall we add something 
in doc?

Regards,
Qingqing 



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] somebody could explain this?

2005-11-04 Thread Greg Stark
Otto Hirr [EMAIL PROTECTED] writes:

 Trivia...
 I heard a story many years ago that landed a programmer in prison...
 He worked on the program that calculated interest that was to be
 deposited into an account. Instead of rounding or truncating the
 amount beyond what the bank wanted to use, he saved the small fractional
 dollars (really micro-pennies) into an account. Soon he had several
 hundred thousand in the account - there is a lot of multiplication...
 He got caught.

Uh yeah, that story would be the plot to Superman II (and Office Space too).


-- 
greg


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Its a first!! We are on scheduale ...

2005-11-04 Thread Tom Lane
Marc G. Fournier [EMAIL PROTECTED] writes:
 'k, 8.1.0 is bundled ...

Seems like it's time to cut the REL 8.1 branch?

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


[HACKERS] Old interfaces directory in CVS tree?

2005-11-04 Thread jtv
A libpqxx user just informed me that the anonymous CVS repository at
anoncvs.postgresql.org still contained a 2002 version of libpqxx in the
interfaces directory.  I checked it out and otherwise it seems to be the
current source tree--at least I found an 8.1 version number somewhere.

Could someone cvs remove this old libpqxx version?  Having the old code
available may still be useful, but it should not be in current checkouts!


Jeroen



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match