date:20061220

Re: [HACKERS] Companies Contributing to Open Source

2006-12-20 Thread David Fetter

On Tue, Dec 19, 2006 at 07:17:13PM -0800, Joshua D. Drake wrote:
 On Tue, 2006-12-19 at 22:04 -0500, Tom Lane wrote:
  Joshua D. Drake [EMAIL PROTECTED] writes:
   I remember the president of Great Bridge
   saying that the company needs the community, but not visa-vera --- if
   the company dies, the community keeps going (as it did after Great
   Bridge, without a hickup), but if the community dies, the company dies
   too. 
  
   I 95% agree here. If EDB or CMD were go to down in flames, it could hurt
   the community quite a bit.
  
  Josh, I hate to burst your bubble, but Great Bridge employed a much
  larger fraction of the hacker-community-at-the-time than either EDB or
  CMD do today.  We survived that, and if EDB or CMD or Greenplum or the
  entire lot went down tomorrow, we'd survive that too.
 
 I never once suggested that the community would not survive. I said it
 would hurt productivity for n amount of time.
 
 Let's just be realistic here:
 
 In one fails swoop:
 
 Devrim, Alvaro, Darcy, Heikki, Bruce, Simon, Greg, Dave, Marc and I are
 all suddenly looking for employment...

and by Friday, all are working again, unless they want to take a few
weeks off. :)

 You don't think there would be an issue that could cause some grief to
 the community?

Not really.  It might cause some personal grief to each, which might
cause some temporary loss of productivity.  Then again, it might be a
stimulating shakeup, or creative destruction, to borrow a phrase.

 Is it surmountable?  Of course, that isn't the point.  The point is
 that it is not painless. 

Nothing is painless.

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Companies Contributing to Open Source

2006-12-20 Thread David Fetter

On Tue, Dec 19, 2006 at 05:40:12PM -0500, Bruce Momjian wrote:
 Lukas Kahwe Smith wrote:
  Hi,
  
  I think another point you need to bring out more clearily is that
  the community is also often miffed if they feel they have been
  left out of the design and testing phases. This is sometimes just
  a reflex that is not always based on technical reasoning. Its just
  that as you correctly point out are worried of being high-jacked
  by companies.
 
 I hate to mention an emotional community reaction in this document.

You don't have to name it that if you don't want to, although respect
(or at least a good simulation of it) is crucial when dealing with any
person or group.  Handing the community a /fait accompli/ is a great
way to convey disrespect, no matter how well-meaning the process
originally was.

 We normally just highlight the inefficiency of a company doing
 things on their own, and the wasted effort of them having to make
 adjustments.

Would it really hurt to touch on some of the whys of this?

Cheers,
D
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

[HACKERS] Doc bug

2006-12-20 Thread Gurjeet Singh


The documentation at
http://www.postgresql.org/docs/8.2/interactive/config-setting.html states
that:

Boolean values may be written as ON, OFF, TRUE, FALSE, YES, NO, 1, 0 (all
case-insensitive) or any unambiguous prefix of these.

But the following doesn't work:

postgres=# set enable_seqscan = of;
ERROR:  parameter enable_seqscan requires a Boolean value
postgres=#

'of' is an unambiguous prefix of OFF, but it clearly doesn't work. Is it the
documentation that needs fix or is it the code?

I tried the following too:

set enable_seqscan = of; -- doesn't work
set enable_seqscan = off; -- works

BTW, I tried TR, TRU, FA, FAL, FALS, YE. They all work fine.

Best regards,

--
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | yahoo }.com

Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Martijn van Oosterhout

On Wed, Dec 20, 2006 at 01:39:58AM +, Tom Dunstan wrote:
 Not with this patch, and AFAIK not possible generally, without writing 
 separate I/O functions for each type. I'd love to be able to do that, 
 but I don't think it's possible currently. The main stumbling block is 
 the output function (and cast-to-text function), because output 
 functions do not get provided the oid of the type that they're dealing 
 with, for security reasons IIRC. It was never clear to me why I/O 
 functions should ever be directly callable by a user (and hence open to 
 security issues), but apparently it was enough to purge any that were 
 designed like that from the system, so I wasn't going to go down that 
 road with the patch.

I worked around this in taggedtypes by indeed creating seperate copies
of the i/o functions on demand and at execution time looking up the
required type from the function signiture.

The only solution indeed is to change the calling convention if the I/O
functions so that the relevent datatype oid stored in a safe place,
that isn't set for normal function calls.

BTW, being able to call type i/o functions directly is very useful. For
example date_in(text_out(blah)) is a form of cast between types that
don't usually have a cast. If you change the calling convention as
indicated, that trick will still work, just not for types with the
restricted i/o functions.

Also, it's not just I/O functions that are the issue, consider the
enum-to-integer cast.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Takayuki Tsunakawa

Hello, Itagaki-san, all

I have to report a sad result.  Your patch didn't work.  Let's
consider the solution together.  What you are addressing is very
important for the system designers in the real world -- smoothing
response time.

Recall that unpatched PostgreSQL showed the following tps's in case
(1) (i.e. with default bgwriter_* and checkpoint_* settings.)

235  80  226 77  240

The patched PostgreSQL showed the following tps's:

230  228  77  209  66

[disk usage]
The same tendency can be seen as with the unpatched PostgreSQL.  That
is:
When the tps is low, the %util of disk for data files is high, and
%util of disk for WAL is low.  Why is transaction logging is disturbed
by cleaning and/or syncing activity?  While the bgwriter is
fsync()ing, it does not lock any data structures that the transactions
want to access.  Even though they share the same SCSI controller and
bus, they are different disks.  The bandwidth does not appear to be
exhausted, since Ultra320 is said to have 256MB band width in
practice.
(Recall that WAL is on sdd and data files are on sde.)

Device:rrqm/s wrqm/s   r/s   w/s  rsec/s  wsec/srkB/swkB/s
avgrq-sz avgqu-sz   await  svctm  %util
sdd  0.00 810.78  0.00 102.200.00 7306.99 0.00
3653.4971.50 1.12   10.95   7.32  74.77
sde  0.00  25.35  0.00  6.190.00  252.30 0.00   126.15
40.77 0.50   81.32   5.94   3.67
sdd  0.00 884.20  0.00 126.000.00 8080.00 0.00
4040.0064.13 1.26   10.00   7.11  89.64
sde  0.00  21.40  0.00  5.000.00  211.20 0.00   105.60
42.24 0.31   62.56   6.52   3.26
sdd  0.00 924.80  0.00 116.200.00 8326.40 0.00
4163.2071.66 1.23   10.59   7.37  85.64
sde  0.00  27.60  0.00 26.600.00  433.60 0.00   216.80
16.30 4.24  159.29   2.44   6.50
sdd  0.00 721.20  0.00 102.400.00 6588.80 0.00
3294.4064.34 0.999.71   7.07  72.40
sde  0.00 1446.80  0.00 101.600.00 20289.60 0.00
10144.80   199.70  1192.40  572.45   2.29  23.30
sdd  0.00   0.00  0.00  0.200.001.60 0.00 0.80
8.00 0.11  539.00 539.00  10.80
sde  0.00   0.00  0.00 452.100.000.00 0.00
0.00 0.00  3829.57 3715.83   2.22 100.22
sdd  0.00   0.00  0.00  0.000.000.00 0.00 0.00
0.00 0.000.00   0.00   0.00
sde  0.00   0.00  0.00 349.800.000.00 0.00
0.00 0.00  1745.52 8515.74   2.86 100.02
sdd  0.00 442.40  0.00 51.000.00 3948.80 0.00  1974.40
77.43 0.60   11.73   7.54  38.46
sde  0.00   2.80  0.00 184.000.00   25.60 0.00
12.80 0.14   277.52 12629.41   3.19  58.74
sdd  0.00 898.00  0.00 124.800.00 8182.40 0.00
4091.2065.56 1.30   10.40   7.24  90.30
sde  0.00  19.20  0.00  3.800.00  184.00 0.0092.00
48.42 0.24   62.11  14.11   5.36
sdd  0.00 842.28  0.00 109.020.00 7612.02 0.00
3806.0169.82 1.33   12.26   8.35  91.02
sde  0.00  45.49  0.00 46.890.00  739.08 0.00   369.54
15.76 9.04  192.73   3.38  15.85
sdd  0.00 1198.41  0.00 71.510.00 10505.18 0.00
5252.59   146.90   128.19   99.76  13.48  96.43
sde  0.00 1357.77  0.00 199.800.00 19263.75 0.00
9631.8796.41  2251.09 1179.42   2.39  47.81
sdd  0.00   0.00  0.00  7.200.000.00 0.00 0.00
0.00   203.87 5671.83 138.92 100.02
sde  0.00   0.00  0.00 409.600.000.00 0.00
0.00 0.00  3171.04 4779.83   2.44 100.02
sdd  0.00   0.00  0.00 17.800.000.00 0.00 0.00
0.00   137.87 10240.90  56.19 100.02
sde  0.00   0.00  0.00 240.600.000.00 0.00
0.00 0.00  1573.85 9815.29   4.16 100.02
sdd  0.00 109.80  0.00 35.400.00 1012.80 0.00   506.40
28.6142.14 7974.47  27.86  98.64
sde  0.00   2.80  0.00 198.800.00   30.40 0.00
15.20 0.15   428.49 14474.39   4.30  85.56
sdd  0.00 466.20  0.00 62.800.00 4230.40 0.00  2115.20
67.36 0.599.49   6.79  42.62
sde  0.00   5.20  0.00  0.800.00   48.00 0.0024.00
60.00 0.01   16.25  11.25   0.90
sdd  0.00   0.00  0.00  0.200.001.60 0.00 0.80
8.00 0.01   35.00  35.00   0.70
sde  0.00   0.00  0.00  0.000.000.00 0.00 0.00
0.00 0.000.00   0.00   0.00


I suspect that fsync() is the criminal as I've been afraid.  I'll show
you an interesting data.  I collected the stack traces of backend
processes while a checkpoint is happening.


[bgwriter]
Oh, he is fsync()ing hard.

#0  0x003a629bfbb2 in __fsync_nocancel () from
/lib64/tls/libc.so.6
#1  0x005742a1 in mdsync ()
#2  0x005753d7 in smgrsync ()
#3  0x00564d65 in FlushBufferPool ()
...

[some backends]
They are forced to wait for some lock pertaining to WAL when they try
to insert a log record.

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Martijn van Oosterhout

On Wed, Dec 20, 2006 at 08:10:56PM +0900, Takayuki Tsunakawa wrote:
 One question is the disk utilization.  While bgwriter is fsync()ing,
 %util of WAL disk drops to almost 0.  But the the bandwidth of
 Ultra320 SCSI does not appear to be used fully.  Any idea?

That implies that fsyncing a datafile blocks fsyncing the WAL. That
seems terribly unlikely (although...). What OS/Kernel/Filesystem is
this. I note a sync bug in linux for ext3 that may have relevence.

Have a nice day,

-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Russell Smith


Tom Lane wrote:

Stephen Frost [EMAIL PROTECTED] writes:
  

Force references to go through macros which implement the lookup for the
appropriate type?  ie: LOGICAL_COL(table_oid,2) vs.
PHYSICAL_COL(table_oid,1)  Perhaps that's too simplistic.



It doesn't really address the question of how you know which one to
use at any particular line of code; or even more to the point, what
mechanism will warn you if you use the wrong one.

My gut feeling about this is that we could probably enforce such a
distinction if we were using C++, but while coding in C I have no
confidence in it.  (And no, that's not a vote to move to C++ ...)
  

What about a comprimise...

The 8.1 documentation for ALTER TABLE states the following.

Adding a column with a non-null default or changing the type of an 
existing column will require the entire table to be rewritten. This may 
take a significant amount of time for a large table; and it will 
temporarily require double the disk space.



Now, we are rewriting the table from scratch anyway, the on disk format 
is changing.  What is stopping us from switching the column order at the 
same time.  The only thing I can think is that the catalogs will need 
more work to update them.  It's a middle sized price to pay for being 
able to reorder the columns in the table.  One of the problems I have is 
wanting to add a column in the middle of the table, but FK constraints 
stop me dropping the table to do the reorder.  If ALTER TABLE would let 
me stick it in the middle and rewrite the table on disk, I wouldn't 
care.  It's likely that I would be rewriting the table anyway.  And by 
specifying AT POSITION, or BEFORE/AFTER you know for big tables it's 
going to take a while.


Not that I'm able to code this at all, but I'm interested in feedback on 
this option.


Regards

Russell Smith

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


  



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Fw: Fw: [HACKERS] choosing use an index or not

2006-12-20 Thread Felipe Rondon Rocha

Hi, 


can you halp me?

Thks,
Felipe
- Original Message - 
From: Martijn van Oosterhout kleptog@svana.org

To: Felipe Rondon Rocha [EMAIL PROTECTED]
Sent: Tuesday, December 19, 2006 8:39 AM
Subject: Re: Fw: [HACKERS] choosing use an index or not


No idea, ask on -hackers...

I only know the concepts of how it works, I have no idea about the code
itself.

Have a nice day,

On Tue, Dec 19, 2006 at 08:39:12AM -0300, Felipe Rondon Rocha wrote:

Hi Martijn,

my problem is that our group of study made a modification to always use the 
index on a select explain ... statement. But as a matter of fact we found 
some bugs that have to be fixed.  I know how the choice of an index 
happens. What I need to find is the moment that PostgreSQL compares the 
costs of all trees (paths) and decide to use the index or not. Do you have 
any tips?


Thank you, very much.

Regards,
Felipe

- Original Message - 
From: Martijn van Oosterhout kleptog@svana.org

To: Felipe Rondon Rocha [EMAIL PROTECTED]
Cc: pgsql-hackers@postgresql.org
Sent: Thursday, December 14, 2006 9:56 AM
Subject: Re: [HACKERS] choosing use an index or not




On Thu, Dec 14, 2006 at 10:06:45AM -0300, Felipe Rondon Rocha wrote:
Hi everyone,

Do you know in which part of the file analyze.c, PostgreSQL decides
to use or not an index? I?m making a test and I can?t trace the
moment that he makes the choice.

It doesn't explicitly make the choice anywhere. The planner makes a set
of paths that can be used. So for each table a path is made for a
sequential scan and one or more paths for each index. For each a cost
is calcualted and whichever has the lowest cost is the one chosen.

It's quite dynamic, the choice also depends on what happens to the
result. There's no point using an index if the result is in the wrong
order for the join, for example.

Have a nice day,
--
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
From each according to his ability. To each according to his ability to 
litigate.




--
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/

From each according to his ability. To each according to his ability to 
litigate.


signature.asc
Description: Binary data

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] psql: core dumped

2006-12-20 Thread Mario


On 19/12/06, Alvaro Herrera [EMAIL PROTECTED] wrote:


I think the problem Mario is really trying to solve is quitting at
psql's Password:  prompt.  Ctrl-C is ignored at that point apparently.
SIGQUIT (thus Ctrl-\ in most people's setup) does it but it also dumps
core.



 yes, that is true and also when you are is psql prompt too. But, I
had no idea that a core dump is normal with a SIGQUIT signal.

 Thank you all for your time and patience; this was my first try
with a postgres patch, I'll try with something else in other chance
:-)





--
http://www.advogato.org/person/mgonzalez/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] psql: core dumped

2006-12-20 Thread Mario


On 19/12/06, Andrew Dunstan [EMAIL PROTECTED] wrote:


This normally a SIGQUIT, and on my machine at least the default action for
that is a core dump. Perhaps you need to say what you are trying to do and
why.



 I'd like to help :-)   I wanted to avoid a core dumped but you told
me that's a normal thing for a SIGQUIT signal.





--
http://www.advogato.org/person/mgonzalez/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] [PATCHES] Load distributed checkpoint patch

2006-12-20 Thread Takayuki Tsunakawa

From: ITAGAKI Takahiro [EMAIL PROTECTED]
 Bruce Momjian [EMAIL PROTECTED] wrote:
 Do you use the same delay autovacuum uses?

 What do you mean 'the same delay'? Autovacuum does VACUUM, not
CHECKPOINT.
 If you think cost-based-delay, I think we cannot use it here. It's
hard to
 estimate how much checkpoints delay by cost-based sleeping, but we
should
 finish asynchronous checkpoints by the start of next checkpoint. So
I gave
 priority to punctuality over load smoothing.

I consider that smoothing the load (more meaningfully, response time)
has higher priority over checkpoint punctuality in a practical sense,
because the users of a system benefit from good steady response and
give good reputation to the system.  If the checkpoint processing is
not punctual, crash recovery would take longer time.  But which would
you give higher priority, the unlikely event (=crash of the system) or
likely event (=peek hours of the system)?  I believe the latter should
be regarded.  The system can write dirty buffers after the peek hours
pass.  User experience should be taken much case of.



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: Fw: Fw: [HACKERS] choosing use an index or not

2006-12-20 Thread Gurjeet Singh


As a starting point, you can look at these two functions:

src/backend/optimizer/path/indxpath.c : find_usable_indexes()
src/backend/optimizer/util/pathnode.c : add_path()

Hope this helps.


On 12/20/06, Felipe Rondon Rocha [EMAIL PROTECTED] wrote:


Hi,

can you halp me?

Thks,
Felipe
- Original Message -
From: Martijn van Oosterhout kleptog@svana.org
To: Felipe Rondon Rocha [EMAIL PROTECTED]
Sent: Tuesday, December 19, 2006 8:39 AM
Subject: Re: Fw: [HACKERS] choosing use an index or not




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match







--
[EMAIL PROTECTED]
[EMAIL PROTECTED] gmail | hotmail | yahoo }.com

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Martijn van Oosterhout

On Tue, Dec 19, 2006 at 11:29:24PM -0500, Tom Lane wrote:
 Stephen Frost [EMAIL PROTECTED] writes:
  Force references to go through macros which implement the lookup for the
  appropriate type?  ie: LOGICAL_COL(table_oid,2) vs.
  PHYSICAL_COL(table_oid,1)  Perhaps that's too simplistic.
 
 It doesn't really address the question of how you know which one to
 use at any particular line of code; or even more to the point, what
 mechanism will warn you if you use the wrong one.

There's one method: Set it up so that when you create a table, it
randomizes the order of the fields on disk. Obviously for production
this isn't smart, but it would test the code a lot. Though in the
regression tests many tables only have one column so they won't be
affected.

If we had unit tests you could create a function called
heap_mangle_tuple which simply does physical reordering but logically
does nothing and feed it in at each point to check the code is
invarient.

Another approach is to number logical columns starting at 1000. This
would mean that at a glance you could tell what you're talking about.
And code using the wrong one will do something obviously bad. If
performance is an issue you could only enable the offset for
--enable-assert builds.

Personally I like this approach because it would encourage everyone
to use the macro to access the fields, since not doing so will place a
constant in an obvious place. It's also trivial for the system to
check.

Personally I'm unsure of the scope of the problem. AFAICS there's
hardly anywhere that would use physical offsets...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Takayuki Tsunakawa

 That implies that fsyncing a datafile blocks fsyncing the WAL. That
 seems terribly unlikely (although...). What OS/Kernel/Filesystem is
 this. I note a sync bug in linux for ext3 that may have relevence.

Oh, really?  What bug?  I've heard that ext3 reports wrong data to
iostat when it performs writes (the data is correct when performing
reads.)
My env is:

OS: RHEL 4.0 for AMD64/EM64T
kernel: 2.6.9-42.ELsmp
The file system is ext3.

Terribly unlikely?  But I've seen the disk utilization quite often.




---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Peter Eisentraut

Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
 If you can show me a reasonably bulletproof or machine-checkable way to
 keep the two kinds of column numbers distinct, I'd be all for it.

The only way I can see is to make the domains of the numbers distinct.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] effective_cache_size vs units

2006-12-20 Thread Tom Dunstan


Tom Lane wrote:


(Hmm, I wonder what Tom Dunstan's enum patch does about case
sensitivity...)


Currently enum labels are case sensitive. I was a bit ambivalent about 
it... case insensitivity can lead to less surprises in some cases, but 
many programming languages that have enums are case sensitive, and so 
this wouldn't be a direct map for them. OTOH, if someone's doing evil 
things like sticking labels that differ only in case into an enum, 
perhaps they *should* be dissuaded. :)


The question is where does it end, though? Should we treat letters with 
accents and umlauts as equivalent as well? Do we remove punctuation 
characters? It gets into a (for me) more murky localization issue, and 
I'm not familiar with the postgresql apis for handling that. Maybe it's 
easy.


Since we basically accept any old thing into an enum label, I think we 
probably shouldn't muck with it. If we want to have some sort of 
normalized version, then we should probably restrict the characters that 
we accept fairly severely.


Also note that enum values are far more likely to be set by application 
code than by a human typing the value in directly, so in that sense the 
need for case insensitivity seems somewhat diminished.


I suppose we should think about mysql refugees at some point, though. I 
wonder what they do. The documentation is silent on the matter (and all 
their examples are in lower case). Mysql is generally case insensitive, 
right?


Cheers

Tom

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] Interface for pg_autovacuum

2006-12-20 Thread Florian G. Pflug


Jim Nasby wrote:
I'm teaching a class this week and a student asked me about OIDs. He 
related the story of how in Sybase, if you moved a database from one 
server from another, permissions got all screwed up because user IDs no 
longer matched. I explained that exposing something like an integer ID 
in a user interface or an API is just a bad idea and PostgreSQL doesn't 
do that.


Then I got to pg_autovacuum

So... is there any reason there isn't a prescribed interface to 
pg_autovacuum that doesn't expose vacrelid? Can we get that added to TODO?


Wouldn't it be sufficient to change the type of vacrelid from oid
to regclass? Then just dumping and restoring pg_autovacuum like any
other table should Just Work.

greetings, Florian Pflug

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Martijn van Oosterhout

On Wed, Dec 20, 2006 at 07:20:14AM -0600, Kenneth Marshall wrote:
 On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:
  Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
   If you can show me a reasonably bulletproof or machine-checkable way to
   keep the two kinds of column numbers distinct, I'd be all for it.
  
  The only way I can see is to make the domains of the numbers distinct.
  
 Negative vs. positive numbers?

Negative is used by system columns. Just adding some large constant
(say 1) should be enough.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Martijn van Oosterhout

On Wed, Dec 20, 2006 at 09:14:50PM +0900, Takayuki Tsunakawa wrote:
  That implies that fsyncing a datafile blocks fsyncing the WAL. That
  seems terribly unlikely (although...). What OS/Kernel/Filesystem is
  this. I note a sync bug in linux for ext3 that may have relevence.
 
 Oh, really?  What bug?  I've heard that ext3 reports wrong data to
 iostat when it performs writes (the data is correct when performing
 reads.)

I was referring to this in the 2.6.6 changelog:

http://www.linuxhq.com/kernel/changelog/v2.6/6/index.html

   ext3's fsync/fdatasync implementation is currently syncing the inode via a
   full journal commit even if it was unaltered.
   
However you're running a later version so that's not it.

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Peter Eisentraut

Am Mittwoch, 20. Dezember 2006 14:20 schrieb Kenneth Marshall:
 On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:
  Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
   If you can show me a reasonably bulletproof or machine-checkable way to
   keep the two kinds of column numbers distinct, I'd be all for it.
 
  The only way I can see is to make the domains of the numbers distinct.

 Negative vs. positive numbers?

That would be an obvious choice, but negative column numbers are already in 
use for system columns.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Andrew Dunstan


Martijn van Oosterhout wrote:

On Wed, Dec 20, 2006 at 07:20:14AM -0600, Kenneth Marshall wrote:
  

On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:


Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
  

If you can show me a reasonably bulletproof or machine-checkable way to
keep the two kinds of column numbers distinct, I'd be all for it.


The only way I can see is to make the domains of the numbers distinct.

  

Negative vs. positive numbers?



Negative is used by system columns. Just adding some large constant
(say 1) should be enough.

Have a nice day,
  


Or we could divide the positive number space in two, by starting at 2^14 
(attnums are int2). Then a simple bitmask test would work to distinguish 
them.



cheers

andrew

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote:
 Stephen Frost [EMAIL PROTECTED] writes:
  Force references to go through macros which implement the lookup for the
  appropriate type?  ie: LOGICAL_COL(table_oid,2) vs.
  PHYSICAL_COL(table_oid,1)  Perhaps that's too simplistic.
 
 It doesn't really address the question of how you know which one to
 use at any particular line of code; or even more to the point, what
 mechanism will warn you if you use the wrong one.

That'd be the point of doing the typing, you then declare functions as
accepting the type and then if someone passes the wrong type to a
function the compiler will complain.  Inside of a particular function it
would hopefully be easier to keep it clear.  I'd think that most
functions would deal with one type or the other (which would be declared
in the arguments or in the local variables) and that functions which
have to deal with both would be able to keep them straight.

 My gut feeling about this is that we could probably enforce such a
 distinction if we were using C++, but while coding in C I have no
 confidence in it.  (And no, that's not a vote to move to C++ ...)

I need to go research what Linux does for this because aiui it's
pretty good about being able to enforce better type-checking than the
stock C types.  The only downside is that I *think* it might be a
GCC-only thing.  In that case I'd think we would still use it but build
some macros which essentially disable it for non-GCC compilers.  As a
mainly-for-developers compile-time check I think as long as a build-farm
member is running GCC and complaining when there are errors (and it can
be disabled on non-GCC compilers) we won't lose any portability from it.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Andrew Dunstan


Russell Smith wrote:

Tom Lane wrote:

Stephen Frost [EMAIL PROTECTED] writes:
 
Force references to go through macros which implement the lookup for 
the

appropriate type?  ie: LOGICAL_COL(table_oid,2) vs.
PHYSICAL_COL(table_oid,1)  Perhaps that's too simplistic.



It doesn't really address the question of how you know which one to
use at any particular line of code; or even more to the point, what
mechanism will warn you if you use the wrong one.

My gut feeling about this is that we could probably enforce such a
distinction if we were using C++, but while coding in C I have no
confidence in it.  (And no, that's not a vote to move to C++ ...)
  

What about a comprimise...

The 8.1 documentation for ALTER TABLE states the following.

Adding a column with a non-null default or changing the type of an 
existing column will require the entire table to be rewritten. This 
may take a significant amount of time for a large table; and it will 
temporarily require double the disk space.



Now, we are rewriting the table from scratch anyway, the on disk 
format is changing.  What is stopping us from switching the column 
order at the same time.  The only thing I can think is that the 
catalogs will need more work to update them.  It's a middle sized 
price to pay for being able to reorder the columns in the table.  One 
of the problems I have is wanting to add a column in the middle of the 
table, but FK constraints stop me dropping the table to do the 
reorder.  If ALTER TABLE would let me stick it in the middle and 
rewrite the table on disk, I wouldn't care.  It's likely that I would 
be rewriting the table anyway.  And by specifying AT POSITION, or 
BEFORE/AFTER you know for big tables it's going to take a while.




This isn't really a compromise. Remember that this discussion started 
with consideration of optimal record layout (minimising space use by 
reducing or eliminating alignment padding). The above proposal really 
does nothing for that.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] effective_cache_size vs units

2006-12-20 Thread Kenneth Marshall

On Tue, Dec 19, 2006 at 10:12:34PM +, Gregory Stark wrote:
 
 Tom Lane [EMAIL PROTECTED] writes:
 
  Magnus Hagander [EMAIL PROTECTED] writes:
  Oh, you mean MB vs Mb. Man, it had to be that simple :)
 
  ISTM we had discussed whether guc.c should accept units strings in
  a case-insensitive manner, and the forces of pedantry won the first
  round.  Shall we reopen that argument?
 
 Nope, I just checked back in the archive and that's not what happened. There
 was an extended discussion about whether to force users to use the silly KiB,
 MiB, etc units. Thankfully the pedants lost that round soundly.
 
 There was no particular discussion about case sensitivity though Simon made
 the case for user-friendly behaviour:
 
  I think we are safe assume to that
  
  kB = KB = kb = Kb = 1024 bytes
  
  mB = MB = mb = Mb = 1024 * 1024 bytes
  
  gB = GB = gb = Gb = 1024 * 1024 * 1024 bytes
  
  There's no value in forcing the use of specific case and it will be just
  confusing for people.
 
 http://archives.postgresql.org/pgsql-hackers/2006-07/msg01253.php
 
 And Jim Nasby said something similar:
 
  Forcing people to use a specific casing scheme is just going to lead to
  confusion and user frustration. If there's not a very solid *functional*
  argument for it, we shouldn't do it. Wanting to enforce a convention that
  people rarely use isn't a good reason.
 
 http://archives.postgresql.org/pgsql-hackers/2006-07/msg01355.php
 
 There was a lone comment from Thomas Hallgren in favour of case sensitivity in
 the name of consistency. But Nasby's comment was directly in response and
 nobody else piped up after that.
 

My one comment is that a little 'b' is used to indicate bits normally
and a capital 'B' is used to indicate bytes. So
   kb = '1024 bits'
   kB = '1024 bytes'
...

I do think that whether or not the k/m/g is upper case or lower case
is immaterial.

Ken
 -- 
   Gregory Stark
   EnterpriseDB  http://www.enterprisedb.com
 
 ---(end of broadcast)---
 TIP 4: Have you searched our list archives?
 
http://archives.postgresql.org
 

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Kenneth Marshall

On Wed, Dec 20, 2006 at 01:26:59PM +0100, Peter Eisentraut wrote:
 Am Mittwoch, 20. Dezember 2006 04:44 schrieb Tom Lane:
  If you can show me a reasonably bulletproof or machine-checkable way to
  keep the two kinds of column numbers distinct, I'd be all for it.
 
 The only way I can see is to make the domains of the numbers distinct.
 
Negative vs. positive numbers?

Ken

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

VS: [HACKERS] Companies Contributing to Open Source

2006-12-20 Thread Heikki Linnakangas

Jonah Harris wrote:
 We could also mention all the Ingres-based offshoots that were
 commercial.  Let me think of some other examples... but there may not
 be that many seeing as there aren't really that many PostgreSQL-like
 communities.  I guess I could mention more if I had a clear
 understanding of what we mean when we say, community.

That's something that bothers me as well: The article and the discussions talks 
about the community, but what's The Community? There really isn't any clear 
definition, the closest thing is the list of committers or core members, but I 
don't think anyone considers The Community to be just the committers, though 
that's the group of people whose opinions matter the most when trying to get a 
patch accepted. I don't think it's fruitful to spend time on a precise 
definition, but it's important to realize that there isn't one and that the 
community consists of individuals with different priorities, opinions and 
points of view. Therefore it's a bit meaningless to say that The Community 
thinks this or The Community says that. On one topic, some people might have a 
very strong opinion one way, and others might just not care at all. On some 
topics, everyone agrees. And on some topics, people strongly disagree. And 
that's ok. Respecting all the different viewpoints leads to a well-balanced 
product.

How does that affect a company trying to get a patch accepted? First, do no 
harm. If you're proposing something that for example brakes someone else's 
application, your proposal is likely to be rejected. Or if you're proposing a 
patch that increases the performance of something, at a very high cost on some 
other things, your patch is likely to be rejected. Another kind of harm that 
many people miss is the maintainability of the codebase. Adding complexity for 
little gain is likely to be rejected, just because it'll make the code harder 
to read.

Secondly, getting a large feature accepted is easier if you're not just dumping 
a large patch to pgsql-patches, but you're committed to maintainting it and 
developing it further. Remember, some  things you might have ignored as not 
important might be crucial to other people.

Also, a note to all Members of The Community: people like to work in different 
ways. Some might want to seek acceptance and commitment to a feature from 
others before starting development. Some might want to write a large up-front 
design document before proposing something. Some might want to write an 
experimental patch with a lot of quick hacks and no comments, and refine that 
according to feedback. And we, The Members of The Community, if I may count 
myself as one, don't get to choose how others prefer to work.

--
 Heikki Linnakangas
 EnterpriseDB   http://www.enterprisedb.com

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Stephen Frost

* Andrew Dunstan ([EMAIL PROTECTED]) wrote:
 This isn't really a compromise. Remember that this discussion started 
 with consideration of optimal record layout (minimising space use by 
 reducing or eliminating alignment padding). The above proposal really 
 does nothing for that.

While I agree that's how the discussion started the column ordering
issue can stand on its own and any proposal which provides that feature
should be considered.  I don't think we should throw out the
rewrite-the-table idea because it doesn't solve other problems.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Richard Huxton


Andrew Dunstan wrote:


Or we could divide the positive number space in two, by starting at 2^14 
(attnums are int2). Then a simple bitmask test would work to distinguish 
them.


Perhaps divide-by-four, then it would be possible to have calculated 
columns (as mentioned recently on one of the lists). In particular, that 
would let you have FK constraints with a constant as part of the key.


--
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] [PATCHES] Load distributed checkpoint patch

2006-12-20 Thread Bruce Momjian

ITAGAKI Takahiro wrote:
 Bruce Momjian [EMAIL PROTECTED] wrote:
 
  OK, if I understand correctly, instead of doing a buffer scan, write(),
  and fsync(), and recyle the WAL files at checkpoint time, you delay the
  scan/write part with the some delay.
 
 Exactly. Actual behavior of checkpoint is not changed by the patch. Compared
 with existing checkpoints, it just takes longer time in scan/write part.
 
  Do you use the same delay autovacuum uses?

Sorry, I meant bgwriter delay, not autovauum.

 What do you mean 'the same delay'? Autovacuum does VACUUM, not CHECKPOINT.
 If you think cost-based-delay, I think we cannot use it here. It's hard to
 estimate how much checkpoints delay by cost-based sleeping, but we should
 finish asynchronous checkpoints by the start of next checkpoint. So I gave
 priority to punctuality over load smoothing.

OK.

  As I remember, often the checkpoint is caused because
  we are using the last WAL file.  Doesn't this delay the creation of new
  WAL files by renaming the old ones to higher numbers (we can't rename
  them until the checkpoint is complete)?
 
 Checkpoints should be done by the next one, so we need WAL files for two
 checkpoints. It is the same as now.

Ah, OK, so we already reserve a full set of WAL files while we are
waiting for the checkpoint to complete.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] effective_cache_size vs units

2006-12-20 Thread Peter Eisentraut

Am Mittwoch, 20. Dezember 2006 13:42 schrieb Tom Dunstan:
 I suppose we should think about mysql refugees at some point, though. I
 wonder what they do. The documentation is silent on the matter (and all
 their examples are in lower case). Mysql is generally case insensitive,
 right?

Maybe you can make sense of this, but I can't ...


mysql create table test (a int, b enum ('x', 'X', 'y'));
Query OK, 0 rows affected, 1 warning (0.02 sec)

mysql show warnings;
+---+--+-+
| Level | Code | Message |
+---+--+-+
| Note  | 1291 | Column 'b' has duplicated value 'x' in ENUM |
+---+--+-+
1 row in set (0.00 sec)

mysql insert into test values (1, 'x');
Query OK, 1 row affected (0.00 sec)

mysql insert into test values (1, 'X');
Query OK, 1 row affected (0.00 sec)

mysql insert into test values (1, 'y');
Query OK, 1 row affected (0.00 sec)

mysql insert into test values (1, 'Y');
Query OK, 1 row affected (0.00 sec)

mysql insert into test values (1, 'z');
Query OK, 1 row affected, 1 warning (0.00 sec)

## You think that was funny -- now watch this:

mysql select * from test;
+--+--+
| a| b|
+--+--+
|1 | x|
|1 | x|
|1 | y|
|1 | y|
|1 |  |
+--+--+
5 rows in set (0.00 sec)

mysql drop table test;
Query OK, 0 rows affected (0.00 sec)

mysql create table test (a int, b enum ('ä', 'Ä', ' ', '  '));
## Above is a-diaeresis, A-diaeresis.
Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql show warnings;
+---+--+-+
| Level | Code | Message |
+---+--+-+
| Note  | 1291 | Column 'b' has duplicated value '?' in ENUM | # literal ?
| Note  | 1291 | Column 'b' has duplicated value '' in ENUM  |
+---+--+-+
2 rows in set (0.00 sec)

mysql insert into test values (1, ' ');
Query OK, 1 row affected (0.00 sec)

mysql insert into test values (1, '  ');
Query OK, 1 row affected (0.00 sec)

mysql insert into test values (1, '   ');
Query OK, 1 row affected (0.01 sec)

mysql insert into test values (1, '   |');
Query OK, 1 row affected, 1 warning (0.00 sec)

mysql show warnings;
+-+--++
| Level   | Code | Message|
+-+--++
| Warning | 1265 | Data truncated for column 'b' at row 1 |
+-+--++
1 row in set (0.00 sec)

mysql select distinct * from test;
+--+--+
| a| b|
+--+--+
|1 | ä|
|1 |  |
|1 |  |
+--+--+


Better not imitate that.

-- 
Peter Eisentraut
http://developer.postgresql.org/~petere/

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-20 Thread Teodor Sigaev


   Perhaps an array of int4 would be better?  How much

Done
http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz


The patch needs more cleanup before applying, too, eg make comments
match code, get rid of unused keywords added to gram.y.


Cleaned.


--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: [HACKERS] [PATCHES] Bundle of patches

2006-12-20 Thread Tom Lane

Teodor Sigaev [EMAIL PROTECTED] writes:
 Perhaps an array of int4 would be better?  How much

 Done
 http://www.sigaev.ru/misc/user_defined_typmod-0.9.gz

 The patch needs more cleanup before applying, too, eg make comments
 match code, get rid of unused keywords added to gram.y.

 Cleaned.

OK.  I'm up to my rear in the opclass/opfamily rewrite, but will take
another look at the typmod code as soon as I have a working HEAD again ;-)

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Andrew Dunstan


Martijn van Oosterhout wrote:

Also, it's not just I/O functions that are the issue, consider the
enum-to-integer cast.
  



er, what cast? :-)

IIRC Tom hasn't provided one. If you don't break the enum abstraction 
there should be no need for one, and given the implementation it's not 
quite trivial - probably the best way if this is needed would be to 
precalculate it at type creation time and store the value in an extra 
column in pg_enum.


cheers

andrew

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Companies Contributing to Open Source

2006-12-20 Thread Bruce Momjian

David Fetter wrote:
 On Tue, Dec 19, 2006 at 05:40:12PM -0500, Bruce Momjian wrote:
  Lukas Kahwe Smith wrote:
   Hi,
   
   I think another point you need to bring out more clearily is that
   the community is also often miffed if they feel they have been
   left out of the design and testing phases. This is sometimes just
   a reflex that is not always based on technical reasoning. Its just
   that as you correctly point out are worried of being high-jacked
   by companies.
  
  I hate to mention an emotional community reaction in this document.
 
 You don't have to name it that if you don't want to, although respect
 (or at least a good simulation of it) is crucial when dealing with any
 person or group.  Handing the community a /fait accompli/ is a great
 way to convey disrespect, no matter how well-meaning the process
 originally was.
 
  We normally just highlight the inefficiency of a company doing
  things on their own, and the wasted effort of them having to make
  adjustments.
 
 Would it really hurt to touch on some of the whys of this?

Sure, I mention that once they deal with the community, it might
requires extensive rewriting:

This means the employee is not benefitting from community oversight
and suggestions, often leading to a major rewrite when a patch is
submitted to the community.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch

2006-12-20 Thread Bruce Momjian

Andrew Dunstan wrote:
  Now, we are rewriting the table from scratch anyway, the on disk 
  format is changing.  What is stopping us from switching the column 
  order at the same time.  The only thing I can think is that the 
  catalogs will need more work to update them.  It's a middle sized 
  price to pay for being able to reorder the columns in the table.  One 
  of the problems I have is wanting to add a column in the middle of the 
  table, but FK constraints stop me dropping the table to do the 
  reorder.  If ALTER TABLE would let me stick it in the middle and 
  rewrite the table on disk, I wouldn't care.  It's likely that I would 
  be rewriting the table anyway.  And by specifying AT POSITION, or 
  BEFORE/AFTER you know for big tables it's going to take a while.
 
 
 This isn't really a compromise. Remember that this discussion started 
 with consideration of optimal record layout (minimising space use by 
 reducing or eliminating alignment padding). The above proposal really 
 does nothing for that.

I assume space waste will be mostly fixed when we have 0/1 byte headers
for varlena data types.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Tom Lane

Bruce Momjian [EMAIL PROTECTED] writes:
 I assume space waste will be mostly fixed when we have 0/1 byte headers
 for varlena data types.

Hardly.  int float timestamp etc types will all still have alignment issues.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] effective_cache_size vs units

2006-12-20 Thread Andrew Dunstan


Peter Eisentraut wrote:

Am Mittwoch, 20. Dezember 2006 13:42 schrieb Tom Dunstan:
  

I suppose we should think about mysql refugees at some point, though. I
wonder what they do. The documentation is silent on the matter (and all
their examples are in lower case). Mysql is generally case insensitive,
right?



Maybe you can make sense of this, but I can't ...


  

[lots of amusing non-orthogonal braindead stuff ...]


Better not imitate that.

  


The MySQL treatment of enums is generally quite reprehensible. The 
proposed patch by contrast fits quite well into our existing type 
system, I think. MySQL users migrating will have a bit of work to do.


I don't think their experience has much to teach us (except how not to 
do enums). We should decide on case sensitivity without having reference 
to it.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend

[HACKERS] Stats Collector Oddity

2006-12-20 Thread Chris Browne

We're getting a bit of an anomaly relating to pg_stat_activity...

oxrstld=# SELECT * from pg_stat_activity where current_query  'IDLE';
   datid| datname | procpid | usesysid | usename | current_query |  
query_start  
+-+-+--+-+---+---
 1347729970 | oxrstld |  893094 |  122 | tldepp  | commit| 
2006-12-16 19:34:08.583978+00
(1 row)
oxrstld=# select version();
   version
--
 PostgreSQL 7.4.12 on powerpc-ibm-aix5.3.0.0, compiled by GCC gcc (GCC) 4.0.1
(1 row)

That PID has been dead for several days, but this connection is marked
as being open, still, after lo many days.

This has *traditionally* been a sign that the stats collector has been
clobbered, at which point pg_stat_activity becomes useless.  That's
not the case here; a less restricted query on pg_stat_activity shows
other more recent data that keeps changing, seemingly consistent with
system activity.

I can track most of the lifecycle of that PID; the connection was
established at 2006-12-16 18:46:38, and I see a few errors associated
with the PID (violations of unique constraints; usual business stuff).
I see nothing in the logs at around 19:34 which would suggest a reason
for any strange behaviour.  There are no records about that PID after
19:34...

I'd like to get rid of this entry, if I can; it's blowing up tests
that warn us about elderly transactions, causing a false positive...
-- 
output = reverse(moc.enworbbc @ enworbbc)
http://www3.sympatico.ca/cbbrowne/
Rules  of the  Evil Overlord  #212. I  will not  send  out battalions
composed wholly of robots or  skeletons against heroes who have qualms
about killing living beings.  http://www.eviloverlord.com/

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread Kaare Rasmussen

I'm not sure, but as far as I remember, it will be a short release cycle for 
8.3 in order to finish some big items that couldn't be ready in time for 8.2. 

But which items are more or less expected for 8.3? I recall
- Hierarchical Queries
- On disk bitmap index
- Clustered/replication solutions

being discussed. What are on people's minds?

-- 

Med venlig hilsen
Kaare Rasmussen, Jasonic

Jasonic Telefon: +45 3816 2582
Nordre Fasanvej 12
2000 Frederiksberg  Email: [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: VS: [HACKERS] Companies Contributing to Open Source

2006-12-20 Thread Bruce Momjian

Heikki Linnakangas wrote:
 Jonah Harris wrote:
  We could also mention all the Ingres-based offshoots that were
  commercial.  Let me think of some other examples... but there may not
  be that many seeing as there aren't really that many PostgreSQL-like
  communities.  I guess I could mention more if I had a clear
  understanding of what we mean when we say, community.
 
 That's something that bothers me as well: The article and the
 discussions talks about the community, but what's The Community? There
 really isn't any clear definition, the closest thing is the list of
 committers or core members, but I don't think anyone considers The
 Community to be just the committers, though that's the group of people
 whose opinions matter the most when trying to get a patch accepted. I

Not really.  The committers/core just weigh in on the patch, but it is
the general discussion that determines if a patch gets in.  Now, if all
committers don't want a patch, it is unlikely it will be applied, but in
most cases some committers want something, and some don't, and the
discussion determines if it gets in or not, perhaps with modification so
all community members are happy.

The remainder of this email talks about general principles for all
developers, not just for company employees.  If we need more text in
this area, it should be added to the developer's FAQ.  Feel free to
suggest additions to that.

I am kind of stumped that we have so much text in the developer's FAQ,
but it seems many people don't know what is in there.  The Developer's
FAQ is referenced in the main FAQ, and on the web site.  I am inclinded
to think that developers know the contents of the developer's FAQ, but
think they know better, and go ahead and do what they want anyway.  I
have no idea how to fix that.  :-(

---

 don't think it's fruitful to spend time on a precise definition, but
 it's important to realize that there isn't one and that the community
 consists of individuals with different priorities, opinions and points
 of view. Therefore it's a bit meaningless to say that The Community
 thinks this or The Community says that. On one topic, some people might
 have a very strong opinion one way, and others might just not care at
 all. On some topics, everyone agrees. And on some topics, people
 strongly disagree. And that's ok. Respecting all the different
 viewpoints leads to a well-balanced product.
 
 How does that affect a company trying to get a patch accepted? First,
 do no harm. If you're proposing something that for example brakes
 someone else's application, your proposal is likely to be rejected. Or
 if you're proposing a patch that increases the performance of something,
 at a very high cost on some other things, your patch is likely to be
 rejected. Another kind of harm that many people miss is the
 maintainability of the codebase. Adding complexity for little gain is
 likely to be rejected, just because it'll make the code harder to read.
 
 Secondly, getting a large feature accepted is easier if you're not just
 dumping a large patch to pgsql-patches, but you're committed to
 maintainting it and developing it further. Remember, some  things you
 might have ignored as not important might be crucial to other people.
 
 Also, a note to all Members of The Community: people like to work in
 different ways. Some might want to seek acceptance and commitment to
 a feature from others before starting development. Some might want to
 write a large up-front design document before proposing something. Some
 might want to write an experimental patch with a lot of quick hacks
 and no comments, and refine that according to feedback. And we, The
 Members of The Community, if I may count myself as one, don't get to
 choose how others prefer to work.
 
 --
  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com

--
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org

Re: VS: [HACKERS] Companies Contributing to Open Source

2006-12-20 Thread Joshua D. Drake


 The remainder of this email talks about general principles for all
 developers, not just for company employees.  If we need more text in
 this area, it should be added to the developer's FAQ.  Feel free to
 suggest additions to that.
 
 I am kind of stumped that we have so much text in the developer's FAQ,
 but it seems many people don't know what is in there.  The Developer's
 FAQ is referenced in the main FAQ, and on the web site.  I am inclinded
 to think that developers know the contents of the developer's FAQ, but
 think they know better, and go ahead and do what they want anyway.  I
 have no idea how to fix that.  :-(

We don't accept patches from anyone who doesn't follow the developer FAQ
rules. Which means of course, that I need to go read it ;)

Sincerely,

Joshua D. Drake




-- 

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate




---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

[HACKERS] tsearch in core patch, for review

2006-12-20 Thread Teodor Sigaev

We (Oleg and me) are glad to present tsearch2 in core of pgsql patch. In basic, 
layout, functions, methods, types etc are the same as in current tsearch2 with a 
lot of improvements:


 - pg_ts_* tables now are in pg_catalog
 - parsers, dictionaries, configurations now have owner and namespace similar to
   other pgsql's objects like tables, operator classes etc
 - current tsearch configuration is managed with a help of GUC variable
   tsearch_conf_name.
 - choosing of tsearch cfg by locale may be done for each schema separately
 - managing of tsearch configuration with a help of SQL commands, not with
   insert/update/delete statements. This allows to drive dependencies,
   correct dumping and dropping.
 - psql support with a help of \dF* commands
 - add all available Snowball stemmers and corresponding configuration
 - correct memory freeing by any dictionary

Work is sponsored by EnterpriseDB's PostgreSQL Development Fund

patch: http://www.sigaev.ru/misc/tsearch_core-0.27.gz

Sorry, but documentation is under heavy development now and temporary placed at 
http://mira.sai.msu.su/~megera/pgsql/ftsdoc/ (changes quickly :) )

So, below there is a short overview of syntax.

Comments, suggestions, objections, questions  will be appreciated. We are 
planning to complete the patch and suggest to commit in HEAD soon.


SQL syntax:
CREATE FULLTEXT PARSER prsname (
START = funcname,
GETTOKEN = funcname,
END = funcname,
LEXTYPES = funcname
[ , HEADLINE = funcname ]
);

CREATE FULLTEXT DICTIONARY dictname (
LEXIZE = funcname
[ , INIT = funcname ]
[ , OPT = text ]
);

CREATE FULLTEXT DICTIONARY dictname [(
{   INIT = funcname |
LEXIZE = funcname |
OPT = text }
 [, ... ]
)] LIKE template_dictname;
ALTER FULLTEXT DICTIONARY dictname SET OPT=text;

CREATE FULLTEXT CONFIGURATION cfgname (
PARSER = prsname
[, LOCALE = localename]
) [AS DEFAULT];

CREATE FULLTEXT CONFIGURATION cfgname [(
   {  LOCALE = localename |
  PARSER = prsname  }
   [, ...]
)] LIKE template_cfg [WITH MAP] [AS DEFAULT];

 ALTER FULLTEXT CONFIGURATION cfgname SET
{ LOCALE=localename | PARSER=prsname } [, ...];
 ALTER FULLTEXT CONFIGURATION cfgname SET AS DEFAULT;
 ALTER FULLTEXT CONFIGURATION cfgname DROP DEFAULT;

CREATE FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...]
  WITH dictname1[, ...];
ALTER FULLTEXT MAPPING ON cfgname FOR lexemetypename[, ...]
WITH dictname1[, ...];
DROP FULLTEXT MAPPING [IF EXISTS] ON cfgname FOR lexemetypename;

DROP FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } [IF EXISTS] objname [ 
CASCADE | RESTRICT ];

ALTER FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } objname RENAME TO 
newobjname;
ALTER FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } objname OWNER TO 
newowner;
COMMENT ON FULLTEXT { DICTIONARY | PARSER | CONFIGURATION } objname IS text;

psql backslesh commands:
\dF [PATTERN]  list fulltext configurations (add + for more detail)
\dFd [PATTERN] list fulltext dictionaries (add + for more detail)
\dFp [PATTERN] list fulltext parsers (add + for more detail)

--
Teodor Sigaev   E-mail: [EMAIL PROTECTED]
   WWW: http://www.sigaev.ru/

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

[HACKERS] Rare corruption of pg_class index

2006-12-20 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I'm encountering some disconcerting problems on a 8.1.3 database.
Very occasionally, I get a could not open relation with OID xxx.
This always occurs inside of a plpgsql function, and always refers
to a normal, stable table that has not been dropped. The first
time this occured, I reindexed the system indexes, but it has occured
again, and on more than one database, which seems to lessen the chance
of a hardware issue or a temporary index corruption issue. The functions
in question are called many times, but the error only happens once in
a blue moon. The last time it happened, the function ran with no problem
one minute before the error, and again four minutes afterwards. The
table name is hard-coded into the functions. This happens on average
five times a day or so, on a very busy database ( 10M statements/day,
the functions fire tens of thousands of times)

Any clues on what could be causing this, or ways to go forward on
debugging? Thanks.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200612201252
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-BEGIN PGP SIGNATURE-

iD8DBQFFiXmzvJuQZxSWSsgRAkr4AKDhTKTILjrVbitWTu4wzktejXD2egCfSk/H
h+vBn6lJegKeRD+tsBFSYng=
=svv7
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Rare corruption of pg_class index

2006-12-20 Thread Jeff Davis

On Wed, 2006-12-20 at 18:06 +, Greg Sabino Mullane wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 I'm encountering some disconcerting problems on a 8.1.3 database.
 Very occasionally, I get a could not open relation with OID xxx.
 This always occurs inside of a plpgsql function, and always refers
 to a normal, stable table that has not been dropped. The first
 time this occured, I reindexed the system indexes, but it has occured
 again, and on more than one database, which seems to lessen the chance
 of a hardware issue or a temporary index corruption issue. The functions
 in question are called many times, but the error only happens once in
 a blue moon. The last time it happened, the function ran with no problem
 one minute before the error, and again four minutes afterwards. The
 table name is hard-coded into the functions. This happens on average
 five times a day or so, on a very busy database ( 10M statements/day,
 the functions fire tens of thousands of times)
 

It has to do with PL/pgSQL caching the plans of the statements inside.
If there is some kind of change that causes that plan to become invalid,
that could cause that error.

You can almost certainly fix it by using EXECUTE in the function, which
will not cache the plan, or telling all clients to reconnect when you
make a change that could invalidate the plan.

Regards,
Jeff Davis


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Rare corruption of pg_class index

2006-12-20 Thread Alvaro Herrera

Greg Sabino Mullane wrote:

 I'm encountering some disconcerting problems on a 8.1.3 database.
 Very occasionally, I get a could not open relation with OID xxx.
 This always occurs inside of a plpgsql function, and always refers
 to a normal, stable table that has not been dropped. The first
 time this occured, I reindexed the system indexes, but it has occured
 again, and on more than one database, which seems to lessen the chance
 of a hardware issue or a temporary index corruption issue. The functions
 in question are called many times, but the error only happens once in
 a blue moon. The last time it happened, the function ran with no problem
 one minute before the error, and again four minutes afterwards. The
 table name is hard-coded into the functions. This happens on average
 five times a day or so, on a very busy database ( 10M statements/day,
 the functions fire tens of thousands of times)

Well, if the error fixed by itself four minutes after the failure,
then it's not likely to be a corrupted index.  My first guess would be
that there's some kind of race condition on the relcache or the sinval
mechanism.

It would help if you could get a stack trace at the moment of the
problem, but I'm not sure how to do that.  Maybe cause the backend to
send a SIGSTOP to itself and then have an external program to quickly
launch GDB on it and get a backtrace, then send a SIGCONT.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate

Re: [HACKERS] Rare corruption of pg_class index

2006-12-20 Thread Tom Lane

Alvaro Herrera [EMAIL PROTECTED] writes:
 Greg Sabino Mullane wrote:
 I'm encountering some disconcerting problems on a 8.1.3 database.
 Very occasionally, I get a could not open relation with OID xxx.

Does the mentioned OID actually correspond to the OID of the table it's
supposed to be opening, or is it wrong?  Is anything being done to
the table schema in parallel?

If the table is occasionally dropped and recreated, there's a known race
condition that could cause it: we lookup the table name in pg_class to
get the OID, then lock the relation by OID, then try to finish opening
the relation.  By the time we obtain lock the original rel could be gone
and the name now refers to some other OID, but we'll fail because the
old OID is no longer anywhere to be found.  I think this is fixed in 8.2.

If the table is perfectly static then another explanation is needed ...

 It would help if you could get a stack trace at the moment of the
 problem, but I'm not sure how to do that.

Perhaps insert an abort() call right before the elog(ERROR)
that's reporting this.  (I think there are three possibilities,
but they're all in heapam.c so you might as well just hack them all.)

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread David Fetter

On Wed, Dec 20, 2006 at 05:49:15PM +0100, Kaare Rasmussen wrote:
 I'm not sure, but as far as I remember, it will be a short release
 cycle for 8.3 in order to finish some big items that couldn't be
 ready in time for 8.2. 
 
 But which items are more or less expected for 8.3? I recall
 - Hierarchical Queries
 - On disk bitmap index

These two are happening

 - Clustered/replication solutions

This one's stalled because no two replication solutions appear to need
anything in the core.  Instead, lots of things are going on outside
the core in the clustering/replication/HA arena.

 being discussed. What are on people's minds?

Other things coming down the pike are:

* SQL/XML support per SQL:2003
* SQL/PSM support per SQL:2003

Hrm.  While two things isn't much of a pattern, it's suggestive.

Personally, I'd like to see MERGE, windowing functions, and better
integration of (INSERT|UPDATE|DELETE) ... RETURNING.

Cheers,
D (SQL/MED, anybody?)
-- 
David Fetter [EMAIL PROTECTED] http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread Chris Browne

[EMAIL PROTECTED] (Kaare Rasmussen) writes:

 I'm not sure, but as far as I remember, it will be a short release cycle for 
 8.3 in order to finish some big items that couldn't be ready in time for 8.2. 

 But which items are more or less expected for 8.3? I recall
 - Hierarchical Queries
 - On disk bitmap index
 - Clustered/replication solutions

 being discussed. What are on people's minds?

- WITH (recursive) queries
- Better handling of partitioning
- SQL:2003 windowing queries
- Discussion is under way on improvements to autovac
- Simon Riggs has reuse of index tuples under way
-- 
let name=cbbrowne and tld=acm.org in name ^ @ ^ tld;;
http://linuxfinances.info/info/advocacy.html
I think there is a world market for maybe five computers
-- Tom Watson, CEO of IBM, 1943

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Rare corruption of pg_class index

2006-12-20 Thread Greg Sabino Mullane


-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


(Jeff Davis: I've not changed the function, so it's not the plan cache)

Tom Lane wrote:
 Does the mentioned OID actually correspond to the OID of the table it's
 supposed to be opening, or is it wrong?  Is anything being done to
 the table schema in parallel?

Yes, it is the correct OID. No, nothing done to the schema in parallel,
although there is a process that disables/re-enables triggers and rules
on that table via pg_class tweaking (inside a txn, of course).

 If the table is occasionally dropped and recreated...

Yeah, that's the first thing I thought of, but it's definitely not
being dropped and recreated.

Alvaro Herrera wrote:
 It would help if you could get a stack trace at the moment of the
 problem, but I'm not sure how to do that.

 Perhaps insert an abort() call right before the elog(ERROR)
 that's reporting this.  (I think there are three possibilities,
 but they're all in heapam.c so you might as well just hack them all.)

Argh, that will have to be a last resort measure, as this is a
production system. Have not been able to duplicate yet on a dev
box, but will look into adding the abort() for when/if I can
duplicate it there. Thanks everyone.

- --
Greg Sabino Mullane [EMAIL PROTECTED]
PGP Key: 0x14964AC8 200612201412
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-BEGIN PGP SIGNATURE-

iD8DBQFFiYtwvJuQZxSWSsgRArEbAJ9udj9/Kh2Vi45A8ej3YCC2RKESwgCcCkT8
5ZnN0+yCU0rQ6+PuHtJRtnw=
=C1gB
-END PGP SIGNATURE-



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread Andrew Dunstan


Kaare Rasmussen wrote:
I'm not sure, but as far as I remember, it will be a short release cycle for 
8.3 in order to finish some big items that couldn't be ready in time for 8.2. 


But which items are more or less expected for 8.3? I recall
- Hierarchical Queries
- On disk bitmap index
- Clustered/replication solutions

being discussed. What are on people's minds?

  


Items on my personal agenda for this time frame - all have been 
previously discussed:


. notification payload messages
. fix permissions properly on custom GUC vars
. create a mechanism for plperl to load modules safely

I am committed to getting the first of these done to meet a businees 
need. The last two will be best effort.


cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Rare corruption of pg_class index

2006-12-20 Thread Alvaro Herrera

Tom Lane wrote:
 Alvaro Herrera [EMAIL PROTECTED] writes:
  Greg Sabino Mullane wrote:

  It would help if you could get a stack trace at the moment of the
  problem, but I'm not sure how to do that.
 
 Perhaps insert an abort() call right before the elog(ERROR)
 that's reporting this.

Yeah, but doing this in a production environment is not likely to go
very far ...

 (I think there are three possibilities,
 but they're all in heapam.c so you might as well just hack them all.)

Maybe it would be good to run under log_error_verbosity='verbose' for a
while to discover whether it's always the same one.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Interface for pg_autovacuum

2006-12-20 Thread Jim Nasby


On Dec 20, 2006, at 7:56 AM, Florian G. Pflug wrote:

Jim Nasby wrote:
I'm teaching a class this week and a student asked me about OIDs.  
He related the story of how in Sybase, if you moved a database  
from one server from another, permissions got all screwed up  
because user IDs no longer matched. I explained that exposing  
something like an integer ID in a user interface or an API is just  
a bad idea and PostgreSQL doesn't do that.

Then I got to pg_autovacuum
So... is there any reason there isn't a prescribed interface to  
pg_autovacuum that doesn't expose vacrelid? Can we get that added  
to TODO?


Wouldn't it be sufficient to change the type of vacrelid from oid
to regclass? Then just dumping and restoring pg_autovacuum like any
other table should Just Work.


I think that would work, though as I mentioned we'd also want to set  
reasonable defaults on the table if we decide to keep that as our  
interface.


On the other hand, this would be the only part of the system where  
the official interface/API is a system catalog table. Do we really  
want to expose the internal representation of something as our API?  
That doesn't seem wise to me...


Additionally, AFAIK it is not safe to go poking data into catalogs  
willy-nilly. Having one table where this is the interface to the  
system seems like it could lead to some dangerous confusion.

--
Jim Nasby   [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Martijn van Oosterhout

On Wed, Dec 20, 2006 at 09:15:05AM -0500, Stephen Frost wrote:
  It doesn't really address the question of how you know which one to
  use at any particular line of code; or even more to the point, what
  mechanism will warn you if you use the wrong one.
 
 That'd be the point of doing the typing, you then declare functions as
 accepting the type and then if someone passes the wrong type to a
 function the compiler will complain.  Inside of a particular function it
 would hopefully be easier to keep it clear.  I'd think that most
 functions would deal with one type or the other (which would be declared
 in the arguments or in the local variables) and that functions which
 have to deal with both would be able to keep them straight.

I'm not sure how much you can do with typing. Things like heap_getattr
are macros, and thus untyped. Most places use attr as an index to an
array, which also can't be type checked.

If you switched everything over to inline functions you might get it to
work, but that's about it.

IMHO the best solution is to offset the logical numbers by some
constant...

Have a nice day,
-- 
Martijn van Oosterhout   kleptog@svana.org   http://svana.org/kleptog/
 From each according to his ability. To each according to his ability to 
 litigate.


signature.asc
Description: Digital signature

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread Lukas Kahwe Smith


Hi,

I just collected all the items mentioned in this thread as well as what 
people quickly came up with on IRC and put it on a list in the developer 
wiki [1]. I tried to put names and links behind the items where ever 
possible. Let me know if there is something missing or if you know any 
other information (links, names) etc. that should be mentioned.


The list is kinda scary long ..

regards
Lukas

[1] http://developer.postgresql.org/index.php/Todo:WishlistFor83

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Inaam Rana


On 12/20/06, Takayuki Tsunakawa [EMAIL PROTECTED] wrote:


[Conclusion]
I believe that the problem cannot be solved in a real sense by
avoiding fsync/fdatasync().  We can't ignore what commercial databases
have done so far.  The kernel does as much as he likes when PostgreSQL
requests him to fsync().



I am new to the community and am very interested in the tests that you have
done. I am also working on resolving the sudden IO spikes at checkpoint
time. I agree with you that fsync() is the core issue here.

Being a new member I was wondering if someone on this list has done testing
with O_DIRECT and/or O_SYNC for datafiles as that seems to be the most
logical way of dealing with fsync() flood at checkpoint time. If so, I'll be
very interested in the results. As mentioned in this thread that a single
bgwriter with O_DIRECT will not be able to keep pace with cleaning effort
causing backend writes. I think (i.e. IMHO) multiple bgwriters and/or
AsyncIO with O_DIRECT can resolve this issue.

Talking of bgwriter_* parameters I think we are missing a crucial internal
counter i.e. number of dirty pages. How much work bgwriter has to do at each
wakeup call should be a function of total buffers and currently dirty
buffers. Relying on both these values instead of just one static NBuffers
should allow bgwriter to adapt more quickly to workload changes and ensure
that not much work is accumulated for checkpoint.

--
Inaam Rana
EnterpriseDB   http://www.enterprisedb.com

Re: [HACKERS] Question about debugging bootstrapping and catalog

2006-12-20 Thread Zdenek Kotala


Zdenek Kotala wrote:

Gregory Stark wrote:

Martijn van Oosterhout kleptog@svana.org writes:


Here's what I did: you can step over functions in initdb until it fails
(although I alredy know which part it's failing I guess). Restart. Then
you go into that function and step until the new backend has been
started. At this point you attach another gdb to the backend and let it
run.


Hm, I suppose. Though starting a second gdb is a pain. What I've done 
in the
past is introduce a usleep(3000) in strategic points in the 
backend to

give me a chance to attach.


I use dtrace which wait on write syscall for stderr output and if it is 
happen then stop(freeze) the process and I able to connect into the 
process with debugger and examine what happened.




There is dtrace script which sitting on exec. It stops postgres 
process after exec.  It works on Solaris. Different name of kernel 
function probably will be on other platform where is dtrace implemented 
(Freebsd,MacOS).




::exec_common:return
/execname == initdb/
{
  exec_pg = 1;
}


syscall:::entry
/execname == postgres  exec_pg == 1/
{
  stop();
  printf(Postgres is stopped.\n);
  exec_pg = 0;
}


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate

Re: column ordering, was Re: [HACKERS] [PATCHES] Enums patch v2

2006-12-20 Thread Russell Smith


Andrew Dunstan wrote:

Russell Smith wrote:

Tom Lane wrote:

Stephen Frost [EMAIL PROTECTED] writes:
 
Force references to go through macros which implement the lookup 
for the

appropriate type?  ie: LOGICAL_COL(table_oid,2) vs.
PHYSICAL_COL(table_oid,1)  Perhaps that's too simplistic.



It doesn't really address the question of how you know which one to
use at any particular line of code; or even more to the point, what
mechanism will warn you if you use the wrong one.

My gut feeling about this is that we could probably enforce such a
distinction if we were using C++, but while coding in C I have no
confidence in it.  (And no, that's not a vote to move to C++ ...)
  

What about a comprimise...

The 8.1 documentation for ALTER TABLE states the following.

Adding a column with a non-null default or changing the type of an 
existing column will require the entire table to be rewritten. This 
may take a significant amount of time for a large table; and it will 
temporarily require double the disk space.



Now, we are rewriting the table from scratch anyway, the on disk 
format is changing.  What is stopping us from switching the column 
order at the same time.  The only thing I can think is that the 
catalogs will need more work to update them.  It's a middle sized 
price to pay for being able to reorder the columns in the table.  One 
of the problems I have is wanting to add a column in the middle of 
the table, but FK constraints stop me dropping the table to do the 
reorder.  If ALTER TABLE would let me stick it in the middle and 
rewrite the table on disk, I wouldn't care.  It's likely that I would 
be rewriting the table anyway.  And by specifying AT POSITION, or 
BEFORE/AFTER you know for big tables it's going to take a while.




This isn't really a compromise. Remember that this discussion started 
with consideration of optimal record layout (minimising space use by 
reducing or eliminating alignment padding). The above proposal really 
does nothing for that.


cheers

andrew


This is partly true.  If you have the ability to rewrite the table and 
put columns in a specific order you can manually minimize the 
alignment padding.  However that will probably produce a table that is 
not in the logical order you would like.  I still see plenty of use case 
for both my initial case as the alignment padding case, even without 
logical layout being different to disk layout.


Also there has been a large about of discussion on performance relating 
to having firm numbers for proposals for different compiler options.  Do 
anybody have tested numbers, and known information about where/how you 
can eliminate padding by column ordering?  Tom suggests in this thread 
that lots of types have padding issues, so how much is it really going 
to buy us space wise if we re-order the table in optimal format.  What 
is the optimal ordering to reduce disk usage?


Russell.

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread Bruce Momjian

Lukas Kahwe Smith wrote:
 Hi,
 
 I just collected all the items mentioned in this thread as well as what 
 people quickly came up with on IRC and put it on a list in the developer 
 wiki [1]. I tried to put names and links behind the items where ever 
 possible. Let me know if there is something missing or if you know any 
 other information (links, names) etc. that should be mentioned.
 
 The list is kinda scary long ..
 
 regards
 Lukas
 
 [1] http://developer.postgresql.org/index.php/Todo:WishlistFor83

That looks helpful.

-- 
  Bruce Momjian   [EMAIL PROTECTED]
  EnterpriseDBhttp://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread Jonah H. Harris


On 12/20/06, Lukas Kahwe Smith [EMAIL PROTECTED] wrote:

wiki [1]. I tried to put names and links behind the items where ever
possible. Let me know if there is something missing or if you know any
other information (links, names) etc. that should be mentioned.


I know Mark Cave-Ayland was interested in taking over hierarchical
queries... let me see where he stands on it and whether he was going
to complete it on an 8.3 deadline.  If not, I'll do it.

--
Jonah H. Harris, Software Architect | phone: 732.331.1324
EnterpriseDB Corporation| fax: 732.331.1301
33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED]
Iselin, New Jersey 08830| http://www.enterprisedb.com/

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match

Re: [HACKERS] psql: core dumped

2006-12-20 Thread Philip Yarra


Mario wrote:

 I'd like to help :-)   I wanted to avoid a core dumped but you told
me that's a normal thing for a SIGQUIT signal.


Did you try running `ulimit -c 0` first? That should do what you want - 
 prevent generation of the dump file.


Regards, Philip.

--
Philip Yarra
Senior Software Engineer, Utiba Pty Ltd
[EMAIL PROTECTED]

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

[HACKERS] inet/cidr

2006-12-20 Thread Worky Workerson


I was looking at upgrading to 8.2, but I make extensive use of the IP4
module.  It doesn't make properly due to the changes in the inet/cidr
types, notably the removal of the type (is_cidr) field of the
inet_struct, which the module uses when casting:  ip4_cast_to_cidr,
ip4r_cast_to_cidr, and a check in ip4r_cast_from_cidr.

I read a bit about the cidr/inet types becoming binary compatible, but
I wasn't sure if I needed to do more extensive conversion, or if I
could just remove the references to the type field and be happy.

-Mike

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq

Re: [HACKERS] Interface for pg_autovacuum

2006-12-20 Thread Russell Smith


Jim Nasby wrote:

On Dec 20, 2006, at 7:56 AM, Florian G. Pflug wrote:

Jim Nasby wrote:
I'm teaching a class this week and a student asked me about OIDs. He 
related the story of how in Sybase, if you moved a database from one 
server from another, permissions got all screwed up because user IDs 
no longer matched. I explained that exposing something like an 
integer ID in a user interface or an API is just a bad idea and 
PostgreSQL doesn't do that.

Then I got to pg_autovacuum
So... is there any reason there isn't a prescribed interface to 
pg_autovacuum that doesn't expose vacrelid? Can we get that added to 
TODO?


Wouldn't it be sufficient to change the type of vacrelid from oid
to regclass? Then just dumping and restoring pg_autovacuum like any
other table should Just Work.


I think that would work, though as I mentioned we'd also want to set 
reasonable defaults on the table if we decide to keep that as our 
interface.


On the other hand, this would be the only part of the system where the 
official interface/API is a system catalog table. Do we really want to 
expose the internal representation of something as our API? That 
doesn't seem wise to me...


Additionally, AFAIK it is not safe to go poking data into catalogs 
willy-nilly. Having one table where this is the interface to the 
system seems like it could lead to some dangerous confusion.
I thought the plan was to change the ALTER TABLE command to allow vacuum 
settings to be set.  I may be totally away from the mark.  But if this 
was the case it would mean that dumps would just need an alter table 
statement to maintain autovacuum information.  There is an advantage 
that if you only dump some tables, their autovac settings would go with 
them. But is that a good thing?


Reagrds

Russell Smith

--
Jim Nasby   [EMAIL PROTECTED]
EnterpriseDB  http://enterprisedb.com  512.569.9461 (cell)




---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster





---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread Takayuki Tsunakawa

On 12/20/06, Takayuki Tsunakawa [EMAIL PROTECTED] wrote:
  [Conclusion]
  I believe that the problem cannot be solved in a real sense by
  avoiding fsync/fdatasync().  We can't ignore what commercial databases
  have done so far.  The kernel does as much as he likes when PostgreSQL
  requests him to fsync().


From: Inaam Rana
 I am new to the community and am very interested in the tests that you
have done. I am also working on resolving the sudden IO spikes at checkpoint
time. I agree with you that fsync() is the core issue here.

Thank you for understanding my bad English correctly.  Yes, what I've been
insisting is that it is necessary to avoid fsync()/fdatasync() and to use
O_SYNC (plus O_DIRECT if supported on the target platform) to really
eliminate the big spikes.
In my mail, the following sentence made a small mistake.

I believe that the problem cannot be solved in a real sense by avoiding
fsync/fdatasync().

The correct sentence is:

I believe that the problem cannot be solved in a real sense without
avoiding fsync/fdatasync().


 Being a new member I was wondering if someone on this list has done
testing with O_DIRECT and/or O_SYNC for datafiles as that seems to be the
most logical way of dealing with fsync() flood at checkpoint time. If so,
I'll be very interested in the results.

Could you see the mail I sent on Dec 18?  Its content was so long that I
zipped the whole content and attached to the mail.  I just performed the
same test simply adding O_SYNC to open() in mdopen() and another function in
md.c.  I couldn't succeed in running with O_DIRECT because O_DIRECT requires
the shared buffers to be aligned on the sector-size boundary.  To perform
O_DIRECT test, a little more modification is necessary to the code where the
shared buffers are allocated.
The result was bad.  But that's just a starting point.  We need some
improvements that commercial databases have done.  I think some approaches
we should take are:

(1) two-checkpoint (described in Jim Gray's textbook Transaction
Processing: Concepts and Techniques
(2) what Oracle suggests in its manual (see my previous mails)
(3) write multiple contiguous buffers with one write() to decrease the count
of write() calls

 As mentioned in this thread that a single bgwriter with O_DIRECT will not
be able to keep pace with cleaning effort causing backend writes. I think
(i.e. IMHO) multiple bgwriters and/or AsyncIO with O_DIRECT can resolve this
issue.

I agree with you.  Oracle provides a parameter called DB_WRITER_PROCESSES to
set the number of database writer processes.  Oracle also provides
asynchronous I/O to solve the problem you are saying about.  Please see
section 10.3.9 the following page:

http://download-west.oracle.com/docs/cd/B19306_01/server.102/b14211/instance_tune.htm#sthref1049

 Talking of bgwriter_* parameters I think we are missing a crucial internal
counter i.e. number of dirty pages. How much work bgwriter has to do at each
wakeup call should be a function of total buffers and currently dirty
buffers. Relying on both these values instead of just one static NBuffers
should allow bgwriter to adapt more quickly to workload changes and ensure
that not much work is accumulated for checkpoint.

I agree with you in the sense that the current bgwriter is a bit careless
about the system load.  I believe that PostgreSQL should be more gentle to
OLTP transactions -- many users of the system as a result.  I think the
speed of WAL accumulation should also be taken into account.  Let's list up
the problems and ideas.

--

Re: [HACKERS] New version of money type

2006-12-20 Thread D'Arcy J.M. Cain

On Thu, 12 Oct 2006 14:24:22 -0400
D'Arcy J.M. Cain darcy@druid.net wrote:
 On Thu, 12 Oct 2006 14:17:33 -0400
 Tom Lane [EMAIL PROTECTED] wrote:
  D'Arcy J.M. Cain darcy@druid.net writes:
   Cool.  So what do I do with the patch?  Should I add the currency
   symbol back in and commit or should I resubmit the patch to hackers for
   further review?
  
  Well, one thing you definitely *don't* do is commit right now, because
  we're in feature freeze, not to mention trying to avoid forced initdbs
  now that beta has started.  Sit on it till 8.3 is branched, and
 
 OK.  I hadn't thought of it as a new feature per se but I understand
 the initdb issue.  Holding at 30,000 feet, ground control.
 
  meanwhile think about what you want to do with the currency-symbol
  issue...
 
 Personally I don't see a need for it but I am currently in favour of
 adding it back in before committing just so that we can deal with the
 issue separately.  The same as the other changes being discussed.

Now that 8.3 has been branched shall I go ahead and commit?  As
discussed I will put the currency symbol back in just so that it can be
discussed and worked on as a completely separate issue.  I have
attached the current patch against HEAD.

-- 
D'Arcy J.M. Cain darcy@druid.net |  Democracy is three wolves
http://www.druid.net/darcy/|  and a sheep voting on
+1 416 425 1212 (DoD#0082)(eNTP)   |  what's for dinner.


cash64_patch
Description: Binary data

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Load distributed checkpoint

2006-12-20 Thread ITAGAKI Takahiro


Takayuki Tsunakawa [EMAIL PROTECTED] wrote:

 I have to report a sad result.  Your patch didn't work.  Let's
 consider the solution together.  What you are addressing is very
 important for the system designers in the real world -- smoothing
 response time.

You were running the test on the very memory-depend machine.
 shared_buffers = 4GB / The scaling factor is 50, 800MB of data.
Thet would be why the patch did not work. I tested it with DBT-2, 10GB of
data and 2GB of memory. Storage is always the main part of performace here,
even not in checkpoints.

If you use Linux, it has very unpleased behavior in fsync(); It locks all
metadata of the file being fsync-ed. We have to wait for the completion of
fsync when we do read(), write(), and even lseek().

Almost of your data is in the accounts table and it was stored in a single
file. All of transactions must wait for fsync to the single largest file,
so you saw the bottleneck was in the fsync.

 [Conclusion]
 I believe that the problem cannot be solved in a real sense by
 avoiding fsync/fdatasync().

I think so, too. However, I assume we can resolve a part of the
checkpoint spikes with smoothing of write() alone.

BTW, can we use the same way to fsync? We call fsync()s to all modified
files without rest in mdsync(), but it's not difficult at all to insert
sleeps between fsync()s. Do you think it helps us? One of issues is that
we have to sleep in file unit, which is maybe rough granularity.

Regards,
---
ITAGAKI Takahiro
NTT Open Source Software Center



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq

Re: [HACKERS] Interface for pg_autovacuum

2006-12-20 Thread Tom Lane

Jim Nasby [EMAIL PROTECTED] writes:
 On the other hand, this would be the only part of the system where  
 the official interface/API is a system catalog table.

I don't think it was ever intended by anyone that that would be the
long-term solution.  Where we are currently at is experimenting to find
out what autovacuum's control knobs ought to be.  The catalog table was
a suitably low-effort way to expose a first cut at the knobs.  The fact
that pg_dump doesn't dump the settings is entirely deliberate: that's to
avoid locking us into a forward compatibility commitment before we're
ready.  Once we are happy with the control design, we can think about
what the long-term API ought to be.

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Release 8.2.0 done, 8.3 development starts

2006-12-20 Thread Bruno Wolff III

On Wed, Dec 20, 2006 at 17:49:15 +0100,
  Kaare Rasmussen [EMAIL PROTECTED] wrote:
 I'm not sure, but as far as I remember, it will be a short release cycle for 
 8.3 in order to finish some big items that couldn't be ready in time for 8.2. 

I believe the point of the short release cycle was more to change the timing
of the release to avoid lots of people being on holiday during a critical
period. After the short release cycle, the tentative plan was to go back to
yearly releases shifted a few months from where they typically used to
occur in the past.
Whether or not this actually works out, remains to be seen.

---(end of broadcast)---
TIP 6: explain analyze is your friend

Re: [HACKERS] Stats Collector Oddity

2006-12-20 Thread Tom Lane

Chris Browne [EMAIL PROTECTED] writes:
 We're getting a bit of an anomaly relating to pg_stat_activity...
 ...
 That PID has been dead for several days, but this connection is marked
 as being open, still, after lo many days.

This probably just means that the backend termination stats message
got dropped due to heavy load.  That's expected behavior in all pre-8.2
releases: the stats system was never intended to provide
guaranteed-exactly-correct status.  PG 8.2 has reimplemented the
pg_stat_activity view to make it more trustworthy.  (The other stuff is
still probabilistic, but being just event counters, message loss isn't
so obvious.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match

Re: [HACKERS] Question about debugging bootstrapping and catalog entries

2006-12-20 Thread Tom Lane

Gurjeet Singh [EMAIL PROTECTED] writes:
 On 12/18/06, Tom Lane [EMAIL PROTECTED] wrote:
 There is already an option to sleep early in backend startup for the
 normal case.  Not sure if it works for bootstrap, autovacuum, etc,
 but I could see making it do so.

 You are probably referring to the command-line switch -W to posrgres, that
 translates to 'PostAuthDelay' GUC variable; I think that kicks in a bit too
 late!

No, I was thinking of PreAuthDelay.  There might be cases where even
that is too late in the procedure --- probably not on Unix, but on
Windows there's a lot that happens before BackendInitialize.  But
offhand I don't know how we'd have a configurable delay much earlier
... custom insertions of hardwired delays into the source code are
probably the only good approach if you find that, say, guc.c
initialization fails in individual backends under Windows.

Back at the ranch, though, the question was whether it'd be worth
honoring PreAuthDelay in the other startup code paths such as
BootstrapMain.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] ERROR: tuple concurrently updated

2006-12-20 Thread Tom Lane

Stephen Frost [EMAIL PROTECTED] writes:
   Subject pretty much says it all.  I've put up with this error in the
   past when it has caused me trouble but it's now starting to hit our
   clients on occation which is just unacceptable.

Have you tracked down the exact scenario making it happen?

   If this is correct then the solution seems to be either add versioning
   to the SysCache data,

You have provided no evidence that that would fix anything at all.
To my mind a concurrently updated failure is more likely to mean
insufficient locking around update operations.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly

Re: [HACKERS] ERROR: tuple concurrently updated

2006-12-20 Thread Stephen Frost

* Tom Lane ([EMAIL PROTECTED]) wrote:
 Stephen Frost [EMAIL PROTECTED] writes:
Subject pretty much says it all.  I've put up with this error in the
past when it has caused me trouble but it's now starting to hit our
clients on occation which is just unacceptable.
 
 Have you tracked down the exact scenario making it happen?

I think I might have confused two different issues, sorry. :/

If this is correct then the solution seems to be either add versioning
to the SysCache data,
 
 You have provided no evidence that that would fix anything at all.
 To my mind a concurrently updated failure is more likely to mean
 insufficient locking around update operations.

I havn't built a reliable test case yet but I *think* the tuple
concurrently updated problem is with an analyze being run inside of a
function and also being run by autovacuum.  The SysCache stuff I was
thinking about previously was actually for another problem that I hadn't
seen in a long time (because I hadn't been doing a particular set of
operations, not because it's that difficult to have happen) but just ran
into again today:
ERROR:  cache lookup failed for relation ...

I first started seeing this happen, iirc, when I created a function 
which went against pg_class/pg_attribute/pg_type and used
pg_table_is_visible().  This function (and another which uses pg_class,
pg_constraint and pg_attribute) gets used over and over again from
another pl/pgsql function.  Basically we are walking through a list of
tables pulling the column names and primary keys and then running some
checks on the tables.  Anyhow, while this is running (and it takes
upwards of half an hour to run) other activity on the database (this
time it was creating a view in a seperate completely unrelated schema)
can cause the lookup failure which kills the long-running function
(which gets to be very frustrating...).

Thanks,

Stephen


signature.asc
Description: Digital signature

71 matches

Mail list logo