Re: [pgtranslation-translators] [HACKERS] Opinions about wording of error messages for bug #3883?

2008-01-30 Thread Guillaume Lelarge

Alvaro Herrera wrote:

Tom Lane wrote:

Alvaro Herrera [EMAIL PROTECTED] writes:



I suggest
cannot execute \%s\ on \%s\ because ...

Hmm, why not just

cannot execute %s \%s\ because ...

?


Hmm, yeah, that seems fine too.  Thinking more about it, from the POV of
the translator probably the three forms are the same because he has all
the elements to construct the phrase however he sees fit.



Alvarro's sentence seems better to me. Anyways, I have no problem with 
such a change this near of a release. As Alvarro said, if the 
translation of this sentence is not available for 8.3, it can be for 
8.3.1. That's not such a big deal.


And thanks for asking translators' opinion on this, I really appreciate.

Regards.


--
Guillaume.
 http://www.postgresqlfr.org
 http://dalibo.com

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Will PostgreSQL get ported to CUDA?

2008-01-30 Thread Simon Riggs
On Tue, 2008-01-29 at 22:12 -0800, Dann Corbit wrote:
 http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~ISSUE~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html
 
 http://www.nvidia.com/object/cuda_learn.html
 
 http://www.nvidia.com/object/cuda_get.html
 

I assume you mean to have interesting CPU-intensive tasks offloaded to
the GPU, rather than full-porting...

I looked into this and it seems like an innovative plan technically, but
most servers running PostgreSQL don't have a GPU. So it makes it more a
personal computer opportunity rather than a business server one, no?

Plus I wouldn't look at it ahead of a debugger being available.

If you wrote a CUDA function to perform a useful operation that might be
a good starting place to general understanding of how we might use this
in the future. PostgreSQL is extensible, so an add-in function might be
a useful module on pgfoundry. 

Maybe we could build in a hook to allow an external-function library to
provide sort capability... but I don't think anyone is going to take it
seriously for database core without a greater than normal amount of test
results and prototypes.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Zeugswetter Andreas ADI SD

  The plural seems better to me; there's no such thing as a solitary
  synchronized scan, no?  The whole point of the feature is to affect
  the behavior of multiple scans.
 
 +1. The plural is important IMHO.

ok, good.

 As I stated earlier, I don't really like this argument (we already
 broke badly designed applications a few times in the past) but we
 really need a way to guarantee that the execution of a query is stable
 and doesn't depend on external factors. And the original problem was
 to guarantee that pg_dump builds a dump as identical as possible to
 the existing data by ignoring external factors. It's now the case with
 your patch.
 The fact that it allows us not to break existing applications relying
 too much on physical ordering is a nice side effect though :).

One more question. It would be possible that a session that turned off
the synchronized_seqscans still be a pack leader for other later
sessions.
Do/should we consider that ?

The procedure would be:
start from page 0
iff no other pack is present fill the current scan position for others

Andreas

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] MSVC Build error

2008-01-30 Thread Magnus Hagander
On Sun, Jan 27, 2008 at 10:04:38PM +0100, Gevik Babakhani wrote:
 
  Do you have the dumpbin command available in the path?
  
  //Magnus
  
 
 :) yes. This is why I do not understand why the command does not run
 correctly!

Strange indeed. I recognise this problem from earlier, but I thought it had
been fixed a long time ago. Can you confirm that you are building
8.3-current and not 8.2 here?


Also, looking at the original one it seems that it has dumped *some* files,
but not all. Could you modify gendef.pl line 22 to include the filename
that it's failing on, to see if that gets you further?

//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Kenneth Marshall
On Wed, Jan 30, 2008 at 10:56:47AM +0100, Zeugswetter Andreas ADI SD wrote:
 
   The plural seems better to me; there's no such thing as a solitary
   synchronized scan, no?  The whole point of the feature is to affect
   the behavior of multiple scans.
  
  +1. The plural is important IMHO.
 
 ok, good.
 
  As I stated earlier, I don't really like this argument (we already
  broke badly designed applications a few times in the past) but we
  really need a way to guarantee that the execution of a query is stable
  and doesn't depend on external factors. And the original problem was
  to guarantee that pg_dump builds a dump as identical as possible to
  the existing data by ignoring external factors. It's now the case with
  your patch.
  The fact that it allows us not to break existing applications relying
  too much on physical ordering is a nice side effect though :).
 
 One more question. It would be possible that a session that turned off
 the synchronized_seqscans still be a pack leader for other later
 sessions.
 Do/should we consider that ?
 
 The procedure would be:
 start from page 0
 iff no other pack is present fill the current scan position for others
 

I think that allowing other scans to use the scan started by a query that
disabled the sync scans would have value. It would prevent these types
of queries from completely tanking the I/O.

+1

Ken

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Alvaro Herrera
Zeugswetter Andreas ADI SD escribió:
 
   The plural seems better to me; there's no such thing as a solitary
   synchronized scan, no?  The whole point of the feature is to affect
   the behavior of multiple scans.
  
  +1. The plural is important IMHO.
 
 ok, good.

Hmm, if you guys are going to add another GUC variable, please hurry
because we have to translate the description text.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Simon Riggs
On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Rather than having a boolean GUC, we should have a number and make the
  parameter synchronised_scan_threshold.
 
 This would open up a can of worms I'd prefer not to touch, having to do
 with whether the buffer-access-strategy behavior should track that or
 not.  As the note in heapam.c says,
 
  * If the table is large relative to NBuffers, use a bulk-read access
  * strategy and enable synchronized scanning (see syncscan.c).  Although
  * the thresholds for these features could be different, we make them the
  * same so that there are only two behaviors to tune rather than four.
 
 It's a bit late in the cycle to be revisiting that choice.  Now we do
 already have three behaviors to worry about (BAS on and syncscan off)
 but throwing in a randomly settable knob will take it back to four,
 and we have no idea how that fourth case will behave.  The other tack we
 could take (having the one GUC variable control both thresholds) is
 not good since it will result in pg_dump trashing the buffer cache.

I'm still not very happy with any of the options here.

BAS is great if you didn't want to trash the cache, but its also
annoying to people that really did want to load a large table into
cache. However we set it, we're going to have problems because not
everybody has the same database.

We're trying to guess which data is in memory and which is on disk and
then act accordinly. The answer to that question cannot be answered
solely by how big shared_buffers is. It really ought to be a combination
of (at least) shared_buffers and total database size. I think we must
either put some more intelligence into the setting of the threshold, or
give it to the user as a parameter, possibly as a parameter not
mentioned in the sample .conf.

If we set the threshold unintelligently or in a way that cannot be
overridden, we will still get weird bug reports from people who set
shared_buffers higher and got a performance drop.

We need to make a final decision on this quickly, so I'll say no more on
this for 8.3 to help that process.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Truncate Triggers

2008-01-30 Thread Simon Riggs
On Fri, 2008-01-25 at 10:44 -0500, Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  Notes: As the syntax shows, these would be statement-level triggers
  (only). Requesting row level triggers will cause an error. [As Chris
  Browne explained, if people really want, they can use these facilities
  to create a Before Statement trigger that executes a DELETE, which then
  fires row level calls.]
 
 Is there a way for a BS trigger to return a flag skip the statement,
 as there is for BR?

I've got a working version of truncate triggers now, posted to -patches
shortly.

Answering the above question is the last point of the implementation.
ISTM it would be best to think of it as a separate and not-very related
feature, implemented as a separate patch, if we decide we really do want
that. It doesn't seem important to do that for replication, which was
the main use case for truncate triggers.

Currently, BS trigger functions return NULL. This is handled in various
ways within each PL and is specifically tested for within main trigger
exec code. Returning different information in some form or other would
be required to signal skip the main statement. FOR EACH ROW triggers
return NULL when they want to skip the change for that row, so the
current implementation is the wrong way round for BS triggers. I'm not
sure how to handle that in a way that makes obvious sense for future
trigger developers, so suggestions welcome.

So allowing us to skip commands as a result of statement level triggers
is as much work for INSERT, UPDATE, DELETE and TRUNCATE together as it
is just for TRUNCATE. I also think that if we did do that for TRUNCATE
it would be useful to do for the other commands anyway. SQLStandard
doesn't say we *can't* do this.

Having said that, some PLs simply ignore the return value from BS
triggers. So interpreting return values in new ways might make existing
trigger code break or behave differently. So if we did BS trigger
skipping for all statement types then we would need to introduce that
concept slowly over a couple of releases with a non-default, then
default trigger behaviour parameter.

I've written the truncate trigger handling in such a way that it would
be straightforward to extend this to include statement skipping, should
we do it in the future.

Can we just skip statement skipping?

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] autonomous transactions

2008-01-30 Thread Josh Berkus

All,



Added to TODO:

* Add anonymous transactions

  http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php



IMHO, autonomous transactions should be part of a package with a 
spec-compliant CREATE PROCEDURE statement.That is, the difference 
between PROCEDURES and FUNCTIONS would be that:


-- PROCs have autonomous transactions
-- PROCs have to be excuted with CALL, and can't go in a query
-- PROCs don't necessarily return a result

--Josh Berkus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] autonomous transactions

2008-01-30 Thread Alvaro Herrera
Josh Berkus escribió:
 All,


 Added to TODO:

 * Add anonymous transactions

   http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php

 IMHO, autonomous transactions should be part of a package with a  
 spec-compliant CREATE PROCEDURE statement.

IMHO we should try to get both things separately, otherwise we will
never get either.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Tom Lane
Alvaro Herrera [EMAIL PROTECTED] writes:
 Hmm, if you guys are going to add another GUC variable, please hurry
 because we have to translate the description text.

Yeah, I'm going to put it in today --- just the on/off switch.
Any discussions of exposing threshold parameters will have to wait
for 8.4.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Tom Lane
Simon Riggs [EMAIL PROTECTED] writes:
 I'm still not very happy with any of the options here.

 BAS is great if you didn't want to trash the cache, but its also
 annoying to people that really did want to load a large table into
 cache. However we set it, we're going to have problems because not
 everybody has the same database.

That argument leads immediately to the conclusion that you need
per-table control over the behavior.  Which maybe you do, but it's
far too late to be proposing it for 8.3.  We should put this whole
area of more-control-over-BAS-and-syncscan on the TODO agenda.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Will PostgreSQL get ported to CUDA?

2008-01-30 Thread Tom Lane
Dann Corbit [EMAIL PROTECTED] writes:
 http://www.nvidia.com/object/cuda_get.html

The license terms here seem to be sufficient reason why not.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] Will PostgreSQL get ported to CUDA?

2008-01-30 Thread Christopher Browne
2008/1/30 Dann Corbit [EMAIL PROTECTED]:
http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~ISSUE~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html

 http://www.nvidia.com/object/cuda_learn.html

 http://www.nvidia.com/object/cuda_get.html

Someone at CMU has tried this, somewhat fruitfully.

http://www.andrew.cmu.edu/user/ngm/15-823/project/Draft.pdf
http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf

This was based on GPUSort:
http://gamma.cs.unc.edu/GPUSORT/

Unfortunately, the licensing of GPUSort is, if anything, more awful
than that for CUDA.
http://gamma.cs.unc.edu/GPUSORT/terms.html

This would need to get pretty totally reimplemented to be useful with
PostgreSQL.  Happily, we actually have some evidence that the exercise
would be of some value.  Further, it looks to me like the
implementation that was done was done in a pretty naive way.
Something done more seriously would likely be radically better...
-- 
http://linuxfinances.info/info/linuxdistributions.html
The definition of insanity is doing the same thing over and over and
expecting different results.  -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Tom Lane
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
 One more question. It would be possible that a session that turned off
 the synchronized_seqscans still be a pack leader for other later
 sessions.
 Do/should we consider that ?

Seems like a reasonable thing to consider ... for 8.4.  I'm not willing
to go poking the syncscan code that much at this late point in the 8.3
cycle.  (I'm not sure if it's been mentioned yet on -hackers, but the
current plan is to freeze 8.3.0 tomorrow evening.)

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Simon Riggs
On Wed, 2008-01-30 at 13:07 -0500, Tom Lane wrote:
 Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes:
  One more question. It would be possible that a session that turned off
  the synchronized_seqscans still be a pack leader for other later
  sessions.
  Do/should we consider that ?
 
 Seems like a reasonable thing to consider ... for 8.4. 

Definitely. I thought about this the other day and decided it had some
strange behaviour in some circumstances, so wouldn't be desirable
overall.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Heikki Linnakangas

Tom Lane wrote:

Simon Riggs [EMAIL PROTECTED] writes:

I'm still not very happy with any of the options here.



BAS is great if you didn't want to trash the cache, but its also
annoying to people that really did want to load a large table into
cache. However we set it, we're going to have problems because not
everybody has the same database.


That argument leads immediately to the conclusion that you need
per-table control over the behavior.


It's even worse than that. Elsewhere in this thread Simon mentioned a 
partitioned table, where each partition on its own is smaller than the 
threshold, but you're seq scanning several partitions and the total size 
of the seq scans is larger than memory size. In that scenario, you would 
want BAS and synchronized scans, but even a per-table setting wouldn't 
cut it.


One idea would be to look at the access plan and estimate the total size 
of all scans in the plan. Another idea would be to switch to a more seq 
scan resistant cache replacement algorithm, and forget about the threshold.


For synchronized scans to help in the partitioned situation, I guess 
you'd want to synchronize across partitions. If someone is already 
scanning partition 5, you'd want to start from that partition and join 
the pack, instead of starting from partition 1.


I think we'll be wiser after we see some real world use of what we have 
there now..


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Simon Riggs
On Wed, 2008-01-30 at 18:42 +, Heikki Linnakangas wrote:
 Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
  I'm still not very happy with any of the options here.
  
  BAS is great if you didn't want to trash the cache, but its also
  annoying to people that really did want to load a large table into
  cache. However we set it, we're going to have problems because not
  everybody has the same database.
  
  That argument leads immediately to the conclusion that you need
  per-table control over the behavior.
 
 It's even worse than that. Elsewhere in this thread Simon mentioned a 
 partitioned table, where each partition on its own is smaller than the 
 threshold, but you're seq scanning several partitions and the total size 
 of the seq scans is larger than memory size. In that scenario, you would 
 want BAS and synchronized scans, but even a per-table setting wouldn't 
 cut it.

 For synchronized scans to help in the partitioned situation, I guess 
 you'd want to synchronize across partitions. If someone is already 
 scanning partition 5, you'd want to start from that partition and join 
 the pack, instead of starting from partition 1.

You're right, but in practice its not quite that bad with the
multi-table route. When you have partitions you generally exclude most
of them, with typically 1-2 per query, usually different ones.

If you were scanning lots of partitions in sequence so frequently that
you'd get benefit from synch scans then your partitioning scheme isn't
working for you - and that is the worst problem by far.

But yes, it does need to be addressed.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Will PostgreSQL get ported to CUDA?

2008-01-30 Thread Gregory Stark
Christopher Browne [EMAIL PROTECTED] writes:

 This was based on GPUSort:
 http://gamma.cs.unc.edu/GPUSORT/

I looked briefly at GPUSort a while back. I couldn't see how to shoehorn into
POstgres the assumption that you're always sorting floating point numbers. You
would have to add some property of some data types that mapped every value to
a floating point value. I suppose we could add that as a btree proc entry for
specific data types but it would be a pretty radical change.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's Slony Replication support!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable

2008-01-30 Thread Heikki Linnakangas

Simon Riggs wrote:

On Wed, 2008-01-30 at 18:42 +, Heikki Linnakangas wrote:
It's even worse than that. Elsewhere in this thread Simon mentioned a 
partitioned table, where each partition on its own is smaller than the 
threshold, but you're seq scanning several partitions and the total size 
of the seq scans is larger than memory size. In that scenario, you would 
want BAS and synchronized scans, but even a per-table setting wouldn't 
cut it.


For synchronized scans to help in the partitioned situation, I guess 
you'd want to synchronize across partitions. If someone is already 
scanning partition 5, you'd want to start from that partition and join 
the pack, instead of starting from partition 1.


You're right, but in practice its not quite that bad with the
multi-table route. When you have partitions you generally exclude most
of them, with typically 1-2 per query, usually different ones.


Yep. And in that case, you *don't'* want BAS or sync scans to kick in, 
because you're only accessing a relatively small chunk of data, and it's 
worthwhile to cache it.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Will PostgreSQL get ported to CUDA?

2008-01-30 Thread Dann Corbit
 -Original Message-
 From: [EMAIL PROTECTED] [mailto:pgsql-hackers-
 [EMAIL PROTECTED] On Behalf Of Christopher Browne
 Sent: Wednesday, January 30, 2008 9:56 AM
 To: pgsql-hackers@postgresql.org
 Subject: [HACKERS] Will PostgreSQL get ported to CUDA?
 
 2008/1/30 Dann Corbit [EMAIL PROTECTED]:

http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~IS
SU
 E~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html
 
  http://www.nvidia.com/object/cuda_learn.html
 
  http://www.nvidia.com/object/cuda_get.html
 
 Someone at CMU has tried this, somewhat fruitfully.
 
 http://www.andrew.cmu.edu/user/ngm/15-823/project/Draft.pdf
 http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf
 
 This was based on GPUSort:
 http://gamma.cs.unc.edu/GPUSORT/
 
 Unfortunately, the licensing of GPUSort is, if anything, more awful
 than that for CUDA.
 http://gamma.cs.unc.edu/GPUSORT/terms.html
 
 This would need to get pretty totally reimplemented to be useful with
 PostgreSQL.  Happily, we actually have some evidence that the exercise
 would be of some value.  Further, it looks to me like the
 implementation that was done was done in a pretty naive way.
 Something done more seriously would likely be radically better...

It's too bad that they have a restrictive license.

Perhaps there is an opportunity to create an information appliance
that contains a special build of PostgreSQL, a nice heap of super-speedy
disk, and a big pile of GPUs for sort and merge type operations.  The
thing that seems nice to me about this idea is that you would have a
very stable test platform (all hardware and software combinations would
be known and thoroughly tested) and you might also get some extreme
performance.

I guess that a better sort than GPUSort could be written from scratch,
but legal entanglements with the use of the graphics cards may make the
whole concept DOA.


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] 8.3RC1 on windows missing descriptive Event handle names

2008-01-30 Thread Magnus Hagander

Stephen Denne wrote:

I said...

On Windows XP, using Process Explorer with the lower pane showing
Handles, not all postgres.exe processes are including an Event
type with a description of what the process is doing.


I've had difficulty reproducing this, but I now suspect that it is
only happening when running both v8.2 and v8.3rc1 at once, and I
think it is the second started that is missing the process
descriptions.


That makes sense, really - I think you nailed it. We create a global 
event, and for those that are duplicates, it won't show up in the 
second process.


I think the solution to this is to add the process id to the name of the 
event. So instead of:

pgident: postgres: autovacuum launcher process
We'd ahve
pgident(12345): postgres: autovacuum launcher process

Seems reasomable? I'll be able to write up and properly test a patch 
tomorrow.


//Magnus

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


[HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Dave Page
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00

/D

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Magnus Hagander

Dave Page wrote:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00


Maybe I shouldn't have had those beers after work today, but that looks 
like it's for example failing tsearch2, which hasn't been touched for 
over a month!


Any chance there's something dodgy in the build env?

(If I'm missing the obvious, I blame the beer!)

//Magnus


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Magnus Hagander

Dave Page wrote:

On Jan 30, 2008 9:13 PM, Magnus Hagander [EMAIL PROTECTED] wrote:

Dave Page wrote:

http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00

Maybe I shouldn't have had those beers after work today, but that looks
like it's for example failing tsearch2, which hasn't been touched for
over a month!

Any chance there's something dodgy in the build env?


I can't remember the last time I logged into that box so if it's
something in the buildenv, it's either caused by a Windows update, or
some failing hardware.


I won't have access to my MSVC box until tomorrow, but unless beaten to 
it I can dig into it a bit more. I don't see anything obvious int he 
latest patches thoughy (but again, that could be the beer :-P).


Any chance you could just do a forced run on it now to show if it was 
some kind of transient stuff?


//Magnus

---(end of broadcast)---
TIP 4: Have you searched our list archives?

  http://archives.postgresql.org


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Dave Page
On Jan 30, 2008 9:21 PM, Magnus Hagander [EMAIL PROTECTED] wrote:

 I won't have access to my MSVC box until tomorrow, but unless beaten to
 it I can dig into it a bit more. I don't see anything obvious int he
 latest patches thoughy (but again, that could be the beer :-P).

 Any chance you could just do a forced run on it now to show if it was
 some kind of transient stuff?

Not from here. :-(

/D

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Dave Page
On Jan 30, 2008 9:13 PM, Magnus Hagander [EMAIL PROTECTED] wrote:
 Dave Page wrote:
  http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00

 Maybe I shouldn't have had those beers after work today, but that looks
 like it's for example failing tsearch2, which hasn't been touched for
 over a month!

 Any chance there's something dodgy in the build env?

I can't remember the last time I logged into that box so if it's
something in the buildenv, it's either caused by a Windows update, or
some failing hardware.

/D

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Andrew Dunstan



Dave Page wrote:

On Jan 30, 2008 9:13 PM, Magnus Hagander [EMAIL PROTECTED] wrote:
  

Dave Page wrote:


http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00
  

Maybe I shouldn't have had those beers after work today, but that looks
like it's for example failing tsearch2, which hasn't been touched for
over a month!

Any chance there's something dodgy in the build env?



I can't remember the last time I logged into that box so if it's
something in the buildenv, it's either caused by a Windows update, or
some failing hardware.


  


None of the CVS changes in the relevant period seems to have any 
relation to the errors, so I suspect a local problem.


red_bat is due to build in a couple of hours, so we will soon see if it 
reproduces the error.


cheers

andrew


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names

2008-01-30 Thread Tom Lane
Magnus Hagander [EMAIL PROTECTED] writes:
 On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote:
 While I'm complaining: that's got to be one of the least useful error
 messages I've ever seen, and it's for a case that's surely going to be
 fairly common in practice.  Can't we persuade GSSAPI to produce
 something more user-friendly?  At least convert 7 to Server not
 found in Kerberos database?
 
 I agree, and have found it to be very frustrating while working w/
 Kerberos in general.  I *think* there's a library which can convert
 those error-codes (libcomm-err?), but I've not really looked into it
 yet.

 AFAIK, that one is for Kerberos only. For GSSAPI, we already use the
 gss_display_status function to get the error messages. I think the problem
 here is in the Kerberos library?

Yeah, I found it:
https://bugzilla.redhat.com/show_bug.cgi?id=430983

The best fix is not entirely clear, but in any case it's not our bug.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Will PostgreSQL get ported to CUDA?

2008-01-30 Thread Simon Riggs
On Wed, 2008-01-30 at 17:55 +, Christopher Browne wrote:
 2008/1/30 Dann Corbit [EMAIL PROTECTED]:
 http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~ISSUE~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html
 
  http://www.nvidia.com/object/cuda_learn.html
 
  http://www.nvidia.com/object/cuda_get.html
 
 Someone at CMU has tried this, somewhat fruitfully.
 
 http://www.andrew.cmu.edu/user/ngm/15-823/project/Draft.pdf
 http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf

Well done that man! Excellent piece of research.

Clearly GPUsort is cool; is it cool enough? Here's a few thoughts and
questions that we still need answers to:

The concept of CPU offload can be generalised to any specialised
hardware. Can we offload such tasks easily? If so, to what? Should it be
a GPU, or just another more general CPU? Is the cost and difficulty of
making the GPU work in generalised form better than spending that money
on more resources e.g. memory?

Can the sorting network really be reused in the general case, or must we
realistically recreate it for each new sort set?

Can we have multiple concurrent sorts of the GPU, or is it one user at a
time? Would we need multiple GPUs? Is such an architecture available?

I note that the comparison with HeapSort is worst case, since we sort 1
GB of memory without increasing work_mem beyond 1MB.

There doesn't seem to be a discussion of how GPUsort would handle sorts
too large to fit within the GPU, so we would need to have an external
sort mechanism. So if qsort is better for smaller inputs and external
sorts are needed for larger, then there seems to be a narrow-ish middle
band of benefit.

Another thought would be to replace external heap sort with an external
sort based around qsort or GPUsort, which we extend the range of
usefulness. But then we're back to redesigning external sorts.

-- 
  Simon Riggs
  2ndQuadrant  http://www.2ndQuadrant.com 


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-30 Thread Decibel!
On Mon, Jan 28, 2008 at 11:14:05PM +, Christopher Browne wrote:
 On Dec 6, 2007 6:28 PM, Decibel! [EMAIL PROTECTED] wrote:
  FWIW, I've never seen anything but a performance increase or no change
  when going from 10 to 100. In most cases there's a noticeable
  improvement since it's common to have over 100k rows in a table, and
  there's just no way to capture any kind of a real picture of that with
  only 10 buckets.
 
 I'd be more inclined to try to do something that was at least somewhat
 data aware.
 
 The interesting theory that I'd like to verify if I had a chance
 would be to run through a by-column tuning using a set of heuristics.
 My first order approximation would be:
 
 - If a column defines a unique key, then we know there will be no
 clustering of values, so no need to increase the count...
 
 - If a column contains a datestamp, then the distribution of values is
 likely to be temporal, so no need to increase the count...
 
 - If a column has a highly constricted set of values (e.g. - boolean),
 then we might *decrease* the count.
 
 - We might run a query that runs across the table, looking at
 frequencies of values, and if it finds a lot of repeated values, we'd
 increase the count.
 
 That's a bit hand-wavy, but that could lead to both increases and
 decreases in the histogram sizes.  Given that, we can expect the
 overall stat sizes to not forcibly need to grow *enormously*, because
 we can hope for there to be cases of shrinkage.

I think that before doing any of that you'd be much better off
investigating how much performance penalty there is for maxing out
default_statistict_target. If, as I suspect, it's essentially 0 on
modern hardware, then I don't think it's worth any more effort.

BTW, that investigation wouldn't just be academic either; if we could
convince ourselves that there normally wasn't any cost associated with a
high default_statistics_target, we could increase the default, which
would reduce the amount of traffic we'd see on -performance about bad
query plans.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpJmmXmUl3KN.pgp
Description: PGP signature


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-30 Thread Gregory Stark
Decibel! [EMAIL PROTECTED] writes:

 I think that before doing any of that you'd be much better off
 investigating how much performance penalty there is for maxing out
 default_statistict_target. If, as I suspect, it's essentially 0 on
 modern hardware, then I don't think it's worth any more effort.

That's not my experience. Even just raising it to 100 multiplies the number of
rows ANALYZE has to read by 10. And the arrays for every column become ten
times larger. Eventually they start being toasted...

 BTW, that investigation wouldn't just be academic either; if we could
 convince ourselves that there normally wasn't any cost associated with a
 high default_statistics_target, we could increase the default, which
 would reduce the amount of traffic we'd see on -performance about bad
 query plans.

I suspect we could raise it, we just don't know by how much.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com
  Ask me about EnterpriseDB's 24x7 Postgres support!

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 None of the CVS changes in the relevant period seems to have any 
 relation to the errors, so I suspect a local problem.

skylark and baiji are now red too, so I guess that theory is dead in the
water.  Something in today's changes broke the MSVC build, but what?

I diffed yesterday's and today's make logs from skylark, and found
nothing interesting except this:

***
*** 605,611 
  Generate DEF file^M
  Generating POSTGRES.DEF from directory Release\postgres^M
  
\
..\
.^M
! Generated 5208 symbols^M
  Linking...^M
 Creating library Release\postgres\postgres.lib and object 
Release\postgres\postgres.exp^M
  Embedding manifest...^M
--- 605,611 
  Generate DEF file^M
  Generating POSTGRES.DEF from directory Release\postgres^M
  
\
..\
.^M
! Generated 5205 symbols^M
  Linking...^M
 Creating library Release\postgres\postgres.lib and object 
Release\postgres\postgres.exp^M
  Embedding manifest...^M
***

Presumably the three missing symbols include the two that are being
complained of later, but what the heck?

(Hmm, actually today's commits should have added two global symbols to
the backend, so it seems there are five not three symbols to be
accounted for.)

It is probably significant that both of the known missing symbols come
from guc.c, which we added another variable to today.  I have a
sickening feeling that we have hit some kind of undocumented internal
limit in MSVC as to the number of symbols imported/exported by one
source file...

regards, tom lane

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-30 Thread Guillaume Smet
On Jan 31, 2008 12:08 AM, Gregory Stark [EMAIL PROTECTED] wrote:
 Decibel! [EMAIL PROTECTED] writes:

  I think that before doing any of that you'd be much better off
  investigating how much performance penalty there is for maxing out
  default_statistict_target. If, as I suspect, it's essentially 0 on
  modern hardware, then I don't think it's worth any more effort.

 That's not my experience. Even just raising it to 100 multiplies the number of
 rows ANALYZE has to read by 10. And the arrays for every column become ten
 times larger. Eventually they start being toasted...

+1. From the tests I did on our new server, I set the
default_statistict_target to 30. Those tests were mainly based on the
ANALYZE time though, not the planner overhead introduced by larger
statistics - with higher values, I considered the ANALYZE time too
high for the benefits. I set it higher on a per column basis only if I
see it can lead to better stats but from all the tests I did so far,
it was sufficient for our data set.

--
Guillaume

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-30 Thread Tom Lane
Guillaume Smet [EMAIL PROTECTED] writes:
 On Jan 31, 2008 12:08 AM, Gregory Stark [EMAIL PROTECTED] wrote:
 That's not my experience. Even just raising it to 100 multiplies the number 
 of
 rows ANALYZE has to read by 10. And the arrays for every column become ten
 times larger. Eventually they start being toasted...

 +1. From the tests I did on our new server, I set the
 default_statistict_target to 30. Those tests were mainly based on the
 ANALYZE time though, not the planner overhead introduced by larger
 statistics - with higher values, I considered the ANALYZE time too
 high for the benefits.

eqjoinsel(), for one, is O(N^2) in the number of MCV values kept.
Possibly this could be improved, but in general I'd be real wary
of pushing the default to the moon without some explicit testing of
the impact on planning time.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Tom Lane
I wrote:
 I diffed yesterday's and today's make logs from skylark, and found
 nothing interesting except this:

 ***
 *** 605,611 
   Generating POSTGRES.DEF from directory Release\postgres^M
 ! Generated 5208 symbols^M
   Linking...^M
 --- 605,611 
   Generating POSTGRES.DEF from directory Release\postgres^M
 ! Generated 5205 symbols^M
   Linking...^M
 ***

Looking at this a bit closer, I realize that it's coming from
gendef.pl's dumpbin usage of recent infamy.  So there are a couple
of ideas that come to mind:

* Has the buildfarm script changed recently in a way that might change
the execution PATH and thereby suck in a different version of dumpbin?
(Or even a different version of Perl?)

* Is it conceivable that dumpbin's output format has changed in a way
that confuses the bit of Perl code that's parsing it?  One idea that
comes to mind is that it contains a timestamp that just got wider ---
I remember seeing some bugs like that when the value of Unix time_t
reached 1 billion and became 9 instead of 8 digits.

Neither of these sound very plausible, but it seems the next step for
investigation is to look closely at what's happening in gendef.pl.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Tom Lane
Dave Page [EMAIL PROTECTED] writes:
 I can't remember the last time I logged into that box so if it's
 something in the buildenv, it's either caused by a Windows update,

Re-reading the thread ... could that last point be significant?  Are
all four of these boxen set to auto-accept updates from Redmond?

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Andrew Dunstan



Tom Lane wrote:


* Has the buildfarm script changed recently in a way that might change
the execution PATH and thereby suck in a different version of dumpbin?
(Or even a different version of Perl?)
  



No. In at least the case of red_bat nothing has changed for months.


* Is it conceivable that dumpbin's output format has changed in a way
that confuses the bit of Perl code that's parsing it?  One idea that
comes to mind is that it contains a timestamp that just got wider ---
I remember seeing some bugs like that when the value of Unix time_t
reached 1 billion and became 9 instead of 8 digits.

Neither of these sound very plausible, but it seems the next step for
investigation is to look closely at what's happening in gendef.pl.


  


Right. I agree that your diff makes gendef.pl the prime suspect.

Yoo also just said:

Dave Page [EMAIL PROTECTED] writes:
  

 I can't remember the last time I logged into that box so if it's
 something in the buildenv, it's either caused by a Windows update,



Re-reading the thread ... could that last point be significant?  Are
all four of these boxen set to auto-accept updates from Redmond?


No. red_bat does not auto-accept anything.

cheers

andrew

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Bruce Momjian
Tom Lane wrote:
 Simon Riggs [EMAIL PROTECTED] writes:
  I'm still not very happy with any of the options here.
 
  BAS is great if you didn't want to trash the cache, but its also
  annoying to people that really did want to load a large table into
  cache. However we set it, we're going to have problems because not
  everybody has the same database.
 
 That argument leads immediately to the conclusion that you need
 per-table control over the behavior.  Which maybe you do, but it's
 far too late to be proposing it for 8.3.  We should put this whole
 area of more-control-over-BAS-and-syncscan on the TODO agenda.

Another question --- why don't we just turn off synchronized_seqscans
when we do COPY TO?  That would fix pg_dump and be transparent.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Bruce Momjian
Bruce Momjian wrote:
 Tom Lane wrote:
  Simon Riggs [EMAIL PROTECTED] writes:
   I'm still not very happy with any of the options here.
  
   BAS is great if you didn't want to trash the cache, but its also
   annoying to people that really did want to load a large table into
   cache. However we set it, we're going to have problems because not
   everybody has the same database.
  
  That argument leads immediately to the conclusion that you need
  per-table control over the behavior.  Which maybe you do, but it's
  far too late to be proposing it for 8.3.  We should put this whole
  area of more-control-over-BAS-and-syncscan on the TODO agenda.
 
 Another question --- why don't we just turn off synchronized_seqscans
 when we do COPY TO?  That would fix pg_dump and be transparent.

Sorry, I was unclear. I meant don't have a GUC at all but just set an
internal variable to turn off synchronized sequential scans when we do
COPY TO.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 Another question --- why don't we just turn off synchronized_seqscans
 when we do COPY TO?  That would fix pg_dump and be transparent.

Enforcing this from the server side seems a pretty bad idea.  Note that
there were squawks about having pg_dump behave this way at all; if the
control is on the pg_dump side then at least we have the chance to make
it a user option later.

Also, you forgot about pg_dump -d.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-30 Thread Christopher Browne
On Jan 30, 2008 5:58 PM, Decibel! [EMAIL PROTECTED] wrote:

 On Mon, Jan 28, 2008 at 11:14:05PM +, Christopher Browne wrote:
  On Dec 6, 2007 6:28 PM, Decibel! [EMAIL PROTECTED] wrote:
   FWIW, I've never seen anything but a performance increase or no change
   when going from 10 to 100. In most cases there's a noticeable
   improvement since it's common to have over 100k rows in a table, and
   there's just no way to capture any kind of a real picture of that with
   only 10 buckets.
 
  I'd be more inclined to try to do something that was at least somewhat
  data aware.
 
  The interesting theory that I'd like to verify if I had a chance
  would be to run through a by-column tuning using a set of heuristics.
  My first order approximation would be:
 
  - If a column defines a unique key, then we know there will be no
  clustering of values, so no need to increase the count...
 
  - If a column contains a datestamp, then the distribution of values is
  likely to be temporal, so no need to increase the count...
 
  - If a column has a highly constricted set of values (e.g. - boolean),
  then we might *decrease* the count.
 
  - We might run a query that runs across the table, looking at
  frequencies of values, and if it finds a lot of repeated values, we'd
  increase the count.
 
  That's a bit hand-wavy, but that could lead to both increases and
  decreases in the histogram sizes.  Given that, we can expect the
  overall stat sizes to not forcibly need to grow *enormously*, because
  we can hope for there to be cases of shrinkage.

 I think that before doing any of that you'd be much better off
 investigating how much performance penalty there is for maxing out
 default_statistict_target. If, as I suspect, it's essentially 0 on
 modern hardware, then I don't think it's worth any more effort.

 BTW, that investigation wouldn't just be academic either; if we could
 convince ourselves that there normally wasn't any cost associated with a
 high default_statistics_target, we could increase the default, which
 would reduce the amount of traffic we'd see on -performance about bad
 query plans.

There seems to be *plenty* of evidence out there that the performance
penalty would NOT be essentially zero.

Tom points out:
   eqjoinsel(), for one, is O(N^2) in the number of MCV values kept.

It seems to me that there are cases where we can *REDUCE* the
histogram width, and if we do that, and then pick and choose the
columns where the width increases, the performance penalty may be
yea, verily *actually* 0.

This fits somewhat with Simon Riggs' discussion earlier in the month
about Segment Exclusion; these both represent cases where it is quite
likely that there is emergent data in our tables that can help us to
better optimize our queries.
-- 
http://linuxfinances.info/info/linuxdistributions.html
The definition of insanity is doing the same thing over and over and
expecting different results.  -- assortedly attributed to Albert
Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Bruce Momjian
Tom Lane wrote:
 Bruce Momjian [EMAIL PROTECTED] writes:
  Another question --- why don't we just turn off synchronized_seqscans
  when we do COPY TO?  That would fix pg_dump and be transparent.
 
 Enforcing this from the server side seems a pretty bad idea.  Note that
 there were squawks about having pg_dump behave this way at all; if the
 control is on the pg_dump side then at least we have the chance to make
 it a user option later.
 
 Also, you forgot about pg_dump -d.

OK, but keep in mind if we use synchronized_seqscans in pg_dump we will
have to recognize that GUC forever.

-- 
  Bruce Momjian  [EMAIL PROTECTED]http://momjian.us
  EnterpriseDB http://postgres.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Andrew Dunstan



Tom Lane wrote:


Neither of these sound very plausible, but it seems the next step for
investigation is to look closely at what's happening in gendef.pl.


  


Yes, I have found the problem. It is this line, which I am amazed hasn't 
bitten us before:


   next unless /^\d/;

The first field in the dumpbin output looks like a 3 digit hex number. 
The line on my system for GetConfigOptionByName starts with 'A02' which 
of course fails the test above.


For now I'm going try to fix it by changing it to:

   next unless $pieces[0] =~/^[A-F0-9]{3}$/;

I also propose to have the gendefs.pl script save the dumpbin output so 
this sort of problem will be easier to debug.


cheers

andrew

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Oops - BF:Mastodon just died

2008-01-30 Thread Tom Lane
Andrew Dunstan [EMAIL PROTECTED] writes:
 Yes, I have found the problem. It is this line, which I am amazed hasn't 
 bitten us before:
 next unless /^\d/;
 The first field in the dumpbin output looks like a 3 digit hex number. 

Argh, so it was crossing a power-of-2 boundary that got us.  Good catch.

 For now I'm going try to fix it by changing it to:
 next unless $pieces[0] =~/^[A-F0-9]{3}$/;

Check.

 I also propose to have the gendefs.pl script save the dumpbin output so 
 this sort of problem will be easier to debug.

Agreed, but I suggest waiting till 8.4 is branched unless you are really
sure about this addition.  We freeze for 8.3.0 in less than 24 hours.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable

2008-01-30 Thread Tom Lane
Bruce Momjian [EMAIL PROTECTED] writes:
 OK, but keep in mind if we use synchronized_seqscans in pg_dump we will
 have to recognize that GUC forever.

No, because it's being used on the query side, not in the emitted dump.
We have *never* promised that pg_dump version N could dump from server
version N+1 .., in fact, personally I'd like to make that case be a hard
error, rather than something people could override with -i.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Truncate Triggers

2008-01-30 Thread Decibel!
On Mon, Jan 28, 2008 at 09:09:13PM -0300, Alvaro Herrera wrote:
 Decibel! wrote:
  On Fri, Jan 25, 2008 at 11:40:19AM +, Simon Riggs wrote:
   (for 8.4 ...)
   I'd like to introduce triggers that fire when we issue a truncate:
  
  Rather than focusing exclusively on TRUNCATE, how about triggers that
  fire whenever any kind of DDL operation is performed? (Ok, truncate is
  more DML than DDL, but still).
 
 I don't think it makes sense in general.  For example, would we fire
 triggers on CLUSTER?  Or on ALTER TABLE / SET STATISTICS?

CLUSTER isn't DDL. Most forms of ALTER TABLE are. And CREATE blah, etc.

My point is that people have been asking for triggers that fire when
specific commands are executed for a long time; it would be
short-sighted to come up with a solution that only works for TRUNCATE if
we could instead come up with a more generic solution that works for a
broader class of (or perhaps all) commands.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgph70AwSi4cQ.pgp
Description: PGP signature


Re: [HACKERS] [PATCHES] Better default_statistics_target

2008-01-30 Thread Decibel!
On Wed, Jan 30, 2008 at 09:13:37PM -0500, Christopher Browne wrote:
 There seems to be *plenty* of evidence out there that the performance
 penalty would NOT be essentially zero.
 
 Tom points out:
eqjoinsel(), for one, is O(N^2) in the number of MCV values kept.
 
 It seems to me that there are cases where we can *REDUCE* the
 histogram width, and if we do that, and then pick and choose the
 columns where the width increases, the performance penalty may be
 yea, verily *actually* 0.
 
 This fits somewhat with Simon Riggs' discussion earlier in the month
 about Segment Exclusion; these both represent cases where it is quite
 likely that there is emergent data in our tables that can help us to
 better optimize our queries.

This is all still hand-waving until someone actually measures what the
impact of the stats target is on planner time. I would suggest actually
measuring that before trying to invent more machinery. Besides, I think
you'll need that data for the machinery to make an intelligent decision
anyway...

BTW, with autovacuum I don't really see why we should care about how
long analyze takes, though perhaps it should have a throttle ala
vacuum_cost_delay.
-- 
Decibel!, aka Jim C. Nasby, Database Architect  [EMAIL PROTECTED] 
Give your computer some brain candy! www.distributed.net Team #1828


pgpbkSuadbMUY.pgp
Description: PGP signature