Re: [HACKERS] O_DIRECT support for Windows

2007-03-28 Thread Magnus Hagander
On Wed, Mar 28, 2007 at 02:47:12PM +0900, ITAGAKI Takahiro wrote:
> Magnus Hagander <[EMAIL PROTECTED]> wrote:
> 
> > IIRC, we're still waiting for performance numbers showing there exists a
> > win from this patch.
> 
> Here is a performance number of Direct I/O support on Windows.
> There was 10%+ of performance win on pgbench (263.33 vs. 290.79) in O_DIRECT.

That sounds good enough to go for it.


> However, I only have a desktop-class machine for Windows.
> (Pendium 4 3.6GHz with HT, 3GB of RAM, 2 ATA-drives)
> Test on production-class machines might show different results.

Yes, that would be very good. I don't have any "server-grade" machines I
can run it on ATM. But perhaps someone else does. Stefan, you mentioned you
might have one to run other tests for me - can you run this one? Do you
have a build system on it?

> In addition, I'm slightly worried about aligment issues reported by Magnus.
> We might need fail-back-to-non-direct feature on error for safety.

A question is - is there risk that this works for weeks, and then suddenly
stops working, or will we detect it on the first attempt to write. If we
detect it on the first attempt, that's enough I think - the user can
configure it to use the old behaviour in that case. But if  there is a risk
that we hit it later on, we need to automatically fallback at the time of
the write.

//Magnus


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [PATCHES] [HACKERS] Full page writes improvement, code update

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 10:54 +0900, Koichi Suzuki wrote:

> As written below, full page write can be
> categolized as follows:
> 
> 1) Needed for crash recovery: first page update after each checkpoint.
> This has to be kept in WAL.
> 
> 2) Needed for archive recovery: page update between pg_start_backup and
> pg_stop_backup. This has to be kept in archive log.
> 
> 3) For log-shipping slave such as pg_standby: no full page writes will
> be needed for this purpose.
> 
> My proposal deals with 2). So, if we mark each "full_page_write", I'd
> rather mark when this is needed. Still need only one bit because the
> case 3) does not need any mark.

I'm very happy with this proposal, though I do still have some points in
detailed areas.

If you accept that 1 & 2 are valid goals, then 1 & 3 or 1, 2 & 3 are
also valid goals, ISTM. i.e. you might choose to use full_page_writes on
the primary and yet would like to see optimised data transfer to the
standby server. In that case, you would need the mark.

> > - Not sure why we need "full_page_compress", why not just mark them
> > always? That harms noone. (Did someone else ask for that? If so, keep
> > it)
> 
> No, no one asked to have a separate option. There'll be no bad
> influence to do so.  So, if we mark each "full_page_write", I'd
> rather mark when this is needed. Still need only one bit because the
> case 3) does not need any mark.

OK, different question: 
Why would anyone ever set full_page_compress = off? 

Why have a parameter that does so little? ISTM this is:

i) one more thing to get wrong

ii) cheaper to mark the block when appropriate than to perform the if()
test each time. That can be done only in the path where backup blocks
are present.

iii) If we mark the blocks every time, it allows us to do an offline WAL
compression. If the blocks aren't marked that option is lost. The bit is
useful information, so we should have it in all cases.

> > - OTOH I'd like to see an explicit parameter set during recovery since
> > you're asking the main recovery path to act differently in case a single
> > bit is set/unset. If you are using that form of recovery, we should say
> > so explicitly, to keep everybody else safe.
> 
> Only one thing I had to do is to create "dummy" full page write to
> maintain LSNs. Full page writes are omitted in archive log. We have to
> LSNs same as those in the original WAL. In this case, recovery has to
> read logical log, not "dummy" full page writes. On the other hand, if
> both logical log and "real" full page writes are found in a log record,
> the recovery has to use "real" full page writes.

I apologise for not understanding your reply, perhaps my original
request was unclear.

In recovery.conf, I'd like to see a parameter such as

dummy_backup_blocks = off (default) | on

to explicitly indicate to the recovery process that backup blocks are
present, yet they are garbage and should be ignored. Having garbage data
within the system is potentially dangerous and I want to be told by the
user that they were expecting that and its OK to ignore that data.
Otherwise I want to throw informative errors. Maybe it seems OK now, but
the next change to the system may have unintended consequences and it
may not be us making the change. "It's OK the Alien will never escape
from the lab" is the starting premise for many good sci-fi horrors and I
want to watch them, not be in one myself. :-)

We can call it other things, of course. e.g.
ignore_dummy_blocks
decompressed_blocks
apply_backup_blocks

> Yes I believe so. As pg_standby does not include any chance to meet
> partial writes of pages, I believe you can omit all the full page
> writes. Of course, as Tom Lange suggested in
> http://archives.postgresql.org/pgsql-hackers/2007-02/msg00034.php
> removing full page writes can lose a chance to recover from
> partial/inconsisitent writes in the crash of pg_standby. In this case,
> we have to import a backup and archive logs (with full page writes
> during the backup) to recover. (We have to import them when the file
> system crashes anyway). If it's okay, I believe
> pg_compresslog/pg_decompresslog can be integrated with pg_standby.
> 
> Maybe we can work together to include pg_compresslog/pg_decompresslog in 
> pg_standby.

ISTM there are two options.

I think this option is already possible:

1. Allow pg_decompresslog to operate on a file, replacing it with the
expanded form, like gunzip, so we would do this:
  restore_command = 'pg_standby %f decomp.tmp && pg_decompresslog
decomp.tmp %p'

though the decomp.tmp file would not get properly initialised or cleaned
up when we finish.

whereas this will take additional work

2. Allow pg_standby to write to stdin, so that we can do this:
  restore_command = 'pg_standby %f | pg_decompresslog - %p'

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

 

Re: [HACKERS] Concurrent connections in psql

2007-03-28 Thread Heikki Linnakangas

Tom Lane wrote:

"Simon Riggs" <[EMAIL PROTECTED]> writes:

I would love, love, love to be able to use this syntax within pg_dump as
well, so we can create multiple indexes in parallel at restore time.


I can hardly conceive of greater folly than putting an *experimental*
psql facility into pg_dump scripts, thereby forcing us to support it
until the end of time.


We could write a [awk | perl | scripting language of your choice]-script 
 to transform CREATE INDEX statements in normal pg_dump output to use 
the concurrent psql commands. That way we don't need to modify pg_dump. 
Whether or not to build indexes should be a restore-time, not dump-time 
option anyway. Whether you want it or not it depends on the target, not 
the source.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Pavel Stehule

Hello,

I found in queue patch simply "custom variables protection, Pavel Stehule" 
which you removed and didn't find my patch for scrollable cursors in 
plpgsql.


Regards
Pavel Stehule

_
Emotikony a pozadi programu MSN Messenger ozivi vasi konverzaci. 
http://messenger.msn.cz/



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Bruce Momjian
Josh Berkus wrote:
> Bruce,
> 
> > However, with feature freeze coming on Sunday, I am worried because
> > there are a significant number of patches that have are not ready for
> > review because they have not been completed by their authors.
> 
> Can you flag those somehow?

I have sent out email on every one in the past few days, with the lists
cc'ed.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Gregory Stark
"Bruce Momjian" <[EMAIL PROTECTED]> writes:

> Right now, all the patches I think are ready for review are in the patch
> queue:
>
>   http://momjian.postgresql.org/cgi-bin/pgpatches
>
> However, with feature freeze coming on Sunday, I am worried because
> there are a significant number of patches that have are not ready for
> review because they have not been completed by their authors.

That seems like a bit of a whacky criterion to use before reviewing a patch.
It favours people who are short-sighted and don't see what possible
improvements their code has. No code in an ongoing project like this is ever
"completed" anyways.

It's also an artifact of the working model we have where patches are sent in
big chunks and reviewed much later during a feature freeze. If we were
committing directly into a CVS repository we would have wanted to commit these
changes as soon as they were ready for committing, not wait until they're
"completed". Then continue working and commit further changes. It's only
because there's a two step process and the reviews are mainly happening during
the feature freeze that there's any sense that some of them are "completed".
In fact they're not of course, there will be further changes in the same area
once the freeze is lifted.

I think you should be asking people whether they think the code is in a state
where it can be committed, not whether they've finished working on it. Just
because they see further work that can be done is no reason not to commit
useful patches that are functional as they are.

In fact Postgres historically has had an even looser standard. If the code is
ready to be committed modulo bugs then it's been included in the feature
freeze in the past.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Kenneth Marshall
On Wed, Mar 28, 2007 at 08:07:14AM +0100, Simon Riggs wrote:
> On Tue, 2007-03-27 at 20:48 -0400, Tom Lane wrote:
> > "Simon Riggs" <[EMAIL PROTECTED]> writes:
> > > It seems possible to reduce overall WAL volume by roughly 25% on common
> > > workloads by optimising the way UPDATE statements generate WAL.
> > 
> > This seems a huge amount of work to optimize *one* benchmark.  
> 
> Please don't beat me with that. I wouldn't suggest it if I didn't think
> it would help real users. The analysis of the WAL volume was done using
> a benchmark, but only as a guide to indicate likely usage patterns.
> There aren't many real world heavy UPDATE scenarios to analyze right now
> because people have previously actively avoided such usage.
> 
> > If it
> > weren't so narrowly focused on the properties of a particular benchmark
> > (mostly UPDATE, mostly a few columns in wide tuples), I'd get more
> > excited. 
> 
> Updating the current balance on a Customer Account is one of the main
> focus areas for HOT. Those records are typically at least 250 bytes
> long, so we can save ~200 bytes of WAL per UPDATE for the most frequent
> types of UPDATE. Sure, not all UPDATEs would be optimised, but then they
> are much less frequent.
> 
> As I mentioned, the WAL volume is disproportionately generated by
> UPDATEs of longer rows, so optimising WAL for just a few tables can make
> a big difference to the overall volume.
> 
> >  The extra time spent holding exclusive lock on the page
> > doesn't sound attractive either ...
> 
> Agreed, thats why I set a fairly high bar for when this would kick in.
> The fewer rows on a page, the less contention.
> 
We use DSPAM as one of our anti-spam options. Its UPDATE pattern is to
increment a spam counter or a not-spam counter while keeping the user and
token information the same. This would benefit from this optimization.
Currently we are forced to use MySQL with MyISM tables to support the
update load, although PostgreSQL 8.2 performance is right at the I/O
break-even point for switching databases. With HOT and more optimized
UPDATE I/O, 8.3 would give us enough I/O headroom to switch to PostgreSQL.

Ken Marshall

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Log levels for checkpoint/bgwriter monitoring

2007-03-28 Thread Greg Smith

On Tue, 27 Mar 2007, Magnus Hagander wrote:


Would not at least some of these numbers be better presented through the
stats collector, so they can be easily monitored?
That goes along the line of my way way way away from finished attempt
earlier, perhaps a combination of these two patches?


When I saw your patch recently, I thought to myself "hmmm, the data 
collected here sure looks familiar"--you even made some of the exact same 
code changes I did.  I've been bogged down recently chasing a performance 
issue that, come to find, was mainly caused by the "high CPU usage for 
stats collector" bug.  That caused the background writer to slow to a 
crawl under heavy load, which is why I was having all these checkpoint and 
writer issues that got me monitoring that code in the first place.


With that seemingly resolved, slightly new plan now.  Next I want to take 
the data I've been collecting in my patch, bundle the most important parts 
of that into messages sent to the stats writer the way it was suggested 
you rewrite your patch, then submit the result.  I got log files down and 
have a real good idea what data should be collected, but as this would be 
my first time adding stats I'd certainly love some help with that.


Once that monitoring infrastructure is in place, I then planned to merge 
Itagati's "Load distributed checkpoint" patch (it touches a lot of the 
same code) and test that out under heavy load.  I think it gives a much 
better context to evaluate that patch in if rather than measuring just its 
gross results, you can say something like "with the patch in place the 
average fsync time on my system dropped from 3 seconds to 1.2 seconds when 
writing out more than 100MB at checkpoint time".  That's the direct cause 
of the biggest problem in that area of code, so why not stare right at it 
rather than measuring it indirectly.


--
* Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Kenneth Marshall <[EMAIL PROTECTED]> writes:
> We use DSPAM as one of our anti-spam options. Its UPDATE pattern is to
> increment a spam counter or a not-spam counter while keeping the user and
> token information the same. This would benefit from this optimization.

Would it?  How wide is the "user and token" information?

regards, tom lane

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Florian G. Pflug

Kenneth Marshall wrote:

We use DSPAM as one of our anti-spam options. Its UPDATE pattern is to
increment a spam counter or a not-spam counter while keeping the user and
token information the same. This would benefit from this optimization.
Currently we are forced to use MySQL with MyISM tables to support the
update load, although PostgreSQL 8.2 performance is right at the I/O
break-even point for switching databases. With HOT and more optimized
UPDATE I/O, 8.3 would give us enough I/O headroom to switch to PostgreSQL.


Interesting. I've switched from MySQL to PostgreSQL for dspam, because
of concurrency issues with MyISAM which caused bad performance.

I am eager to see how much HOT speeds of my setup, though ;-)

BTW, the "COMMIT NOWAIT" feature Simon Riggs proposed should provide
a huge speedup too, since dspam runs one transaction for each token
it has to update.

greetings, Florian Pflug

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

   http://www.postgresql.org/about/donate


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Kenneth Marshall <[EMAIL PROTECTED]> writes:
> On Wed, Mar 28, 2007 at 09:46:30AM -0400, Tom Lane wrote:
>> Would it?  How wide is the "user and token" information?

> Sorry about the waste of time. I just noticed that the proposal is
> only for rows over 128 bytes. The token definition is:

> CREATE TABLE dspam_token_data (
>   uid smallint,
>   token bigint,
>   spam_hits int,
>   innocent_hits int,
>   last_hit date,
> );

> which is below the cutoff for the proposal.

Yeah, this illustrates my concern that the proposal is too narrowly
focused on a specific benchmark.

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] ECPG threads test

2007-03-28 Thread Magnus Hagander
Hi!

>From what I can see, the ecpg thread tests (src/interfaces/ecpg/threads)
don't ever run. They rely on ENABLE_THREAD_SAFETY to be set, but even when
I build with --enable-thread-safety, it's not set. This is because ecpg
does not pull in pg_config.h, and also does not specify it on the
commandline.

Or am I missing something completely here? FWIW, it's not running the
threads test on windows or linux in my tests, but I may have missed
something...

//Magnus


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 15:51 +0200, Florian G. Pflug wrote:

> BTW, the "COMMIT NOWAIT" feature Simon Riggs proposed should provide
> a huge speedup too, since dspam runs one transaction for each token
> it has to update.

I've switched to doing the COMMIT NOWAIT as a priority now, but do plan
to do both for 8.3.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Yeah, this illustrates my concern that the proposal is too narrowly
> focused on a specific benchmark.

A lot of the recently proposed changes don't really fit in the "optimizations"
category very well at all. I think of them more as "avoiding pitfalls".

Currently Postgres works quite well if your application is designed around its
performance profile. But as soon as you do something "strange" you run the
risk of running into various pitfalls.

If you keep a long-running transaction open you suddenly find your tables
bloating. If your table grows too large vacuum takes too long to complete and
your tables bloat. If you update the same record many times instead of
batching updates and performing a single update your table bloats. 

This one is similar, if you keep a bunch of static data attached to some small
dynamic data your WAL and table bloats. Certainly you could have engineered
your system not to fall into this pitfall, but only if you knew about it and
only if it was worth the effort and other possible costs of doing so.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Joshua D. Drake

Gregory Stark wrote:

"Tom Lane" <[EMAIL PROTECTED]> writes:


Yeah, this illustrates my concern that the proposal is too narrowly
focused on a specific benchmark.


A lot of the recently proposed changes don't really fit in the "optimizations"
category very well at all. I think of them more as "avoiding pitfalls".

Currently Postgres works quite well if your application is designed around its
performance profile. But as soon as you do something "strange" you run the
risk of running into various pitfalls.


I would go a step further. Once you get into real life scenarios with 
real life work loads, you run into various pitfalls.




If you keep a long-running transaction open you suddenly find your tables
bloating. If your table grows too large vacuum takes too long to complete and
your tables bloat. If you update the same record many times instead of
batching updates and performing a single update your table bloats. 


Long-running transaction is a big problem. I wish I knew how to solve it.



This one is similar, if you keep a bunch of static data attached to some small
dynamic data your WAL and table bloats. Certainly you could have engineered
your system not to fall into this pitfall, but only if you knew about it and
only if it was worth the effort and other possible costs of doing so.



It seems to me the solution could be useful. We have lots of tables that 
fall into the category that the test table presented.


Sincerely,

Joshua D. Drake



--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> A lot of the recently proposed changes don't really fit in the
> "optimizations" category very well at all. I think of them more as
> "avoiding pitfalls".

Well, we can't put a major amount of complexity into the system for
each possible "pitfall".

> This one is similar, if you keep a bunch of static data attached to
> some small dynamic data your WAL and table bloats.

Actually, PG does extremely well on that in the situation where the
static data is *really* wide, ie, wide enough to be toasted out-of-line.
Simon's proposal will only help for an intermediate range of situations
where the row is wide but not very wide.

It strikes me that a more useful solution might come from the recent
discussions about offering more user control of per-column toasting
decisions.  Or maybe we just need to revisit the default toast
thresholds --- AFAIR there has never been any significant study of
the particular values that Jan picked originally.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread August Zajonc
Simon Riggs wrote:
> On Tue, 2007-03-27 at 20:48 -0400, Tom Lane wrote:
>> "Simon Riggs" <[EMAIL PROTECTED]> writes:
>>> It seems possible to reduce overall WAL volume by roughly 25% on common
>>> workloads by optimising the way UPDATE statements generate WAL.
>> This seems a huge amount of work to optimize *one* benchmark.  
> 
> Please don't beat me with that. I wouldn't suggest it if I didn't think
> it would help real users. The analysis of the WAL volume was done using
> a benchmark, but only as a guide to indicate likely usage patterns.
> There aren't many real world heavy UPDATE scenarios to analyze right now
> because people have previously actively avoided such usage.
> 
>> If it
>> weren't so narrowly focused on the properties of a particular benchmark
>> (mostly UPDATE, mostly a few columns in wide tuples), I'd get more
>> excited. 
> 

As a reference unless there is some further restriction I'm not
understanding I've seen a lot of scenarios with this profile.

An online multiplayer game, around 60,000 active users (out of 250k
registered) did 6m page view per day, most dynamic.

Most of the interactions between players, or between players and objects
in the system resulted in updates to tables perhaps 512-1kb wide. A fair
number of strings etc.

Session information was tracked for active users, similar description
but even more updates.

I'd echo the other poster, that at the time we didn't fully know
postgresql's performance profile (this was 6 years ago now) to code to,
and ended up using MySQL because it worked and deadlines were tight.

I can think of a lot of other scenarios as well that would be helped.

Echoing another poster, it would be nice to round of a couple of the
gotcha edges. This may be one of them.

- August

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 10:51 -0400, Tom Lane wrote:
> Kenneth Marshall <[EMAIL PROTECTED]> writes:
> > On Wed, Mar 28, 2007 at 09:46:30AM -0400, Tom Lane wrote:
> >> Would it?  How wide is the "user and token" information?
> 
> > Sorry about the waste of time. I just noticed that the proposal is
> > only for rows over 128 bytes. The token definition is:
> 
> > CREATE TABLE dspam_token_data (
> >   uid smallint,
> >   token bigint,
> >   spam_hits int,
> >   innocent_hits int,
> >   last_hit date,
> > );
> 
> > which is below the cutoff for the proposal.

More to the point this looks like it has already been optimised to
reduce the row length on a heavily updated table.

> Yeah, this illustrates my concern that the proposal is too narrowly
> focused on a specific benchmark.

Not really. I specifically labelled that recommendation as a discussion
point, so if you don't like the limit, please say so. My reasoning for
having a limit at all is that block contention goes up at least as fast
as the inverse of row length since the best case is when rows are
randomly distributed and updated.

What other aspect of the proposal has anything whatsoever to do with
this single benchmark you think I'm over-fitting to? 

You and I discussed this in Toronto, so I'm surprised by your comments.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> Actually, PG does extremely well on that in the situation where the
> static data is *really* wide, ie, wide enough to be toasted out-of-line.
> Simon's proposal will only help for an intermediate range of situations
> where the row is wide but not very wide.

The reason I think this is idea is exciting is that later I would suggest
applying it to HOT updates. Having to keep a spare tuple's worth of space in
every page is pretty annoying. But if we could get by with the average
half-tuple dead space to do an update-- or even several updates--it would make
a huge difference.

> It strikes me that a more useful solution might come from the recent
> discussions about offering more user control of per-column toasting
> decisions.  Or maybe we just need to revisit the default toast
> thresholds --- AFAIR there has never been any significant study of
> the particular values that Jan picked originally.

I agree that these values need a second look. I think a TOAST_TUPLE_THRESHOLD
well smaller than the current value would still easily pay its way. With a
little caution to avoid wasting too much effort on the last few bytes I
suspect even as low as 400-500 bytes is probably worthwhile.

Also, it may make sense to take into account what percentage of the overall
tuple a field is. It doesn't make much sense to start toasting fields in a
table that consists of fourty 40-byte varcahars for example. Whereas it
probably does make sense to toast a single 500-byte varchar in a table if the
rest of the table consists of just ten integers.

But considering how large the toast pointer itself is, how expensive it is to
fetch it, and that we need one for each attribute, it still won't be able to
usefully handle anything under 32 bytes or so. That's still a lot more than a
single byte indicating that the field is unchanged. It's not uncommon to have
tables that are wide because they have lots of small data in them.

In fact looking at this code now, is there a bug here? I don't see anything in
there stopping us from trying to toast varlenas that are smaller than a toast
pointer. Have I just missed something obvious?

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> The reason I think this is idea is exciting is that later I would suggest
> applying it to HOT updates. Having to keep a spare tuple's worth of space in
> every page is pretty annoying. But if we could get by with the average
> half-tuple dead space to do an update-- or even several updates--it would make
> a huge difference.

Uh ... what?  This proposal is about reducing WAL volume, not about
changing the live data storage.

>> Or maybe we just need to revisit the default toast
>> thresholds --- AFAIR there has never been any significant study of
>> the particular values that Jan picked originally.

> I agree that these values need a second look. I think a TOAST_TUPLE_THRESHOLD
> well smaller than the current value would still easily pay its way. With a
> little caution to avoid wasting too much effort on the last few bytes I
> suspect even as low as 400-500 bytes is probably worthwhile.

Maybe.  One thing I was just thinking about is that it's silly to have
the threshold constrained so strongly by a desire that tuples in toast
tables not be toastable.  It would be trivial to tweak the heapam.c
routines so that they simply don't invoke the toaster when relkind is
't', and then we could have independent choices of toast-tuple size and
main-tuple size.  This would be particularly good because in the current
scheme you can't modify toast-tuple size without an initdb, but if that
were decoupled there'd be no reason not to allow changes in the
main-tuple thresholds.

> In fact looking at this code now, is there a bug here? I don't see anything in
> there stopping us from trying to toast varlenas that are smaller than a toast
> pointer. Have I just missed something obvious?

Note the initialization of biggest_size.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


[HACKERS] ECPG regression tests expected files

2007-03-28 Thread Magnus Hagander
If I change the code in one of the ecpg regression tests (porting tests as
well to non-pthread win32), am I supposed to manually change the .c files
in the expected directory? Or is ther some other process for it?

//Magnus

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Zeugswetter Andreas ADI SD

> > I agree that these values need a second look. I think a 
> > TOAST_TUPLE_THRESHOLD well smaller than the current value would
still 
> > easily pay its way. With a little caution to avoid wasting too much 
> > effort on the last few bytes I suspect even as low as 
> 400-500 bytes is probably worthwhile.

But a seq scan (or non cached access) would suddenly mutate to multiple
random accesses, so this is not a win-win situation.

Btw: Do we consider the existance of toasted columns in the seq-scan
cost estimation ?

Andreas 

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Simon Riggs
On Tue, 2007-03-27 at 21:15 -0400, Bruce Momjian wrote:
> Right now, all the patches I think are ready for review are in the patch
> queue:
> 
>   http://momjian.postgresql.org/cgi-bin/pgpatches
> 
> However, with feature freeze coming on Sunday, I am worried because
> there are a significant number of patches that have are not ready for
> review because they have not been completed by their authors.

It's probably a good idea to have a queue of those too, to allow others
to finish them if the original author hasn't/can't/won't. I'm not sure
which ones you mean.

I have at least 2 patches that depend upon other patches in the queue.
I'm not sure how to go about completing them, so any advice or guidance
would be welcome:

- Scan_recycle_buffers depends upon synchronised scans because we agreed
we would use the same parameter (if any exists) to govern the behaviour.
Should I write a patch-on-patch? What happens if the patch changes after
review? ISTM I should just wait until the first one is applied and then
I can make the necessary changes in about an hour. The patch's main
functionality is complete.

- Fast cluster conflicts with Heikki's cluster patch, so one of them
will need fixing depending which is applied first. I don't mind if its
me going second. I also have proposed an additional mode on VACUUM FULL
that builds upon Heikki's patch - should I submit that also, even though
it cannot be applied?

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Tom Lane
"Zeugswetter Andreas ADI SD" <[EMAIL PROTECTED]> writes:
> Btw: Do we consider the existance of toasted columns in the seq-scan
> cost estimation ?

Not at present.  There was some discussion of this but it seems like
a fair amount of work --- we don't currently track statistics on how
many of a column's entries are toasted or how big they are.  For that
matter it would be entirely unreasonable to pin the cost on "seq scan";
you'd need to look in close detail at exactly where and how the Vars get
used, and distinguish simply copying a Var from actual use of its value.
The planner is mostly uninterested in the evaluation costs of plan node
targetlists at the moment, and IIRC that's something not so easily
changed...

regards, tom lane

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Pavan Deolasee

On 3/23/07, Pavan Deolasee <[EMAIL PROTECTED]> wrote:




Its slightly different for the HOT-chains created by this transaction
which
is creating the index. We should index the latest version of the row which
is not yet committed. But thats ok because when CREATE INDEX commits
this latest version would also get committed.


Sounds like you'll need to store the Next TransactionId rather than the
> TransactionId of the CREATE INDEX.



Just when I thought we have nailed down CREATE INDEX, I realized
that there something more to worry. The problem is with the HOT-chains
created by our own transaction which is creating the index. We thought
it will be enough to index the tuple at the head-of-the-chain since that
would be the visible copy once the transaction commits. We thought
of keeping the index unavailable for queries in pre-existing transactions
by setting a new "xid" attribute in pg_index. The question is what value
to assign to "xid". I though we would assign ReadNewTransactionId().

Now, a new transaction can start before we commit and hence have
transaction_id > xid. This transaction can still see the old tuple
(because the transaction creating the index is not yet committed)
which we did not index while creating the index. Once the transaction
creating the index commits, the index is also available to this new
transaction and we are in trouble at that point.

I think Tom had already seen this, but his comment got overlooked
in the flow discussion. If thats the case, I regret that.

Any idea how to handle this case ? One ugly hack I can think of
is to remember all those indexes created in the transaction for which
we had seen DELETE_IN_PROGRESS tuples while building
the index. At the commit time, we (somehow) stop new transactions
to start, note the next transaction id and set it in pg_index and commit
the transaction. New transactions are then enabled again.

I know it looks ugly. Is there something better that we can do ?

Thanks,
Pavan


--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread David Fetter
On Wed, Mar 28, 2007 at 07:05:24AM -, Andrew - Supernews wrote:
> On 2007-03-27, David Fetter <[EMAIL PROTECTED]> wrote:
> > Per further discussion with Andrew of Supernews and Merlin
> > Moncure, I've added a check for compound types and moved the
> > creation of the array type from DefineRelation in
> > backend/commands/tablecmds.c to heap_create_with_catalog in
> > backend/catalog/heap.c.
> 
> You've still got the usage of the relation OID and the relation
> _type_ OID reversed.
> 
> The array element type that you pass to TypeCreate must be the
> _type_ OID.

The attached patch takes it down to two regression test failures, also
attached: 

The first is in type_sanity, which basically doesn't understand that
complex types now have array types associated with them and thinks
they're orphan array types, so it's actually the test that's not
right.

The second is in alter_table where ALTER TABLE ... SET SCHEMA doesn't
pick up the array types associated with the tables.

Andrew at Supernews also noticed that in general, the array type
doesn't change schemas when its base type does.  Is this the intended
behavior?  If not, should we change it globally?

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate
? GNUmakefile
? array_of_complex.diff
? config.log
? config.status
? contrib/spi/.deps
? src/Makefile.global
? src/backend/postgres
? src/backend/access/common/.deps
? src/backend/access/gin/.deps
? src/backend/access/gist/.deps
? src/backend/access/hash/.deps
? src/backend/access/heap/.deps
? src/backend/access/index/.deps
? src/backend/access/nbtree/.deps
? src/backend/access/transam/.deps
? src/backend/bootstrap/.deps
? src/backend/catalog/.deps
? src/backend/catalog/postgres.bki
? src/backend/catalog/postgres.description
? src/backend/catalog/postgres.shdescription
? src/backend/commands/.deps
? src/backend/executor/.deps
? src/backend/lib/.deps
? src/backend/libpq/.deps
? src/backend/main/.deps
? src/backend/nodes/.deps
? src/backend/optimizer/geqo/.deps
? src/backend/optimizer/path/.deps
? src/backend/optimizer/plan/.deps
? src/backend/optimizer/prep/.deps
? src/backend/optimizer/util/.deps
? src/backend/parser/.deps
? src/backend/port/.deps
? src/backend/postmaster/.deps
? src/backend/regex/.deps
? src/backend/rewrite/.deps
? src/backend/storage/buffer/.deps
? src/backend/storage/file/.deps
? src/backend/storage/freespace/.deps
? src/backend/storage/ipc/.deps
? src/backend/storage/large_object/.deps
? src/backend/storage/lmgr/.deps
? src/backend/storage/page/.deps
? src/backend/storage/smgr/.deps
? src/backend/tcop/.deps
? src/backend/utils/.deps
? src/backend/utils/adt/.deps
? src/backend/utils/cache/.deps
? src/backend/utils/error/.deps
? src/backend/utils/fmgr/.deps
? src/backend/utils/hash/.deps
? src/backend/utils/init/.deps
? src/backend/utils/mb/.deps
? src/backend/utils/mb/conversion_procs/conversion_create.sql
? src/backend/utils/mb/conversion_procs/ascii_and_mic/.deps
? src/backend/utils/mb/conversion_procs/ascii_and_mic/libascii_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/cyrillic_and_mic/.deps
? 
src/backend/utils/mb/conversion_procs/cyrillic_and_mic/libcyrillic_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/euc_cn_and_mic/.deps
? src/backend/utils/mb/conversion_procs/euc_cn_and_mic/libeuc_cn_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/euc_jis_2004_and_shift_jis_2004/.deps
? 
src/backend/utils/mb/conversion_procs/euc_jis_2004_and_shift_jis_2004/libeuc_jis_2004_and_shift_jis_2004.so.0.0
? src/backend/utils/mb/conversion_procs/euc_jp_and_sjis/.deps
? 
src/backend/utils/mb/conversion_procs/euc_jp_and_sjis/libeuc_jp_and_sjis.so.0.0
? src/backend/utils/mb/conversion_procs/euc_kr_and_mic/.deps
? src/backend/utils/mb/conversion_procs/euc_kr_and_mic/libeuc_kr_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/euc_tw_and_big5/.deps
? 
src/backend/utils/mb/conversion_procs/euc_tw_and_big5/libeuc_tw_and_big5.so.0.0
? src/backend/utils/mb/conversion_procs/latin2_and_win1250/.deps
? 
src/backend/utils/mb/conversion_procs/latin2_and_win1250/liblatin2_and_win1250.so.0.0
? src/backend/utils/mb/conversion_procs/latin_and_mic/.deps
? src/backend/utils/mb/conversion_procs/latin_and_mic/liblatin_and_mic.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_ascii/.deps
? src/backend/utils/mb/conversion_procs/utf8_and_ascii/libutf8_and_ascii.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_big5/.deps
? src/backend/utils/mb/conversion_procs/utf8_and_big5/libutf8_and_big5.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_cyrillic/.deps
? 
src/backend/utils/mb/conversion_procs/utf8_and_cyrillic/libutf8_and_cyrillic.so.0.0
? src/backend/utils/mb/conversion_procs/utf8_and_euc_cn/.deps
? 
src/backend/utils/mb/conversion_procs/utf8_and_euc_cn/libutf8_and_euc_cn.so.0.0
?

Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread Alvaro Herrera
David Fetter wrote:

> The first is in type_sanity, which basically doesn't understand that
> complex types now have array types associated with them and thinks
> they're orphan array types, so it's actually the test that's not
> right.

Hmm, I question the usefulness of automatically creating array types for
all relation types that are created -- the catalog bloat seems a bit too
much.  An array of pg_autovacuum for example, does that make sense?

I'm not sure what was the reaction to having an "CREATE TYPE foo ARRAY
OF bar" command of some kind?  I think this was discussed but not
explicitely rejected, or was it?

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


[HACKERS] Modifying TOAST thresholds

2007-03-28 Thread Tom Lane
In another thread I wrote:
> ... One thing I was just thinking about is that it's silly to have
> the threshold constrained so strongly by a desire that tuples in toast
> tables not be toastable.  It would be trivial to tweak the heapam.c
> routines so that they simply don't invoke the toaster when relkind is
> 't', and then we could have independent choices of toast-tuple size and
> main-tuple size.  This would be particularly good because in the current
> scheme you can't modify toast-tuple size without an initdb, but if that
> were decoupled there'd be no reason not to allow changes in the
> main-tuple thresholds.

After thinking about this more I'm convinced that the above is a good
idea, eg in heap_insert change

if (HeapTupleHasExternal(tup) || tup->t_len > TOAST_TUPLE_THRESHOLD)
heaptup = toast_insert_or_update(relation, tup, NULL, use_wal);
else
heaptup = tup;

to

if (relation->rd_rel->relkind == RELKIND_TOASTVALUE)
{
/* toast table entries should never be recursively toasted */
Assert(!HeapTupleHasExternal(tup));
heaptup = tup;
}
else if (HeapTupleHasExternal(tup) || tup->t_len > TOAST_TUPLE_THRESHOLD)
heaptup = toast_insert_or_update(relation, tup, NULL, use_wal);
else
heaptup = tup;

I also think that we ought to add TOAST_MAX_CHUNK_SIZE to the set of
compiled-in parameters that are recorded in pg_control and checked for
compatibility at startup (like BLCKSZ) --- this will prevent anyone from
shooting themselves in the foot while experimenting.

Any objections?

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 22:24 +0530, Pavan Deolasee wrote:

> Just when I thought we have nailed down CREATE INDEX, I realized 
> that there something more to worry. The problem is with the HOT-chains
> created by our own transaction which is creating the index. We thought
> it will be enough to index the tuple at the head-of-the-chain since
> that 
> would be the visible copy once the transaction commits. We thought
> of keeping the index unavailable for queries in pre-existing
> transactions
> by setting a new "xid" attribute in pg_index. The question is what
> value 
> to assign to "xid". I though we would assign ReadNewTransactionId().

> Any idea how to handle this case ?

Set it at the end, not the beginning.

If you are indexing a table that hasn't just been created by you, set
the xcreate field on pg_index at the *end* of the build using
ReadNewTransactionId(). Any xid less than that sees the index as
invalid. If you created the table in this transaction (i.e.
createSubId != 0) then set xcreate to creating xid.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] ECPG regression tests expected files

2007-03-28 Thread Michael Meskes
On Wed, Mar 28, 2007 at 06:13:03PM +0200, Magnus Hagander wrote:
> If I change the code in one of the ecpg regression tests (porting tests as
> well to non-pthread win32), am I supposed to manually change the .c files
> in the expected directory? Or is ther some other process for it?

Just run the test, check whether it's okay and then replace the expected
.c file with yours.

Please change what you need for win32. I will have a look at the changes
on Linux as soon as I find time. The other thread related emails are
still in my inbox. So the same holds for those. :-)

Michael
-- 
Michael Meskes
Email: Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
ICQ: 179140304, AIM/Yahoo: michaelmeskes, Jabber: [EMAIL PROTECTED]
Go SF 49ers! Go Rhein Fire! Use Debian GNU/Linux! Use PostgreSQL!

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] ECPG regression tests expected files

2007-03-28 Thread Magnus Hagander
Michael Meskes wrote:
> On Wed, Mar 28, 2007 at 06:13:03PM +0200, Magnus Hagander wrote:
>> If I change the code in one of the ecpg regression tests (porting tests as
>> well to non-pthread win32), am I supposed to manually change the .c files
>> in the expected directory? Or is ther some other process for it?
> 
> Just run the test, check whether it's okay and then replace the expected
> .c file with yours.

Ok.


> Please change what you need for win32. I will have a look at the changes
> on Linux as soon as I find time. The other thread related emails are
> still in my inbox. So the same holds for those. :-)

Ok. Will do.

If you want to pick one early, please look at the one about the thread
regression tests not appearing to run at all. I'd like to have that
confirmed before I try to dig into how to fix it - in case it's not
actually broken, and it's just me who's doing something wrong...

//Magnus

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Reduction in WAL for UPDATEs

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 11:17 -0400, Tom Lane wrote:
> > This one is similar, if you keep a bunch of static data attached to
> > some small dynamic data your WAL and table bloats.
> 
> Actually, PG does extremely well on that in the situation where the
> static data is *really* wide, ie, wide enough to be toasted out-of-line.
> Simon's proposal will only help for an intermediate range of situations
> where the row is wide but not very wide.

Trouble is, thats lots of commonly updated tables. Thin tables generate
only small amounts of WAL, while very wide tables are optimised for
UPDATE already.

But right now most relational tables that represent Major Entities, i.e.
objects in the real world, have row lengths in the range 100-2000 bytes.

Page hit counters, Customer Accounts, Financials-to-date, Event
bookings, Seats sold.

> It strikes me that a more useful solution might come from the recent
> discussions about offering more user control of per-column toasting
> decisions.  Or maybe we just need to revisit the default toast
> thresholds --- AFAIR there has never been any significant study of
> the particular values that Jan picked originally.

That's effectively definable vertical partitioning. The user has to know
about this and do something about it themselves. By default we compress
before we move out, so you'd need to know that also.

I've investigated that route briefly but there does seem to be a
surprising overhead in splitting off small pieces of data. Currently
TOAST seems to be optimised for when we have more than one chunk of
data.

I agree there's something worth looking at there, but even so I don't
see a direct correspondence between seldom updated (=> WAL reduction is
beneficial) and seldom used (=> TOAST is beneficial) columns.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread Andrew Dunstan

Alvaro Herrera wrote:

David Fetter wrote:

  

The first is in type_sanity, which basically doesn't understand that
complex types now have array types associated with them and thinks
they're orphan array types, so it's actually the test that's not
right.



Hmm, I question the usefulness of automatically creating array types for
all relation types that are created -- the catalog bloat seems a bit too
much.  An array of pg_autovacuum for example, does that make sense?

I'm not sure what was the reaction to having an "CREATE TYPE foo ARRAY
OF bar" command of some kind?  I think this was discussed but not
explicitely rejected, or was it?

  


It certainly seems rather inconsistent to have array types autocreated 
for some types but not others. But unless we create them for all types 
then I think we need a command such as you suggest.


How much bloat will this really be? If it's not used it won't get into 
the type cache. I find it hard to believe there will be any very 
significant performance effect.


cheers

andrew



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread Tom Lane
Alvaro Herrera <[EMAIL PROTECTED]> writes:
> Hmm, I question the usefulness of automatically creating array types for
> all relation types that are created -- the catalog bloat seems a bit too
> much.  An array of pg_autovacuum for example, does that make sense?

Not only that, it won't even work for pg_statistic, which has got a
special kluge to allow anyarray in a place where it usually mustn't go.

> I'm not sure what was the reaction to having an "CREATE TYPE foo ARRAY
> OF bar" command of some kind?  I think this was discussed but not
> explicitely rejected, or was it?

I think this is a much better idea than automatically creating a pile of
usually-useless types.  In the long run maybe we should even migrate to
the assumption that array types aren't automatically created?

If we think this way, it changes the ground rules for Andrew's question
about whether an array type ought to be affected by ALTER TYPE SET
SCHEMA on its base type --- it starts to look more like an independent
entity than an auxiliary component.  I'm not really sure which answer
I like better.

One point here is that currently the system depends on the naming
convention "foo[] is named _foo" to be able to find the array type
from the base type.  The syntax you suggest would break that.  We
could fix it by adding More Stuff to pg_type, but I wonder whether
it's worth it, compared to say
CREATE ARRAY TYPE FOR foo
Also, at the moment ALTER TYPE SET SCHEMA is certainly broken because
it destroys this naming convention ... we either abandon the convention
or fix SET SCHEMA.

regards, tom lane

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Modifying TOAST thresholds

2007-03-28 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

> I also think that we ought to add TOAST_MAX_CHUNK_SIZE to the set of
> compiled-in parameters that are recorded in pg_control and checked for
> compatibility at startup (like BLCKSZ) --- this will prevent anyone from
> shooting themselves in the foot while experimenting.

Is there any reason to experiment with this? I would have thought we would
divorce TOAST_MAX_CHUNK_SIZE from TOAST_THRESHOLD and hard code it as the same
expression that's there now. Ie, the largest size that can fit in a page.

That doesn't mean it shouldn't go in pg_control of course but it would just be
a function of BLCKSIZE and the architecture alignment and not dependent on any
user configurable value.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Florian G. Pflug

Simon Riggs wrote:

On Wed, 2007-03-28 at 22:24 +0530, Pavan Deolasee wrote:

Just when I thought we have nailed down CREATE INDEX, I realized 
that there something more to worry. The problem is with the HOT-chains

created by our own transaction which is creating the index. We thought
it will be enough to index the tuple at the head-of-the-chain since
that 
would be the visible copy once the transaction commits. We thought

of keeping the index unavailable for queries in pre-existing
transactions
by setting a new "xid" attribute in pg_index. The question is what
value 
to assign to "xid". I though we would assign ReadNewTransactionId().

>

If you are indexing a table that hasn't just been created by you, set
the xcreate field on pg_index at the *end* of the build using
ReadNewTransactionId(). Any xid less than that sees the index as
invalid. If you created the table in this transaction (i.e.
createSubId != 0) then set xcreate to creating xid.


Couldn't you store the creating transaction's xid in pg_index, and
let other transaction check that against their snapshot like they
would for any tuple's xmin or xmax? (With one exception - the creating
transaction would consider indices it built itself invalid, which
is not how things usually work for xmin/xmax).

This would mean that any transaction that believes that the creating
transaction has committed also consideres the index to be valid.

greetings, Florian Pflug


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Pavan Deolasee

On 3/28/07, Tom Lane <[EMAIL PROTECTED]> wrote:


"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> Couldn't you store the creating transaction's xid in pg_index, and
> let other transaction check that against their snapshot like they
> would for any tuple's xmin or xmax?

What snapshot?  I keep having to remind people that system catalog
operations are SnapshotNow.  In the particular context here, the
place where doing something else would break down is that the planner
has no idea when it makes a plan what snapshot(s) the plan might later
be used with.





Tom, please correct me if I am wrong. But ISTM that this idea might
work in this context. In get_relation_info(), we would check if "xcreate"
xid stored in pg_index for the index under consideration is seen
committed with respect to the snapshot at that point of time.
Even if the snapshot changes later and index becomes valid, we
might not replan and hence not use index. But that doesn't seem
like a big problem to me.

So in get_relation_info():

We get the transaction snapshot. If its a serializable transaction,
it can't change later. If its a read-commited transaction, we anyways
don't care because the transaction can only see the last committed
version of the tuple in the table and we have indexed that. And so
we are safe.

- if xcreate < snapshot->xmin, index is valid
- if xcreate > snapshot->xmax, index is invalid
- if snapshot->xmax > xcreate > snapshot->xmin, we check xids in the
snapshot
 to determine validity of the index.

I might have got something wrong here, but I hope I made the point.

Am I missing something here ?

Thanks,
Pavan



--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Tom Lane
"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> Couldn't you store the creating transaction's xid in pg_index, and
> let other transaction check that against their snapshot like they
> would for any tuple's xmin or xmax?

What snapshot?  I keep having to remind people that system catalog
operations are SnapshotNow.  In the particular context here, the
place where doing something else would break down is that the planner
has no idea when it makes a plan what snapshot(s) the plan might later
be used with.

regards, tom lane

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Pavan Deolasee

On 3/28/07, Simon Riggs <[EMAIL PROTECTED]> wrote:




Set it at the end, not the beginning.




At the end of what ? It does not help to set it at the end of CREATE
INDEX because the transaction may not commit immediately. In
the meantime, many new transactions may start with
transaction id > xcreate. All these transactions can see the old
tuple (which we did not index) and can also see the index once
CREATE INDEX commits.



If you are indexing a table that hasn't just been created by you, set

the xcreate field on pg_index at the *end* of the build using
ReadNewTransactionId(). Any xid less than that sees the index as
invalid. If you created the table in this transaction (i.e.
createSubId != 0) then set xcreate to creating xid.





Why do we need to handle the case where  table is created in
the same transaction ? Neither the table nor the index is visible
until we commit. So thats a simple case.


Thanks,
Pavan


--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Florian G. Pflug

Tom Lane wrote:

"Florian G. Pflug" <[EMAIL PROTECTED]> writes:

Couldn't you store the creating transaction's xid in pg_index, and
let other transaction check that against their snapshot like they
would for any tuple's xmin or xmax?


What snapshot?  I keep having to remind people that system catalog
operations are SnapshotNow.  In the particular context here, the
place where doing something else would break down is that the planner
has no idea when it makes a plan what snapshot(s) the plan might later
be used with.


Sorry - now that you say it, I remember that you've said that already
multiple times...

So the question is, why did this work until now, and CREATE INDEX+HOT
just doesn't seem to fit into this scheme?

I think the answer is that  all other DDL statements manage to assure
that any database objects they create or modify are usable for everybody
else immediatly after they are committed. This usually implies pretty
strong locking requirements - for example, I think that the core reason
why TRUNCATE needs an exclusive lock is precisely that guarantee it has
to make.

Maybe this could somehow be relaxed? Could, for example, the planner
be allowed to base some of it's decisions on the SerializableSnapshot
the every transaction (even read-only ones) posseses? It seems that
this would prevent plans from living longer than a transaction,
but maybe plan invalidation could help here?

greetings, Florian Pflug


---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread David Fetter
On Wed, Mar 28, 2007 at 01:33:56PM -0400, Andrew Dunstan wrote:
> Alvaro Herrera wrote:
> >David Fetter wrote:
> >>The first is in type_sanity, which basically doesn't understand
> >>that complex types now have array types associated with them and
> >>thinks they're orphan array types, so it's actually the test
> >>that's not right.
> >
> >Hmm, I question the usefulness of automatically creating array
> >types for all relation types that are created -- the catalog bloat
> >seems a bit too much.  An array of pg_autovacuum for example, does
> >that make sense?
> >
> >I'm not sure what was the reaction to having an "CREATE TYPE foo
> >ARRAY OF bar" command of some kind?  I think this was discussed but
> >not explicitely rejected, or was it?
> 
> It certainly seems rather inconsistent to have array types
> autocreated for some types but not others.

This was my thought in the latest version of the patch.

> But unless we create them for all types then I think we need a
> command such as you suggest.
> 
> How much bloat will this really be? If it's not used it won't get
> into the type cache. I find it hard to believe there will be any
> very significant performance effect.

So do I, but how would we check this?

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 23:42 +0530, Pavan Deolasee wrote:
> 
> 
> On 3/28/07, Simon Riggs <[EMAIL PROTECTED]> wrote:
> 
> 
> Set it at the end, not the beginning.
> 
> 
> At the end of what ? It does not help to set it at the end of CREATE
> INDEX because the transaction may not commit immediately. In
> the meantime, many new transactions may start with 
> transaction id > xcreate. All these transactions can see the old
> tuple (which we did not index) and can also see the index once
> CREATE INDEX commits.

AtEOX_Reincarnate()... :-)

Set xcreate to InvalidTransactionId when we build the index

If we created an index in this transaction, as soon as we commit the top
level transaction, run another top level transaction to set xcreate
using ReadNewTransactionId().

During WAL replay, we remember any index creations and reset xcreate if
we were unlucky enough to crash between the two transactions.

(I'll be offline now for a few hours until the flames subside.)

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
   choose an index scan if your joining column's datatypes do not
   match


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Florian G. Pflug

Pavan Deolasee wrote:

On 3/28/07, Tom Lane <[EMAIL PROTECTED]> wrote:


"Florian G. Pflug" <[EMAIL PROTECTED]> writes:
> Couldn't you store the creating transaction's xid in pg_index, and
> let other transaction check that against their snapshot like they
> would for any tuple's xmin or xmax?

What snapshot?  I keep having to remind people that system catalog
operations are SnapshotNow.  In the particular context here, the
place where doing something else would break down is that the planner
has no idea when it makes a plan what snapshot(s) the plan might later
be used with.


Tom, please correct me if I am wrong. But ISTM that this idea might
work in this context. In get_relation_info(), we would check if "xcreate"
xid stored in pg_index for the index under consideration is seen
committed with respect to the snapshot at that point of time.
Even if the snapshot changes later and index becomes valid, we
might not replan and hence not use index. But that doesn't seem
like a big problem to me.


That problem are usecases like
PREPARE my_plan  ;
BEGIN;
EXECUTE my_plan  ;
COMMIT ;

Is that "PREPARE" even run inside a transaction? Even if it is, it
probably won't have created a snapshot...

I think allowing the use of some sort of snapshot from inside the planner
would allow some locking to be relaxed, but there seems be a lot of
corner cases to consider :-(

OTOH, if you manage to make this work, a TRUNCATE that doesn't block
concurrent selects might become possible to do. This would for example
allow dropping and rebuilding subscriptions on a slony node while it
is in use.

greetings, Florian Pflug

---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings


Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread Gregory Stark

"Tom Lane" <[EMAIL PROTECTED]> writes:

>   CREATE ARRAY TYPE FOR foo

I also made a suggestion along the way that we never create array types
automatically except for domains. Ie, we don't need a new command, we just
document that what you do if you want to create an array of something is
create a domain for it then use arrays of that domain.

I'm not sure whether having to create a new command is cleaner or less clean
than overloading an existing command with two purposes.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>> CREATE ARRAY TYPE FOR foo

> I also made a suggestion along the way that we never create array types
> automatically except for domains.

That seems awfully strange, not to mention very non-backwards-compatible
since it exactly reverses what happens now.

I'd be willing to consider it if a domain were a zero-cost addition to
the equation, but it is not --- every operation on a domain has to check
to see if there are constraints to enforce.  You shouldn't have to buy
into that overhead to have an array.

regards, tom lane

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Arrays of Complex Types

2007-03-28 Thread David Fetter
On Wed, Mar 28, 2007 at 03:24:26PM -0400, Tom Lane wrote:
> Gregory Stark <[EMAIL PROTECTED]> writes:
> > "Tom Lane" <[EMAIL PROTECTED]> writes:
> >> CREATE ARRAY TYPE FOR foo
> 
> > I also made a suggestion along the way that we never create array
> > types automatically except for domains.
> 
> That seems awfully strange, not to mention very
> non-backwards-compatible since it exactly reverses what happens now.
> 
> I'd be willing to consider it if a domain were a zero-cost addition
> to the equation, but it is not --- every operation on a domain has
> to check to see if there are constraints to enforce.  You shouldn't
> have to buy into that overhead to have an array.

The way I see the big picture, complex types, arrays and domains
should all compose without limit, as in arrays of domains of complex
types, etc.  The SQL standard even has something like our SETOF (which
should probably be called BAGOF, but let's not go there just now ;) in
the form of MULTISET, and that, too, should eventually be in the above
mix.

I'm not advocating the idea that people should *store* those
compositions--if it were just up to me, I'd disallow it--but they're
very handy for input and output :)

Cheers,
D
-- 
David Fetter <[EMAIL PROTECTED]> http://fetter.org/
phone: +1 415 235 3778AIM: dfetter666
  Skype: davidfetter

Remember to vote!
Consider donating to PostgreSQL: http://www.postgresql.org/about/donate

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Bruce Momjian
Gregory Stark wrote:
> "Bruce Momjian" <[EMAIL PROTECTED]> writes:
> 
> > Right now, all the patches I think are ready for review are in the patch
> > queue:
> >
> > http://momjian.postgresql.org/cgi-bin/pgpatches
> >
> > However, with feature freeze coming on Sunday, I am worried because
> > there are a significant number of patches that have are not ready for
> > review because they have not been completed by their authors.
> 
> That seems like a bit of a whacky criterion to use before reviewing a patch.

"wacky"?

> It favours people who are short-sighted and don't see what possible
> improvements their code has. No code in an ongoing project like this is ever
> "completed" anyways.

It favors those who do not wait until the last minute, but complete them
well before the freeze date.

> It's also an artifact of the working model we have where patches are sent in
> big chunks and reviewed much later during a feature freeze. If we were
> committing directly into a CVS repository we would have wanted to commit these
> changes as soon as they were ready for committing, not wait until they're
> "completed". Then continue working and commit further changes. It's only

This would have CVS containing uncomplete features --- and before beta,
we would either have to beg the authors to complete them, or rip them
out, neither of which we want to do.

> because there's a two step process and the reviews are mainly happening during
> the feature freeze that there's any sense that some of them are "completed".
> In fact they're not of course, there will be further changes in the same area
> once the freeze is lifted.
> 
> I think you should be asking people whether they think the code is in a state
> where it can be committed, not whether they've finished working on it. Just
> because they see further work that can be done is no reason not to commit
> useful patches that are functional as they are.

OK, but we don't want something that is ready to be committed, we need
it complete.

> In fact Postgres historically has had an even looser standard. If the code is
> ready to be committed modulo bugs then it's been included in the feature
> freeze in the past.

Well, if we know something has bugs, we fix them.  Things are committed
with bugs only because we don't know it has bugs when it was committed.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Bruce Momjian
Simon Riggs wrote:
> On Tue, 2007-03-27 at 21:15 -0400, Bruce Momjian wrote:
> > Right now, all the patches I think are ready for review are in the patch
> > queue:
> > 
> > http://momjian.postgresql.org/cgi-bin/pgpatches
> > 
> > However, with feature freeze coming on Sunday, I am worried because
> > there are a significant number of patches that have are not ready for
> > review because they have not been completed by their authors.
> 
> It's probably a good idea to have a queue of those too, to allow others
> to finish them if the original author hasn't/can't/won't. I'm not sure
> which ones you mean.

At this point, with four days left before feature freeze, if the authors
don't finish them, I doubt someone else is going to be able to do it.

> I have at least 2 patches that depend upon other patches in the queue.
> I'm not sure how to go about completing them, so any advice or guidance
> would be welcome:
> 
> - Scan_recycle_buffers depends upon synchronised scans because we agreed
> we would use the same parameter (if any exists) to govern the behaviour.
> Should I write a patch-on-patch? What happens if the patch changes after
> review? ISTM I should just wait until the first one is applied and then
> I can make the necessary changes in about an hour. The patch's main
> functionality is complete.

Yes, that is fine.  I was unaware that is why the patch wasn't "done".
Once synchronised scans is in, I will go back to you and ask for a new
version against CVS.  I will put your email in the patch queue as a
reminder.

> - Fast cluster conflicts with Heikki's cluster patch, so one of them
> will need fixing depending which is applied first. I don't mind if its
> me going second. I also have proposed an additional mode on VACUUM FULL
> that builds upon Heikki's patch - should I submit that also, even though
> it cannot be applied?

OK, same rules.  I am just glad that is all that was hold them up.  I
was worried.  What about the delayed fsync patch?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Joshua D. Drake

at seems like a bit of a whacky criterion to use before reviewing a patch.


"wacky"?


It favours people who are short-sighted and don't see what possible
improvements their code has. No code in an ongoing project like this is ever
"completed" anyways.


It favors those who do not wait until the last minute, but complete them
well before the freeze date.


But wouldn't it hurt those that are continuously working the patch with 
the community? Just asking.





It's also an artifact of the working model we have where patches are sent in
big chunks and reviewed much later during a feature freeze. If we were
committing directly into a CVS repository we would have wanted to commit these
changes as soon as they were ready for committing, not wait until they're
"completed". Then continue working and commit further changes. It's only


This would have CVS containing uncomplete features --- and before beta,
we would either have to beg the authors to complete them, or rip them
out, neither of which we want to do.


I agree here.


I think you should be asking people whether they think the code is in a state
where it can be committed, not whether they've finished working on it. Just
because they see further work that can be done is no reason not to commit
useful patches that are functional as they are.


OK, but we don't want something that is ready to be committed, we need
it complete.


Right, feature complete does not mean bug free that is what the testing 
period is for.





In fact Postgres historically has had an even looser standard. If the code is
ready to be committed modulo bugs then it's been included in the feature
freeze in the past.


Well, if we know something has bugs, we fix them.  Things are committed
with bugs only because we don't know it has bugs when it was committed.


Yep :)

Joshua D. Drake


--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

  http://www.postgresql.org/docs/faq


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Bruce Momjian
Joshua D. Drake wrote:
> at seems like a bit of a whacky criterion to use before reviewing a patch.
> > 
> > "wacky"?
> > 
> >> It favours people who are short-sighted and don't see what possible
> >> improvements their code has. No code in an ongoing project like this is 
> >> ever
> >> "completed" anyways.
> > 
> > It favors those who do not wait until the last minute, but complete them
> > well before the freeze date.
> 
> But wouldn't it hurt those that are continuously working the patch with 
> the community? Just asking.

Yea, it might, and it certainly hampers complex patches.  I was caught
up on the patch queue until the start of March, when I went on vacation,
Tom started on cache invalidation, _and_ more complex patches started
appearing.  With those three, we had a perfect storm and the patch queue
has gotten clogged, and I am afraid it isn't going to get unclogged
until after feature freeze.  I talked to Tom about this yesterday and he
and I feel there isn't much we can do to change that, in the sense we
are already doing the best we can, and clearing the remaining patches
after feature freeze isn't that bad.  One thing committers have to be
willing to do is to give authors ample time after feature freeze to
adjust patches after receiving feedback, because technically they should
have received feedback _before_ feature freeze.  Hopefully this will not
significantly lengthen feature freeze.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 15:48 -0400, Bruce Momjian wrote:
> What about the delayed fsync patch?

All complete bar two fiddly items, as of Mar 11, design-to-complete
posted along with patch.

Working on those now.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Gregory Stark

"Bruce Momjian" <[EMAIL PROTECTED]> writes:

> Simon Riggs wrote:
>
>> It's probably a good idea to have a queue of those too, to allow others
>> to finish them if the original author hasn't/can't/won't. I'm not sure
>> which ones you mean.
>
> At this point, with four days left before feature freeze, if the authors
> don't finish them, I doubt someone else is going to be able to do it.

This isn't the standard that we've used in the past. In the past patches that
are mostly done and need some extra work done to polish them off are
considered to have met the feature freeze. 

In any case I think Simon and you have fallen into the trap of thinking of
development as a single-person project. Most developers here, especially
first-time contributors, don't just work in the dark on their own and turn up
with a finished patch. They have questions and need help in areas. If you
insist on a "finished" patch before you even consider reviewing their work
it's not going to work.

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate


Re: [HACKERS] ECPG threads test

2007-03-28 Thread Bruce Momjian
Magnus Hagander wrote:
> Hi!
> 
> >From what I can see, the ecpg thread tests (src/interfaces/ecpg/threads)

Uh, the directory src/interfaces/ecpg/threads doesn't exist.  I assume
you mean src/interfaces/ecpg/test/thread.

> don't ever run. They rely on ENABLE_THREAD_SAFETY to be set, but even when
> I build with --enable-thread-safety, it's not set. This is because ecpg
> does not pull in pg_config.h, and also does not specify it on the
> commandline.

Yep, that's a problem.  Michael Meskes committed thoses files on
2006/08/02 so I am hoping he can address it.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] ECPG threads test

2007-03-28 Thread Magnus Hagander
Bruce Momjian wrote:
> Magnus Hagander wrote:
>> Hi!
>>
>> >From what I can see, the ecpg thread tests (src/interfaces/ecpg/threads)
> 
> Uh, the directory src/interfaces/ecpg/threads doesn't exist.  I assume
> you mean src/interfaces/ecpg/test/thread.

Yes, that's what I mean. Sorry 'bout that.


>> don't ever run. They rely on ENABLE_THREAD_SAFETY to be set, but even when
>> I build with --enable-thread-safety, it's not set. This is because ecpg
>> does not pull in pg_config.h, and also does not specify it on the
>> commandline.
> 
> Yep, that's a problem.  Michael Meskes committed thoses files on
> 2006/08/02 so I am hoping he can address it.

Ok. At least I'm not completely lost then.

I'll see if I can put something together while I do the rest of that
work, but if Michael (or someone else) has a quick-fix, go ahead.

//Magnus

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Bruce Momjian
Gregory Stark wrote:
> 
> "Bruce Momjian" <[EMAIL PROTECTED]> writes:
> 
> > Simon Riggs wrote:
> >
> >> It's probably a good idea to have a queue of those too, to allow others
> >> to finish them if the original author hasn't/can't/won't. I'm not sure
> >> which ones you mean.
> >
> > At this point, with four days left before feature freeze, if the authors
> > don't finish them, I doubt someone else is going to be able to do it.
> 
> This isn't the standard that we've used in the past. In the past patches that
> are mostly done and need some extra work done to polish them off are
> considered to have met the feature freeze. 

My assumption is if authors don't finish them in the next few days, they
are unlikely to finish them during some grace period during feature
freeze.  And the extra time is usually allowed for changes requested by
committers, while at this point the authors aren't done and haven't even
gotten to committer review.

> In any case I think Simon and you have fallen into the trap of thinking of
> development as a single-person project. Most developers here, especially
> first-time contributors, don't just work in the dark on their own and turn up
> with a finished patch. They have questions and need help in areas. If you
> insist on a "finished" patch before you even consider reviewing their work
> it's not going to work.

Fine, if they need help, let them ask, but many authors are not asking
for help --- they are just not completing the patches.

Or they are going to surprise us by completing them on March 31.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] ECPG threads test

2007-03-28 Thread Bruce Momjian
Magnus Hagander wrote:
> Bruce Momjian wrote:
> > Magnus Hagander wrote:
> >> Hi!
> >>
> >> >From what I can see, the ecpg thread tests (src/interfaces/ecpg/threads)
> > 
> > Uh, the directory src/interfaces/ecpg/threads doesn't exist.  I assume
> > you mean src/interfaces/ecpg/test/thread.
> 
> Yes, that's what I mean. Sorry 'bout that.
> 
> 
> >> don't ever run. They rely on ENABLE_THREAD_SAFETY to be set, but even when
> >> I build with --enable-thread-safety, it's not set. This is because ecpg
> >> does not pull in pg_config.h, and also does not specify it on the
> >> commandline.
> > 
> > Yep, that's a problem.  Michael Meskes committed thoses files on
> > 2006/08/02 so I am hoping he can address it.
> 
> Ok. At least I'm not completely lost then.
> 
> I'll see if I can put something together while I do the rest of that
> work, but if Michael (or someone else) has a quick-fix, go ahead.

I believe the solution is to test in the Makefile, and just not call the
thread part at all unless threading is enabled.  Doing the thread test
in the C code is not going to work because it will not produce output
that matches the expected file.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 17:02 -0400, Bruce Momjian wrote:

> they

It would be good to know who/what you're talking about, specifically.

Some patchers may think they have completed their work.

Not a name-and-shame, just fair warning their work is considered
incomplete and is about to be rejected as a result.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faq


Re: [HACKERS] Modifying TOAST thresholds

2007-03-28 Thread Tom Lane
Gregory Stark <[EMAIL PROTECTED]> writes:
> "Tom Lane" <[EMAIL PROTECTED]> writes:
>> I also think that we ought to add TOAST_MAX_CHUNK_SIZE to the set of
>> compiled-in parameters that are recorded in pg_control and checked for
>> compatibility at startup (like BLCKSZ) --- this will prevent anyone from
>> shooting themselves in the foot while experimenting.

> Is there any reason to experiment with this? I would have thought we would
> divorce TOAST_MAX_CHUNK_SIZE from TOAST_THRESHOLD and hard code it as the same
> expression that's there now. Ie, the largest size that can fit in a page.

No, right now it's the largest size that you can fit 4 on a page.  It's
not obvious to me that 4 is optimal once it's divorced from TOAST_THRESHOLD.
It seems possible that the correct number is 1, and even if it's useful
to keep the tuples smaller than that, there's no reason to assume 4 is
the best number per page.

regards, tom lane

---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
   subscribe-nomail command to [EMAIL PROTECTED] so that your
   message can get through to the mailing list cleanly


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Bruce Momjian
Simon Riggs wrote:
> On Wed, 2007-03-28 at 17:02 -0400, Bruce Momjian wrote:
> 
> > they
> 
> It would be good to know who/what you're talking about, specifically.
> 
> Some patchers may think they have completed their work.
> 
> Not a name-and-shame, just fair warning their work is considered
> incomplete and is about to be rejected as a result.

Not sure how to do this without name-and-shame.  I sent out emails to
the list asking where we were on various open patches.  I can do it
again tomorrow so there is some context in the requests.  Would that
help?

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Joshua D. Drake

Bruce Momjian wrote:

Gregory Stark wrote:
  

"Bruce Momjian" <[EMAIL PROTECTED]> writes:



Simon Riggs wrote:

  

It's probably a good idea to have a queue of those too, to allow others
to finish them if the original author hasn't/can't/won't. I'm not sure
which ones you mean.


At this point, with four days left before feature freeze, if the authors
don't finish them, I doubt someone else is going to be able to do it.
  

This isn't the standard that we've used in the past. In the past patches that
are mostly done and need some extra work done to polish them off are
considered to have met the feature freeze. 



My assumption is if authors don't finish them in the next few days, they
are unlikely to finish them during some grace period during feature
freeze.  And the extra time is usually allowed for changes requested by
committers, while at this point the authors aren't done and haven't even
gotten to committer review.
  
Well hold on Bruce, that isn't quite fair. I know there are patches in 
this cycle that have been waiting on reviewers/comitters not the other 
way around.
Clustered indexes for example. I know that Gavin is "this close" to 
having vacuum finished for bitmap index on disk. Alvaro's vacuum patch 
isn't done

either, although he has submitted WIP.

Perhaps it makes sense to say:

Feature Freeze: April 1st., no "new" patches accepted for 8.3
Patch Freeze April 15th., Authors have until the 15th to address any 
committer concerns


?

Sincerely,

Joshua D. Drake

  



---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Bruce Momjian
Joshua D. Drake wrote:
> > My assumption is if authors don't finish them in the next few days, they
> > are unlikely to finish them during some grace period during feature
> > freeze.  And the extra time is usually allowed for changes requested by
> > committers, while at this point the authors aren't done and haven't even
> > gotten to committer review.
> >   
> Well hold on Bruce, that isn't quite fair. I know there are patches in 
> this cycle that have been waiting on reviewers/comitters not the other 
> way around.
> Clustered indexes for example. I know that Gavin is "this close" to 
> having vacuum finished for bitmap index on disk. Alvaro's vacuum patch 
> isn't done
> either, although he has submitted WIP.

Yes, for one, I am worried about bitmap indexes, and the performance
testing time we are going to need to decide if we want it.  

In general, I am more concerned about patches where I don't see public
patches/commit, like bitmap indexes, rather than patches like HOT that
are being publicly advanced.  All the patches might be advancing, but of
course, I only see the public ones, and those are the only ones I can
guess are near completion.

I am speaking of my concerns now, rather than after feature freeze,
because author options are more limited after feature freeze.


> Perhaps it makes sense to say:
> 
> Feature Freeze: April 1st., no "new" patches accepted for 8.3
> Patch Freeze April 15th., Authors have until the 15th to address any 
> committer concerns

Well, I am OK with that, but we need _community_ agreement on that.

I realize it isn't fair that committers are behind on patches, while we
are expecting submitters to make the deadline, but there are far fewer
committers than submitters, and there was never a promise to commit
everything before feature freeze.

-- 
  Bruce Momjian  <[EMAIL PROTECTED]>  http://momjian.us
  EnterpriseDB   http://www.enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +

---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Joshua D. Drake





Perhaps it makes sense to say:

Feature Freeze: April 1st., no "new" patches accepted for 8.3
Patch Freeze April 15th., Authors have until the 15th to address any 
committer concerns


Well, I am OK with that, but we need _community_ agreement on that.

I realize it isn't fair that committers are behind on patches, while we
are expecting submitters to make the deadline, but there are far fewer
committers than submitters, and there was never a promise to commit
everything before feature freeze.


Yeah that was kind of my thinking is that everyone knows that the 
committers are behind (and overworked). So if we have this two week 
breather where it is all about patch review...


Joshua D. Drake



--

  === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
Providing the most comprehensive  PostgreSQL solutions since 1997
 http://www.commandprompt.com/

Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL Replication: http://www.commandprompt.com/products/


---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
  choose an index scan if your joining column's datatypes do not
  match


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 17:12 -0400, Bruce Momjian wrote:
> Simon Riggs wrote:
> > On Wed, 2007-03-28 at 17:02 -0400, Bruce Momjian wrote:
> > 
> > > they
> > 
> > It would be good to know who/what you're talking about, specifically.
> > 
> > Some patchers may think they have completed their work.
> > 
> > Not a name-and-shame, just fair warning their work is considered
> > incomplete and is about to be rejected as a result.
> 
> Not sure how to do this without name-and-shame.  I sent out emails to
> the list asking where we were on various open patches.  I can do it
> again tomorrow so there is some context in the requests.  Would that
> help?

Please publish the list. I'm sure it will raise eyebrows, but we can
sort out any misunderstandings; there's no shame in attempting something
and meeting a blockage - thats normal.

If everybody knows where everybody stands then we'll all be better off.
There may be other dependencies that need resolution, or last minute
decisions required to allow authors to finish.

Plus I want to check whether I'm on it, or not.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Jeremy Drake
On Wed, 28 Mar 2007, Simon Riggs wrote:

> On Wed, 2007-03-28 at 17:12 -0400, Bruce Momjian wrote:
>
> If everybody knows where everybody stands then we'll all be better off.
> There may be other dependencies that need resolution, or last minute
> decisions required to allow authors to finish.

Wasn't this the purpose of the wiki page that was set up?  I notice it has
not been updated in a while...

http://developer.postgresql.org/index.php/Todo:WishlistFor83

-- 
If the aborigine drafted an IQ test, all of Western civilization would
presumably flunk it.
-- Stanley Garn

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Guarenteeing complex referencial integrity through custom triggers

2007-03-28 Thread Joris Dobbelsteen
>-Original Message-
>From: [EMAIL PROTECTED] 
>[mailto:[EMAIL PROTECTED] On Behalf Of Greg 
>Sabino Mullane
>Sent: woensdag 28 maart 2007 2:50
>To: pgsql-hackers@postgresql.org
>Subject: Re: [HACKERS] Guarenteeing complex referencial 
>integrity through custom triggers
>
[snip]
>
>Much too elaborate - I'm sorry, but I don't think anyone here 
>is willing to wade through nearly 900 lines of code. Can you 
>break it down to a simpler test case? Or try and break the 
>schema I provided in my previous message perhaps?

Was the only readily available example I had available at the moment in
very short notice.

-- Generic type of printer
CREATE TABLE printer (
  idSERIAL NOT NULL PRIMARY KEY,
  brand TEXT   NOT NULL,
  model TEXT   NOT NULL
);
INSERT INTO printer (brand,model) VALUES ('epson','1200');
INSERT INTO printer (brand,model) VALUES ('hp','laserjet99');

-- A printer can hold one or more cartridges, distinguished by 'color'
CREATE TABLE cartridge_types (
  ptype INT  NOT NULL,
  CONSTRAINT cc_ptype FOREIGN KEY (ptype) REFERENCES printer(id) ON
DELETE RESTRICT,
  color TEXT NOT NULL,
  PRIMARY KEY (ptype, color)
);
INSERT INTO cartridge_types VALUES (1,'black');
INSERT INTO cartridge_types VALUES (2,'black');
INSERT INTO cartridge_types VALUES (2,'color');

-- Specific printers in the company
CREATE TABLE printers (
  idSERIAL  NOT NULL PRIMARY KEY,
  ptype INTEGER NOT NULL,
CONSTRAINT ptype FOREIGN KEY (ptype) REFERENCES printer(id),
  location  TEXTNOT NULL
);
INSERT INTO printers(ptype,location) VALUES (1,'Room 234');
INSERT INTO printers(ptype,location) VALUES (2,'Break room #2');
INSERT INTO printers(ptype,location) VALUES (2,'NE corner of
warehouse');

-- Printers require lots of pesky hardware updates
CREATE TABLE cartridge_change (
  printer_id INT  NOT NULL,
CONSTRAINT change_printer FOREIGN KEY (printer_id) REFERENCES
printers(id),
  color  TEXT NOT NULL,
  whenchanged TIMESTAMPTZ NOT NULL DEFAULT now()
);

--
-- !!!
-- NOTICE
--
-- This constraint is invalid, printer_id should reference printers, not
printer...
-- IF this constraint where valid, you could never change a cartridge on
printer #3...
--ALTER TABLE cartridge_change ADD CONSTRAINT cc FOREIGN KEY
(printer_id, color) REFERENCES cartridge_types (ptype,color);

INSERT INTO cartridge_change (printer_id, color) VALUES (1,'black');
-- dropped your last insert, was a constraint violation...




>> You cannot enforce the above constraint in the database. The keyword 
>> is serializable isolation level.
>>
>> Create new printer + cartidge_defs and such.
>> Now in T1 delete a cartridge_def
>> Now in T2 insert a cartridge replacement.
>
>Works as expected in my schema when I tried it: T2 blocked at 
>the insert, waiting on T1. Once T1 committed, T2 threw an 
>error, as the insert was no longer valid. Using serializable 
>or not, same result.

As noted above, you constraint does not enforce the constraint I
mentioned (though with loose grammer). It cannot use the primitives that
are in the postgresql database, but rather need triggers to have them
enforced.

FOR ALL t IN cartridge_change
THERE MUST EXIST (SELECT 1
  FROM printers
  INNER JOIN cartridge_types ON cartridge_types.ptype =
printers.ptype
  WHERE printers.id = cartridge_change.printer_id
AND cartridge_types.color =
cartridge_change.color
  )

If we replace a catridge, the cartridge_type should be defined.

Obviously we follow good practices:
Before inserting a row into cartridge_change, we should check the
cartridge_types table and lock the row.
After deleting a row in cartridge_types, we should check the
cartridge_change table for constraint violation.
We will first lock the parent and then the childs. This gives
consistency (normally) and reduces the chance of deadlocks.

This means two constraints must be defined (assume you have plpgsql
available).

CREATE FUNCTION "TR_cartridgeRI_cartridge_change_insupd"() RETURNS
trigger AS
$BODY$BEGIN
-- This must be a BEFORE trigger: we lock the parent first...
PERFORM 1
FROM printers p INNER JOIN cartridge_types c ON p.ptype =
c.ptype
WHERE p.id = NEW.printer_id
AND c.color = NEW.color
FOR SHARE;

IF NOT FOUND THEN
RAISE EXCEPTION 'Cartridge_type not defined';
END IF;

RETURN NEW;
END;$BODY$
LANGUAGE 'plpgsql' VOLATILE;

CREATE TRIGGER "TR_cartridgeRI"
BEFORE INSERT OR UPDATE ON cartridge_change FOR EACH ROW
EXECUTE PROCEDURE "TR_RI_cartridgeRI_cartridge_change_insupd"();

CREATE FUNCTION "TR_cartridgeRI_cartridge_types_upddel"() RETURNS
trigger AS
$BODY$BEGIN
-- This must be a AFTER trigger: we lock the parent first...
PERFORM 1
FROM printers p INNER JOIN cartridge_change c ON p.id =
c.printer_id
WHERE p.ptype = OLD.ptype
AND c.color = OLD.color
 

Re: [HACKERS] Patch queue concern

2007-03-28 Thread Simon Riggs
On Wed, 2007-03-28 at 17:37 -0400, Bruce Momjian wrote:

> I realize it isn't fair that committers are behind on patches, while we
> are expecting submitters to make the deadline, but there are far fewer
> committers than submitters, and there was never a promise to commit
> everything before feature freeze.

I'm expecting to review patches after freeze and I'm much more free to
do that now than I have been previously. It seems important we have a
tiered review process so that some of the more obvious flaws can be
driven out of patches as early as possible. 

If we can set expectations that every developer has to contribute review
time, committer or not, then we'll all be better off. That need not take
away authority from committers, nor give it to reviewers.

Anybody and everybody is certainly welcome to comment on my own patches.


My feeling is we should have more regular sync points where the patch
queue is emptied and everything committed or rejected. That way
rejection is less of a problem and we will all have more opportunity to
build upon each others good work.

-- 
  Simon Riggs 
  EnterpriseDB   http://www.enterprisedb.com



---(end of broadcast)---
TIP 4: Have you searched our list archives?

   http://archives.postgresql.org


[HACKERS] problems with plan invalidation

2007-03-28 Thread Kris Jurka


Running the JDBC driver's regression test suite for the first time in a 
while I got a lot of failures that I would have to guess are related to 
plan invalidation work.  Attached is a self contained test case and the 
JDBC driver's log of what protocol messages it is sending.


The end result is:

ERROR: cache lookup failed for type 2139062143

Kris Jurka17:49:35.382 (1) PostgreSQL 8.3devel JDBC3g with SSL (build 600)
17:49:35.390 (1) Trying to establish a protocol version 3 connection to 
localhost:5830
17:49:35.413 (1)  FE=> StartupPacket(user=jurka, database=jurka, 
client_encoding=UNICODE, DateStyle=ISO)
17:49:35.417 (1)  <=BE AuthenticationOk
17:49:35.432 (1)  <=BE ParameterStatus(client_encoding = UNICODE)
17:49:35.432 (1)  <=BE ParameterStatus(DateStyle = ISO, MDY)
17:49:35.433 (1)  <=BE ParameterStatus(integer_datetimes = off)
17:49:35.433 (1)  <=BE ParameterStatus(is_superuser = on)
17:49:35.433 (1)  <=BE ParameterStatus(server_encoding = LATIN1)
17:49:35.433 (1)  <=BE ParameterStatus(server_version = 8.3devel)
17:49:35.433 (1)  <=BE ParameterStatus(session_authorization = jurka)
17:49:35.433 (1)  <=BE ParameterStatus(standard_conforming_strings = off)
17:49:35.433 (1)  <=BE ParameterStatus(TimeZone = Navajo)
17:49:35.433 (1)  <=BE BackendKeyData(pid=27730,ckey=216493403)
17:49:35.434 (1)  <=BE ReadyForQuery(I)
17:49:35.434 (1) compatible = 8.3
17:49:35.434 (1) loglevel = 2
17:49:35.434 (1) prepare threshold = 5
getConnection returning driver[className=org.postgresql.Driver,[EMAIL PROTECTED]
17:49:35.459 (1) simple execute, [EMAIL PROTECTED], maxRows=0, fetchSize=0, 
flags=17
17:49:35.460 (1)  FE=> Parse(stmt=null,query="CREATE TEMP TABLE nulltest (a 
int)",oids={})
17:49:35.460 (1)  FE=> Bind(stmt=null,portal=null)
17:49:35.460 (1)  FE=> Describe(portal=null)
17:49:35.461 (1)  FE=> Execute(portal=null,limit=0)
17:49:35.461 (1)  FE=> Sync
17:49:35.482 (1)  <=BE ParseComplete [null]
17:49:35.482 (1)  <=BE BindComplete [null]
17:49:35.482 (1)  <=BE NoData
17:49:35.482 (1)  <=BE CommandStatus(CREATE TABLE)
17:49:35.483 (1)  <=BE ReadyForQuery(I)
17:49:35.484 (1) simple execute, [EMAIL PROTECTED], maxRows=0, fetchSize=0, 
flags=21
17:49:35.484 (1)  FE=> Parse(stmt=null,query="INSERT INTO nulltest (a) VALUES 
($1)",oids={23})
17:49:35.485 (1)  FE=> Bind(stmt=null,portal=null,$1=)
17:49:35.485 (1)  FE=> Describe(portal=null)
17:49:35.485 (1)  FE=> Execute(portal=null,limit=1)
17:49:35.485 (1)  FE=> Sync
17:49:35.487 (1)  <=BE ParseComplete [null]
17:49:35.489 (1)  <=BE ErrorMessage(ERROR: cache lookup failed for type 
2139062143)
org.postgresql.util.PSQLException: ERROR: cache lookup failed for type 
2139062143
at 
org.postgresql.core.v3.QueryExecutorImpl.receiveErrorResponse(QueryExecutorImpl.java:1548)
at 
org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1316)
at 
org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:191)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:452)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:351)
at 
org.postgresql.jdbc2.AbstractJdbc2Statement.executeUpdate(AbstractJdbc2Statement.java:305)
at TestSetNull.main(TestSetNull.java:16)
SQLException: SQLState(XX000)
17:49:35.491 (1)  <=BE ReadyForQuery(I)
import java.sql.*;

public class TestSetNull {

public static void main(String args[]) throws Exception {
Class.forName("org.postgresql.Driver");

Connection conn = 
DriverManager.getConnection("jdbc:postgresql://localhost:5830/jurka?loglevel=2","jurka","");

Statement stmt = conn.createStatement();
stmt.execute("CREATE TEMP TABLE nulltest (a int)");
stmt.close();

PreparedStatement ps = conn.prepareStatement("INSERT INTO 
nulltest (a) VALUES (?)");
ps.setNull(1, Types.INTEGER);
ps.executeUpdate();

conn.close();
}
}

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Gregory Stark

"Bruce Momjian" <[EMAIL PROTECTED]> writes:

>> It favours people who are short-sighted and don't see what possible
>> improvements their code has. No code in an ongoing project like this is ever
>> "completed" anyways.
>
> It favors those who do not wait until the last minute, but complete them
> well before the freeze date.

What is this "complete" you keep talking about? Should I stop working on the
sort/limit patch even though Heikki pointed out a few things to clean up and
the cost model isn't updated yet just so that you'll consider it "complete"
and put it on the patch queue? If I don't stop working on it you think we
should just ignore it even if it's in a usable state now? Even the cost model
changes could be done pretty easily with some guidance from a review.

>> It's also an artifact of the working model we have where patches are sent in
>> big chunks and reviewed much later during a feature freeze. If we were
>> committing directly into a CVS repository we would have wanted to commit 
>> these
>> changes as soon as they were ready for committing, not wait until they're
>> "completed". Then continue working and commit further changes. It's only
>
> This would have CVS containing uncomplete features --- and before beta,
> we would either have to beg the authors to complete them, or rip them
> out, neither of which we want to do.

You don't want to commit something if it's in an unusable state and would have
to be ripped out without more work. I said "as soon as they're ready for
committing" as opposed to "completed".

You're asking people if they've stopped working on patches and you're
surprised to find that there are a lot of patches people are still working on.

That's silly, of course people are still working on them, many of these tasks
are open ended and can be improved as long as we have time. just because
they're still working on them doesn't necessarily mean what they have so far
isn't worth committing as is yet.

> OK, but we don't want something that is ready to be committed, we need
> it complete.

So how many more releases before you think Postgres is "complete"? 

-- 
  Gregory Stark
  EnterpriseDB  http://www.enterprisedb.com


---(end of broadcast)---
TIP 6: explain analyze is your friend


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Carlo Florendo

Gregory Stark wrote:

"Bruce Momjian" <[EMAIL PROTECTED]> writes:
That's silly, of course people are still working on them, many of these tasks
are open ended and can be improved as long as we have time. just because
they're still working on them doesn't necessarily mean what they have so far
isn't worth committing as is yet.


OK, but we don't want something that is ready to be committed, we need
it complete.


So how many more releases before you think Postgres is "complete"? 


You are using the word complete as in final and unalterable.  That's not, 
it seems to me, what Bruce means.  Bruce has a point, and a valid and 
sensible one at that.


A patch that is ready to be committed does not mean it is usable.  Just 
because you can commit a patch does not mean that the patch will be useful.


Well, if a patch author has promised to supply a patch for the X function, 
and has not completed a stable and generally usable patch for X, then the 
patch is not worth committing.


Thank you very much.

Best Regards,

Carlo

--
Carlo Florendo
Softare Engineer/Network Co-Administrator
Astra Philippines Inc.
UP-Ayala Technopark, Diliman 1101, Quezon City
Philippines
http://www.astra.ph

--
The Astra Group of Companies
5-3-11 Sekido, Tama City
Tokyo 206-0011, Japan
http://www.astra.co.jp

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Carlo Florendo

Gregory Stark wrote:


In any case I think Simon and you have fallen into the trap of thinking of
development as a single-person project. Most developers here, especially
first-time contributors, don't just work in the dark on their own and turn up
with a finished patch. They have questions and need help in areas. If you
insist on a "finished" patch before you even consider reviewing their work
it's not going to work.


This isn't about "finished" patches.  It's about "commit-worthy" patches, 
and since the term is very subjective, there has to be some way for an 
arbiter to be able to say that such a patch is worth committing.  And I 
think the arbiter should not come from any of the two opposing sides with 
diametrically opposed claims or opinions.


Thank you very much.

Best Regards,

Carlo


--
Carlo Florendo
Softare Engineer/Network Co-Administrator
Astra Philippines Inc.
UP-Ayala Technopark, Diliman 1101, Quezon City
Philippines
http://www.astra.ph

--
The Astra Group of Companies
5-3-11 Sekido, Tama City
Tokyo 206-0011, Japan
http://www.astra.co.jp

---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster


Re: [HACKERS] CREATE INDEX and HOT - revised design

2007-03-28 Thread Pavan Deolasee

On 3/29/07, Florian G. Pflug <[EMAIL PROTECTED]> wrote:


Pavan Deolasee wrote:

> Tom, please correct me if I am wrong. But ISTM that this idea might
> work in this context. In get_relation_info(), we would check if
"xcreate"
> xid stored in pg_index for the index under consideration is seen
> committed with respect to the snapshot at that point of time.
> Even if the snapshot changes later and index becomes valid, we
> might not replan and hence not use index. But that doesn't seem
> like a big problem to me.

That problem are usecases like
PREPARE my_plan  ;
BEGIN;
EXECUTE my_plan  ;
COMMIT ;




Oh, I see.


Is that "PREPARE" even run inside a transaction? Even if it is, it

probably won't have created a snapshot...




In this specific context, this particular case is easy to handle because
we are only concerned about the serializable transactions started before
CREATE INDEX commits. If PREPARE can see the new index, it
implies that the CI transaction is committed. So the transaction
starting after than can only see the tuple version that we have indexed.


But I don't know if this is the only case or there are more cases to
consider :-(


Thanks,
Pavan

--

EnterpriseDB http://www.enterprisedb.com


Re: [HACKERS] Patch queue concern

2007-03-28 Thread Tom Lane
"Simon Riggs" <[EMAIL PROTECTED]> writes:
> My feeling is we should have more regular sync points where the patch
> queue is emptied and everything committed or rejected.

No doubt, but the real problem here is that reviewing/committing other
people's patches is not fun, it's just work :-(.  So it's no surprise
that it tends to get put off.  Not sure what to do about that.

regards, tom lane

---(end of broadcast)---
TIP 7: You can help support the PostgreSQL project by donating at

http://www.postgresql.org/about/donate