date:20040311

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Fabien COELHO


Dear Tom,

> > Well, I want to preserve space for the query extract, that's where the
> > context information is! If I put more information there, I'll have to
> > reduce the extract length.
>
> But in the form you are using, you have to reserve that space anyway.
> Consider an error marker at the end of the line:
>
>   QUERY: select 'text that is long enough to be a problem':
>   QUERY:  ^ on line 1
>
> Having to leave space for "on line 1" at the right end is not better
> than putting "line 1:" at the left end.

Yes, I partly agree. However the space above can be used nevertheless for
the extract, so it is not totally "lost":

QUERY: select 'text that is long enough to be a problem': from foo ...
QUERY:  ^ on line 1

> > Moreover the cursor line information content is small, so it seems better
> > to put the line information there.
>
> It seems confusing to me.  The cursor is just a marker.

Sure.

> Consider also the next step up from this scheme: instead of assuming a
> dumb terminal, suppose you can use color or underlining or reverse-video
> or some such. Then you might want to do just one line of output:
>
> QUERY: SELECT foo FROM bar WHRE baz;
>
> where my underlining represents using the terminal's highlight ability.
> With such a scheme, putting LINE n on the same line fits naturally.

Sure, but the current status of psql is to have no such thing, and
I'm not going to start terminal-dependent features in the software;-)
I just want to help my students with syntax errors.

As a compromise, I can suggest the following:

LINE 4: WHERE id=123 AND name LIKE 'calvin' GROP BY name...
^

So "QUERY:" is dropped out. If there is very few lines, I can just
put "LINE:". I'll have to deal with the alignment problem, no big deal,
but not very nice either.

> > There is also an alignment issue here, as depending on the number of
> > figures, the cursor line would have to be fixed accordingly.
>
> Sure, but you have to buy into that anyway if you are going to do "...".

Well, I know that "..." is 3 char long, so I can put a "   " in the
cursor line in that case. If it is a figure, the length depends on its
value.

> It's not going to add more than about one strlen() call to what you need
> to do.

I cannot strlen an integer value;-) I have to convert to a string, or
deal directly with the line number.

With my new current implementation, the line number is only shown if the
query is more than 3 lines.

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [pgsql-www] [HACKERS] The Name Game: postgresql.net vs.

2004-03-11 Thread Joshua D. Drake

IMHO, the domain name isn't the make/break of whether going to GForge will
succeed ... the success will be a matter of marketing it, and making sure
that its project are well known ... personally, focusing on the domain is
like focusing on the name of a car when you buy it, not on its features
and/or price ...


Really? What about BMW, Volvo or Mercedes?

Sincerely,

Joshua D. Drake






Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

begin:vcard
fn:Joshua D. Drake
n:Drake;Joshua D.
org:Command Prompt, Inc.
adr:;;PO Box 215;Cascade Locks;Oregon;97014;USA
email;internet:[EMAIL PROTECTED]
title:Consultant
tel;work:503-667-4564
tel;fax:503-210-0034
note:Command Prompt, Inc. is the largest and oldest US based commercial PostgreSQL support provider. We  provide the only commercially viable integrated PostgreSQL replication solution, but also custom programming, and support. We authored  the book Practical PostgreSQL, the procedural language plPHP, and adding trigger capability to plPerl.
x-mozilla-html:FALSE
url:http://www.commandprompt.com/
version:2.1
end:vcard


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Josh Berkus

Bruce,

> Do they?  We don't create an index automatically when using REFERENCES.
> We do create an index for PRIMARY KEY.
> 
> I was just wondering if the REFERENCES column is more sensitive to join
> usage and would benefit from more accurate statistics even if it doesn't
> have an index.

I don't think so.   If one does not create an index on an FK column, this is 
usually because the column does not have enough of a range of values to 
deserve indexing (for example, 4 potential values across 100,000 records).  
In that case, we would not want to up the statistics either.

And we haven't made it a practice to hand-hold for database designers who 
don't know when to index, so I don't think we should start now.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Bruce Momjian

Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Another idea is whether a foreign key column should get extra
> > statistics?
> 
> In practice, both ends of an FK relationship have to be indexed,
> so I don't see that we need any extra special case for that.

Do they?  We don't create an index automatically when using REFERENCES.
We do create an index for PRIMARY KEY.

I was just wondering if the REFERENCES column is more sensitive to join
usage and would benefit from more accurate statistics even if it doesn't
have an index.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [pgsql-www] [HACKERS] The Name Game: postgresql.net vs.

2004-03-11 Thread Marc G. Fournier

On Thu, 11 Mar 2004, Gavin M. Roy wrote:

> I think having a pgfoundry.postgresql.net/org is good, but it should
> have its own identity, pgfoundry.org for the main url gets my vote for
> what it's worth.

I like the shortness myself ...

IMHO, the domain name isn't the make/break of whether going to GForge will
succeed ... the success will be a matter of marketing it, and making sure
that its project are well known ... personally, focusing on the domain is
like focusing on the name of a car when you buy it, not on its features
and/or price ...

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Tom Lane

Sailesh Krishnamurthy <[EMAIL PROTECTED]> writes:
> (Just a note: my comments are not pg-specific .. indeed I don't know
> much about pg recovery).
> ...
> BTW, logging raw datafile blocks would be pretty gross (physical
> logging) and so ARIES logs the changes to each tuple in "logical"
> fashion .. so if only one column changes only that value (before and
> after images) are logged. This is what's called "physiological
> logging".

What PG currently uses is sort of a hybrid approach.  The basic WAL
entry normally gives tuple-level detail: a tuple image and location
for an INSERT, a new tuple image and old and new locations for UPDATE,
a tuple location for DELETE, etc.  This might be a bit bulkier than
necessary for UPDATEs that change few columns, but it's consistent and
easy to replay.

However, as I mentioned to Marty, the very first change to any disk page
after a checkpoint will dump an entire (post-change) image of that page
into the log, in place of the tuple image or other detail that would
allow us to redo the change to that page.  The purpose of this is to
ensure that if a page is only partially written during a power failure,
we have the ability to recreate the entire correct page contents from
WAL by replaying everything since the last checkpoint.  Replay is thus
a write-only operation as far as the data files are concerned --- we
only write full pages or modify previously written pages.

(You may be thinking "well, what about a partial write to WAL" --- but
if that happens, that's where we stop replaying WAL.  The checksums on
WAL entries allow us to detect the invalid data before we use it.
So we will still have a consistent valid state on disk, showing the
effects of everything up through the last undamaged WAL record.)

BTW this is a one-directional log.  We do not care about UNDO, only
replay.  There is nothing that need be undone from a failed transaction;
it can only have littered the database with dead tuples, which will
eventually be reclaimed by VACUUM.

> Having said that, page-oriented undo logging can be a pain when
> B-tree pages split.

We don't bother to undo index page splits either ...

regards, tom lane

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [pgsql-www] [HACKERS] The Name Game: postgresql.net vs.

2004-03-11 Thread Marc G. Fournier

On Thu, 11 Mar 2004, Tom Lane wrote:

> "Jeroen T. Vermeulen" <[EMAIL PROTECTED]> writes:
> > Here's another idea: couldn't we have a subdomain for the projects, as in
> > ".forge.postgresql.org"?  Or would that be too long?
>
> That would be okay with me ...

I'd go for "too long" myself ...

Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email: [EMAIL PROTECTED]   Yahoo!: yscrappy  ICQ: 7615664

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Tom Lane

Bruce Momjian <[EMAIL PROTECTED]> writes:
> Another idea is whether a foreign key column should get extra
> statistics?

In practice, both ends of an FK relationship have to be indexed,
so I don't see that we need any extra special case for that.

regards, tom lane

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes

> The point is that with redo logging, you can just blindly apply the
> log to the data pages in question, without even really restarting the
> database.
I also am not a recovery expert, but I have watched it happen more than 
once.

You bring up a good point.  My (perhaps false) understanding with 
recovery/redo was that the last checkpointed state was recreated, then 
log changes that finished with a commit were applied and the rest discarded.

Actually, this approach would not be ideal, since the last checkpointed 
state would be unavailable if any data file writes took place between 
the checkpoint and the crash.

Further, post-checkpoint log entries would surely contain multiple 
copies of the same data block, and there is no point in applying any but 
the last log entry for a particular block.

Ok.  To recap:

* Logging parsed statements don't apply to light updates and most 
installations live mostly on light updates
* Logging parsed statements would not work if the checkpoint state could 
not be recreated, which is likely the case.

So, this soluition won't work, and even if it did, would not apply to 
the vast majority of users.

Hmmm...  No wonder it hasn't been implemented yet.  ;-)

Thanks again,
Marty
Sailesh Krishnamurthy wrote:
(Just a note: my comments are not pg-specific .. indeed I don't know
much about pg recovery).

"Marty" == Marty Scholes <[EMAIL PROTECTED]> writes:

Marty> If the DB state cannot be put back to a consistent state
Marty> prior to a SQL statement in the log, then NO amount of
Marty> logging will help.  The idea is that the state can be put
Marty> back to what it was prior to a particular log entry, be it
Marty> raw datafile blocks or a SQL statement.
The point is that with redo logging, you can just blindly apply the
log to the data pages in question, without even really restarting the
database.
Note that in ARIES, recovery follows: (1) analysis, (2) redo
_everything_ since last checkpoint, (3) undo losers. 

So logging carefully will indeed help get the system to a consistent
state - actually after phase (2) above the system will be in precisely
the state during the crash .. and all that's left to do is undo all
the live transactions (losers). 

BTW, logging raw datafile blocks would be pretty gross (physical
logging) and so ARIES logs the changes to each tuple in "logical"
fashion .. so if only one column changes only that value (before and
after images) are logged. This is what's called "physiological
logging".
Marty> See above.  If this cannot be resolved prior to
Marty> re-executing a statement in the log, then the problem is
Marty> beyond ANY subsequent logging.
Not true ! By applying the log entries carefully you should be able to
restore the system to a consistent state. 

>> Having said that, page-oriented undo logging can be a pain when
>> B-tree pages split. For higher concurrency, ARIES uses logical
>> undo logging. In this case, the logs are akin to your "parsed
>> statement" idea.
>> 

Marty> Yes, my experience exactly.  Maybe we are the only company
Marty> on the planet that experiences this sort of thing.  Maybe
Well, logical undo is still at a much lower level than parsed
statements. Each logical undo log is something like "delete key 5 from
index xyz". 

Marty> Maybe this is not a "traditional" RDBMS app, but I am not
Marty> in the mood to write my own storage infrastructure for it.
I agree that your app has a lot of updates .. it's just that I'm not
convinced that logical logging is a clean solution. 

I also don't have a solution for your problem :-)

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Sailesh Krishnamurthy

(Just a note: my comments are not pg-specific .. indeed I don't know
much about pg recovery).

> "Marty" == Marty Scholes <[EMAIL PROTECTED]> writes:

Marty> If the DB state cannot be put back to a consistent state
Marty> prior to a SQL statement in the log, then NO amount of
Marty> logging will help.  The idea is that the state can be put
Marty> back to what it was prior to a particular log entry, be it
Marty> raw datafile blocks or a SQL statement.

The point is that with redo logging, you can just blindly apply the
log to the data pages in question, without even really restarting the
database.

Note that in ARIES, recovery follows: (1) analysis, (2) redo
_everything_ since last checkpoint, (3) undo losers. 

So logging carefully will indeed help get the system to a consistent
state - actually after phase (2) above the system will be in precisely
the state during the crash .. and all that's left to do is undo all
the live transactions (losers). 

BTW, logging raw datafile blocks would be pretty gross (physical
logging) and so ARIES logs the changes to each tuple in "logical"
fashion .. so if only one column changes only that value (before and
after images) are logged. This is what's called "physiological
logging".

Marty> See above.  If this cannot be resolved prior to
Marty> re-executing a statement in the log, then the problem is
Marty> beyond ANY subsequent logging.

Not true ! By applying the log entries carefully you should be able to
restore the system to a consistent state. 

>> Having said that, page-oriented undo logging can be a pain when
>> B-tree pages split. For higher concurrency, ARIES uses logical
>> undo logging. In this case, the logs are akin to your "parsed
>> statement" idea.
>> 

Marty> Yes, my experience exactly.  Maybe we are the only company
Marty> on the planet that experiences this sort of thing.  Maybe

Well, logical undo is still at a much lower level than parsed
statements. Each logical undo log is something like "delete key 5 from
index xyz". 

Marty> Maybe this is not a "traditional" RDBMS app, but I am not
Marty> in the mood to write my own storage infrastructure for it.

I agree that your app has a lot of updates .. it's just that I'm not
convinced that logical logging is a clean solution. 

I also don't have a solution for your problem :-)

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-11 Thread Michael Glaesemann

On Mar 12, 2004, at 9:07 AM, Jeroen T. Vermeulen wrote:

On Thu, Mar 11, 2004 at 07:01:47PM -0500, Tom Lane wrote:
Actually, proposal (A) does provide such a separation: notice that the
projects would go under *.postgresql.net, with the core database 
remaining
at *.postgresql.org.  I am not sure if that will provoke confusion or
not, but I think I like it better than pgfoundry because it is clear
that the domains are related.  pgfoundry seems a bit, um, random.
Agree with the last bit, but I really feel that the difference between
postgresql.org and postgresql.net is too subtle--at least for people 
who
don't work with either very often.
Just to speak up (as an avid lurker), I agree with Jeroen that this 
distinction is quite subtle and may cause confusion. Some may even 
expect the two to resolve to the same site, as a lot of popular sites 
own .com/.net/.org, all resolving to the same site.

This can be read as support for *.pgfoundry.org, *.postgresql.org, or 
*.pgfoundry.postgresql.org.

*.pgfoundry.org is short and clearly distinguished from postgresql.org

*.postgresql.org is short, and clearly associated with postgresql.org 
(of course!), but there's no clear distinction that the former gborg 
projects are separate from, say, developer.postgresql.org or 
techdocs.postgresql.org. Is this distinction important? Maybe not?

*.pgfoundry.postgresql.org is longer, clearly associated with 
postgresql.org, and clear that it's a distinct part of postgresql.org

Michael Glaesemann
grzm myrealbox com
---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes

> Anyway, it really doesn't matter. You're trying to save a large amount
> of time that simply isn't spent in this area in PostgreSQL. fsync()
> happens once with commit -- and on a busy system, a single fsync call
> may be sufficient for a number of parallel backends.
I think you may be right.  I suspect that most "busy" installations run 
a large number of "light" update/delete/insert statements.

In this scenario, the kind of logging I am talking about would make 
things worse, much worse.

Marty

Rod Taylor wrote:
On Thu, 2004-03-11 at 21:04, Marty Scholes wrote:

I can see that and considered it.

The seed state would need to be saved, or any particular command that is 
not reproducible would need to be exempted from this sort of logging.

Again, this would apply only to situations where a small SQL command 
created huge changes.

I would say a majority of SQL queries in most designed systems
(timestamp). Not to mention the fact the statement itself may use very
expensive functions -- perhaps even user functions that don't repeatably
do the same thing or depend on data from other tables.
Consider a successful write to table X for transaction 2, but an
unsuccessful write to table Y for transaction 1. Transaction 1 calls a
function that uses information from table X -- but it'll get different
information this time around.
Anyway, it really doesn't matter. You're trying to save a large amount
of time that simply isn't spent in this area in PostgreSQL. fsync()
happens once with commit -- and on a busy system, a single fsync call
may be sufficient for a number of parallel backends.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes

> If your load is primarily big update statements, maybe so...

It is.  Maybe we are anomalous here.

> I don't think I buy that claim.  We don't normally fsync the log file
> except at transaction commit (and read-only transactions
> don't generate
> any commit record, so they don't cause an fsync).  If a single
> transaction is generating lots of log data, it doesn't have
> to wait for
> that data to hit disk before it can do more stuff.
I have glanced at the code, and I agree that reads do not generate 
fsync() calls.  Since I watched my mirrored RAID 5 arrays hit 2,000 iops 
with an average request of 4 KB on a recent batch update with Pg, I 
still think that Pg may be fsync()-ing a bit too often.

Since I haven't digested all of the code, I am speaking a bit out of turn.

> But having said that --- on some platforms our default WAL sync method
> is open_datasync, which could result in the sort of behavior you are
> talking about.  Try experimenting with the other possible values of
> wal_sync_method to see if you like them better.
I will have to check that.  I am running Sparc Solaris 8.

> That probably gets you into a situation where no I/O
> is really happening
> at all, it's just being absorbed by kernel disk
> buffers.
Few things would please me more.

> Unfortunately
> that doesn't have a lot to do with the performance you can get if you
> want to be sure you don't lose data ...
I am not sure these are as mutually exclusive as it looks here.

>
> BTW, one thing you can do to reduce the WAL I/O volume in Postgres is
> to increase the inter-checkpoint interval (there are two settings to
> increase, one time-based and one volume-based).  The first write of a
> given data page after a checkpoint dumps the whole page into WAL, as a
> safety measure to deal with partial page writes during power failures.
> So right after a checkpoint the WAL volume goes way up.  With a longer
> interval between checkpoints you don't pay that price as often.
I did that and it helped tremendously.  Without proper tuning, I just 
made the numbers pretty large:

shared_buffers = 10
sort_mem = 131072
vacuum_mem = 65536
wal_buffers = 8192
checkpoint_segments = 32
Thanks for your feedback.

Sincerely,
Marty
Tom Lane wrote:
Marty Scholes <[EMAIL PROTECTED]> writes:

My experience with Oracle (and now limited experience with Pg) is that 
the major choke point in performance is not the CPU or read I/O, it is 
the log performance of big update and select statements.

If your load is primarily big update statements, maybe so...

Essentially, the data is written twice: first to the log and then the 
data files.  This would be ok except the transaction is regularly frozen 
while the log files sync to disk with a bunch of tiny (8KB for Oracle 
and Pg) write requests.

I don't think I buy that claim.  We don't normally fsync the log file
except at transaction commit (and read-only transactions don't generate
any commit record, so they don't cause an fsync).  If a single
transaction is generating lots of log data, it doesn't have to wait for
that data to hit disk before it can do more stuff.
But having said that --- on some platforms our default WAL sync method
is open_datasync, which could result in the sort of behavior you are
talking about.  Try experimenting with the other possible values of
wal_sync_method to see if you like them better.

If a transaction will do large updates or inserts, why don't we just log 
the parsed statements in the WAL instead of the individual data blocks 
that have changed?

As already pointed out, this would not give enough information to
reproduce the database state.

Some informal testing suggests that we get a factor of 8 improvement in 
speed here if we completely disable fsync() in large updates under Pg.

That probably gets you into a situation where no I/O is really happening
at all, it's just being absorbed by kernel disk buffers.  Unfortunately
that doesn't have a lot to do with the performance you can get if you
want to be sure you don't lose data ...
BTW, one thing you can do to reduce the WAL I/O volume in Postgres is
to increase the inter-checkpoint interval (there are two settings to
increase, one time-based and one volume-based).  The first write of a
given data page after a checkpoint dumps the whole page into WAL, as a
safety measure to deal with partial page writes during power failures.
So right after a checkpoint the WAL volume goes way up.  With a longer
interval between checkpoints you don't pay that price as often.
			regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes

> A major reason for this is that logical logs make recovery contingent
> on being able to execute the "parsed statements". This execution
> might, however, not be possible if the system is itself not in a
> consistent state .. as is normally the case during recovery.
>
I am not sure I follow you here.  The logs should (IMHO) save both types 
of data: physical pages (what happens now), or the SQL statement if it 
is small and generates a bunch of changes.

If the DB state cannot be put back to a consistent state prior to a SQL 
statement in the log, then NO amount of logging will help.  The idea is 
that the state can be put back to what it was prior to a particular log 
entry, be it raw datafile blocks or a SQL statement.

> What if, for instance, it's the catalog tables that were hosed when
> the system went down ? It may be difficult to execute the parsed
> statements without the catalogs.
>
See above.  If this cannot be resolved prior to re-executing a statement 
in the log, then the problem is beyond ANY subsequent logging.

> Having said that, page-oriented undo logging can be a pain when B-tree
> pages split. For higher concurrency, ARIES uses logical undo
> logging. In this case, the logs are akin to your "parsed statement"
> idea.
>
Yes, my experience exactly.  Maybe we are the only company on the planet 
that experiences this sort of thing.  Maybe not.

> In any case, the only place that parsed statements are useful, imo are
> with searched updates that cause a large number of records to change
> and with "insert into from select" statements.
>
Yes.  Correlated UPDATE, INSERT INTO with subselects AND mass DELETE on 
heavily indexed tables.  Index creation...  The list goes on and on.  I 
have experienced and live it all on a daily basis with Oracle.  And I 
despise it.

The difference is, of course, I can't even have this kind of discussion 
with Oracle, but I can here.  ;-)

> Then, there is also the case that this, the "parsed statements"
> approach, is not a general solution. How would you handle the "update
> current of cursor" scenarios ? In this case, there is some application
> logic that determines the precise records that change and how they
> change.
>
> Ergo, it is my claim that while logical redo logging does have some
> benefits, it is not a viable general solution.
>
Agreed, this is not a general solution.  What it is, however, is a 
tremendous improvement over the current situation for transactions that 
do massive changes to heavily indexed datasets.

I am working on an application right now that will require current 
postal information on EVERY address in the U.S. -- street name, street 
address, directional, subunit, 5 digit zip, 3 digit zip, city, state, 
delivery point barcode, carrier route, lattitude, longitude, etc.  Most 
of these fields will need to be indexed, because they will be searched 
in real time via a web application several thousand times per day.

To keep the address current, we will be updating them all (150+ million) 
 on a programmed basis, so we will go through and update several 
million addresses EVERY DAY, while needing to ensure that the address 
updates happen atomically so that they don't disrupt web activity.

Maybe this is not a "traditional" RDBMS app, but I am not in the mood to 
write my own storage infrastructure for it.

Then again, maybe I don't know what I am talking about...

Marty

Sailesh Krishnamurthy wrote:
"Marty" == Marty Scholes <[EMAIL PROTECTED]> writes:

Marty> Why have I not seen this in any database?
Marty> There must be a reason.
For ARIES-style systems, logging parsed statements (commonly called
"logical" logging) is not preferred compared to logging data items
("physical" or "physiological" logging). 

A major reason for this is that logical logs make recovery contingent
on being able to execute the "parsed statements". This execution
might, however, not be possible if the system is itself not in a
consistent state .. as is normally the case during recovery. 

What if, for instance, it's the catalog tables that were hosed when
the system went down ? It may be difficult to execute the parsed
statements without the catalogs. 

For this reason, a major goal of ARIES was to have each and every data
object (tables/indexes) individually recoverable. So ARIES follows
page-oriented redo logging.
Having said that, page-oriented undo logging can be a pain when B-tree
pages split. For higher concurrency, ARIES uses logical undo
logging. In this case, the logs are akin to your "parsed statement"
idea.
In any case, the only place that parsed statements are useful, imo are
with searched updates that cause a large number of records to change
and with "insert into from select" statements.
Then, there is also the case that this, the "parsed statements"
approach, is not a general solution. How would you handle the "update
current of cursor" scenarios ? In this case, there is some application
logic that determines the precise records tha

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Rod Taylor

On Thu, 2004-03-11 at 21:04, Marty Scholes wrote:
> I can see that and considered it.
> 
> The seed state would need to be saved, or any particular command that is 
> not reproducible would need to be exempted from this sort of logging.
> 
> Again, this would apply only to situations where a small SQL command 
> created huge changes.

I would say a majority of SQL queries in most designed systems
(timestamp). Not to mention the fact the statement itself may use very
expensive functions -- perhaps even user functions that don't repeatably
do the same thing or depend on data from other tables.

Consider a successful write to table X for transaction 2, but an
unsuccessful write to table Y for transaction 1. Transaction 1 calls a
function that uses information from table X -- but it'll get different
information this time around.

Anyway, it really doesn't matter. You're trying to save a large amount
of time that simply isn't spent in this area in PostgreSQL. fsync()
happens once with commit -- and on a busy system, a single fsync call
may be sufficient for a number of parallel backends.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes

I can see that and considered it.

The seed state would need to be saved, or any particular command that is 
not reproducible would need to be exempted from this sort of logging.

Again, this would apply only to situations where a small SQL command 
created huge changes.

Marty

Rod Taylor wrote:
If a transaction will do large updates or inserts, why don't we just log 
the parsed statements in the WAL instead of the individual data blocks 


UPDATE table SET col = random();




---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Tatsuo Ishii

> Please find attached my first attempt at providing a localisation
> information on the client side for syntax errors in "psql". Basically, one
> function is added which takes care of the burden. I put a lot of
> comments in the function to try make it clear what is going on.
> 
> It validates for me against current cvs head.
> 
> I've also added new regression tests in "errors.sql".
> 
> Although I have taken great and painful care to deal with multi-byte
> encodings, I'm not sure how I can validate that, as I don't have some
> japanese terminal to play with.

As far as I understand your code, it will be broken on many multi byte
encodings.

1) a character is not always represented on a terminal propotional to
   the storage size. For example a kanji character in UTF-8 encoding
   has a storage size of 3 bytes while it occupies spaces only twice
   of ASCII characters on a terminal. Same thing can be said to LATIN
   2,3 etc. in UTF-8 perhaps.

2) It assume all encodings are "ASCII compatible". Apparently some
   client-side-only encodings do not satisfy this request. Examples
   include SJIS, Big5.

I think 2) would be solved by carefull codings. 

However 1) is hard to solve. We need kind of a mapping table between
"storage size" and "terminal size" for each encoding. Note that this
is not new problem, and sophiscated editors such as Emacs (mule) has
already concur that.
--
Tatsuo Ishii

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-11 Thread Andrew Dunstan



Tom Lane wrote:

"Jeroen T. Vermeulen" <[EMAIL PROTECTED]> writes:
 

On Thu, Mar 11, 2004 at 03:14:10PM -0800, Josh Berkus wrote:
   

A) Favor www.postgresql.net
B) Favor www.pgfoundry.org
C) Don't care as long as the porting is relatively painless.
 

 

I'm not crazy about the name pgfoundry, but otherwise I think it's the
better choice.  The "www." problem could be circumvented by renaming the
project, perhaps, but I think it's best to keep a distinction between
"Postres, the database" and "related projects."
   

Actually, proposal (A) does provide such a separation: notice that the
projects would go under *.postgresql.net, with the core database remaining
at *.postgresql.org.  I am not sure if that will provoke confusion or
not, but I think I like it better than pgfoundry because it is clear
that the domains are related.  pgfoundry seems a bit, um, random.
 

I'm really going to try hard to stay out of all the hoohaa that seems to 
be boiling ... I got involved arse-end foremost because I was stupid 
enough to tell Josh that he could call on me if he needed things done in 
Perl or Java, and he took sufficient license from that to inveigle me 
into a lot of other stuff, none of which looks remotely like Perl or 
Java. :-). I do enough webbish stuff by day and would far rather spend 
*my* time in a modest way making postgresql even better than it is.

There are 2 name issues - the base site and the project sites. If I am 
web surfing and I go to foo.net or www.foo.net I expect (other things 
being equal) to go to the main page for organization foo. Going to some 
other page for the foo organization is just a bit weird. Now I know 
there are exceptions, millions of them, but they always jar slightly. 
That's why we settled on pgfoundry.net.

There is no reason, however, that the individual projects could not live 
under both domains, i.e. projname.postgresql.net and 
projname.pgfoundry.net. This is very doable.

As for "pgfoundry" - the name isn't set in concrete. In fact it is 
entirely trivial to change. Suggest another that might be better.

cheers

andrew



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
   (send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] unsafe floats

2004-03-11 Thread Neil Conway

Tom Lane <[EMAIL PROTECTED]> writes:
> That sounds okay.  Also we might want to distinguish NaN from
> Infinity --- I would expect most people to want zero-divide to
> continue to get reported, for instance, even if they want to get
> Infinity for overflow.

Yeah, good point.

> This I disagree with.  It would mean, for example, that you could not
> dump and reload columns containing such data unless you remembered to
> switch the variable first.

Hmmm... on reflection, you're probably correct.

Since that removes the potential objection to the previous patch, I've
applied it to CVS.

-Neil


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-11 Thread Josh Berkus

Jeroen,

> > Here's another idea: couldn't we have a subdomain for the projects, as in
> > ".forge.postgresql.org"?  Or would that be too long?

Hmmm ... wouldn't that be rather awkward with the projects with longer names?

http://orapgsqlviews.foundry.postgresql.org

That's 39 characters, not including the http ...

To speak up, I'd rather have either options (A) or (B) thank this option.  

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

[HACKERS] Log rotation

2004-03-11 Thread Fernando Nasser

Hi,

Please remind me again why the postmaster cannot close and open the log 
file when it receives a SIGHUP (to re-read configuration)?  This was 
discussed before but I cannot remember if and why this was not possible 
or if the arguments are still valid after -l was added.

If this was possible it could even be done after re-reading the 
configuration and even use the value of a GUC variable for the log file 
name, which would allow us to change the value of -l without the need to 
restart (actualy, with a GUC variable set in postgresql.conf one could 
set the log without the -l, but would perhaps lose a few initial messages).

regards to all,
Fernando
---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
 subscribe-nomail command to [EMAIL PROTECTED] so that your
 message can get through to the mailing list cleanly

Re: [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-11 Thread Tom Lane

"Jeroen T. Vermeulen" <[EMAIL PROTECTED]> writes:
> Here's another idea: couldn't we have a subdomain for the projects, as in
> ".forge.postgresql.org"?  Or would that be too long?

That would be okay with me ...

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-11 Thread Jeroen T. Vermeulen

On Thu, Mar 11, 2004 at 07:01:47PM -0500, Tom Lane wrote:
> 
> Actually, proposal (A) does provide such a separation: notice that the
> projects would go under *.postgresql.net, with the core database remaining
> at *.postgresql.org.  I am not sure if that will provoke confusion or
> not, but I think I like it better than pgfoundry because it is clear
> that the domains are related.  pgfoundry seems a bit, um, random.

Agree with the last bit, but I really feel that the difference between
postgresql.org and postgresql.net is too subtle--at least for people who
don't work with either very often.

Here's another idea: couldn't we have a subdomain for the projects, as in
".forge.postgresql.org"?  Or would that be too long?

Jeroen

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-11 Thread Tom Lane

"Jeroen T. Vermeulen" <[EMAIL PROTECTED]> writes:
> On Thu, Mar 11, 2004 at 03:14:10PM -0800, Josh Berkus wrote:
>> A) Favor www.postgresql.net
>> B) Favor www.pgfoundry.org
>> C) Don't care as long as the porting is relatively painless.

> I'm not crazy about the name pgfoundry, but otherwise I think it's the
> better choice.  The "www." problem could be circumvented by renaming the
> project, perhaps, but I think it's best to keep a distinction between
> "Postres, the database" and "related projects."

Actually, proposal (A) does provide such a separation: notice that the
projects would go under *.postgresql.net, with the core database remaining
at *.postgresql.org.  I am not sure if that will provoke confusion or
not, but I think I like it better than pgfoundry because it is clear
that the domains are related.  pgfoundry seems a bit, um, random.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Tom Lane

Josh Berkus <[EMAIL PROTECTED]> writes:
>> Also how will you handle column that are part of expressional indexes (where
>> foo is true for example) ?

> See my original proposal.  These columns will be ignored.  Expressions have 
> their own stats.

Yeah, I see no particular need to increase the stats allocation for a
column merely because it is used in an expression index.  The mechanism
Josh is proposing should cause the default amount of stats collected
*for the expression index* to go up, though.

BTW, there's an as-yet-undone bit of business associated with collecting
stats for expression indexes.  The ANALYZE code will honor an explicit
attstattarget setting for an expressional index column, but there's no
clean way to get that setting into the system.  What works in CVS tip is

regression=# create table foo (f1 int, f2 int);
CREATE TABLE
regression=# create index fooi on foo ((f1 + f2));
CREATE INDEX
regression=# alter table fooi alter column pg_expression_1 set statistics 100;
ALTER TABLE

but I don't much care for this; the arbitrary names that are used for
expressional columns shouldn't be embedded into SQL commands, and doing
"alter table" on an index looks a bit funny as well.

I want to make pg_dump dump these settings, but I'm not feeling
comfortable with having it dump commands that look like the above.
That would nail down the current method of assigning expression column
names as something we could never change without breaking dump scripts.

Can anyone think of a better way?

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] The Name Game: postgresql.net vs. pgfoundry.org

2004-03-11 Thread Jeroen T. Vermeulen

On Thu, Mar 11, 2004 at 03:14:10PM -0800, Josh Berkus wrote:
> 
> So far, only 4 people, total, have expressed opinons on the matter.  I'm 
> throwing this on Hackers so that members of projects we will be hosting can 
> indicate whether they:
> 
> A) Favor www.postgresql.net
> B) Favor www.pgfoundry.org
> C) Don't care as long as the porting is relatively painless.

I'm not crazy about the name pgfoundry, but otherwise I think it's the
better choice.  The "www." problem could be circumvented by renaming the
project, perhaps, but I think it's best to keep a distinction between
"Postres, the database" and "related projects."

Jeroen

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Tom Lane

Marty Scholes <[EMAIL PROTECTED]> writes:
> My experience with Oracle (and now limited experience with Pg) is that 
> the major choke point in performance is not the CPU or read I/O, it is 
> the log performance of big update and select statements.

If your load is primarily big update statements, maybe so...

> Essentially, the data is written twice: first to the log and then the 
> data files.  This would be ok except the transaction is regularly frozen 
> while the log files sync to disk with a bunch of tiny (8KB for Oracle 
> and Pg) write requests.

I don't think I buy that claim.  We don't normally fsync the log file
except at transaction commit (and read-only transactions don't generate
any commit record, so they don't cause an fsync).  If a single
transaction is generating lots of log data, it doesn't have to wait for
that data to hit disk before it can do more stuff.

But having said that --- on some platforms our default WAL sync method
is open_datasync, which could result in the sort of behavior you are
talking about.  Try experimenting with the other possible values of
wal_sync_method to see if you like them better.

> If a transaction will do large updates or inserts, why don't we just log 
> the parsed statements in the WAL instead of the individual data blocks 
> that have changed?

As already pointed out, this would not give enough information to
reproduce the database state.

> Some informal testing suggests that we get a factor of 8 improvement in 
> speed here if we completely disable fsync() in large updates under Pg.

That probably gets you into a situation where no I/O is really happening
at all, it's just being absorbed by kernel disk buffers.  Unfortunately
that doesn't have a lot to do with the performance you can get if you
want to be sure you don't lose data ...

BTW, one thing you can do to reduce the WAL I/O volume in Postgres is
to increase the inter-checkpoint interval (there are two settings to
increase, one time-based and one volume-based).  The first write of a
given data page after a checkpoint dumps the whole page into WAL, as a
safety measure to deal with partial page writes during power failures.
So right after a checkpoint the WAL volume goes way up.  With a longer
interval between checkpoints you don't pay that price as often.

regards, tom lane

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] unsafe floats

2004-03-11 Thread Tom Lane

Neil Conway <[EMAIL PROTECTED]> writes:
> Okay, I think this would be a reasonable set of behavior:

> - define a new GUC var that controls how exceptional floating
>   point values (NaN, inf, -inf) are handled.

> - by default, we still raise an error when a floating point
>   operation results in NaN / inf / etc.; if the GUC var is toggled
>   from its default value, no error is raised. This preserves
>   backward compatibility with applications that expect floating
>   point overflow to be reported, for example.

That sounds okay.  Also we might want to distinguish NaN from Infinity
--- I would expect most people to want zero-divide to continue to get
reported, for instance, even if they want to get Infinity for overflow.

> - that also means that, by default, we should reject 'NaN',
>   'Infinity', and '-Infinity' as input to float4/float8: if these
>   values are illegal as the result of FP operations by default, it
>   seems only logical to disallow them as input to the FP types.

This I disagree with.  It would mean, for example, that you could not
dump and reload columns containing such data unless you remembered to
switch the variable first.  If you did this then I'd insist on pg_dump
scripts setting the variable to the permissive state.  In any case,
I don't see why a restriction that certain operations can't produce a
certain value should render the value illegal overall.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Josh Berkus

Robert,

> Do you plan on handeling primary key columns differently (since they are 
> likely to be unique and indexed) ?

The same as any other indexed column.

> Also how will you handle column that are part of expressional indexes (where 
> foo is true for example) ?

See my original proposal.  These columns will be ignored.  Expressions have 
their own stats.

> Final thought... I'm a DBA and I think the straight number is simpler, 
though 
> could be convinced to go with whichever is higher... 

Thanks.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Robert Treat

On Thursday 11 March 2004 14:17, Josh Berkus wrote:
> Tom,
>
> > Maybe you should ask on -admin or -general.  Personally I thought there
> > wasn't anything to say until someone did some experiments to show
> > whether an indexed-column differential is really worthwhile and what a
> > plausible default value would be.  The idea sounds good in the abstract,
> > but will it really help or just be another useless setting?
>
> Yeah, that's our next step, a test.
>
> On Monday,  I hope to have comparative stats for a difficult database on
> the 3 solutions (leaving things as-is, raising the general default stats,
> and doing index_stats).

Do you plan on handeling primary key columns differently (since they are 
likely to be unique and indexed) ?

Also how will you handle column that are part of expressional indexes (where 
foo is true for example) ?

Final thought... I'm a DBA and I think the straight number is simpler, though 
could be convinced to go with whichever is higher... 

Robert Treat
-- 
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] unsafe floats

2004-03-11 Thread Tom Lane

Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> On Thu, 11 Mar 2004, Neil Conway wrote:
>> So, what is the correct behavior: if you multiply two values and get a
>> result that exceeds the range of a float8, should you get
>> 'Infinity'/'-Infinity', or an overflow error?

> That's the issue and I think we should allow infinity as a result of a 
> float operation (like you got in the example). It's part of IEEE 754 
> math, so not getting Infinity as a result would break that.

This would still be infinitely (ahem) better than the behavior you get
when an integer operation overflows.  We return whatever the hardware
gives in such cases.  Returning whatever the hardware gives for a float
overflow doesn't seem out of line, particularly if it's a well-defined
behavior.

I am somewhat concerned about the prospect of implementation-dependent
results --- machines that do not implement IEEE-style math are going to
return something other than an Infinity.  However, I think that
providing access to the IEEE semantics is more useful than a (rather
vain) attempt to impose uniformity across IEEE and non-IEEE machines.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Sailesh Krishnamurthy

> "Marty" == Marty Scholes <[EMAIL PROTECTED]> writes:

Marty> Why have I not seen this in any database?
Marty> There must be a reason.

For ARIES-style systems, logging parsed statements (commonly called
"logical" logging) is not preferred compared to logging data items
("physical" or "physiological" logging). 

A major reason for this is that logical logs make recovery contingent
on being able to execute the "parsed statements". This execution
might, however, not be possible if the system is itself not in a
consistent state .. as is normally the case during recovery. 

What if, for instance, it's the catalog tables that were hosed when
the system went down ? It may be difficult to execute the parsed
statements without the catalogs. 

For this reason, a major goal of ARIES was to have each and every data
object (tables/indexes) individually recoverable. So ARIES follows
page-oriented redo logging.

Having said that, page-oriented undo logging can be a pain when B-tree
pages split. For higher concurrency, ARIES uses logical undo
logging. In this case, the logs are akin to your "parsed statement"
idea.

In any case, the only place that parsed statements are useful, imo are
with searched updates that cause a large number of records to change
and with "insert into from select" statements.

Then, there is also the case that this, the "parsed statements"
approach, is not a general solution. How would you handle the "update
current of cursor" scenarios ? In this case, there is some application
logic that determines the precise records that change and how they
change. 

Ergo, it is my claim that while logical redo logging does have some
benefits, it is not a viable general solution.

-- 
Pip-pip
Sailesh
http://www.cs.berkeley.edu/~sailesh



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Rod Taylor

> If a transaction will do large updates or inserts, why don't we just log 
> the parsed statements in the WAL instead of the individual data blocks 

UPDATE table SET col = random();



---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

[HACKERS] Performance and WAL on big inserts/updates

2004-03-11 Thread Marty Scholes

I combed the archives but could not find a discussion on this and am 
amazed this hasn't been discussed.

My experience with Oracle (and now limited experience with Pg) is that 
the major choke point in performance is not the CPU or read I/O, it is 
the log performance of big update and select statements.

Essentially, the data is written twice: first to the log and then the 
data files.  This would be ok except the transaction is regularly frozen 
while the log files sync to disk with a bunch of tiny (8KB for Oracle 
and Pg) write requests.

I realize that the logs must be there to ensure crash recovery and that 
PITR is on the way to supplement this.

If a transaction will do large updates or inserts, why don't we just log 
the parsed statements in the WAL instead of the individual data blocks 
that have changed?  Then, the data files could be fsync()ed every 
checkpoint, but essentially no write I/O takes places in the interim.

Outside of checkpoints, large updates would only need to fsync() a very 
small addition to the log files.

Recovery could be similar to how I understand it currently is:
1. Roll back in-flight changes
2. Roll forward log entries in order, either direct changes to the data 
or re-execute the parsed command in the log.

Some informal testing suggests that we get a factor of 8 improvement in 
speed here if we completely disable fsync() in large updates under Pg.

I would suspect that a big portion of those gains would be preserved if 
fsync() calls were limited to checkpoints and saving the parsed SQL 
command in the log.

Why have I not seen this in any database?

There must be a reason.

Thanks in advance.

Sincerely,
Marty
---(end of broadcast)---
TIP 6: Have you searched our list archives?
  http://archives.postgresql.org

Re: [HACKERS] unsafe floats

2004-03-11 Thread Neil Conway

Dennis Bjorklund <[EMAIL PROTECTED]> writes:
> In C one can set a signal handler to catch floating point exceptions
> (SIGFPE). Without a handler you can get NaN and Infinity as the
> result of mathematical operations.

Okay, I think this would be a reasonable set of behavior:

- define a new GUC var that controls how exceptional floating
  point values (NaN, inf, -inf) are handled.

- by default, we still raise an error when a floating point
  operation results in NaN / inf / etc.; if the GUC var is toggled
  from its default value, no error is raised. This preserves
  backward compatibility with applications that expect floating
  point overflow to be reported, for example.

- that also means that, by default, we should reject 'NaN',
  'Infinity', and '-Infinity' as input to float4/float8: if these
  values are illegal as the result of FP operations by default, it
  seems only logical to disallow them as input to the FP types.

Does that sound ok?

Unfortunately, I have more important things that I need to get wrapped
up in time for 7.5, so I can't finish this now. Dennis, would you like
to implement this?

Finally, I've attached a revised patch for 'Infinity' input to float4
and float8: this avoids breaking the regression tests by only allowing
'Infinity' and '-Infinity' as input, not as a legal result of FP
operations. This is obviously incomplete, as discussed above, but it
might be a good starting point. Should I commit this?

-Neil

Index: doc/src/sgml/syntax.sgml
===
RCS file: /var/lib/cvs/pgsql-server/doc/src/sgml/syntax.sgml,v
retrieving revision 1.89
diff -c -r1.89 syntax.sgml
*** a/doc/src/sgml/syntax.sgml	29 Nov 2003 19:51:37 -	1.89
--- b/doc/src/sgml/syntax.sgml	11 Mar 2004 21:18:57 -
***
*** 359,364 
--- 359,382 
  
  
  
+  
+   In addition, there are several special constant values that are
+   accepted as numeric constants. The float4 and
+   float8 types allow the following special constants:
+ 
+ Infinity
+ -Infinity
+ NaN
+ 
+   These represnt the special values infinity,
+   negative infinity, and
+   not-a-number, respectively. The
+   numeric type only allows NaN, and the
+   integral types do not allow any of these constants. These
+   constants are treated without sensitivity to case. These values
+   should be enclosed in single quotes.
+  
+ 
  
   integer
   bigint
Index: src/backend/utils/adt/float.c
===
RCS file: /var/lib/cvs/pgsql-server/src/backend/utils/adt/float.c,v
retrieving revision 1.98
diff -c -r1.98 float.c
*** a/src/backend/utils/adt/float.c	11 Mar 2004 02:11:13 -	1.98
--- b/src/backend/utils/adt/float.c	11 Mar 2004 20:53:15 -
***
*** 114,134 
  
  
  /*
!  * check to see if a float4 val is outside of
!  * the FLOAT4_MIN, FLOAT4_MAX bounds.
   *
!  * raise an ereport warning if it is
! */
  static void
  CheckFloat4Val(double val)
  {
- 	/*
- 	 * defining unsafe floats's will make float4 and float8 ops faster at
- 	 * the cost of safety, of course!
- 	 */
- #ifdef UNSAFE_FLOATS
- 	return;
- #else
  	if (fabs(val) > FLOAT4_MAX)
  		ereport(ERROR,
  (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
--- 114,127 
  
  
  /*
!  * check to see if a float4 val is outside of the FLOAT4_MIN,
!  * FLOAT4_MAX bounds.
   *
!  * raise an ereport() error if it is
!  */
  static void
  CheckFloat4Val(double val)
  {
  	if (fabs(val) > FLOAT4_MAX)
  		ereport(ERROR,
  (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
***
*** 137,163 
  		ereport(ERROR,
  (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
   errmsg("type \"real\" value out of range: underflow")));
- 
- 	return;
- #endif   /* UNSAFE_FLOATS */
  }
  
  /*
!  * check to see if a float8 val is outside of
!  * the FLOAT8_MIN, FLOAT8_MAX bounds.
   *
!  * raise an ereport error if it is
   */
  static void
  CheckFloat8Val(double val)
  {
- 	/*
- 	 * defining unsafe floats's will make float4 and float8 ops faster at
- 	 * the cost of safety, of course!
- 	 */
- #ifdef UNSAFE_FLOATS
- 	return;
- #else
  	if (fabs(val) > FLOAT8_MAX)
  		ereport(ERROR,
  (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
--- 130,146 
  		ereport(ERROR,
  (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
   errmsg("type \"real\" value out of range: underflow")));
  }
  
  /*
!  * check to see if a float8 val is outside of the FLOAT8_MIN,
!  * FLOAT8_MAX bounds.
   *
!  * raise an ereport() error if it is
   */
  static void
  CheckFloat8Val(double val)
  {
  	if (fabs(val) > FLOAT8_MAX)
  		ereport(ERROR,
  (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
***
*** 166,172 
  		ereport(ERROR,
  (errcode(ERRCODE_NUMERIC_VALUE_OUT_OF_RANGE),
   errmsg("type \"double precision\" value out of range: underflow")

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Josh Berkus

Tom,

> Maybe you should ask on -admin or -general.  Personally I thought there
> wasn't anything to say until someone did some experiments to show
> whether an indexed-column differential is really worthwhile and what a
> plausible default value would be.  The idea sounds good in the abstract,
> but will it really help or just be another useless setting?

Yeah, that's our next step, a test.

On Monday,  I hope to have comparative stats for a difficult database on the 3 
solutions (leaving things as-is, raising the general default stats, and doing 
index_stats).

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Tom Lane

Josh Berkus <[EMAIL PROTECTED]> writes:
> Well, I'm not 100% opposed to a multiplier.   I'd like to take a poll of DBAs
> to find out which they would find more accessable.   But since most people 
> seem to be ignoring this thread, I'm not sure we'll get much response ...

Maybe you should ask on -admin or -general.  Personally I thought there
wasn't anything to say until someone did some experiments to show
whether an indexed-column differential is really worthwhile and what a
plausible default value would be.  The idea sounds good in the abstract,
but will it really help or just be another useless setting?

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] [pgsql-hackers-win32] fsync with sync, and Win32 unlink

2004-03-11 Thread Greg Stark

Tom Lane <[EMAIL PROTECTED]> writes:

> This seems, um, hard to believe.  Did he shut down the standard syncer
> daemon?  I have never seen a Unix system that would allow more than
> thirty seconds' worth of unwritten buffers to accumulate, and would not
> care to use one if it existed.

Well it was Solaris so it didn't have the BSD 30s sync style strategy. But
this was a large NFS file transfer to another host on a 100Mb/s network. In
30s there could be a lot of writes buffered up. 

I'm not saying the behaviour was ideal, and I don't know exactly why it
interfered with anything else. But I'm not entirely surprised either.

-- 
greg

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html

Re: [HACKERS] creating index on changed field type

2004-03-11 Thread David Smith

Użytkownik Tom Lane napisał:

David Smith <[EMAIL PROTECTED]> writes:

Should I forget about btrees and move to GIST,


Yes.  There's no provision in the btree code for an index storage type
different from the column datatype.
			regards, tom lane


Thank You for reply.
Let us suppose, the we retain type of field (column). But instead of 
storing original value(key), we will store tokens(Instead 'this is my 
first text', we would keep 5 tokens (5 different BTItems) respectively: 
'this', 'is', 'my', 'first', 'text'). Will it work or is there any other 
catch I can not see.

My performance tests resulted that GIST would be slower than original 
btree index. Maybe I mistaken somehow...

Best regards,
David
---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Peter Eisentraut

Fabien COELHO wrote:
> There is also a localisation issue here, as the translation of both
> lines must match so that the alignment is kept. I thought that if it
> is the very same word, the translation should be the same.

You can just indent with as many spaces.  This is done in other places 
as well.

 Btw., a better word than "query" would be "statement".


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [pgsql-hackers-win32] fsync with sync, and Win32 unlink

2004-03-11 Thread Tom Lane

Greg Stark <[EMAIL PROTECTED]> writes:
> I've seen some pretty severe damage caused by calling sync(2) on a loaded
> system. The system in question was in the process of copying data to an NFS
> mounted archival site. When the sync hit basically everything stopped until
> the buffered network writes could be synced. The live database was basically
> frozen for a few seconds and the web site nearly crashed. The sysadmin had to
> send out a notice asking everybody to refrain from using sync until the
> archival process had completed.

This seems, um, hard to believe.  Did he shut down the standard syncer
daemon?  I have never seen a Unix system that would allow more than
thirty seconds' worth of unwritten buffers to accumulate, and would not
care to use one if it existed.

regards, tom lane

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Josh Berkus

Scott,

> But possible more error prone.  If you crank up the default statistics to 
> 50, but the index default is still 25...  OTOH, you could always have the 
> setting of used for index default be whichever is greater... hmmm.

Well, I'm not 100% opposed to a multiplier.   I'd like to take a poll of DBAs 
to find out which they would find more accessable.   But since most people 
seem to be ignoring this thread, I'm not sure we'll get much response ...

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] creating index on changed field type

2004-03-11 Thread Tom Lane

David Smith <[EMAIL PROTECTED]> writes:
> Should I forget about btrees and move to GIST,

Yes.  There's no provision in the btree code for an index storage type
different from the column datatype.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

[HACKERS] creating index on changed field type

2004-03-11 Thread David Smith

Hello,
	the subject is obscure, so I will try to explain. I would like to 
develop index based on text field (or tsvector stolen from tsearch2), 
but containing different type (for example cstring, varchar,etc.) in 
order to tokenize the original field. I would like to use postgresql 
btree implementation, but AFAICS I can not do it. Example:

CREATE TABLE test (id int, mytext text);
CREATE INDEX myindex on test USING myindex (mytext) ;
INSERT INTO test VALUES(1,'this is my first text');
In index I do not want to keep whole phrase, but words derived from it
('this', 'is', 'my', 'first', 'text').
My idea was to create functions mybtgettuple, mybtinsert, mybtbeginscan
 , mybtrescan and so on. And in every case ignoring original
IndexTuple, and create set of new IndexTuple's (one for every term) and 
involving original functiions.

The problem is that index_create() in catalog/index.c creates everything
in system tables, especially type of index field.
Should I forget about btrees and move to GIST, or is there any hack,
which could solve my problem? Please help me.
Thanks in advance,
David
ps. maybe I should create index on TEXT field, store terms (words form 
the original field) also as TEXT type? Will it work?



---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread scott.marlowe

On Thu, 11 Mar 2004, Josh Berkus wrote:

> Scott,
> 
> > I like it.  Would a multiplier be acceptable?  
> > default_stats_index_multiplier = 10
> 
> Yeah, I thought about that, but a multiplier would be harder to manage for 
> most people.I mean, what if your default_stats are at 25 and you want 
> your index_stats at 40?   PITA.   Also, if you wanted to increase the default 
> stats but *forgot* that the index_stats were a multiplier ...
> 
> I think a straight number is less confusing.

But possible more error prone.  If you crank up the default statistics to 
50, but the index default is still 25...  OTOH, you could always have the 
setting of used for index default be whichever is greater... hmmm.

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Tom Lane

Fabien COELHO <[EMAIL PROTECTED]> writes:
>> Another possibility is to keep the cursor as just "^", and bury the
>> line info in the query extract.  For instance:

> Well, I want to preserve space for the query extract, that's where the
> context information is! If I put more information there, I'll have to
> reduce the extract length.

But in the form you are using, you have to reserve that space anyway.
Consider an error marker at the end of the line:

QUERY: select 'text that is long enough to be a problem':
QUERY:  ^ on line 1

Having to leave space for "on line 1" at the right end is not better
than putting "line 1:" at the left end.

> Moreover the cursor line information content is small, so it seems better
> to put the line information there.

It seems confusing to me.  The cursor is just a marker.  Consider also
the next step up from this scheme: instead of assuming a dumb terminal,
suppose you can use color or underlining or reverse-video or some such.
Then you might want to do just one line of output:

QUERY: SELECT foo FROM bar WHRE baz;

where my underlining represents using the terminal's highlight ability.
With such a scheme, putting LINE n on the same line fits naturally.

> There is also an alignment issue here, as depending on the number of
> figures, the cursor line would have to be fixed accordingly.

Sure, but you have to buy into that anyway if you are going to do "...".
It's not going to add more than about one strlen() call to what you need
to do.

> There is also a localisation issue here, as the translation of both lines
> must match so that the alignment is kept. I thought that if it is the very
> same word, the translation should be the same.

Hmm, that's a good point, although perhaps you could insert a note to
translators to add space as needed to make the two strings the same
length.

regards, tom lane

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] [pgsql-hackers-win32] fsync with sync, and Win32 unlink

2004-03-11 Thread Greg Stark

"Zeugswetter Andreas SB SD" <[EMAIL PROTECTED]> writes:

> > Consider either a box with many different postgresql instances, or one
> > that run both postgresql and other software. Issuing sync() in that
> > sitaution will cause sync of a lot of data that probably doesn't need
> > syncing. 
> > But it'd probably be a very good thing on a dedicated server, giving the
> > kernel the chance to optimise.
> 
> It is not like the sync is done every few seconds ! It is currently done
> every 5 minutes (I actually think this is too frequent now that we have 
> bgwriter, 10 - 20 min would be sufficient). So imho even on a heavily 
> otherwise used system the sync will be better.

Well, the further apart it is the more dangerous it is to be calling sync...

I've seen some pretty severe damage caused by calling sync(2) on a loaded
system. The system in question was in the process of copying data to an NFS
mounted archival site. When the sync hit basically everything stopped until
the buffered network writes could be synced. The live database was basically
frozen for a few seconds and the web site nearly crashed. The sysadmin had to
send out a notice asking everybody to refrain from using sync until the
archival process had completed.

Now that's not a common situation, but I think it shows how doing things that
cause system-wide effects is unwise.

-- 
greg

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Fabien COELHO


Dear Tom,

> > Please find attached my first attempt at providing a localisation
> > information on the client side for syntax errors in "psql".
>
> "Localisation" tends to mean something quite different around here. I'd
> suggest renaming the function to something like ReportSyntaxErrorPosition.

Ok. "Localisation" was the good french word;-)

> Would it read better to add "..." at the front and/or back of the query
> extract, when you have actually truncated text at that end?

I thought about it as well, but did not implement it.
I agree that it would look better. I can have a go.

> The "on line N" bit seems just noise to me.

It depends.

If you're in the middle of a sql-script as I often do, giving the line
within the statement isn't bad. Say:

SELECT o.titre
FROM collections AS col, ouvrages AS o, a_ecrit AS e, auteurs AS a
WHERE o.id=e.oeuvre AND e.auteur=a.id AND o.collection=col.id
 AND col.nom LIKE '%Spirou%' AND a.nom<>'Franquin'
EXCEPT
SELECT o.titre
FROM collections AS col, ouvrages AS o, a_ecrit AS e, auteurs AS a
WHERE o.id=e.oeuvre AND e.auteur=a.id AND o.collection=col.id
 AND col.nom LIKE '%Spirou%' AND a.nom='Franquin';

All lines look the same... So I would prefer to keep it.

Well, if it is the only issue against the adoption of the patch, I will do
without the line number.

I'll submit a new patch in a moment.

Thanks for your comments,

-- 
Fabien.

---(end of broadcast)---
TIP 3: if posting/reading through Usenet, please send an appropriate
  subscribe-nomail command to [EMAIL PROTECTED] so that your
  message can get through to the mailing list cleanly

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Fabien COELHO


Dear Tom,

> >> The "on line N" bit seems just noise to me.
> > It depends.
>
> I can see that it would be useful in a very large query.  Perhaps
> include it only when the query has more than N lines, for some N
> like three to five or so?

Yes, I can do that.

> Another possibility is to keep the cursor as just "^", and bury the
> line info in the query extract.  For instance:

Well, I want to preserve space for the query extract, that's where the
context information is! If I put more information there, I'll have to
reduce the extract length.

Moreover the cursor line information content is small, so it seems better
to put the line information there.

> (So "LINE n" would never say "LINE 1", only larger line numbers.)

I agree that LINE 1 looks pretty useless, especially if there is only
one line;-)

> I think this wouldn't be terribly complex to implement, and it would
> make things look fairly nice for both short and long queries.

There is also an alignment issue here, as depending on the number of
figures, the cursor line would have to be fixed accordingly.

> One last point: it seems a bit odd to use QUERY: as the prefix for both
> the query extract and the cursor line.  I don't have a suggestion what
> to use instead, offhand.

I agree, but couldn't think of a better choice either.

There is also a localisation issue here, as the translation of both lines
must match so that the alignment is kept. I thought that if it is the very
same word, the translation should be the same.

Thanks for your comments and ideas, I'll submit a v3 on (my) tomorrow.

Have a nice day,

-- 
Fabien Coelho - [EMAIL PROTECTED]

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Tom Lane

Fabien COELHO <[EMAIL PROTECTED]> writes:
>> The "on line N" bit seems just noise to me.

> It depends.

I can see that it would be useful in a very large query.  Perhaps
include it only when the query has more than N lines, for some N
like three to five or so?

Another possibility is to keep the cursor as just "^", and bury the
line info in the query extract.  For instance:

Short single-line query, no truncation needed:

QUERY: SELECT * FROM foo WHRE bar;
QUERY:   ^

Truncation on the right just adds ..., no other change needed:

QUERY: SELECT * FROM foo WHRE lots-of-conditions...
QUERY:   ^

When you truncate on the left, count the number of newlines removed,
and emit "LINE n" if there were any:

QUERY: LINE 4: FROM foo WHRE lots-of-conditions...
QUERY:  ^
or
QUERY: LINE 4: ...FROM foo WHRE lots-of-conditions...
QUERY: ^

(So "LINE n" would never say "LINE 1", only larger line numbers.)

Here I'm imagining that the leading ... gets inserted when text has been
removed after the start of the indicated line.  So another possible
output format is

QUERY: ...FROM foo WHRE lots-of-conditions...
QUERY: ^

if you removed some text but not any newlines from the start of the
query.

I think this wouldn't be terribly complex to implement, and it would
make things look fairly nice for both short and long queries.

One last point: it seems a bit odd to use QUERY: as the prefix for both
the query extract and the cursor line.  I don't have a suggestion what
to use instead, offhand.

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Tom Lane

Fabien COELHO <[EMAIL PROTECTED]> writes:
> Please find attached my first attempt at providing a localisation
> information on the client side for syntax errors in "psql".

"Localisation" tends to mean something quite different around here.
I'd suggest renaming the function to something like ReportSyntaxErrorPosition.

Would it read better to add "..." at the front and/or back of the query
extract, when you have actually truncated text at that end?

The "on line N" bit seems just noise to me.

regards, tom lane

---(end of broadcast)---
TIP 8: explain analyze is your friend

Re: [HACKERS] Default Stats Revisited

2004-03-11 Thread Josh Berkus

Scott,

> I like it.  Would a multiplier be acceptable?  
> default_stats_index_multiplier = 10

Yeah, I thought about that, but a multiplier would be harder to manage for 
most people.I mean, what if your default_stats are at 25 and you want 
your index_stats at 40?   PITA.   Also, if you wanted to increase the default 
stats but *forgot* that the index_stats were a multiplier ...

I think a straight number is less confusing.

-- 
-Josh Berkus
 Aglio Database Solutions
 San Francisco


---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] client side syntax error localisation for psql (v1)

2004-03-11 Thread Fabien COELHO


> Dear patchers,

Sorry, wrong list:-(

-- 
Fabien.

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])

Re: [HACKERS] How to get RelationName ??

2004-03-11 Thread Tom Lane

"Ramanujam H S Iyengar" <[EMAIL PROTECTED]> writes:
>> The optimizer has no need, ever, to find a relation by name; all it ever
>> sees are predigested relation OIDs.  So you are not making a lot of
>> sense here.  You certainly cannot assume that a search-path lookup is
>> appropriate for a relation that the optimizer is dealing with.

> Iam trying to put in some other module of optimizer that we have developed, 
> which works on relation name and gives the optimal plan. Iam trying to 
> convert the plans given by this(our) optimizer to the
> Plan structure of PostgreSQL. So finally i have to convert the Relation 
> names back to their relOids.

If your optimizer emits unqualified relation names then it is broken,
as it will never be safe to use in the presence of schemas.  People
will not want to use a database that might apply updates meant for
"a.foo" to "b.foo".

regards, tom lane

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

Re: [HACKERS] [pgsql-hackers-win32] fsync with sync, and Win32 unlink

2004-03-11 Thread Zeugswetter Andreas SB SD


> Consider either a box with many different postgresql instances, or one
> that run both postgresql and other software. Issuing sync() in that
> sitaution will cause sync of a lot of data that probably doesn't need
> syncing. 
> But it'd probably be a very good thing on a dedicated server, giving the
> kernel the chance to optimise.

It is not like the sync is done every few seconds ! It is currently done
every 5 minutes (I actually think this is too frequent now that we have 
bgwriter, 10 - 20 min would be sufficient). So imho even on a heavily 
otherwise used system the sync will be better.

Andreas

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]

Re: [HACKERS] [pgsql-hackers-win32] fsync with sync, and Win32 unlink

2004-03-11 Thread Magnus Hagander

> I have talked to Tom today and he is willing to implement the 
> discussed method of doing fsync on every file modified 
> between checkpoints, and add unlink handling for open files for Win32.

Great news. I'm sure this will benefig Unix platforms as well, when
taking into account the discussions previously.



> 3)  On checkpoint request, either by the user or postmaster, 
> the background writer will create a subprocess, do a sync(), 
> wait, then do fsync of all files that were marked as dirty.  
> The sync() should flush out most of the dirty files in an 
> optimal manner.

Please make a config variable to make this sync() call optional. This
goes for Unix as well, and not just win32 (which doesn't support it).
Consider either a box with many different postgresql instances, or one
that run both postgresql and other software. Issuing sync() in that
sitaution will cause sync of a lot of data that probably doesn't need
syncing. 
But it'd probably be a very good thing on a dedicated server, giving the
kernel the chance to optimise.


//Magnus

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings

56 matches

Mail list logo