Re: [PERFORM] [ADMIN] Raw devices vs. Filesystems

2004-04-07 Thread Grega Bremec
...and on Wed, Apr 07, 2004 at 09:09:16AM -0700, Josh Berkus used the keyboard:
> 
> Does it work, though?   Without Oracle admin tools?

Hello, Josh. :)

Well, as I said, that's why I was asking - I'm willing to give it a go
if nobody can prove me wrong. :)

> > Now, if both goals can be achieved in one go, hell, I'm willing to try
> > it out myself in an attempt to extract off of it, some performance
> > indicators that could be compared to other database performance tests
> > sent to both this and the PERFORM mailing list.
> 
> Hey, any test you wanna run is fine with us.I'm pretty sure that OCFS 
> belongs to Oracle, though, patent & copyright, so we couldn't actually use it 
> in practice.

I thought you knew - OCFS, OCFS-Tools and OCFSv2 have not only been open-
source for quite a while now - they're released under the GPL.

http://oss.oracle.com/projects/ocfs/
http://oss.oracle.com/projects/ocfs-tools/
http://oss.oracle.com/projects/ocfs2/

I don't know what that means to you (probably nothing good, as PostgreSQL
is released under the BSD license), but it most definitely can be considered
a good thing for the end user, as she can download it, compile, and set it
up on her disks, without the need to pay Oracle royalties. :)

> If your intention in this test is to show the superiority of raw devices, let 
> me give you a reality check: barring some major corporate backing getting 
> involved, we can't possibly implement our own PG-FS for database support.  We 
> already have a TODO list which is far too long for our developer pool, and 
> implementing a custom FS either takes a large team (OCFS) or several years of 
> development (Reiser).   

Not really - I was just thinking about something not-entirely-a-filesystem
and POK!, OCFS sprang to mind. It omits many POSIX features that slow down
a traditional filesystem, yet it does know the concept of inodes and most
of all, it's _really_ heavy on caching. As such, it sounded quite promising
to me, but trial, I think, is the best test.

The question does spring up though, that Steve raised in another post - just
for the record, what POSIX semantics can a postmaster live without in a
filesystem?

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > Tom Lane wrote:
> >> But it should be okay to set the filesystem to journal only its own
> >> metadata.  There's no need for it to journal file contents.
> 
> > Can you set ext2 to journal metadata?  I didn't know it could do that.
> 
> No, ext2 has no journal at all AFAIK.  But I believe ext3 has an option
> to journal or not journal file contents, and at least on a Postgres-only
> volume you'd want to turn that off.

Right, ext3 has that option.  I don't think XFS needs it (it does
meta-data only by default).

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> Tom Lane wrote:
>> But it should be okay to set the filesystem to journal only its own
>> metadata.  There's no need for it to journal file contents.

> Can you set ext2 to journal metadata?  I didn't know it could do that.

No, ext2 has no journal at all AFAIK.  But I believe ext3 has an option
to journal or not journal file contents, and at least on a Postgres-only
volume you'd want to turn that off.

regards, tom lane

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Bruce Momjian
Tom Lane wrote:
> Bruce Momjian <[EMAIL PROTECTED]> writes:
> > scott.marlowe wrote:
> >>> There is no real need (or benefit) from having the database on a
> >>> journalled filesystem - the journalling is only trying to give similar
> >>> sorts of guarantees to what the fsync in PostgreSQL is doing.
> >> 
> >> Is this true?  I was under the impression that without at least meta-data 
> >> journaling postgresql could still be corrupted by power failure.
> 
> > It is false.  ext2 isn't crash-safe, and PostgreSQL needs an intact file
> > system for WAL recovery.
> 
> But it should be okay to set the filesystem to journal only its own
> metadata.  There's no need for it to journal file contents.

Can you set ext2 to journal metadata?  I didn't know it could do that.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Tom Lane
Bruce Momjian <[EMAIL PROTECTED]> writes:
> scott.marlowe wrote:
>>> There is no real need (or benefit) from having the database on a
>>> journalled filesystem - the journalling is only trying to give similar
>>> sorts of guarantees to what the fsync in PostgreSQL is doing.
>> 
>> Is this true?  I was under the impression that without at least meta-data 
>> journaling postgresql could still be corrupted by power failure.

> It is false.  ext2 isn't crash-safe, and PostgreSQL needs an intact file
> system for WAL recovery.

But it should be okay to set the filesystem to journal only its own
metadata.  There's no need for it to journal file contents.

regards, tom lane

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] statistics

2004-04-07 Thread Paul Thomas
On 07/04/2004 22:05 Jaime Casanova wrote:
What the statistics are? Where can i view it? where can i find info 
about its field and why are they valuable information to performance?

thanx in advance,

Jaime Casanova


OK. An idiot's guide to statistics by a full-time idiot...

Let's start with a simple premise. I'm a RDBMS (forget that I'm actually 
an idiot for a moment...) and I've been asked for

select * from foo where bar = 7;

How do I go about fulfilling the reequest in the most efficient manner? 
(i.e., ASAP!)

One way might be to read through the whole table and return only those 
rows which match the where criteron - a sequential scan on the table.

But wait a minute, there is an index on column bar. Could I use this 
instead? Well, of course, I could use it but I have to keep sight of the 
goal of returning the data ASAP and I know that the act of reading 
index/reading table/... will have a performance penalty due to a lot more 
head movement on the disk. So how do I make chose between a sequential 
scan and an index scan? Let's lokk at a couple of extreme scenarios:

1) let's look at the condition where all or virtually all of the bar 
columns are populated wityh the value 7. In this case it would be more 
efficient to read sequentially through the table.

2) the opposite of (1) - very few of the bar columns have the value 7. In 
this case using the index could be a winner.

So generalising, I need to be able to estimate whether doing a sequential 
scan is more efficient that an index scan and this comes down to 2 factors:

a) the cost of moving the disk heads all over the place (random page cost)
b) the spread of values in the selecting column(s)
(a) is specfified in postgresql.conf (see archives for much discusion 
about what the value should be..)
(b) is determined by the dastardly trick of actually sampling the data in 
the table!!! That's what analyze does. It samples your table(s) and uses 
the result to feeede into it's descision about when to flip between 
sequential and index scans.

Hope this makes some kind of sense...

--
Paul Thomas
+--+-+
| Thomas Micro Systems Limited | Software Solutions for 
Business |
| Computer Consultants | 
http://www.thomas-micro-systems-ltd.co.uk   |
+--+-+

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
 joining column's datatypes do not match


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Bruce Momjian
scott.marlowe wrote:
> > One thing I recommend is to use ext2 (or almost anything but ext3). 
> > There is no real need (or benefit) from having the database on a
> > journalled filesystem - the journalling is only trying to give similar
> > sorts of guarantees to what the fsync in PostgreSQL is doing.
> 
> Is this true?  I was under the impression that without at least meta-data 
> journaling postgresql could still be corrupted by power failure.

It is false.  ext2 isn't crash-safe, and PostgreSQL needs an intact file
system for WAL recovery.

-- 
  Bruce Momjian|  http://candle.pha.pa.us
  [EMAIL PROTECTED]   |  (610) 359-1001
  +  If your life is a hard drive, |  13 Roberts Road
  +  Christ can be your backup.|  Newtown Square, Pennsylvania 19073

---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


[PERFORM] statistics

2004-04-07 Thread Jaime Casanova
What the statistics are? Where can i view it? where can i find info about 
its field and why are they valuable information to performance?

thanx in advance,

Jaime Casanova

_
Protect your PC - get McAfee.com VirusScan Online 
http://clinic.mcafee.com/clinic/ibuy/campaign.asp?cid=3963

---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] Postgresql educational sources

2004-04-07 Thread scott.marlowe
On Wed, 7 Apr 2004, Nid wrote:

> Rather than ask some general, unanswerable question on how to tune my 
> database...I thought I ask where I might find an introduction to...or 
> manual/tutorial for the depths of managing a postgres db.  Books?  
> Websites?  Assume a basic to intermediate knowledge of DBs in general 
> with a desire to learn about postgres from the ground up.  If it makes a 
> difference I'm using a postgres db in a Red Hat Linux OS environment.  
> Thanks!

The online (adminstration) docs are quite good, and for tuning, look at 
the excellent tuning document on varlena:

http://www.varlena.com/varlena/GeneralBits/Tidbits/perf.html




---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread scott.marlowe
On Wed, 7 Apr 2004, Andrew McMillan wrote:

> On Wed, 2004-04-07 at 20:56, huang yaqin wrote:
> > Hello, Richard Huxton,
> > 
> > You said  turning fsync off may cause losing data, that's terrible.
> > I use SCSI disk, and file system is ext3. I tune postgresql.conf and can't get 
> > any improvement. So what can I do?
> > Does SCSI disk and IDE disk have difference?
> 
> Yes, turning off fsync means that the database is not guaranteeing
> consistency of writes to disk any longer.  On the other hand your IDE
> system probably never was, because IDE drives just typically turn on
> write caching in hardware without telling anyone.
> 
> SCSI typically doesn't turn on write caching in the physical drive by
> default, as Tom Lane pointed out earlier.  Good SCSI has a battery
> backed up cache, and then it is OK to turn on write caching, because the
> controller has enough battery to complete all writes in the event of a
> power failure.

Actually, almost all SCSI drives turn on write caching by default, they 
just don't lie about fsync, so you still have a one update per revolution 
limit, but other things can be happening while that write is being 
commited due to the multi-threaded nature of both the SCSI interface and 
the kernel drivers associated with them

It would appear the linux kernel hackers are trying to implement the 
multi-threaded features of the latest ATA spec, so that, at some future 
date, you could have IDE drives that cache AND tell the truth of their 
sync AND can do more than one thing at a time.

> One thing I recommend is to use ext2 (or almost anything but ext3). 
> There is no real need (or benefit) from having the database on a
> journalled filesystem - the journalling is only trying to give similar
> sorts of guarantees to what the fsync in PostgreSQL is doing.

Is this true?  I was under the impression that without at least meta-data 
journaling postgresql could still be corrupted by power failure.

> The suggestion someone else made regarding use of software raid is
> probably also a good one if you are trying to use the on-board RAID at
> the moment.

Some onboard RAID controllers are fairly good (dell's 2600 series have an 
adaptec on board that can have battery backed cache that is ok, the lsi
megaraid based one is faster under linux though.)  But some of them are 
pretty poor performers.

> Finally, I would say that because you are seeing poor performance on one
> box and great performance on another, you should look at the hardware,
> or at the hardware drivers, for the problem - not so much at PostgreSQL.

More than likely, the biggest issue is that the SCSI drives are performing 
proper fsync, while the IDE drives are lying.  Definitely a time to look 
at a good caching RAID controller.


---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


[PERFORM] Postgresql educational sources

2004-04-07 Thread Nid



Rather than ask some general, unanswerable question 
on how to tune my database...I thought I ask where I might find an introduction 
to...or manual/tutorial for the depths of managing a postgres db.  
Books?  Websites?  Assume a basic to intermediate knowledge of DBs in 
general with a desire to learn about postgres from the ground up.  If it 
makes a difference I'm using a postgres db in a Red Hat Linux OS 
environment.  Thanks!
 
nid


Re: [PERFORM] PostgreSQL and Linux 2.6 kernel.

2004-04-07 Thread Neil Conway
On Mon, 2004-04-05 at 11:36, Josh Berkus wrote:
> Unfortunately, these days only Tom and Neil seem to be seriously working on 
> the query planner (beg pardon in advance if I've missed someone)

Actually, Tom is the only person actively working on the planner --
while I hope to contribute to it in the future, I haven't done so yet.

-Neil



---(end of broadcast)---
TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]


Re: [PERFORM] [ADMIN] Raw devices vs. Filesystems

2004-04-07 Thread Steve Atkins
On Wed, Apr 07, 2004 at 09:09:16AM -0700, Josh Berkus wrote:

> If your intention in this test is to show the superiority of raw devices, let 
> me give you a reality check: barring some major corporate backing getting 
> involved, we can't possibly implement our own PG-FS for database support.  We 
> already have a TODO list which is far too long for our developer pool, and 
> implementing a custom FS either takes a large team (OCFS) or several years of 
> development (Reiser).   

Is there any documentation as to what guarantees PostgreSQL requires
from the filesystem, or what posix semantics can be relaxed?

Cheers,
  Steve

---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html


Re: [PERFORM] [ADMIN] Raw devices vs. Filesystems

2004-04-07 Thread Josh Berkus
Grega,

> Furthermore, this filesystem would be a blazing one stop solution for
> all replication issues PostgreSQL currently suffers from, as its main
> design goal was to present "a consistent file system image across the
> servers in a cluster".

Does it work, though?   Without Oracle admin tools?

> Now, if both goals can be achieved in one go, hell, I'm willing to try
> it out myself in an attempt to extract off of it, some performance
> indicators that could be compared to other database performance tests
> sent to both this and the PERFORM mailing list.

Hey, any test you wanna run is fine with us.I'm pretty sure that OCFS 
belongs to Oracle, though, patent & copyright, so we couldn't actually use it 
in practice.

If your intention in this test is to show the superiority of raw devices, let 
me give you a reality check: barring some major corporate backing getting 
involved, we can't possibly implement our own PG-FS for database support.  We 
already have a TODO list which is far too long for our developer pool, and 
implementing a custom FS either takes a large team (OCFS) or several years of 
development (Reiser).   

Now, if you know somebody who might pay for one, then great 

-- 
Josh Berkus
Aglio Database Solutions
San Francisco

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Aaron Werman
fsync I'm thinking 50 inserts, if autocommiting is 50TPS = ~100 IO per
second (50 WAL + checkpoint IO) = roughly the I/O rate of a single drive.

Huang - Are you using a single drive for pg? If so, there is a safety
problem of both the data and logs used for recovery on the same drive. If
the drive crashes, there is nothing left for recovery.

Also, there is a big contention issue, since the log is a fast sequential
write, and checkpointing is random. If the log is on a separate drive,
you'll probably see insert speed at disk sequential write speed, since the
other drive(s) should hopefully be able to keep up when checkpointing. If
they share the same drive, you'll see an initial burst of inserts, then a
order of magnitude performance drop-off as soon as you checkpoint - because
the disk is interleaving the log and data writes.

fsync off is only appropriate for externally recoverable processes, such as
loading an empty server from a file.

/Aaron

- Original Message - 
From: "Richard Huxton" <[EMAIL PROTECTED]>
To: "huang yaqin" <[EMAIL PROTECTED]>; "Tom Lane" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wednesday, April 07, 2004 4:33 AM
Subject: Re: [PERFORM] good pc but bad performance,why?


On Wednesday 07 April 2004 05:00, huang yaqin wrote:
> hello!
>
> Thanks, you are right.
> I use "postmaster -o "-F" " to start my PG,and performance improved
> greatly.

I don't think Tom was recommending turning fsync off. If you have a system
crash/power glitch then the database can become corrupted.

If you are happy the possibility if losing your data, write performance will
improve noticably.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster

---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] plan problem

2004-04-07 Thread Richard Huxton
On Wednesday 07 April 2004 10:03, Ken Geis wrote:
> Richard Huxton wrote:
> > On Tuesday 06 April 2004 21:25, Ken Geis wrote:
> >>I am trying to find an efficient way to draw a random sample from a
> >>complex query.  I also want it to be easy to use within my application.
> >>
> >>So I've defined a view that encapsulates the query.  The id in the
> >>"driving" table is exposed, and I run a query like:
> >>
> >>select * from stats_record_view
> >>  where id in (select id from driver_stats
> >>order by random()
> >>limit 3);
> >
> > How about a join?
> >
> > SELECT s.*
> > FROM
> > stats_record_view s
> > JOIN
> > (SELECT id FROM driver_stats ORDER BY random() LIMIT 3) AS r
> > ON s.id = r.id;
>
> Yes, I tried this too after I sent the first mail, and this was somewhat
> better.  I ended up adding a random column to the driving table, putting
> an index on it, and exposing that column in the view.  Now I can say
>
> SELECT * FROM stats_record_view WHERE random < 0.093;
>
> For my application, it's OK if the same sample is picked time after time
> and it may change if data is added.

Fair enough - that'll certainly do it.

> > Also worth checking the various list archives - this has come up in the
> > past, but some time ago.
>
> There are some messages in the archives about how to get a random
> sample.  I know how to do that, and that's not why I posted my message.
>   Are you saying that the planner behavior I spoke of is in the
> archives?  I wouldn't know what to search on to find that thread.  Does
> anyone think that the planner issue has merit to address?  Can someone
> help me figure out what code I would look at?

I was assuming after getting a random subset they'd see the same problem you 
are. If not, probably worth looking at. In which case, an EXPLAIN ANALYZE of 
your original query would be good.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to [EMAIL PROTECTED])


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Shridhar Daithankar
Sending again bacuse of MUA error.. Chose a wrong address in From..:-(

 Shridhar

On Wednesday 07 April 2004 17:21, Shridhar Daithankar wrote:
> On Wednesday 07 April 2004 16:59, Andrew McMillan wrote:
> > One thing I recommend is to use ext2 (or almost anything but ext3).
> > There is no real need (or benefit) from having the database on a
> > journalled filesystem - the journalling is only trying to give similar
> > sorts of guarantees to what the fsync in PostgreSQL is doing.
>
> That is not correct assumption. A journalling file system ensures file
> system consistency even at a cost of loss of some data. And postgresql can
> not guarantee recovery if WAL logs are corrupt. Some months back, there was
> a case reported where ext2 corrupted WAL and database. BAckup is only
> solution then..
>
> Journalling file systems are usually very close to ext2 in performance,
> many a times lot better. With ext2, you are buying a huge risk.
>
> Unless there are good reason, I would not put a database on ext2.
> Performance isn't one ofthem..
>
>  Shridhar

---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Andrew McMillan
On Wed, 2004-04-07 at 20:56, huang yaqin wrote:
> Hello, Richard Huxton,
>   
>   You said  turning fsync off may cause losing data, that's terrible.
>   I use SCSI disk, and file system is ext3. I tune postgresql.conf and can't get 
> any improvement. So what can I do?
>   Does SCSI disk and IDE disk have difference?

Yes, turning off fsync means that the database is not guaranteeing
consistency of writes to disk any longer.  On the other hand your IDE
system probably never was, because IDE drives just typically turn on
write caching in hardware without telling anyone.

SCSI typically doesn't turn on write caching in the physical drive by
default, as Tom Lane pointed out earlier.  Good SCSI has a battery
backed up cache, and then it is OK to turn on write caching, because the
controller has enough battery to complete all writes in the event of a
power failure.

One thing I recommend is to use ext2 (or almost anything but ext3). 
There is no real need (or benefit) from having the database on a
journalled filesystem - the journalling is only trying to give similar
sorts of guarantees to what the fsync in PostgreSQL is doing.

The suggestion someone else made regarding use of software raid is
probably also a good one if you are trying to use the on-board RAID at
the moment.

Finally, I would say that because you are seeing poor performance on one
box and great performance on another, you should look at the hardware,
or at the hardware drivers, for the problem - not so much at PostgreSQL.

Of course if it is application performance you want to achieve, we _can_
help here, but you will need to provide more details of what you are
trying to do in your application, including;
 - confirmation that you have done a VACUUM and ANALYZE of all tables
before you start
 - output from EXPLAIN ANALYZE for slow queries
 - anything else you think is useful.

without that sort of detail we can only give vague suggestions, like
"wrap everything in a transaction" - excellent advice, certainly, but
you can read that in the manual.

There are no magic bullets, but I am sure most of the people on this
list have systems that regularly do way more than 50 inserts / second on
server hardware.

Regards,
Andrew McMillan

-
Andrew @ Catalyst .Net .NZ  Ltd,  PO Box 11-053,  Manners St,  Wellington
WEB: http://catalyst.net.nz/ PHYS: Level 2, 150-154 Willis St
DDI: +64(4)916-7201   MOB: +64(21)635-694  OFFICE: +64(4)499-2267
 http://survey.net.nz/ - any questions?
-


---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Steven Butler
It sounds almost like you're doing one insert per transaction.  Try wrapping
multiple inserts into a single transaction and see if that helps.  This may
not be appropriate for your application, but it does guarantee that
committed transactions will not be lost.

My apologies if you are already doing this. :)

BEGIN;
insert ...
insert ...
insert ...
COMMIT;

Regards,
Steve Butler

- Original Message - 
From: "huang yaqin" <[EMAIL PROTECTED]>
To: "Richard Huxton" <[EMAIL PROTECTED]>
Cc: <[EMAIL PROTECTED]>
Sent: Wednesday, April 07, 2004 6:56 PM
Subject: Re: [PERFORM] good pc but bad performance,why?


Hello, Richard Huxton,

  You said  turning fsync off may cause losing data, that's terrible.
I use SCSI disk, and file system is ext3. I tune postgresql.conf and can't
get any improvement. So what can I do?
Does SCSI disk and IDE disk have difference?

 Regards,
Huang yaqin



---(end of broadcast)---
TIP 9: the planner will ignore your desire to choose an index scan if your
  joining column's datatypes do not match


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Dennis Bjorklund
On Wed, 7 Apr 2004, huang yaqin wrote:

> You said turning fsync off may cause losing data, that's terrible. I use
> SCSI disk, and file system is ext3. I tune postgresql.conf and can't get
> any improvement. So what can I do?

Make sure you do as much as possible inside one transaction. If you want 
to do 1000 inserts, then do BEGIN; insert ; insert; ... ; COMMIT;

-- 
/Dennis Björklund


---(end of broadcast)---
TIP 8: explain analyze is your friend


Re: [PERFORM] plan problem

2004-04-07 Thread Ken Geis
Richard Huxton wrote:
On Tuesday 06 April 2004 21:25, Ken Geis wrote:

I am trying to find an efficient way to draw a random sample from a
complex query.  I also want it to be easy to use within my application.
So I've defined a view that encapsulates the query.  The id in the
"driving" table is exposed, and I run a query like:
select * from stats_record_view
 where id in (select id from driver_stats
   order by random()
   limit 3);


How about a join?

SELECT s.*
FROM
stats_record_view s
JOIN
(SELECT id FROM driver_stats ORDER BY random() LIMIT 3) AS r
ON s.id = r.id;
Yes, I tried this too after I sent the first mail, and this was somewhat 
better.  I ended up adding a random column to the driving table, putting 
an index on it, and exposing that column in the view.  Now I can say

SELECT * FROM stats_record_view WHERE random < 0.093;

For my application, it's OK if the same sample is picked time after time 
and it may change if data is added.

...
Also worth checking the various list archives - this has come up in the past, 
but some time ago.
There are some messages in the archives about how to get a random 
sample.  I know how to do that, and that's not why I posted my message. 
 Are you saying that the planner behavior I spoke of is in the 
archives?  I wouldn't know what to search on to find that thread.  Does 
anyone think that the planner issue has merit to address?  Can someone 
help me figure out what code I would look at?

Ken Geis



---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread huang yaqin
Hello, Richard Huxton,

You said  turning fsync off may cause losing data, that's terrible.
I use SCSI disk, and file system is ext3. I tune postgresql.conf and can't get 
any improvement. So what can I do?
Does SCSI disk and IDE disk have difference?

 Regards,
Huang yaqin

=== 2004-04-07 09:33:00 ===

>On Wednesday 07 April 2004 05:00, huang yaqin wrote:
>> hello!
>>
>>  Thanks, you are right.
>>   I use "postmaster -o "-F" " to start my PG,and performance improved
>> greatly.
>
>I don't think Tom was recommending turning fsync off. If you have a system
>crash/power glitch then the database can become corrupted.
>
>If you are happy the possibility if losing your data, write performance will
>improve noticably.
>
>--
>  Richard Huxton
>  Archonet Ltd
>
>
>
>
>Powered by MessageSoft SMG
>SPAM, virus-free and secure email
>http://www.messagesoft.com
>
>.

= = = = = = = = = = = = = = = = = = = =


致
礼!


huang yaqin
[EMAIL PROTECTED]
  2004-04-07




---(end of broadcast)---
TIP 6: Have you searched our list archives?

   http://archives.postgresql.org


Re: [PERFORM] plan problem

2004-04-07 Thread Richard Huxton
On Tuesday 06 April 2004 21:25, Ken Geis wrote:
> I am trying to find an efficient way to draw a random sample from a
> complex query.  I also want it to be easy to use within my application.
>
> So I've defined a view that encapsulates the query.  The id in the
> "driving" table is exposed, and I run a query like:
>
> select * from stats_record_view
>   where id in (select id from driver_stats
> order by random()
> limit 3);

How about a join?

SELECT s.*
FROM
stats_record_view s
JOIN
(SELECT id FROM driver_stats ORDER BY random() LIMIT 3) AS r
ON s.id = r.id;

Or, what about a cursor and fetch forward (or back?) a random number of rows 
before each fetch. That's probably not going to be so random though.

Also worth checking the various list archives - this has come up in the past, 
but some time ago.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 7: don't forget to increase your free space map settings


Re: [PERFORM] good pc but bad performance,why?

2004-04-07 Thread Richard Huxton
On Wednesday 07 April 2004 05:00, huang yaqin wrote:
> hello!
>
>   Thanks, you are right.
>I use "postmaster -o "-F" " to start my PG,and performance improved
> greatly.

I don't think Tom was recommending turning fsync off. If you have a system 
crash/power glitch then the database can become corrupted.

If you are happy the possibility if losing your data, write performance will 
improve noticably.

-- 
  Richard Huxton
  Archonet Ltd

---(end of broadcast)---
TIP 4: Don't 'kill -9' the postmaster


Re: [PERFORM] [ADMIN] Raw devices vs. Filesystems

2004-04-07 Thread Grega Bremec
...and on Wed, Apr 07, 2004 at 01:26:02AM -0400, Tom Lane used the keyboard:
> 
> After that, we get to implement our own filesystem-equivalent management
> of disk space allocation, disk I/O scheduling, etc.  Are we really
> smarter than all those kernel hackers doing this for a living?  I doubt it.
> 
> After that, we get to re-optimize all the existing Postgres behaviors
> that are designed to sit on top of a standard Unix buffering filesystem
> layer.
> 
> After that, we might reap some performance benefits.  Or maybe not.
> There's not a heck of a lot of hard evidence that we would --- and
> what there is traces to twenty-year-old assumptions about disk drive
> and OS behavior, which are quite unlikely to still apply today.
> 
> Personally, I have a lot of more-promising projects to pursue...
> 

Has anyone tried PostgreSQL on top of OCFS? Personally, I'm not sure it
would even work, as Oracle clearly state that OCFS was _never_ meant to
be a fully fledged UNIX filesystem with POSIX features such as correct
timestamp updates, inode changes, etc., but OCFSv2 brings some features
that might lead one into thinking they're about to make it suitable for
uses beyond that of just having Oracle databases sitting on top of it.

Furthermore, this filesystem would be a blazing one stop solution for
all replication issues PostgreSQL currently suffers from, as its main
design goal was to present "a consistent file system image across the
servers in a cluster".

Now, if both goals can be achieved in one go, hell, I'm willing to try
it out myself in an attempt to extract off of it, some performance
indicators that could be compared to other database performance tests
sent to both this and the PERFORM mailing list.

So, anyone? :)

Cheers,
-- 
Grega Bremec
Senior Administrator
Noviforum Ltd., Software & Media
http://www.noviforum.si/


pgp0.pgp
Description: PGP signature


Re: [PERFORM] performance comparission postgresql/ms-sql server

2004-04-07 Thread Heiko Kehlenbrink
hi shridhar,


> Heiko Kehlenbrink wrote:
>
>> hi list,
>>
>> i want to convince people to use postgresql instead of ms-sql server, so i
>> set up a kind of comparission insert data / select data from postgresql /
>> ms-sql server
>>
>> the table i use was pretty basic,
>>
>> id   bigserial
>> dist  float8
>> x float8
>> y float8
>> z float8
>>
>> i filled the table with a function which filled x,y,z with incremental
increasing values (1,2,3,4,5,6...) and computing from that the dist
value
>> for every tupel (sqrt((x*x)+(y*y)+(z*z))).
>>
>> this works fine for both dbms
>>
>> postgresql needs 13:37 min for 10.000.000 tupel,
>> ms-sql needs 1:01:27 h for 10.000.000 tupel.
>>
>> so far so good.
>>
>> i attached an index on the dist row and started to query the dbs with
scripts which select a serial row of 100.000,200.000,500.000 tupels
based
>> on the dist row.
>> i randomizly compute the start and the end distance and made a "select
avg(dist) from table where dist > startdist and dist < enddist"
>
> Some basics to check quickly.
>
> 1. vacuum analyze the table before you start selecting.

was done,

> 2. for slow running queries, check explain analyze output and find out
who takes
> maximum time.

[EMAIL PROTECTED]:~> psql -d test -c 'explain analyse select avg(dist)
from massive2 where dist > (100*sqrt(3.0))::float8 and dist <
(150*sqrt(3.0))::float8;'
NOTICE:  QUERY PLAN:

Aggregate  (cost=14884.61..14884.61 rows=1 width=8) (actual
time=3133.24..3133.24 rows=1 loops=1)
  ->  Index Scan using massive2_dist on massive2  (cost=0.00..13648.17
rows=494573 width=8) (actual time=0.11..2061.38 rows=49 loops=1) Total
runtime: 3133.79 msec

EXPLAIN

seems to me that most time was needed for the index scanning...

> 3. Check for typecasting. You need to typecast the query correctly e.g.
>
> select avg(dist) from table where dist >startdist::float8 and
> dist
> This might still end up with sequential scan depending upon the plan.
but if
> index scan is picked up, it might be plenty fast..
>
nope, the dist row is float8 and the query-borders are float8 too, also
the explain says that an index scann was done.

> Post explain analyze for the queries if things don't improve.
>
see above..

>   HTH
>
>   Shridhar
>
best regards
heiko


>
>




---(end of broadcast)---
TIP 5: Have you checked our extensive FAQ?

   http://www.postgresql.org/docs/faqs/FAQ.html