Josh Berkus josh@agliodbs.com writes:
1) When is it necessary to run REINDEX or drop/create
an index? All I could really find in the docs is:
If you need to VACUUM FULL, you need to REINDEX as well.
For example,
if you drop millions of rows from a table.
That's probably a
-Original Message-
From: Alex Turner [mailto:[EMAIL PROTECTED]
Sent: Monday, April 18, 2005 5:50 PM
To: Bruce Momjian
Cc: Kevin Brown; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] How to improve db performance with $7K?
Does it really matter at which end of the cable
Tambet Matiisen [EMAIL PROTECTED] writes:
Is my current understanding correct:
1) VACUUM defragments each page locally - moves free space to the end of
page.
2) VACUUM FULL defragments table globally - tries to fill up all
partially free pages and deletes all resulting empty pages.
Both
Good question. If the SCSI system was moving the head from track 1 to 10, and
a request then came in for track 5, could the system make the head stop at
track 5 on its way to track 10? That is something that only the controller
could do. However, I have no idea if SCSI does that.
|| SCSI,
On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote:
I'm trying to restore my database from dump in several parrallel processes,
but restore process works too slow.
Number of rows about 100 000 000,
RAM: 8192M
CPU: Ultra Sparc 3
Number of CPU: 4
OS: SunOS sun 5.8
RDBMS:
Simon Riggs wrote:
On Mon, 2005-04-18 at 08:50 +0400, Nurlan Mukhanov (AL/EKZ) wrote:
I'm trying to restore my database from dump in several parrallel processes, but restore process works too slow.
Number of rows about 100 000 000,
RAM: 8192M
CPU: Ultra Sparc 3
Number of CPU: 4
OS: SunOS sun
Tambet,
Hmm, thanks for a tip. BTW, is output of
select count(1), sum(relpages) from pg_class where relkind in
('r','i','t')
Well, if you do that for all databases in the cluster, it's the number you
start with. However, setting FSM_pages to that would be assuming that you
excpected 100%
Mohan, Ross wrote:
The only part I am pretty sure about is that real-world experience shows SCSI
is better for a mixed I/O environment. Not sure why, exactly, but the
command queueing obviously helps, and I am not sure what else does.
|| TCQ is the secret sauce, no doubt. I think NCQ
Clustered file systems is the first/best example that
comes to mind. Host A and Host B can both request from diskfarm, eg.
-Original Message-
From: Bruce Momjian [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 19, 2005 12:10 PM
To: Mohan, Ross
Cc: pgsql-performance@postgresql.org
Mohan, Ross wrote:
Clustered file systems is the first/best example that
comes to mind. Host A and Host B can both request from diskfarm, eg.
So one host writes to part of the disk and another host writes to a
different part?
[EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM:
What is 'multiple initiators' used for in the real world?
I asked this same question and got an answer off list: Somebody said their
SAN hardware used multiple initiators. I would try to check the archives
for you, but this thread is
Josh Berkus josh@agliodbs.com writes:
select count(1), sum(relpages) from pg_class where relkind in
('r','i','t')
Well, if you do that for all databases in the cluster, it's the number you
start with. However, setting FSM_pages to that would be assuming that you
excpected 100% of the rows
Well, more like they both are allowed to issue disk
requests and the magical clustered file system manages
locking, etc.
In reality, any disk is only reading/writing to one part of
the disk at any given time, of course, but that in the multiple
initiator deal, multiple streams of requests from
Tom,
Not at all. What it says is that you expect 100% of the pages to have
useful amounts of free space, which is a *much* weaker criterion.
Hmmm. Good point.
This seems to be another instance where my rule-of-thumb was based on false
logic but nevertheless arrived at correct numbers.
Josh Berkus josh@agliodbs.com writes:
Not at all. What it says is that you expect 100% of the pages to have
useful amounts of free space, which is a *much* weaker criterion.
H actually, it seems like, if you are vacuuming regularly, you only
*do* need to track pages that have been
On Tue, Apr 19, 2005 at 10:06:40AM -0400, Tom Lane wrote:
BTW, VACUUM FULL does the data movement back-to-front, and stops as soon
as it finds a tuple it cannot move down; which is a reasonable strategy
since the goal is merely to make the file shorter. But it's entirely
likely that there
Alvaro Herrera [EMAIL PROTECTED] writes:
Am I right in thinking that vacuum does at least two passes: one
front-to-back to find removable tuples, and other back-to-front for
movement?
VACUUM FULL, yes. VACUUM only does the first one.
I know maintenance_work_mem is used for storing TIDs of
Folks,
Params: PostgreSQL 8.0.1 on Solaris 10
Statistics = 500
(tablenames have been changed to protect NDA)
e1=# select tablename, null_frac, correlation, n_distinct from pg_stats where
tablename = 'clickstream1' andattname = 'session_id';
tablename | null_frac | correlation |
-Original Message-
From: Josh Berkus [mailto:[EMAIL PROTECTED]
Sent: Tuesday, April 19, 2005 2:09 PM
To: pgsql-perform
Subject: [PERFORM] Bad n_distinct estimation; hacks suggested?
[...]
(BTW, increasing the stats to 1000 only doubles n_distinct,
and doesn't solve the problem)
Josh Berkus josh@agliodbs.com writes:
As you can see, n_distinct estimation is off by a factor of 10x and it's
causing query planning problems. Any suggested hacks to improve the
histogram on this?
What's the histogram itself look like? (I'd like to see the whole
pg_stats row not just
I have dbt-2 tests automatically running against each pull from CVS
and have started to automatically compile results here:
http://developer.osdl.org/markw/postgrescvs/
I did start with a bit of a minimalistic approach, so I'm open for any
comments, feedback, etc.
Mark
On Thu, Apr 14, 2005 at 10:51:46AM -0500, Matthew Nuzum wrote:
So if you all were going to choose between two hard drives where:
drive A has capacity C and spins at 15K rpms, and
drive B has capacity 2 x C and spins at 10K rpms and
all other features are the same, the price is the same and C
On Mon, Apr 18, 2005 at 07:41:49PM +0200, Jacques Caron wrote:
It would be interesting to actually compare this to real-world (or
nearly-real-world) benchmarks to measure the effectiveness of features like
TCQ/NCQ etc.
I was just thinking that it would be very interesting to benchmark
On Mon, Apr 18, 2005 at 10:20:36AM -0500, Dave Held wrote:
Hmm...so you're saying that at some point, quantity beats quality?
That's an interesting point. However, it presumes that you can
actually distribute your data over a larger number of drives. If
you have a db with a bottleneck of one
On Mon, Apr 18, 2005 at 06:41:37PM -, Mohan, Ross wrote:
Don't you think optimal stripe width would be
a good question to research the binaries for? I'd
think that drives the answer, largely. (uh oh, pun alert)
EG, oracle issues IO requests (this may have changed _just_
recently) in
On Tue, Apr 19, 2005 at 11:22:17AM -0500, [EMAIL PROTECTED] wrote:
[EMAIL PROTECTED] wrote on 04/19/2005 11:10:22 AM:
What is 'multiple initiators' used for in the real world?
I asked this same question and got an answer off list: Somebody said their
SAN hardware used multiple
You should re-run the function test using SQL as the function language
instead of plpgsql. There might be some performance to be had there.
--
Jim C. Nasby, Database Consultant [EMAIL PROTECTED]
Give your computer some brain candy! www.distributed.net Team #1828
Windows: Where do
On Mon, Apr 18, 2005 at 11:36:01AM -0300, Rodrigo Moreno wrote:
I really worried about that, because it's no enough anymore, and users claim
about performace. But running the vacuumdb full, everthing starts to run
better again, so i think the problem is not related to a specific query.
Vacuum
On Mon, Apr 18, 2005 at 10:44:43AM -0500, Dave Held wrote:
Since you are fetching the entire table, you are touching all the rows.
If the query were to fetch the rows in index order, it would be seeking
all over the table's tracks. By fetching in sequence order, it has a
much better chance of
Now that we've hashed out which drives are quicker and more money equals
faster...
Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid
option would you use for a standalone postgres server?
a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
b) 1xRAID1 for OS/xlog, 1xRAID5 for
http://stats.distributed.net is setup with the OS, WAL, and temp on a
RAID1 and the database on a RAID10. The drives are 200G SATA with a
3ware raid card. I don't think the controller has battery-backed cache,
but I'm not sure. In any case, it's almost never disk-bound on the
mirror; when it's
A friend of mine has an application where he's copying in 4000 rows at a
time into a table that has about 4M rows. Each row is 40-50 bytes. This
is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk
SATA mirror, running FBSD 4.10-stable. There's one index on the table.
What's
Jim C. Nasby [EMAIL PROTECTED] writes:
Actually, the planner (at least in 7.4) isn't smart enough to consider
if the sort would fit in memory or not.
Really? Have you read cost_sort()?
It's certainly possible that the calculation is all wet, but to claim
that the issue is not considered is
Jeff,
Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid
option would you use for a standalone postgres server?
a) 3xRAID1 - 1 for data, 1 for xlog, 1 for os?
b) 1xRAID1 for OS/xlog, 1xRAID5 for data
c) 1xRAID10 for OS/xlong/data
d) 1xRAID1 for OS, 1xRAID10 for data
My experience:
1xRAID10 for postgres
1xRAID1 for OS + WAL
Jeff Frost wrote:
Now that we've hashed out which drives are quicker and more money equals
faster...
Let's say you had a server with 6 separate 15k RPM SCSI disks, what raid
option would you use for a standalone postgres server?
a)
Jim C. Nasby [EMAIL PROTECTED] writes:
A friend of mine has an application where he's copying in 4000 rows at a
time into a table that has about 4M rows. Each row is 40-50 bytes. This
is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2 disk
SATA mirror, running FBSD 4.10-stable.
On Tue, Apr 19, 2005 at 11:01:26PM -0400, Tom Lane wrote:
Jim C. Nasby [EMAIL PROTECTED] writes:
Actually, the planner (at least in 7.4) isn't smart enough to consider
if the sort would fit in memory or not.
Really? Have you read cost_sort()?
It's certainly possible that the
Jim C. Nasby [EMAIL PROTECTED] writes:
What's really odd is that neither the CPU or the disk are being
hammered. The box appears to be pretty idle; the postgresql proces is
using 4-5% CPU.
Is he committing every row? In that case you would see fairly low i/o
bandwidth usage because most of
Quoting Tom Lane [EMAIL PROTECTED]:
Jim C. Nasby [EMAIL PROTECTED] writes:
A friend of mine has an application where he's copying in 4000 rows at a
time into a table that has about 4M rows. Each row is 40-50 bytes. This
is taking 25 seconds on a dual PIII-1GHz with 1G of RAM and a 2
Is there a way to look at the stats tables and tell what is jamming up
your postgres server the most? Other than seeing long running queries
and watch top, atop, iostat, vmstat in separate xterms...I'm wondering
if postgres keeps some stats on what it spends the most time doing or if
there's
40 matches
Mail list logo