spiky load and you
can add/remove machines to meet that load, but if you end up needing to have the
machines running a significant percentage of the time, dedicated boxes are
cheaper (as well as faster)
David Lang
--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
, and it changes over
the time. Scaling up/down has helped us cope.
how do you add another server without having to do a massive data copy in the
process?
David Lang
- Live relocation of databases helps with hardware upgrades and spreading
of load.
Main issues:
- We are not overprovisioning
.
note tha the ext3, reiserfs, jfs, and xfs developers (at least) consider
fsck nessasary even for journaling fileysstems. they just let you get away
without it being mandatory after a unclean shutdown.
David Lang
---(end of broadcast)---
TIP 2
On Wed, 9 Aug 2006, Stephen Frost wrote:
* David Lang ([EMAIL PROTECTED]) wrote:
there's a huge difference between 'works on debian' and 'supported on
debian'. I do use debian extensivly, (along with slackware on my personal
machines), so i am comfortable getting things to work. but 'supported
I'm talking about support, it's not just postgresql
support, but also hardware/driver support that can run into these problems
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http
from the support groups of
companies)
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
on it.
David Lang
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
On Sat, 11 Mar 2006, Joost Kraaijeveld wrote:
Date: Sat, 11 Mar 2006 09:17:09 +0100
From: Joost Kraaijeveld [EMAIL PROTECTED]
To: David Lang [EMAIL PROTECTED]
Cc: Richard Huxton dev@archonet.com, pgsql-performance@postgresql.org
Subject: Re: [PERFORM] x206-x225
On Fri, 2006-03-10 at 23:57
then wait for), so you can only do one transaction per
rotation.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
) )
if you could drop that constraint (the cost of which would be extra 'real'
compares within a bucket) then a helper function per datatype could work
as you are talking.
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
intuition
holds up in such a high-dimensional space as we have here.
I will say that I'm not understanding the problem well enough to
understand themulti-dimentional nature of this problem.
David Lang
---(end of broadcast)---
TIP 2: Don't 'kill
for databases of this size.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
if there
is such a limit on Linux, but there definitely is on some other Unixen).
Linux doesn't have any ability to limit the amount of memory used for
caching (there are periodicly requests for such a feature)
David Lang
Look around and see if you can reduce the memory used by processes
do a pretty good job of this (especially if large_file is
noticably larger then the amount of ram you have)
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http://archives.postgresql.org
that got us off on this tangent, when
doing new writes to an array you don't have to read the blocks as they are
blank, assuming your cacheing is enough so that you can write blocksize*n
before the system starts actually writing the data)
David Lang
Alex.
On 12/25/05, Michael Stone [EMAIL PROTECTED
am looking at where RAID0 is looking
appropriate for a database (a multi-TB array that gets completely reloaded
every month or so as data expires and new data is loaded from the
authoritative source, adding another 16 drives to get redundancy isn't
reasonable)
David Lang
Alex.
On 12/26/05
, not the controllers.
Thanks for the clarification, I knew that PATA didn't do hotswap, and I've
seen discussions on the linux-kernel list about SATA hotswap being worked
on, but I thought that scsi handled it. how recent a kernel have you had
problems with?
David Lang
---(end
for
sequential writes (for example data mining where you do a large import at
one time, but seldom do other updates). I'm assuming a controller with a
reasonable amount of battery-backed cache.
David Lang
---(end of broadcast)---
TIP 9: In versions
.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
trying to get some started, but haven't had a chance yet)
David Lang
I know it
depends alot on the system but for now this database is about 20 gigabytes.
Not too large right now but it may grow 5x in the next year.
Thanks,
Juan
On Wednesday 21 December 2005 22:09, Juan Casero wrote:
I just
, or dogs (and it could even be both, depending
on your workload)
David Lang
Thanks,
Juan
On Thursday 22 December 2005 22:12, David Lang wrote:
On Wed, 21 Dec 2005, Juan Casero wrote:
Date: Wed, 21 Dec 2005 22:31:54 -0500
From: Juan Casero [EMAIL PROTECTED]
To: pgsql-performance
postgresql processes running in parallel on the
current systems (assuming the application can scale).
note that like hyperthreading, the strands aren't full processors, their
efficiancy depends on how much other threads shareing the core stall
waiting for external things.
David Lang
a dual opteron with 16gigs of ram will allow you to
work with much larger sets of data, and you can go beyond that if needed.
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
your database significantly. you may be better off
with fewer, but dedicated drives rather then more, but shared drives.
David Lang
---(end of broadcast)---
TIP 5: don't forget to increase your free space map settings
On Mon, 19 Dec 2005, David Lang wrote:
this is getting dangerously close to being able to fit in ram. I saw an
article over the weekend that Samsung is starting to produce 8G DIMM's, that
can go 8 to a controller (instead of 4 per as is currently done), when
motherboards come out that support
between machines this would be a hook
that the cluster engine could use to put it's own plan into place without
having to modify and recompile)
David Lang
---(end of broadcast)---
TIP 4: Have you searched our list archives?
http
these config options listed as tweaking targets fairly
frequently, has anyone put any thought or effort into creating a test
program that could analyse the actual system and set the defaults based on
the measured performance?
David Lang
---(end of broadcast
on. and this by itself
can result in significant wins (does oracle support Opteron CPU's in 64
bit mode yet? as of this summer it just wasn't an option)
David Lang
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
drives are better (less time to read or write a track)
so the 15k drive option is better
one other note, you probably don't want to use all the disks in a raid10
array, you probably want to split a pair of them off into a seperate raid1
array and put your WAL on it.
David Lang
optimize things.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
can test this (with significant data risk) by putting the WAL
on a ramdisk and see what your performance looks like.
David Lang
---(end of broadcast)---
TIP 2: Don't 'kill -9' the postmaster
to have a filesystem journal on a
different drive.
David Lang
---(end of broadcast)---
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match
of narrow rows). I'm waiting to see what happens once I have
data/pg_xlog on the 2nd disk set.
in that case you logicly have two disks, so see the post from Ron earlier
in this thread.
David Lang
---(end of broadcast)---
TIP 2: Don't 'kill -9
, and the
binary representation of the data will reduce probably your network
traffic as a side effect.
and for things like date which get parsed in multiple ways until one is
found that seems sane, there's a significant amount of work that the
server could avoid.
David Lang
The other thought, of course
in
parallel throwing the data at one database then it is to throw more
hardware at the database server to speed it up (and yes, assuming that MPP
splits the parseing costs as well, it can be an answer for some types of
systems)
David Lang
---(end of broadcast
quite a
few of these on each chunk of data without it being measurable in your
overall time)
an alturnative would be to add a 1-byte data type before each data element
to specify it's type, but then the server side code would have to be
smarter to deal with the additional possibilities.
David
it easily becomes a
win.
David Lang
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
message can get through to the mailing list cleanly
, parsing bugs could happen in the server as
welll. (in fact, the server could parse things to the intermediate format
and then convert them, this sounds expensive, but given the high clock
multipliers in use, it may not end up being measurable)
David Lang
---(end
the
sizeof from the data itself as you suggest above has a penalty, namely it
spreads the data that needs to be accessed to process a line between
different cache lines, so in some cases it won't be worth it)
David Lang
---(end of broadcast)---
TIP 2
On Fri, 2 Dec 2005, Qingqing Zhou wrote:
I don't have all the numbers readily available (and I didn't do all the
tests on every filesystem), but I found that even with only 1000
files/directory ext3 had some problems, and if you enabled dir_hash some
functions would speed up, but writing lots
. if
you do a ls -l on the parent directory you will see that the size of the
directory is large if it's ever had lots of files in it, the only way to
shrink it is to mv the old directory to a new name, create a new directory
and move the files from the old directory to the new one.
David Lang
re-running the tests to get a complete set of benchmarks in the next few
days. My tests had their times vary from 4 min to 80 min depending on the
filesystem in use (ext3 with hash_dir posted the worst case). what testing
have other people done with different filesystems?
David Lang
it every few days to defrag
things forever after.
David Lang
I can only think of two other options:
1. Change the database schema to reduce the number of tables involved.
I'm assuming that of the 3500 tables most hold the same data but for
different clients (or something similar). This might
On Thu, 1 Dec 2005, Qingqing Zhou wrote:
David Lang [EMAIL PROTECTED] wrote
a few weeks ago I did a series of tests to compare different filesystems.
the test was for a different purpose so the particulars are not what I
woud do for testing aimed at postgres, but I think the data is relavent
tests that your own database has trouble with :)
David Lang
-- Forwarded message -- Date: Thu, 01 Dec 2005 16:14:25
David,
The choice of benchmark depends on what kind of application would you
like to see performance for.
Than someone speaks about one or other database
the CPU more and overlap it with your
seeking.
David Lang
---(end of broadcast)---
TIP 3: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faq
on a different machine when doing the copy? (I'm thinking that
the first machine may be able to do a lot of the parseing and conversion,
leaving the second machine to just worry about doing the writes)
David Lang
---(end of broadcast)---
TIP 5: don't forget
(at least
until the postgres project itself starts implementing similar features :-)
David Lang
Thanks,
Brendan Duddridge | CTO | 403-277-5591 x24 | [EMAIL PROTECTED]
ClickSpace Interactive Inc.
Suite L100, 239 - 10th Ave
On Sun, 27 Nov 2005, Andreas Pflug wrote:
David Lang wrote:
Postgres needs to work on the low end stuff as well as the high end stuff
or people will write their app to work with things that DO run on low end
hardware and they spend much more money then is needed to scale the
hardware up
, but if we can have a bunch of people run similar tests
we should learn a lot.
David Lang
---(end of broadcast)---
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to [EMAIL PROTECTED] so that your
by the way, this is the discussion that promped me to start this project
http://lwn.net/Articles/161323/
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
-writing their app.
Part of the reason that I made the post on /. to start this was the hope
that a reasonable set of benchmarks could be hammered out and then more
people then just me could run them to get a wider range of results.
David Lang
---(end of broadcast
win with the small systems,
but for most other uses the large system would win easily. and in any case
it's not the open and shut case that you keep presenting it as.
David Lang
---(end of broadcast)---
TIP 6: explain analyze is your friend
53 matches
Mail list logo